> **Scope note:** This document supersedes the v1.0 architecture research. It reflects the *actual* v1.0 codebase (Drizzle ORM, local `uploads/` directory, `@cantoo/pdf-lib`, Auth.js v5) and focuses specifically on how the four v1.1 feature areas integrate with what already exists. The previous research doc described a Prisma + S3 design that was never built — disregard it for implementation.
**Multi-page handling:** Extract all pages, prefix each with `[Page N]` so OpenAI can reference page numbers when placing fields. For real estate forms (typically 8–20 pages), total token count will be 2,000–8,000 tokens — well within gpt-4o-mini's 128k context window.
Use OpenAI's Structured Outputs with `response_format: { type: "json_schema", strict: true }` to guarantee the schema. This eliminates validation and retry loops.
**Coordinate conversion note:** AI returns percentage-based coordinates (`xPct`, `yPct`) rather than absolute PDF points because the AI cannot know page dimensions from text alone. The conversion from percentages to PDF user-space points happens in the API route after reading the PDF with pdfjs-dist to get actual page dimensions.
**Multi-page strategy:** Send all pages in one request (prefixed `[Page N]`). For real estate forms, the full text fits in 8k tokens. Do NOT split into multiple requests — the AI needs the full document context to understand which pages need signatures vs. text fields.
The existing signing flow in `src/app/api/sign/[token]/route.ts` (POST handler, step 8) reads `signatureFields` and maps client-supplied `dataURL` values to each field using `field.id`. It then calls `embedSignatureInPdf` which draws an image at `field.x`, `field.y`, `field.width`, `field.height`, `field.page`.
The critical invariant: **the client signing page must know which fields require a drawn signature (canvas) versus which are already filled (text/checkbox/date/agent-sig).**
### Schema Extension Strategy: Discriminated Union with Backward Compatibility
Extend `SignatureFieldData` by adding an optional `type` property. When `type` is absent or `'client-signature'`, existing behavior is preserved exactly. All field types share the same geometry properties.
**DB column:** No migration needed for the JSONB column itself. JSONB accepts any JSON; the schema change is TypeScript-only. The existing `signatureFields jsonb` column in `documents` stores the extended array.
**Backward compatibility rule:** Any `DocumentField` where `type` is `undefined` or `'client-signature'` is treated identically to the original `SignatureFieldData`. The existing `FieldPlacer.tsx` creates fields without `type` — those continue to work as client signature fields.
The client signing page (`SigningPageClient.tsx`) currently iterates `signatureFields` and presents every field for signature. With the extended schema, it must only present `client-signature` and `initials` fields:
- The `POST /api/sign/[token]` route uses `doc.signatureFields` from the DB and server-stored coordinates. Text/checkbox/date/agent-sig fields are already baked into the prepared PDF by the time the client signs. The signing API should filter `signatureFields` the same way — only embed images for `client-signature`/`initials` fields.
Extend the palette with new draggable tokens, each creating a `DocumentField` with the appropriate `type`. The existing drag-drop, move, resize, and persist logic does not change — it operates on the shared geometry properties.
The existing `preparePdf` function draws a blue rectangle + "Sign Here" for every entry in `sigFields`. With extended types, it needs type-aware rendering:
-`client-signature` / no type: existing blue rectangle + "Sign Here" label (unchanged)
-`text` with `value`: stamp the value directly (use AcroForm fill if name matches, else drawText)
-`date` with `value`: stamp the date text
-`checkbox` with `checked`: draw checkmark glyph or an X
-`agent-signature` with `value` (dataURL): embed the PNG image (same logic as `embedSignatureInPdf`)
1.**Size:** A signature drawn on a 400×140px canvas is typically 2–8 KB as a PNG dataURL string. PostgreSQL's 33% size penalty for base64 is negligible at this scale (8 KB becomes ~11 KB).
2.**Access pattern:** The signature is always fetched alongside an authenticated agent session. A single-row `SELECT` by user ID returns it immediately. No streaming, no presigned URLs.
3.**Existing stack:** The codebase already stores binary-ish data as JSONB text (`signatureFields` containing base64 in `embedSignatureInPdf`). Base64 `data:` URLs are the native format of `signature_pad.toDataURL()` and `canvas.toDataURL()` — no conversion needed.
4.**File on disk:** Rejected. Files on disk create path management complexity, require auth-gated API routes to serve, and must survive container restarts. The `uploads/` pattern works for documents (immutable blobs) but is overkill for a single small image per user.
5.**BYTEA:** Rejected. Drizzle ORM's BYTEA support requires additional type handling. The dataURL string is already the right format for `@cantoo/pdf-lib`'s `embedPng()` — no conversion needed.
When the agent places an `agent-signature` field and has a saved signature, `PreparePanel` sends the `agentSignatureData` as the `value` on that field when calling `POST /api/documents/[id]/prepare`. The `prepare-document.ts` function embeds it as a PNG image at the field's coordinates — exactly the same logic as `embedSignatureInPdf`.
**The agent signature is embedded during prepare, before the document is sent to the client.** The client sees the agent's signature already in the PDF as a real image, not a placeholder rectangle.
---
## Integration 4: Filled Preview Approach
### Question answered: Re-render via react-pdf with overlaid values, or generate a new temporary PDF server-side?
### Decision: Generate a temporary prepared PDF server-side; render with existing react-pdf viewer
Overlaying text/checkmarks on top of a react-pdf canvas in the browser is fragile. Text positions must be pixel-perfect, and the coordinate math between PDF user space and screen pixels is already complex (demonstrated by the existing `FieldPlacer.tsx`). Overlaid values would not be "in" the PDF — they would be CSS layers that look different from the final embedded result.
The existing `preparePdf` function already generates a complete prepared PDF from the source PDF + text values + field geometries. For preview, call the same function but write to a temporary path, then serve it through the existing `/api/documents/[id]/file` pattern.
**PreparePanel preview modal:** Add a "Preview" button that calls `POST /api/documents/[id]/preview` with current field + text-fill state, receives the PDF bytes, converts to an object URL via `URL.createObjectURL`, and opens a modal containing `<Document file={objectUrl}>` from react-pdf. Uses the same `PdfViewerWrapper` pattern (dynamic import, `ssr: false`).
**Why not stream PDF to a new browser tab:** Object URL in a modal keeps the preview in-app and avoids browser popup blockers. The agent can review without leaving the prepare page.
**Preview file persistence:** The `_preview.pdf` file is overwritten each time the agent clicks Preview. It is not stored in the DB and is never sent to the client. It can be cleaned up on a schedule or simply overwritten on each preview request.
| `src/lib/db/schema.ts` | Add `DocumentField` discriminated union; add `agentSignatureData` to users table; add `propertyAddress` to clients table | Keep `SignatureFieldData` as type alias — zero breaking changes |
| `src/lib/pdf/prepare-document.ts` | Add type-aware rendering for text/checkbox/date/agent-sig/initials fields | Existing signature field path (no type / `client-signature`) must behave identically |
| `src/app/api/documents/[id]/fields/route.ts` | Accept `DocumentField[]` (union type) instead of `SignatureFieldData[]` — structurally identical, TypeScript type change only | No behavior change |
| `src/app/api/documents/[id]/prepare/route.ts` | Fetch agent signature from users table if any agent-sig fields present | Must remain backward-compatible with no-fields body |
| `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` | Add new draggable tokens to palette (text, checkbox, initials, date, agent-sig); render type-specific labels and colors for placed fields | Do NOT change the drag/drop/move/resize/persist mechanics |
| `src/app/portal/(protected)/documents/[docId]/_components/PreparePanel.tsx` | Add "AI Auto-place" button + loading state; add "Preview" button + PreviewModal; add AgentSignaturePanel; connect new API routes | Do NOT change existing "Prepare and Send" flow |
| `src/app/sign/[token]/_components/SigningPageClient.tsx` | Filter signatureFields to client-interaction types only (`client-signature`, `initials`, no-type) | Do NOT change submit logic or embed-signature flow |
| `src/app/api/sign/[token]/route.ts` (POST) | Filter signatureFields to client-sig/initials before building `signaturesWithCoords` | Other field types are already baked into the prepared PDF |
| `src/app/portal/_components/ClientModal.tsx` | Add `propertyAddress` field | Straightforward form field addition |
| `src/lib/signing/embed-signature.ts` | Agent signature embedding reuses this logic, but via `prepare-document.ts` not here. Client signing path is unchanged. |
| `src/lib/signing/audit.ts` | No new audit event types needed for v1.1 |
-- No migration needed for documents.signature_fields JSONB —
-- new field types are stored in existing column, backward-compatible
```
**Migration safety:** Both new columns are nullable with no default, so existing rows are unaffected. The `signatureFields` JSONB change requires no migration — JSONB stores arbitrary JSON.
Never call the OpenAI API from a Client Component or import `lib/ai/*.ts` in a component. All OpenAI calls must go through `POST /api/documents/[id]/ai-prepare`. Add a `'server-only'` import to `lib/ai/field-placement.ts` to get a build error if accidentally imported on the client.
localStorage is ephemeral (cleared on browser data wipe), not shared across devices, and not available server-side during prepare. Keeping the current localStorage fallback is fine as a UX shortcut during the signing session, but the source of truth must be the DB.
The client signing flow embeds signatures from client-drawn dataURLs. Do not modify `embed-signature.ts` or the POST `/api/sign/[token]` logic to handle new field types — handle agent-sig and text during `preparePdf` only. The signing route should filter fields before calling embed.
### 4. Making preview a full "saved" prepared file
The `_preview.pdf` file is ephemeral and not recorded in the DB. Do not confuse it with `preparedFilePath`. If the agent proceeds to send after previewing, the actual `POST /api/documents/[id]/prepare` generates a fresh `_prepared.pdf` as before. Preview is read-only and stateless.
### 5. Using AI field placement as authoritative without agent review
AI placement is a starting point. The "AI Auto-place" button fills the FieldPlacer with suggested fields, but the agent must be able to adjust before the fields are committed to the DB. Coordinates from the AI response should populate the client-side field state, not directly write to DB.
### 6. Skipping path traversal guard on new preview route
The preview route writes a file to `uploads/`. Apply the same `destPath.startsWith(UPLOADS_DIR)` guard used in the prepare route.
### 7. Using `pdf-parse` as an additional dependency
`pdfjs-dist` is already installed (dependency of `react-pdf`). Use the legacy build server-side. Adding `pdf-parse` would be a duplicate dependency with no benefit.
- [pdfjs-dist legacy build — Node.js text extraction](https://lirantal.com/blog/how-to-read-and-parse-pdfs-pdfjs-create-pdfs-pdf-lib-nodejs)
- [unpdf vs pdf-parse vs pdfjs-dist — 2026 comparison](https://www.pkgpulse.com/blog/unpdf-vs-pdf-parse-vs-pdfjs-dist-pdf-parsing-extraction-nodejs-2026)
- [OpenAI Structured Outputs — official docs](https://platform.openai.com/docs/guides/structured-outputs)
- [Introducing Structured Outputs in the API](https://openai.com/index/introducing-structured-outputs-in-the-api/)
- [Next.js 15 Server Actions vs API Routes — 2025 patterns](https://medium.com/@sparklewebhelp/server-actions-in-next-js-the-future-of-api-routes-06e51b22a59f)
- [PostgreSQL BYTEA vs TEXT for image storage](https://www.postgrespro.com/list/thread-id/1509166)