From 622ca3dc2174b36176a979a2a56acf112b1eb252 Mon Sep 17 00:00:00 2001 From: Chandler Copeland Date: Fri, 3 Apr 2026 14:47:06 -0600 Subject: [PATCH] docs: complete project research --- .planning/research/ARCHITECTURE.md | 1388 ++++++++++++---------------- .planning/research/FEATURES.md | 254 +++++ .planning/research/PITFALLS.md | 886 +++++++++++------- .planning/research/STACK.md | 434 +++++++-- .planning/research/SUMMARY.md | 243 ++--- 5 files changed, 1943 insertions(+), 1262 deletions(-) diff --git a/.planning/research/ARCHITECTURE.md b/.planning/research/ARCHITECTURE.md index e24c0da..b9fb4ae 100644 --- a/.planning/research/ARCHITECTURE.md +++ b/.planning/research/ARCHITECTURE.md @@ -1,824 +1,666 @@ -# Architecture Research +# Architecture: Multi-Signer Extension + Docker Deployment -**Domain:** Real estate agent website + PDF document signing web app — v1.1 integration -**Researched:** 2026-03-21 -**Confidence:** HIGH +**Project:** teressa-copeland-homes +**Milestone:** v1.2 — Multi-Signer Support + Deployment Hardening +**Researched:** 2026-04-03 +**Confidence:** HIGH — based on direct codebase inspection + official Docker/Next.js documentation --- -> **Scope note:** This document supersedes the v1.0 architecture research. It reflects the *actual* v1.0 codebase (Drizzle ORM, local `uploads/` directory, `@cantoo/pdf-lib`, Auth.js v5) and focuses specifically on how the four v1.1 feature areas integrate with what already exists. The previous research doc described a Prisma + S3 design that was never built — disregard it for implementation. +## Summary + +The existing system is a clean, well-factored single-signer flow. Every document has exactly one signing token, one recipient, and one atomic "mark used" operation. Multi-signer requires four categories of change: + +1. **Schema:** Tag fields to signers, expand signingTokens to identify whose token it is, replace the single-recipient model with a per-signer recipients structure. +2. **Completion detection:** Replace the single `usedAt` → `status = 'Signed'` trigger with a "all tokens claimed" check after each signing submission. +3. **Final PDF assembly:** The per-signer merge model (each signer embeds into the same prepared PDF, sequentially via advisory lock) accumulates signatures into `signedFilePath`. A completion pass fires after the last signer claims their token. +4. **Migration:** Existing signed documents must remain intact — achieved by treating the absence of `signerEmail` on a field as "legacy single-signer" (same as the existing `type` coalescing pattern already used in `getFieldType`). + +Docker deployment has a distinct failure mode from local dev: **env vars that exist in `.env.local` are absent in the Docker container unless explicitly provided at runtime.** The email-sending failure in production Docker is caused by `CONTACT_SMTP_HOST`, `CONTACT_EMAIL_USER`, and `CONTACT_EMAIL_PASS` never reaching the container. The fix is `env_file` injection at `docker compose up` time, not Docker Secrets (which mount as files, not env vars, and require app-side entrypoint shim code that adds no security benefit for a single-server deployment). --- -## Existing Architecture (Actual v1.0 State) +## Part 1: Multi-Signer Schema Changes -The actual implementation diverges from the v1.0 research document. Key facts confirmed by reading the codebase: - -| Concern | Actual v1.0 | -|---------|-------------| -| ORM | Drizzle ORM (`drizzle-orm` + `drizzle-kit`) | -| DB | Local PostgreSQL in Docker | -| PDF write | `@cantoo/pdf-lib` (fork of pdf-lib, same API) | -| PDF view | `react-pdf` (pdfjs-dist backed) — client-side, `ssr: false` | -| PDF storage | `uploads/` at project root (never under `public/`) | -| Auth | Auth.js v5 (`next-auth@5.0.0-beta.30`) | -| Field types | Only `signature` — all fields in `signatureFields` JSONB as `SignatureFieldData[]` | -| Text fill | Free-form key/value pairs in `textFillData` JSONB — manual entry only | -| Agent signature | Stored in client `localStorage` (key: `teressa_homes_saved_signature`) — ephemeral | -| Field placement | `FieldPlacer.tsx` — dnd-kit palette + move/resize, persisted to DB via `PUT /api/documents/[id]/fields` | -| Signing flow | Client signs only; agent signature not embedded before sending | -| Preview | No dedicated preview step — agent goes straight from prepare to send | - ---- - -## System Overview — v1.1 Additions - -``` -┌──────────────────────────────────────────────────────────────────────────────┐ -│ DOCUMENT PREPARE PAGE /portal/documents/[docId] │ -│ │ -│ ┌────────────────────────────────┐ ┌──────────────────────────────────┐ │ -│ │ PdfViewerWrapper (existing) │ │ PreparePanel (extended) │ │ -│ │ FieldPlacer (extended) │ │ │ │ -│ │ │ │ [AI Auto-place] NEW │ │ -│ │ + new field type tokens in │ │ [Text fill (AI pre-filled)] EXT │ │ -│ │ palette: text, checkbox, │ │ [Agent signature] NEW │ │ -│ │ initials, date, agent-sig │ │ [Filled preview] NEW │ │ -│ │ │ │ [Send] existing │ │ -│ └────────────────────────────────┘ └──────────────────────────────────┘ │ -│ │ -│ NEW API Routes: NEW DB Columns: │ -│ POST /api/documents/[id]/ai-prepare users.agentSignatureData (text) │ -│ GET/PUT /api/agent/signature clients.propertyAddress (text) │ -│ GET /api/documents/[id]/preview signatureFields schema extended │ -│ (type discriminant on DocumentField) │ -└──────────────────────────────────────────────────────────────────────────────┘ - │ │ - ▼ ▼ -┌────────────────────────┐ ┌──────────────────────────────────────┐ -│ lib/ai/ │ │ lib/pdf/ │ -│ extract-text.ts NEW │ │ prepare-document.ts (extended) │ -│ field-placement.ts NEW│ │ preview-document.ts NEW │ -└────────────────────────┘ └──────────────────────────────────────┘ - │ │ - ▼ ▼ -┌────────────────────────┐ ┌──────────────────────────────────────┐ -│ OpenAI gpt-4o-mini │ │ @cantoo/pdf-lib │ -│ (server-only) │ │ (existing, unchanged) │ -└────────────────────────┘ └──────────────────────────────────────┘ -``` - ---- - -## Integration 1: OpenAI PDF Text Extraction and Field Placement - -### Question answered: Which extraction library? What prompt structure? Multi-page handling? - -### Extraction Library: pdfjs-dist (already installed) - -`react-pdf` depends on `pdfjs-dist`, which is already in `node_modules`. Use it server-side for text extraction rather than adding a new dependency. - -The correct server-side import uses the **legacy build** to avoid worker/canvas requirements: +### 1. `SignatureFieldData` JSONB — add optional `signerEmail` +**Current shape (from `src/lib/db/schema.ts`):** ```typescript -// lib/ai/extract-text.ts (NEW FILE — server-only) -import * as pdfjsLib from 'pdfjs-dist/legacy/build/pdf.mjs'; -import { readFile } from 'node:fs/promises'; - -export async function extractPdfText(filePath: string): Promise { - const data = new Uint8Array(await readFile(filePath)); - const pdf = await pdfjsLib.getDocument({ data }).promise; - - const pages: string[] = []; - for (let i = 1; i <= pdf.numPages; i++) { - const page = await pdf.getPage(i); - const content = await page.getTextContent(); - const pageText = content.items - .filter((item): item is { str: string } => 'str' in item) - .map((item) => item.str) - .join(' '); - pages.push(`[Page ${i}]\n${pageText}`); - } - return pages.join('\n\n'); -} -``` - -**Why not `pdf-parse`?** It wraps pdfjs-dist but adds a dependency that would be redundant. Use pdfjs-dist directly since it is already installed. - -**Multi-page handling:** Extract all pages, prefix each with `[Page N]` so OpenAI can reference page numbers when placing fields. For real estate forms (typically 8–20 pages), total token count will be 2,000–8,000 tokens — well within gpt-4o-mini's 128k context window. - -### OpenAI API Route: Server-Only, NOT a Server Action - -Use a dedicated API route at `POST /api/documents/[id]/ai-prepare` for the following reasons: -1. The operation is long-running (OpenAI call + text extraction). Server Actions are better for quick mutations. -2. The route needs to return structured JSON (field placements + prefill data) that the client then applies. -3. The existing pattern in this codebase is API routes for document operations. - -```typescript -// app/api/documents/[id]/ai-prepare/route.ts (NEW FILE) -import { auth } from '@/lib/auth'; -import { db } from '@/lib/db'; -import { documents, clients } from '@/lib/db/schema'; -import { eq } from 'drizzle-orm'; -import path from 'node:path'; -import { extractPdfText } from '@/lib/ai/extract-text'; -import { callFieldPlacementAI } from '@/lib/ai/field-placement'; - -const UPLOADS_DIR = path.join(process.cwd(), 'uploads'); - -export async function POST( - _req: Request, - { params }: { params: Promise<{ id: string }> } -) { - const session = await auth(); - if (!session) return new Response('Unauthorized', { status: 401 }); - - const { id } = await params; - const doc = await db.query.documents.findFirst({ - where: eq(documents.id, id), - with: { client: true }, - }); - if (!doc || !doc.filePath) return Response.json({ error: 'Not found' }, { status: 404 }); - - const srcPath = path.join(UPLOADS_DIR, doc.filePath); - const pdfText = await extractPdfText(srcPath); - - const result = await callFieldPlacementAI(pdfText, { - clientName: doc.client?.name ?? '', - clientEmail: doc.client?.email ?? '', - propertyAddress: doc.client?.propertyAddress ?? '', - }); - - return Response.json(result); -} -``` - -### Prompt Structure - -Use OpenAI's Structured Outputs with `response_format: { type: "json_schema", strict: true }` to guarantee the schema. This eliminates validation and retry loops. - -```typescript -// lib/ai/field-placement.ts (NEW FILE) -// This file must NEVER be imported from client components. -// Place in lib/ (not app/) to enforce server-side execution. - -interface ClientContext { - clientName: string; - clientEmail: string; - propertyAddress: string; -} - -export interface AiFieldPlacement { - type: 'text' | 'checkbox' | 'initials' | 'date' | 'client-signature'; - label: string; // human label, becomes the key in textFillData - page: number; // 1-indexed - xPct: number; // 0–100, percentage of page width from left - yPct: number; // 0–100, percentage of page height from bottom (PDF coords) - prefillValue?: string; // AI-suggested value for text/date fields -} - -export interface AiPrepareResult { - fields: AiFieldPlacement[]; - prefillData: Record; // label → value for known fields -} - -const FIELD_PLACEMENT_SCHEMA = { - type: 'object', - properties: { - fields: { - type: 'array', - items: { - type: 'object', - properties: { - type: { type: 'string', enum: ['text', 'checkbox', 'initials', 'date', 'client-signature'] }, - label: { type: 'string' }, - page: { type: 'integer' }, - xPct: { type: 'number' }, - yPct: { type: 'number' }, - prefillValue: { type: 'string' }, - }, - required: ['type', 'label', 'page', 'xPct', 'yPct'], - additionalProperties: false, - }, - }, - prefillData: { - type: 'object', - additionalProperties: { type: 'string' }, - }, - }, - required: ['fields', 'prefillData'], - additionalProperties: false, -}; - -export async function callFieldPlacementAI( - pdfText: string, - context: ClientContext -): Promise { - const systemPrompt = `You are a Utah real estate document assistant. -Analyze the provided PDF text and identify where form fields should be placed. -For each field, estimate its position as a percentage (0-100) of page width (xPct) -and page height from the bottom (yPct, because PDF coords start at bottom-left). -Pre-fill fields where you can infer the value from the client context provided. -Only place fields where blank lines, underscores, or form labels indicate input is expected.`; - - const userPrompt = `Client name: ${context.clientName} -Client email: ${context.clientEmail} -Property address: ${context.propertyAddress || 'unknown'} - -PDF text content: -${pdfText} - -Identify all form fields that need to be filled or signed. -For text fields, prefill any you can infer from the client context. -Return results as JSON matching the specified schema.`; - - const response = await fetch('https://api.openai.com/v1/chat/completions', { - method: 'POST', - headers: { - Authorization: `Bearer ${process.env.OPENAI_API_KEY}`, - 'Content-Type': 'application/json', - }, - body: JSON.stringify({ - model: 'gpt-4o-mini', - messages: [ - { role: 'system', content: systemPrompt }, - { role: 'user', content: userPrompt }, - ], - response_format: { - type: 'json_schema', - json_schema: { - name: 'field_placement', - schema: FIELD_PLACEMENT_SCHEMA, - strict: true, - }, - }, - temperature: 0.2, // lower temperature for more deterministic placement - }), - }); - - const data = await response.json(); - return JSON.parse(data.choices[0].message.content) as AiPrepareResult; -} -``` - -**Coordinate conversion note:** AI returns percentage-based coordinates (`xPct`, `yPct`) rather than absolute PDF points because the AI cannot know page dimensions from text alone. The conversion from percentages to PDF user-space points happens in the API route after reading the PDF with pdfjs-dist to get actual page dimensions. - -**Multi-page strategy:** Send all pages in one request (prefixed `[Page N]`). For real estate forms, the full text fits in 8k tokens. Do NOT split into multiple requests — the AI needs the full document context to understand which pages need signatures vs. text fields. - ---- - -## Integration 2: Extending signatureFields JSONB Schema - -### Question answered: How to add field types without breaking existing client signature functionality? - -### Analysis of Risk - -The existing signing flow in `src/app/api/sign/[token]/route.ts` (POST handler, step 8) reads `signatureFields` and maps client-supplied `dataURL` values to each field using `field.id`. It then calls `embedSignatureInPdf` which draws an image at `field.x`, `field.y`, `field.width`, `field.height`, `field.page`. - -The critical invariant: **the client signing page must know which fields require a drawn signature (canvas) versus which are already filled (text/checkbox/date/agent-sig).** - -### Schema Extension Strategy: Discriminated Union with Backward Compatibility - -Extend `SignatureFieldData` by adding an optional `type` property. When `type` is absent or `'client-signature'`, existing behavior is preserved exactly. All field types share the same geometry properties. - -```typescript -// src/lib/db/schema.ts — MODIFY existing SignatureFieldData - -// NEW: base interface — shared geometry, all field types -interface BaseFieldData { +interface SignatureFieldData { id: string; - page: number; // 1-indexed - x: number; // PDF user space, bottom-left origin, points + page: number; + x: number; y: number; width: number; height: number; + type?: SignatureFieldType; // optional — v1.0 had no type } - -// NEW: discriminated union -export type DocumentField = - | (BaseFieldData & { type?: 'client-signature' }) // type omitted = client-sig (backward compat) - | (BaseFieldData & { type: 'text'; label: string; value?: string }) - | (BaseFieldData & { type: 'checkbox'; label: string; checked?: boolean }) - | (BaseFieldData & { type: 'initials'; label: string }) // client initials - | (BaseFieldData & { type: 'date'; label: string; value?: string }) - | (BaseFieldData & { type: 'agent-signature'; value?: string }) // PNG dataURL, embedded before send - -// Keep old name as alias for migration safety -export type SignatureFieldData = DocumentField; ``` -**DB column:** No migration needed for the JSONB column itself. JSONB accepts any JSON; the schema change is TypeScript-only. The existing `signatureFields jsonb` column in `documents` stores the extended array. - -**Backward compatibility rule:** Any `DocumentField` where `type` is `undefined` or `'client-signature'` is treated identically to the original `SignatureFieldData`. The existing `FieldPlacer.tsx` creates fields without `type` — those continue to work as client signature fields. - -### Client Signing Page — Filter to Client-Only Fields - -The client signing page (`SigningPageClient.tsx`) currently iterates `signatureFields` and presents every field for signature. With the extended schema, it must only present `client-signature` and `initials` fields: - +**New shape:** ```typescript -// src/app/sign/[token]/_components/SigningPageClient.tsx — MODIFY -// Filter to fields the client needs to interact with: -const clientFields = signatureFields.filter( - (f) => !f.type || f.type === 'client-signature' || f.type === 'initials' +interface SignatureFieldData { + id: string; + page: number; + x: number; + y: number; + width: number; + height: number; + type?: SignatureFieldType; + signerEmail?: string; // NEW — optional; absent = legacy single-signer or agent-owned field +} +``` + +**Backward compatibility:** The `signerEmail` field is optional. Existing documents stored in `signature_fields` JSONB have no `signerEmail`. The signing page already filters fields via `isClientVisibleField()`. A new `getSignerEmail(field, fallbackEmail)` helper mirrors `getFieldType()` and returns `field.signerEmail ?? fallbackEmail` — where `fallbackEmail` is the document's legacy single-recipient email. This keeps existing signed documents working without a data backfill. + +**No SQL migration needed** for the JSONB column itself — it is already `jsonb`, schema-less at the DB level. + +--- + +### 2. `signingTokens` table — add `signerEmail` column + +**Current:** +```sql +CREATE TABLE signing_tokens ( + jti text PRIMARY KEY, + document_id text NOT NULL REFERENCES documents(id) ON DELETE CASCADE, + created_at timestamp DEFAULT now() NOT NULL, + expires_at timestamp NOT NULL, + used_at timestamp ); ``` -**What the client signing page does NOT change:** -- The `POST /api/sign/[token]` route uses `doc.signatureFields` from the DB and server-stored coordinates. Text/checkbox/date/agent-sig fields are already baked into the prepared PDF by the time the client signs. The signing API should filter `signatureFields` the same way — only embed images for `client-signature`/`initials` fields. - -### FieldPlacer.tsx — Add New Field Type Tokens to Palette - -Extend the palette with new draggable tokens, each creating a `DocumentField` with the appropriate `type`. The existing drag-drop, move, resize, and persist logic does not change — it operates on the shared geometry properties. - -```typescript -// Palette additions in FieldPlacer.tsx — add to the palette row: - - - - - -// Keep existing: - +**New column to add:** +```sql +ALTER TABLE signing_tokens ADD COLUMN signer_email text; ``` -### prepare-document.ts — Handle All Field Types - -The existing `preparePdf` function draws a blue rectangle + "Sign Here" for every entry in `sigFields`. With extended types, it needs type-aware rendering: - -- `client-signature` / no type: existing blue rectangle + "Sign Here" label (unchanged) -- `text` with `value`: stamp the value directly (use AcroForm fill if name matches, else drawText) -- `date` with `value`: stamp the date text -- `checkbox` with `checked`: draw checkmark glyph or an X -- `agent-signature` with `value` (dataURL): embed the PNG image (same logic as `embedSignatureInPdf`) -- `initials`: blue rectangle + "Initials" label - ---- - -## Integration 3: Agent Saved Signature Persistence - -### Question answered: DB BYTEA/text column vs file on disk? How served? - -### Decision: `text` column on the `users` table (base64 PNG dataURL) - -**Recommendation: Store as `TEXT` in the `users` table** — not BYTEA, not a file on disk. - -**Rationale:** - -1. **Size:** A signature drawn on a 400×140px canvas is typically 2–8 KB as a PNG dataURL string. PostgreSQL's 33% size penalty for base64 is negligible at this scale (8 KB becomes ~11 KB). - -2. **Access pattern:** The signature is always fetched alongside an authenticated agent session. A single-row `SELECT` by user ID returns it immediately. No streaming, no presigned URLs. - -3. **Existing stack:** The codebase already stores binary-ish data as JSONB text (`signatureFields` containing base64 in `embedSignatureInPdf`). Base64 `data:` URLs are the native format of `signature_pad.toDataURL()` and `canvas.toDataURL()` — no conversion needed. - -4. **File on disk:** Rejected. Files on disk create path management complexity, require auth-gated API routes to serve, and must survive container restarts. The `uploads/` pattern works for documents (immutable blobs) but is overkill for a single small image per user. - -5. **BYTEA:** Rejected. Drizzle ORM's BYTEA support requires additional type handling. The dataURL string is already the right format for `@cantoo/pdf-lib`'s `embedPng()` — no conversion needed. - -### DB Migration - +**Drizzle schema change:** ```typescript -// src/lib/db/schema.ts — ADD to users table: -export const users = pgTable('users', { - id: text('id').primaryKey().$defaultFn(() => crypto.randomUUID()), - email: text('email').notNull().unique(), - passwordHash: text('password_hash').notNull(), - agentSignatureData: text('agent_signature_data'), // NEW: base64 PNG dataURL or null +export const signingTokens = pgTable('signing_tokens', { + jti: text('jti').primaryKey(), + documentId: text('document_id').notNull() + .references(() => documents.id, { onDelete: 'cascade' }), + signerEmail: text('signer_email'), // NEW — null for legacy tokens createdAt: timestamp('created_at').defaultNow().notNull(), + expiresAt: timestamp('expires_at').notNull(), + usedAt: timestamp('used_at'), }); ``` -Run: `npm run db:generate && npm run db:migrate` +**Why not store field IDs in the token?** Field filtering should happen server-side by matching `field.signerEmail === tokenRow.signerEmail`. Storing field IDs in the token creates a second source of truth and complicates migration. The signing GET endpoint already fetches the document's `signatureFields` and filters them — adding a `signerEmail` comparison is a one-line change. -### API Routes for Agent Signature +**Backward compatibility:** `signer_email` is nullable. Existing tokens have `null`. The signing endpoint uses `tokenRow.signerEmail` to filter fields; `null` falls back to `isClientVisibleField()` (current behavior). + +--- + +### 3. `documents` table — add `signers` JSONB column + +**Current problem:** `assignedClientId` is a single-value text column. Signers in multi-signer are identified by email, not necessarily by a `clients` row (requirement: "signers may not be in clients table"). The current `emailAddresses` JSONB column holds email strings but lacks per-signer identity (name, signing status, token linkage). + +**Decision: add a `signers` JSONB column; leave `assignedClientId` in place for legacy** + +```sql +ALTER TABLE documents ADD COLUMN signers jsonb; +``` + +**New TypeScript type:** +```typescript +export interface DocumentSigner { + email: string; + name?: string; // display name for email greeting, optional + tokenJti?: string; // populated at send time — links token back to signer record + signedAt?: string; // ISO timestamp — populated when their token is claimed +} +``` + +**Drizzle:** +```typescript +// In documents table: +signers: jsonb('signers').$type(), +``` + +**Why JSONB array instead of a new table?** A `document_signers` join table would be cleanest long-term, but for a solo-agent app with document-level granularity and no need to query "all documents this email signed across the system", JSONB avoids an extra join on every document fetch. The `tokenJti` field on each signer record gives the bidirectional link without the join table. + +**Why keep `assignedClientId`?** It is still used by the send route to resolve the `clients` row for `client.email` and `client.name`. For multi-signer, the agent provides emails directly in the `signers` array. The two flows coexist: +- Legacy: `assignedClientId` is set, `signers` is null → single-signer behavior +- New: `signers` is set (non-null, non-empty) → multi-signer behavior + +The send route checks `if (doc.signers?.length) { /* multi-signer path */ } else { /* legacy path */ }`. + +**`emailAddresses` column:** Currently stores `[client.email, ...ccAddresses]`. In multi-signer this is superseded by `signers[].email`. The column can remain and be ignored for new documents, or populated with all signer emails for audit reading consistency. + +--- + +### 4. `auditEvents` — new event types + +**Current enum values:** +``` +document_prepared | email_sent | link_opened | document_viewed | signature_submitted | pdf_hash_computed +``` + +**New values to add:** + +| Event Type | When Fired | Metadata | +|---|---|---| +| `signer_email_sent` | Per-signer email sent (supplements `email_sent` for multi-signer) | `{ signerEmail, tokenJti }` | +| `signer_signed` | Per-signer token claimed | `{ signerEmail }` | +| `document_completed` | All signers have signed — triggers final notification | `{ signerCount, mergedFilePath }` | + +**Backward compatibility:** Postgres enums cannot have values removed, only added. The existing `email_sent` and `signature_submitted` events stay in the enum and continue to be fired for legacy single-signer documents. New multi-signer documents fire the new, more specific events. Adding values to a Postgres enum requires raw SQL that Drizzle cannot auto-generate: + +```sql +ALTER TYPE audit_event_type ADD VALUE 'signer_email_sent'; +ALTER TYPE audit_event_type ADD VALUE 'signer_signed'; +ALTER TYPE audit_event_type ADD VALUE 'document_completed'; +``` + +**Important:** In Postgres 12+, `ALTER TYPE ... ADD VALUE` can run inside a transaction. In Postgres < 12 it cannot. Write migration with `-- statement-breakpoint` between each ALTER to prevent Drizzle from wrapping them in a single transaction. + +--- + +### 5. `documents` table — completion tracking columns unchanged + +Multi-signer "completion" is no longer a single event. The existing columns serve all needs: + +| Column | Current use | Multi-signer use | +|---|---|---| +| `status` | Draft → Sent → Viewed → Signed | "Signed" now means ALL signers complete. Status transitions to Signed only after `document_completed` fires. | +| `signedAt` | Timestamp of the single signing | Timestamp of completion (last signer claimed) — same semantic, set later. | +| `signedFilePath` | Path to the merged signed PDF | Accumulator path — updated by each signer as they embed; final value = completed PDF. | +| `pdfHash` | SHA-256 of signed PDF | Same — hash of the final merged PDF. | + +Per-signer completion is tracked in `signers[].signedAt` (the JSONB array). No new columns required. + +--- + +## Part 2: Multi-Signer Data Flow + +### Field Tagging (Agent UI) ``` -GET /api/agent/signature — returns { dataURL: string | null } -PUT /api/agent/signature — body: { dataURL: string }, saves to DB +Agent places field on PDF canvas + ↓ +FieldPlacer shows signer email selector (from doc.signers[]) + ↓ +SignatureFieldData.signerEmail = "buyer@example.com" + ↓ +PUT /api/documents/[id]/fields persists to signatureFields JSONB ``` +**Chicken-and-egg consideration:** The agent must know the signer list before tagging fields. Resolution: the PreparePanel collects signer emails first (a new multi-signer entry UI replaces the single email textarea). These are saved to `documents.signers` via `PUT /api/documents/[id]/signers`. The FieldPlacer palette then offers a signer email selector when placing a client-visible field. + +**Unassigned client fields:** If `signerEmail` is absent on a client-visible field in a multi-signer document, behavior must be defined. Recommended: block sending until all client-signature and initials fields have a `signerEmail`. The UI shows a warning. Text, checkbox, and date fields do not require a signer tag (they are embedded at prepare time and never shown to signers). + +### Token Creation (Send) + +``` +Agent clicks "Prepare and Send" + ↓ +POST /api/documents/[id]/prepare + - embeds agent signatures, text fills → preparedFilePath + - reads doc.signers[] to confirm signer list exists + ↓ +POST /api/documents/[id]/send + - if doc.signers?.length: multi-signer path + Promise.all(doc.signers.map(signer => { + createSigningToken(documentId, signer.email) + → INSERT signing_tokens (jti, document_id, signer_email, expires_at) + sendSigningRequestEmail({ to: signer.email, signingUrl: /sign/[token] }) + logAuditEvent('signer_email_sent', { signerEmail: signer.email, tokenJti: jti }) + })) + update doc.signers[*].tokenJti + set documents.status = 'Sent' + - else: legacy single-signer path (unchanged) +``` + +### Signing Page (Per Signer) + +``` +Signer opens /sign/[token] + ↓ +GET /api/sign/[token] + - verifySigningToken(token) → { documentId, jti } + - fetch tokenRow (jti) → tokenRow.signerEmail + - fetch doc.signatureFields + - if tokenRow.signerEmail: + filter fields where field.signerEmail === tokenRow.signerEmail AND isClientVisibleField + else: + filter fields with isClientVisibleField (legacy path — unchanged) + - return { status: 'pending', document: { ...doc, signatureFields: filteredFields } } + ↓ +Signer sees only their fields; draws signatures; submits + ↓ +POST /api/sign/[token] + 1. Verify JWT + 2. Atomic claim: UPDATE signing_tokens SET used_at = NOW() WHERE jti = ? AND used_at IS NULL + → 0 rows = 409 already-signed + 3. Acquire Postgres advisory lock on document ID (prevents concurrent PDF writes) + 4. Read current accumulatorPath = doc.signedFilePath ?? doc.preparedFilePath + 5. Embed this signer's signatures into accumulatorPath → write to new path + (clients/{id}/{uuid}_partial.pdf, updated with atomic rename) + 6. Update doc.signedFilePath = new path + 7. Update doc.signers[signerEmail].signedAt = now + 8. Release advisory lock + 9. Check completion: COUNT(signing_tokens WHERE document_id = ? AND used_at IS NOT NULL) + vs COUNT(signing_tokens WHERE document_id = ?) + 10a. Not all signed: logAuditEvent('signer_signed'); return 200 + 10b. All signed (completion): + - Compute pdfHash of final signedFilePath + - UPDATE documents SET status='Signed', signedAt=now, pdfHash=hash + - logAuditEvent('signer_signed') + - logAuditEvent('document_completed', { signerCount, mergedFilePath }) + - sendAgentNotificationEmail (all signed) + - sendAllSignersCompletionEmail (each signer receives final PDF link) + - return 200 +``` + +### Advisory Lock Implementation + ```typescript -// app/api/agent/signature/route.ts (NEW FILE) -import { auth } from '@/lib/auth'; -import { db } from '@/lib/db'; -import { users } from '@/lib/db/schema'; -import { eq } from 'drizzle-orm'; +// Within the signing POST, wrap the PDF write in an advisory lock: +await db.execute(sql`SELECT pg_advisory_xact_lock(hashtext(${documentId}))`); +// All subsequent DB operations in this transaction hold the lock. +// Lock released automatically when transaction commits or rolls back. +``` +Drizzle's `db.execute(sql`...`)` supports raw SQL. `pg_advisory_xact_lock` is a session-level transaction lock — safe for this use case. + +--- + +## Part 3: Migration Strategy + +### Existing signed documents — no action required + +The `signerEmail` field is absent from all existing `signatureFields` JSONB. For existing tokens, `signer_email = null`. The signing endpoint's null path falls through to `isClientVisibleField()` — identical to current behavior. Existing documents never enter multi-signer code paths. + +### Migration file (single file, order matters) + +Write as `drizzle/0010_multi_signer.sql`: + +```sql +-- 1. Expand signing_tokens +ALTER TABLE "signing_tokens" ADD COLUMN "signer_email" text; + +-- statement-breakpoint + +-- 2. Add signers JSONB to documents +ALTER TABLE "documents" ADD COLUMN "signers" jsonb; + +-- statement-breakpoint + +-- 3. Expand audit event enum +-- Must be outside a transaction in Postgres < 12 — use statement-breakpoint +ALTER TYPE "audit_event_type" ADD VALUE 'signer_email_sent'; + +-- statement-breakpoint + +ALTER TYPE "audit_event_type" ADD VALUE 'signer_signed'; + +-- statement-breakpoint + +ALTER TYPE "audit_event_type" ADD VALUE 'document_completed'; +``` + +**No backfill required.** Existing rows have `null` for new columns, which is the correct legacy sentinel value at every call site. + +### TypeScript changes after migration + +1. Add `signerEmail?: string` to `SignatureFieldData` interface +2. Add `DocumentSigner` interface +3. Add `signers` column to `documents` Drizzle table definition +4. Add `signerEmail` to `signingTokens` Drizzle table definition +5. Add three values to `auditEventTypeEnum` array in schema +6. Add `getSignerEmail(field, fallback)` helper function + +All changes are additive. No existing function signatures break. + +--- + +## Part 4: Multi-Signer Build Order + +Each step is independently deployable. Deploy schema migration first, then backend changes, then UI. + +``` +Step 1: DB migration (0010_multi_signer.sql) + → System: DB ready. App unchanged. No user impact. + +Step 2: Schema TypeScript + token layer + - Add DocumentSigner type, signerEmail to SignatureFieldData + - Update signingTokens and documents Drizzle definitions + - Update createSigningToken(documentId, signerEmail?) + - Add auditEventTypeEnum new values + → System: Token creation accepts signer email. All existing behavior unchanged. + +Step 3: Signing GET endpoint — field filtering + - Read tokenRow.signerEmail + - Filter signatureFields by signerEmail (null → legacy) + → System: Signing page shows correct fields per signer. Legacy tokens unaffected. + +Step 4: Signing POST endpoint — accumulator + completion + - Add advisory lock + - Add accumulator path logic + - Add completion check + - Add document_completed event + notifications + → System: Multi-signer signing flow complete end-to-end. Single-signer legacy unchanged. + +Step 5: Send route — per-signer token loop + - Detect doc.signers vs legacy + - Loop: create token + send email per signer + - Log signer_email_sent per signer + → System: New documents get per-signer tokens. Old documents still use legacy path. + +Step 6: New endpoint — PUT /api/documents/[id]/signers + - Validate email array + - Update documents.signers JSONB + → System: Agent can set signer list from UI. + +Step 7: UI — PreparePanel signer list + - Replace single email textarea with name+email rows (add/remove) + - Call PUT /api/documents/[id]/signers on change + - Warn if client-visible fields lack signerEmail + → System: Agent can define signers before placing fields. + +Step 8: UI — FieldPlacer signer tagging + - Add signerEmail selector per client-visible field + - Color-code placed fields by signer + - Pass signerEmail through persistFields + → System: Full multi-signer field placement. + +Step 9: Email — completion notifications + - sendAllSignersCompletionEmail in signing-mailer.tsx + - Update sendAgentNotificationEmail for completion context + → System: All parties notified and receive final PDF link. + +Step 10: End-to-end verification + - Test with two signers on a real Utah form + - Verify field isolation, sequential PDF accumulation, final hash +``` + +--- + +## Part 5: Multi-Signer Components — New vs Modified + +### Modified + +| Component | File | Nature of Change | +|---|---|---| +| `SignatureFieldData` interface | `src/lib/db/schema.ts` | Add `signerEmail?: string` | +| `auditEventTypeEnum` | `src/lib/db/schema.ts` | Add 3 new values | +| `signingTokens` Drizzle table | `src/lib/db/schema.ts` | Add `signerEmail` column | +| `documents` Drizzle table | `src/lib/db/schema.ts` | Add `signers` column | +| `createSigningToken()` | `src/lib/signing/token.ts` | Add `signerEmail?` param; INSERT includes it | +| GET `/api/sign/[token]` | `src/app/api/sign/[token]/route.ts` | Signer-aware field filtering (null path = legacy) | +| POST `/api/sign/[token]` | `src/app/api/sign/[token]/route.ts` | Accumulator PDF logic, advisory lock, completion check, completion notifications | +| POST `/api/documents/[id]/send` | `src/app/api/documents/[id]/send/route.ts` | Per-signer token + email loop; legacy path preserved | +| `PreparePanel` | `src/app/portal/(protected)/documents/[docId]/_components/PreparePanel.tsx` | Multi-signer list entry UI (replaces single textarea) | +| `FieldPlacer` | `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` | Signer email selector on field place; per-signer color coding | +| `signing-mailer.tsx` | `src/lib/signing/signing-mailer.tsx` | Add `sendAllSignersCompletionEmail` function | + +### New + +| Component | File | Purpose | +|---|---|---| +| `DocumentSigner` interface | `src/lib/db/schema.ts` | Shape of `documents.signers[]` JSONB entries | +| `getSignerEmail()` helper | `src/lib/db/schema.ts` | Returns `field.signerEmail ?? fallback`; mirrors `getFieldType()` pattern | +| PUT `/api/documents/[id]/signers` | `src/app/api/documents/[id]/signers/route.ts` | Save/update signer list on the document | +| Migration file | `drizzle/0010_multi_signer.sql` | All DB schema changes in one file | + +### Not Changed + +| Component | Reason | +|---|---| +| `embedSignatureInPdf()` | Works on any path; accumulator pattern reuses it as-is | +| `verifySigningToken()` | JWT payload unchanged; `signerEmail` is DB-only, not a JWT claim | +| `logAuditEvent()` | Accepts any enum value; new values are additive | +| `isClientVisibleField()` | Logic unchanged; still used for legacy null-signer tokens | +| GET `/api/sign/[token]/pdf` | Serves prepared PDF; no signer-specific logic needed | +| `clients` table | Signers are email-identified, not FK-linked | +| `preparePdf()` / prepare endpoint | Unchanged; accumulation happens during signing, not preparation | + +--- + +## Part 6: Docker Compose — Secrets and Environment Variables + +### The Core Problem + +The existing email failure in Docker production is a **runtime env var injection gap**: `CONTACT_SMTP_HOST`, `CONTACT_EMAIL_USER`, `CONTACT_EMAIL_PASS` (and `CONTACT_SMTP_PORT`) exist in `.env.local` during development but are never passed to the Docker container at runtime. The nodemailer transporter in `src/lib/signing/signing-mailer.tsx` reads these directly from `process.env`. When they are undefined, `nodemailer.createTransport()` silently creates a transporter with no credentials, and `sendMail()` fails at send time. + +### Why Not Docker Secrets (file-based)? + +Docker Compose's `secrets:` block mounts secret values as files at `/run/secrets/` inside the container. This is designed for Docker Swarm and serves a specific security model (encrypted transport in Swarm, no env var exposure in `docker inspect`). It requires the application to read from the filesystem instead of `process.env`. Bridging this to `process.env` requires an entrypoint shell script that reads each `/run/secrets/` file and exports its value before starting `node server.js`. + +For this deployment (single VPS, secrets are SSH-managed files on the server, not Swarm), the file-based secrets approach adds complexity with no meaningful security benefit over a properly permissioned `.env.production` file. **Use `env_file` injection, not `secrets:`.** + +**Decision:** Confirmed approach is `env_file:` with a server-side `.env.production` file (not committed to git, permissions 600, owned by deploy user). + +### Environment Variable Classification + +Not all env vars are equal. Next.js has two distinct categories: + +| Category | Prefix | When Evaluated | Who Can Read | Example | +|---|---|---|---|---| +| Server-only runtime | (none) | At request time, via `process.env` | API routes, Server Components, Route Handlers | `DATABASE_URL`, `CONTACT_SMTP_HOST`, `OPENAI_API_KEY` | +| Public build-time | `NEXT_PUBLIC_` | At `next build` — inlined into JS bundle | Client-side code | (none in this app) | + +This app has **no `NEXT_PUBLIC_` variables.** All secrets are server-only and evaluated at request time. They do not need to be present at `docker build` time — only at `docker run` / `docker compose up` time. This is the ideal case: the same Docker image can run in any environment by providing different `env_file` values. + +**Verified:** Next.js App Router server-side code (`process.env.X` in API routes, Server Components) reads env vars at request time when the route is dynamically rendered. Source: [Next.js deploying docs](https://nextjs.org/docs/app/getting-started/deploying), [vercel/next.js docker-compose example](https://github.com/vercel/next.js/tree/canary/examples/with-docker-compose). + +### Required Secrets for Production + +Derived from `.env.local` inspection — all server-only, none `NEXT_PUBLIC_`: + +``` +DATABASE_URL — Neon PostgreSQL connection string +SIGNING_JWT_SECRET — JWT signing key for signing tokens +AUTH_SECRET — Next Auth / Iron Session secret +AGENT_EMAIL — Agent login email +AGENT_PASSWORD — Agent login password hash seed +BLOB_READ_WRITE_TOKEN — Vercel Blob storage token +CONTACT_EMAIL_USER — SMTP username (fixes email delivery bug) +CONTACT_EMAIL_PASS — SMTP password (fixes email delivery bug) +CONTACT_SMTP_HOST — SMTP host (fixes email delivery bug) +CONTACT_SMTP_PORT — SMTP port (fixes email delivery bug) +OPENAI_API_KEY — GPT-4.1 for AI field placement +``` + +`SKYSLOPE_*` and `URE_*` credentials are script-only (seed/scrape scripts), not needed in the production container. + +### Compose File Structure + +**`docker-compose.yml`** (production): + +```yaml +services: + app: + build: + context: ./teressa-copeland-homes + dockerfile: Dockerfile + restart: unless-stopped + ports: + - "3000:3000" + env_file: + - .env.production # server-side secrets — NOT committed to git + environment: + NODE_ENV: production + healthcheck: + test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", + "http://localhost:3000/api/health"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 15s +``` + +**`.env.production`** (on server, never committed): + +```bash +DATABASE_URL=postgres://... +SIGNING_JWT_SECRET=... +AUTH_SECRET=... +AGENT_EMAIL=teressa@... +AGENT_PASSWORD=... +BLOB_READ_WRITE_TOKEN=... +CONTACT_EMAIL_USER=... +CONTACT_EMAIL_PASS=... +CONTACT_SMTP_HOST=smtp.fastmail.com +CONTACT_SMTP_PORT=465 +OPENAI_API_KEY=sk-... +``` + +**`.gitignore`** must include: +``` +.env.production +.env.production.local +.env.local +``` + +### Dockerfile — next.config.ts Change Required + +The current `next.config.ts` does not set `output: 'standalone'`. The standalone output is required for the official multi-stage Docker pattern — it produces a self-contained `server.js` with only necessary files, yielding a ~60-80% smaller production image compared to copying all of `node_modules`. + +**Change needed in `next.config.ts`:** +```typescript +const nextConfig: NextConfig = { + output: 'standalone', // ADD THIS + transpilePackages: ['react-pdf', 'pdfjs-dist'], + serverExternalPackages: ['@napi-rs/canvas'], +}; +``` + +**Caution:** `@napi-rs/canvas` is a native addon. Verify the production base image (Debian slim recommended, not Alpine) has the required glibc version. Alpine uses musl libc which is incompatible with pre-built `@napi-rs/canvas` binaries. The official canary Dockerfile uses `node:24-slim` (Debian). + +### Dockerfile — Recommended Pattern + +Three-stage build based on the [official Next.js canary example](https://github.com/vercel/next.js/blob/canary/examples/with-docker/Dockerfile): + +```dockerfile +ARG NODE_VERSION=20-slim + +# Stage 1: Install dependencies +FROM node:${NODE_VERSION} AS dependencies +WORKDIR /app +COPY package.json package-lock.json* ./ +RUN npm ci --no-audit --no-fund + +# Stage 2: Build +FROM node:${NODE_VERSION} AS builder +WORKDIR /app +COPY --from=dependencies /app/node_modules ./node_modules +COPY . . +ENV NODE_ENV=production +# No NEXT_PUBLIC_ vars needed — all secrets are server-only runtime vars +RUN npm run build + +# Stage 3: Production runner +FROM node:${NODE_VERSION} AS runner +WORKDIR /app +ENV NODE_ENV=production +ENV PORT=3000 +ENV HOSTNAME="0.0.0.0" + +RUN mkdir .next && chown node:node .next +COPY --from=builder --chown=node:node /app/public ./public +COPY --from=builder --chown=node:node /app/.next/standalone ./ +COPY --from=builder --chown=node:node /app/.next/static ./.next/static + +USER node +EXPOSE 3000 +CMD ["node", "server.js"] +``` + +**No `ARG`/`ENV` lines for secrets in the Dockerfile.** Secrets are never baked into the image. They arrive exclusively at runtime via `env_file:` in the Compose file. + +### Health Check Endpoint + +Required for the Compose `healthcheck:` to work. Create at `src/app/api/health/route.ts`: + +```typescript export async function GET() { - const session = await auth(); - if (!session?.user?.id) return new Response('Unauthorized', { status: 401 }); - - const user = await db.query.users.findFirst({ - where: eq(users.id, session.user.id), - columns: { agentSignatureData: true }, - }); - return Response.json({ dataURL: user?.agentSignatureData ?? null }); -} - -export async function PUT(req: Request) { - const session = await auth(); - if (!session?.user?.id) return new Response('Unauthorized', { status: 401 }); - - const { dataURL } = await req.json() as { dataURL: string }; - // Validate: must be a PNG dataURL - if (!dataURL.startsWith('data:image/png;base64,')) { - return Response.json({ error: 'Invalid signature format' }, { status: 400 }); - } - - await db.update(users) - .set({ agentSignatureData: dataURL }) - .where(eq(users.id, session.user.id)); - - return Response.json({ ok: true }); + return Response.json({ status: 'ok', uptime: process.uptime() }); } ``` -### How Agent Signature Is Applied +`wget` is available in the `node:20-slim` (Debian) image. `curl` is not installed by default in slim images; `wget` is more reliable without a separate `RUN apt-get install curl`. -When the agent places an `agent-signature` field and has a saved signature, `PreparePanel` sends the `agentSignatureData` as the `value` on that field when calling `POST /api/documents/[id]/prepare`. The `prepare-document.ts` function embeds it as a PNG image at the field's coordinates — exactly the same logic as `embedSignatureInPdf`. +### `.dockerignore` -**The agent signature is embedded during prepare, before the document is sent to the client.** The client sees the agent's signature already in the PDF as a real image, not a placeholder rectangle. +Prevents secrets and large dirs from entering the build context: + +``` +node_modules +.next +.env* +uploads/ +seeds/ +scripts/ +drizzle/ +*.png +*.pdf +``` + +### Deployment Procedure (First Time) + +``` +1. SSH into VPS +2. git clone (or git pull) the repo +3. Create .env.production with all secrets (chmod 600 .env.production) +4. Run database migration: docker compose run --rm app npm run db:migrate + (or run migration against Neon directly before starting) +5. docker compose build +6. docker compose up -d +7. docker compose logs -f app (verify email sends on first signing test) +``` + +### Common Pitfall: `db:migrate` in Container + +Drizzle `db:migrate` reads `DATABASE_URL` from env. In the container, this is provided via `env_file:`. Run migration as a one-off: + +```bash +docker compose run --rm app node -e " + const { drizzle } = require('drizzle-orm/neon-http'); + const { migrate } = require('drizzle-orm/neon-http/migrator'); + // ... +" +``` + +Or more practically: run `npx drizzle-kit migrate` from the host with `DATABASE_URL` set in the shell, pointing at the production Neon database, before deploying the new container. This avoids needing `drizzle-kit` inside the production image. --- -## Integration 4: Filled Preview Approach - -### Question answered: Re-render via react-pdf with overlaid values, or generate a new temporary PDF server-side? - -### Decision: Generate a temporary prepared PDF server-side; render with existing react-pdf viewer - -**Rejected approach: Client-side overlay rendering** - -Overlaying text/checkmarks on top of a react-pdf canvas in the browser is fragile. Text positions must be pixel-perfect, and the coordinate math between PDF user space and screen pixels is already complex (demonstrated by the existing `FieldPlacer.tsx`). Overlaid values would not be "in" the PDF — they would be CSS layers that look different from the final embedded result. - -**Recommended approach: Reuse `preparePdf` in preview-only mode** - -The existing `preparePdf` function already generates a complete prepared PDF from the source PDF + text values + field geometries. For preview, call the same function but write to a temporary path, then serve it through the existing `/api/documents/[id]/file` pattern. - -``` -PreparePanel clicks "Preview" - │ - ▼ -POST /api/documents/[id]/preview - │ (auth-gated, agent only) - │ body: { textFillData, fields (DocumentField[]) } - ▼ -preparePdf(srcPath, tmpPath, textFillData, fields) - │ (same function, new tmp destination) - ▼ -Response: streams the tmp PDF bytes directly - │ (or writes to uploads/{docId}_preview.pdf, serves via file route) - ▼ -PreparePanel opens preview in a modal with from react-pdf - (same PdfViewerWrapper pattern — ssr: false) -``` - -**Preview route:** - -```typescript -// app/api/documents/[id]/preview/route.ts (NEW FILE) -import { auth } from '@/lib/auth'; -import { db } from '@/lib/db'; -import { documents } from '@/lib/db/schema'; -import { eq } from 'drizzle-orm'; -import { preparePdf } from '@/lib/pdf/prepare-document'; -import { readFile } from 'node:fs/promises'; -import path from 'node:path'; -import type { DocumentField } from '@/lib/db/schema'; - -const UPLOADS_DIR = path.join(process.cwd(), 'uploads'); - -export async function POST( - req: Request, - { params }: { params: Promise<{ id: string }> } -) { - const session = await auth(); - if (!session) return new Response('Unauthorized', { status: 401 }); - - const { id } = await params; - const body = await req.json() as { - textFillData?: Record; - fields?: DocumentField[]; - }; - - const doc = await db.query.documents.findFirst({ where: eq(documents.id, id) }); - if (!doc?.filePath) return Response.json({ error: 'Not found' }, { status: 404 }); - - const srcPath = path.join(UPLOADS_DIR, doc.filePath); - const previewPath = path.join(UPLOADS_DIR, doc.filePath.replace(/\.pdf$/, '_preview.pdf')); - - // Path traversal guard - if (!previewPath.startsWith(UPLOADS_DIR)) return new Response('Forbidden', { status: 403 }); - - await preparePdf(srcPath, previewPath, body.textFillData ?? {}, body.fields ?? []); - - const pdfBytes = await readFile(previewPath); - return new Response(pdfBytes, { - headers: { - 'Content-Type': 'application/pdf', - 'Content-Disposition': 'inline', - 'Cache-Control': 'no-store', - }, - }); -} -``` - -**PreparePanel preview modal:** Add a "Preview" button that calls `POST /api/documents/[id]/preview` with current field + text-fill state, receives the PDF bytes, converts to an object URL via `URL.createObjectURL`, and opens a modal containing `` from react-pdf. Uses the same `PdfViewerWrapper` pattern (dynamic import, `ssr: false`). - -**Why not stream PDF to a new browser tab:** Object URL in a modal keeps the preview in-app and avoids browser popup blockers. The agent can review without leaving the prepare page. - -**Preview file persistence:** The `_preview.pdf` file is overwritten each time the agent clicks Preview. It is not stored in the DB and is never sent to the client. It can be cleaned up on a schedule or simply overwritten on each preview request. - ---- - -## Data Flow Changes - -### New Flow: AI Auto-Prepare - -``` -Agent clicks "AI Auto-place" - │ - ▼ -Client: POST /api/documents/[id]/ai-prepare - │ - ▼ -Server: extractPdfText(filePath) — pdfjs-dist legacy build - │ - ▼ -Server: callFieldPlacementAI(text, clientContext) — OpenAI gpt-4o-mini - │ structured output - ▼ -Server: convert xPct/yPct → PDF points using page dimensions - │ - ▼ -Response: { fields: DocumentField[], prefillData: Record } - │ - ▼ -Client: setFields(aiFields) + setTextFillData(prefillData) - │ — renders in FieldPlacer (existing render path) - ▼ -Agent adjusts if needed, then saves via existing PUT /api/documents/[id]/fields -``` - -### New Flow: Agent Signature Draw & Save - -``` -Agent opens "My Signature" panel (new section in PreparePanel or sidebar) - │ - ▼ -Agent draws signature on SignaturePad canvas (reuse SignatureModal pattern) - │ - ▼ -Agent clicks "Save Signature" - │ - ▼ -Client: PUT /api/agent/signature { dataURL } - │ - ▼ -Server: UPDATE users SET agent_signature_data = ? WHERE id = session.user.id - │ - ▼ -Response: { ok: true } - │ - ▼ -Agent can now place "Agent Sig" tokens in FieldPlacer — on prepare, the saved -dataURL is fetched and embedded as a PNG image at those field coordinates. -``` - -### Modified Flow: Prepare (Extended) - -``` -Agent clicks "Prepare and Send" (existing button) - │ - ▼ -PreparePanel: POST /api/documents/[id]/prepare - body: { - textFillData: Record, // existing - emailAddresses: string[], // existing - fields: DocumentField[], // MODIFIED: full typed field array - } - │ - ▼ -prepare/route.ts: fetch agent signature if any agent-sig fields present - │ GET users.agentSignatureData WHERE id = session.user.id - │ - ▼ -preparePdf(srcPath, destPath, textFillData, fields, agentSigDataURL) - │ — extended to handle all field types - │ - ▼ -[existing] DB update, audit log, redirect -``` - ---- - -## Component Boundaries - -### New Files (create from scratch) - -| File | Type | Purpose | -|------|------|---------| -| `src/lib/ai/extract-text.ts` | Server-only lib | pdfjs-dist text extraction for OpenAI input | -| `src/lib/ai/field-placement.ts` | Server-only lib | OpenAI structured output call, prompt, schema | -| `src/app/api/documents/[id]/ai-prepare/route.ts` | API route (auth-gated) | Orchestrates extract + AI call + coord conversion | -| `src/app/api/documents/[id]/preview/route.ts` | API route (auth-gated) | Calls preparePdf in preview mode, streams bytes | -| `src/app/api/agent/signature/route.ts` | API route (auth-gated) | GET/PUT agent saved signature from DB | -| `src/app/portal/(protected)/documents/[docId]/_components/AgentSignaturePanel.tsx` | Client component | Draw/save/display agent signature; calls PUT /api/agent/signature | -| `src/app/portal/(protected)/documents/[docId]/_components/PreviewModal.tsx` | Client component | Modal wrapping react-pdf Document for preview display | - -### Modified Files (targeted additions only) - -| File | Change | Constraint | -|------|--------|------------| -| `src/lib/db/schema.ts` | Add `DocumentField` discriminated union; add `agentSignatureData` to users table; add `propertyAddress` to clients table | Keep `SignatureFieldData` as type alias — zero breaking changes | -| `src/lib/pdf/prepare-document.ts` | Add type-aware rendering for text/checkbox/date/agent-sig/initials fields | Existing signature field path (no type / `client-signature`) must behave identically | -| `src/app/api/documents/[id]/fields/route.ts` | Accept `DocumentField[]` (union type) instead of `SignatureFieldData[]` — structurally identical, TypeScript type change only | No behavior change | -| `src/app/api/documents/[id]/prepare/route.ts` | Fetch agent signature from users table if any agent-sig fields present | Must remain backward-compatible with no-fields body | -| `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` | Add new draggable tokens to palette (text, checkbox, initials, date, agent-sig); render type-specific labels and colors for placed fields | Do NOT change the drag/drop/move/resize/persist mechanics | -| `src/app/portal/(protected)/documents/[docId]/_components/PreparePanel.tsx` | Add "AI Auto-place" button + loading state; add "Preview" button + PreviewModal; add AgentSignaturePanel; connect new API routes | Do NOT change existing "Prepare and Send" flow | -| `src/app/sign/[token]/_components/SigningPageClient.tsx` | Filter signatureFields to client-interaction types only (`client-signature`, `initials`, no-type) | Do NOT change submit logic or embed-signature flow | -| `src/app/api/sign/[token]/route.ts` (POST) | Filter signatureFields to client-sig/initials before building `signaturesWithCoords` | Other field types are already baked into the prepared PDF | -| `src/app/portal/_components/ClientModal.tsx` | Add `propertyAddress` field | Straightforward form field addition | -| `src/lib/actions/clients.ts` | Add propertyAddress to create/update mutations | Simple Drizzle update | - -### Files to Leave Completely Unchanged - -| File | Reason | -|------|--------| -| `src/lib/signing/embed-signature.ts` | Agent signature embedding reuses this logic, but via `prepare-document.ts` not here. Client signing path is unchanged. | -| `src/lib/signing/audit.ts` | No new audit event types needed for v1.1 | -| `src/lib/signing/token.ts` | Signing token flow unchanged | -| `src/app/sign/[token]/_components/SignatureModal.tsx` | Client signing modal unchanged | -| `src/app/api/sign/[token]/download/route.ts` | Download route unchanged | -| `src/app/api/documents/[id]/send/route.ts` | Send flow unchanged | -| All public marketing pages | No changes to homepage, contact form, listings | - ---- - -## Database Schema Changes - -``` --- Migration: v1.1 additions --- Generated by: npm run db:generate after schema.ts changes - -ALTER TABLE "users" - ADD COLUMN IF NOT EXISTS "agent_signature_data" text; -- base64 PNG dataURL - -ALTER TABLE "clients" - ADD COLUMN IF NOT EXISTS "property_address" text; -- for AI pre-fill - --- No migration needed for documents.signature_fields JSONB — --- new field types are stored in existing column, backward-compatible -``` - -**Migration safety:** Both new columns are nullable with no default, so existing rows are unaffected. The `signatureFields` JSONB change requires no migration — JSONB stores arbitrary JSON. - ---- - -## Coordinate Conversion: AI Output to PDF Points - -The AI returns `xPct` / `yPct` (0–100). The API route must convert to PDF user-space points before returning to the client or writing to DB. - -```typescript -// Inside ai-prepare/route.ts, after AI call: -import * as pdfjsLib from 'pdfjs-dist/legacy/build/pdf.mjs'; - -const pdfDoc = await pdfjsLib.getDocument({ data: new Uint8Array(await readFile(srcPath)) }).promise; -const convertedFields: DocumentField[] = []; - -for (const aiField of result.fields) { - const page = await pdfDoc.getPage(aiField.page); - const viewport = page.getViewport({ scale: 1.0 }); - const pageW = viewport.width; // PDF points - const pageH = viewport.height; - - // Default dimensions by type - const defaultW = aiField.type === 'checkbox' ? 14 : aiField.type === 'initials' ? 72 : 144; - const defaultH = aiField.type === 'checkbox' ? 14 : 28; - - convertedFields.push({ - id: crypto.randomUUID(), - type: aiField.type, - label: aiField.label, - page: aiField.page, - x: (aiField.xPct / 100) * pageW, - y: (aiField.yPct / 100) * pageH, - width: defaultW, - height: defaultH, - ...(aiField.prefillValue ? { value: aiField.prefillValue } : {}), - }); -} -``` - ---- - -## Build Order for v1.1 - -Dependencies flow in this order. Each item can only start after its prerequisites. - -``` -Step 1 — Schema foundation (no deps within v1.1) -├── Extend DocumentField union in schema.ts -├── Add agentSignatureData to users table -├── Add propertyAddress to clients table -└── Run db:generate + db:migrate - -Step 2 — Agent signature persistence (depends on Step 1) -├── GET/PUT /api/agent/signature route -└── AgentSignaturePanel component (draw + save + display) - -Step 3 — New field types in FieldPlacer (depends on Step 1) -├── Extend FieldPlacer palette with 5 new token types -├── Update field rendering (type-specific colors/labels) -└── Update PUT /api/documents/[id]/fields to accept DocumentField[] - -Step 4 — Extended prepare-document.ts (depends on Step 1, 2, 3) -├── Add type-aware rendering in preparePdf -├── Handle agent-sig field type: fetch + embed PNG -└── Update POST /api/documents/[id]/prepare to pass agent sig - -Step 5 — Client signing page filter (depends on Step 3) -├── Filter signatureFields to client-interaction types in SigningPageClient -└── Filter in POST /api/sign/[token] before embedSignatureInPdf - -Step 6 — Preview (depends on Step 4) -├── POST /api/documents/[id]/preview route -└── PreviewModal component + "Preview" button in PreparePanel - -Step 7 — AI auto-place (depends on Step 1, 3) -├── lib/ai/extract-text.ts -├── lib/ai/field-placement.ts -├── POST /api/documents/[id]/ai-prepare route -└── "AI Auto-place" button + client-side field application in PreparePanel - -Step 8 — Property address for AI (depends on Step 1) -├── Add propertyAddress field to ClientModal -└── Update clients server action to save propertyAddress -``` - -**Why this order:** -- Steps 1–3 establish the data foundation that everything else reads from. -- Step 4 (prepare) depends on knowing what all field types look like. -- Step 5 (client signing filter) must happen before Step 4 ships, otherwise clients see agent-sig/text fields as signature prompts. -- Step 6 (preview) depends on the extended prepare function being complete. -- Step 7 (AI) depends on field types being placeable and the FieldPlacer accepting them. - ---- - -## Anti-Patterns to Avoid - -### 1. Exposing OPENAI_API_KEY to the client - -Never call the OpenAI API from a Client Component or import `lib/ai/*.ts` in a component. All OpenAI calls must go through `POST /api/documents/[id]/ai-prepare`. Add a `'server-only'` import to `lib/ai/field-placement.ts` to get a build error if accidentally imported on the client. - -### 2. Storing agent signature in localStorage only (current v1.0 state) - -localStorage is ephemeral (cleared on browser data wipe), not shared across devices, and not available server-side during prepare. Keeping the current localStorage fallback is fine as a UX shortcut during the signing session, but the source of truth must be the DB. - -### 3. Changing the `embedSignatureInPdf` signing flow for client signatures - -The client signing flow embeds signatures from client-drawn dataURLs. Do not modify `embed-signature.ts` or the POST `/api/sign/[token]` logic to handle new field types — handle agent-sig and text during `preparePdf` only. The signing route should filter fields before calling embed. - -### 4. Making preview a full "saved" prepared file - -The `_preview.pdf` file is ephemeral and not recorded in the DB. Do not confuse it with `preparedFilePath`. If the agent proceeds to send after previewing, the actual `POST /api/documents/[id]/prepare` generates a fresh `_prepared.pdf` as before. Preview is read-only and stateless. - -### 5. Using AI field placement as authoritative without agent review - -AI placement is a starting point. The "AI Auto-place" button fills the FieldPlacer with suggested fields, but the agent must be able to adjust before the fields are committed to the DB. Coordinates from the AI response should populate the client-side field state, not directly write to DB. - -### 6. Skipping path traversal guard on new preview route - -The preview route writes a file to `uploads/`. Apply the same `destPath.startsWith(UPLOADS_DIR)` guard used in the prepare route. - -### 7. Using `pdf-parse` as an additional dependency - -`pdfjs-dist` is already installed (dependency of `react-pdf`). Use the legacy build server-side. Adding `pdf-parse` would be a duplicate dependency with no benefit. - ---- - -## Integration Points Summary - -| New Feature | Touches | Does NOT Touch | -|-------------|---------|----------------| -| AI auto-place | lib/ai/* (new), ai-prepare route (new), FieldPlacer (palette only), PreparePanel (button only) | Signing flow, embed-signature, FieldPlacer drag logic | -| New field types | schema.ts (type union), FieldPlacer (palette + render), prepare-document.ts (type switch), sign route (filter) | Signature modal, signing token, audit log | -| Agent signature | users table (new column), signature route (new), PreparePanel (new panel), prepare-document.ts (embed agent sig) | Client signing, embed-signature.ts, signing token | -| Filled preview | preview route (new), PreviewModal (new), PreparePanel (button only), prepare-document.ts (reused as-is) | Prepare/send flow, signing flow, DB records | +## Part 7: Multi-Signer Key Risks and Mitigations + +| Risk | Severity | Mitigation | +|---|---|---| +| Two signers submit simultaneously, both read same PDF | HIGH | Postgres advisory lock `pg_advisory_xact_lock(hashtext(documentId))` on signing POST | +| Accumulator path tracking lost between signers | MEDIUM | `documents.signedFilePath` always tracks current accumulator; null = use preparedFilePath | +| Agent sends before all fields tagged to signers | MEDIUM | PreparePanel validates: block send if any client-visible field has no `signerEmail` in a multi-signer document | +| `ALTER TYPE ADD VALUE` in Postgres < 12 fails in transaction | MEDIUM | Use `-- statement-breakpoint` between each ALTER; verify Postgres version | +| Resending to a signer (token expired) | LOW | Issue new token via a resend endpoint; existing tokens remain valid | +| Legacy documents break | LOW | `signerEmail` optional at every layer; null path = unchanged behavior throughout | + +## Part 8: Docker Key Risks and Mitigations + +| Risk | Severity | Mitigation | +|---|---|---| +| `.env.production` committed to git | HIGH | `.gitignore` entry required; never add to repo | +| `@napi-rs/canvas` binary incompatible with Alpine | HIGH | Use `node:20-slim` (Debian), not `node:20-alpine` | +| Secrets baked into Docker image layer | MEDIUM | Zero `ARG`/`ENV` secret lines in Dockerfile; all secrets via `env_file:` at compose up | +| `standalone` output omits required files | MEDIUM | Test locally with `output: 'standalone'` before pushing; watch for missing static assets | +| Health check uses `curl` (not in slim image) | LOW | Use `wget` in healthcheck command — present in Debian slim | +| Migration runs against wrong DB | LOW | Run `drizzle-kit migrate` from host against Neon URL before container start; never inside production image | --- ## Sources -- [pdfjs-dist legacy build — Node.js text extraction](https://lirantal.com/blog/how-to-read-and-parse-pdfs-pdfjs-create-pdfs-pdf-lib-nodejs) -- [unpdf vs pdf-parse vs pdfjs-dist — 2026 comparison](https://www.pkgpulse.com/blog/unpdf-vs-pdf-parse-vs-pdfjs-dist-pdf-parsing-extraction-nodejs-2026) -- [OpenAI Structured Outputs — official docs](https://platform.openai.com/docs/guides/structured-outputs) -- [Introducing Structured Outputs in the API](https://openai.com/index/introducing-structured-outputs-in-the-api/) -- [Next.js 15 Server Actions vs API Routes — 2025 patterns](https://medium.com/@sparklewebhelp/server-actions-in-next-js-the-future-of-api-routes-06e51b22a59f) -- [PostgreSQL BYTEA vs TEXT for image storage](https://www.postgrespro.com/list/thread-id/1509166) -- [Drizzle ORM — add column migration](https://orm.drizzle.team/docs/drizzle-kit-generate) -- [react-pdf npm — v10 (current)](https://www.npmjs.com/package/react-pdf) +- [Docker Compose Secrets — Official Docs](https://docs.docker.com/compose/how-tos/use-secrets/) — HIGH confidence +- [Next.js with-docker official example — Dockerfile (canary)](https://github.com/vercel/next.js/blob/canary/examples/with-docker/Dockerfile) — HIGH confidence +- [Next.js with-docker-compose official example (canary)](https://github.com/vercel/next.js/tree/canary/examples/with-docker-compose) — HIGH confidence +- [Next.js env var classification (runtime vs build-time)](https://nextjs.org/docs/pages/guides/environment-variables) — HIGH confidence +- Direct codebase inspection: `src/lib/signing/signing-mailer.tsx`, `src/lib/db/schema.ts`, `.env.local` key names, `next.config.ts` — HIGH confidence --- - -*Architecture research for: v1.1 Smart Document Preparation — integration with existing Next.js 15 app* -*Researched: 2026-03-21* +*Architecture research for: teressa-copeland-homes v1.2* +*Researched: 2026-04-03* diff --git a/.planning/research/FEATURES.md b/.planning/research/FEATURES.md index 4b19941..ebc9eee 100644 --- a/.planning/research/FEATURES.md +++ b/.planning/research/FEATURES.md @@ -257,3 +257,257 @@ Features that go beyond the baseline — meaningful specifically for this app's --- *Feature research for: Teressa Copeland Homes — v1.1 Smart Document Preparation* *Researched: 2026-03-21* + +--- +--- + +# Feature Research — v1.2 Multi-Signer Workflow + +**Domain:** Multi-signer document signing — parallel dispatch, signer-scoped signing pages, completion tracking +**Researched:** 2026-04-03 +**Confidence:** HIGH — grounded in existing codebase inspection (ARCHITECTURE.md already defines the implementation model), verified against DocuSign/HelloSign/PandaDoc behavioral patterns from web research + +--- + +## Scope Note + +This section covers only the v1.2 milestone additions. The existing v1.0/v1.1 features (single-signer flow, field placement, AI pre-fill, agent signature, filled preview) are already built. + +**New features under research (multi-signer only):** +1. Signer list entry on the document (replace single recipient with named signer rows) +2. Field-to-signer assignment when placing fields +3. Per-signer signing tokens and parallel dispatch +4. Signer-filtered signing page +5. Per-signer completion tracking +6. Document completion: all-signed notification + final PDF to all parties + +--- + +## Feature Landscape + +### Table Stakes (Users Expect These) + +These are behaviors that DocuSign, HelloSign (Dropbox Sign), and PandaDoc all implement as defaults. An agent who has used any of these tools will expect them. Missing any one of these makes the multi-signer feature feel broken. + +| Feature | Why Expected | Complexity | Notes | +|---------|--------------|------------|-------| +| Name each signer (email + optional display name) | Every major tool requires identifying signers before placing fields. Without it, field assignment has no target and the email has no greeting. | LOW | Replaces the single email textarea in PreparePanel with a list of name+email rows. Names used in the email greeting ("Hi Sarah,"). Emails drive field routing. Minimum 1 signer; no enforced maximum for MVP. | +| Assign each client-visible field to a specific signer | Core of multi-signer — a field without a signer assignment is ambiguous. Which canvas does Buyer sign on vs. Seller? DocuSign, PandaDoc, and dotloop all require explicit field-to-recipient assignment before sending. | MEDIUM | Per-field signer selector in FieldPlacer. Agent-owned fields (agent signature, agent-pre-filled text) have no signer assignment — they are filled at prepare time and locked. Only client-visible fields (signature, initials, checkbox) need assignment. Date fields are client-visible but auto-stamp — they inherit the assignment of the adjacent signature/initials field they accompany. | +| Block send if any client-visible field is unassigned | DocuSign enforces this as a hard error before sending. Agents expect it because sending an unassigned signature field means no one will ever sign it. | LOW | UI validation: "Send" button is disabled and shows "2 fields have no signer assigned." Agent fixes in FieldPlacer before proceeding. | +| Each signer receives their own unique signing link | Standard behavior. Sharing one link among multiple signers is not a pattern any tool uses — it removes identity, makes completion tracking impossible, and creates legal ambiguity about who signed what. | MEDIUM | Each signer gets a JWT signed with their email stored in the `signer_email` column of `signing_tokens`. Links sent simultaneously via `Promise.all()`. | +| Each signer sees only their own fields | Expected by analogy with single-signer: when you receive a DocuSign link, you see only your fields. Seeing another signer's name on an empty signature field you cannot fill is confusing and creates "who signs where?" support calls. | MEDIUM | Signing GET endpoint filters `signatureFields` by `field.signerEmail === tokenRow.signerEmail`. Other signers' fields are simply not sent to the client. The PDF renders in full; only the interactive overlay elements are scoped. | +| Signing page shows progress ("X of Y signers have signed") | Users expect to know where the document stands. DocuSign shows "1 of 3 recipients have signed." HelloSign shows a progress indicator. In real estate, agents are frequently asked "has everyone signed yet?" | LOW | Simple count from `signing_tokens` WHERE `document_id = ? AND used_at IS NOT NULL` vs total tokens. Shown on the signing confirmation page and in the agent portal document detail. | +| Agent is notified when all signers complete | Universal behavior. DocuSign sends a "Document Completed" email to the sender by default. Without this, the agent has no way to know the document is fully executed without polling the portal. | LOW | Triggered once all tokens are claimed (`document_completed` audit event). One email to agent. Subject: "All parties have signed — [Document Name]". | +| All parties receive the final PDF when complete | DocuSign attaches the completed PDF to the completion email (subject to a 5 MB size limit). Real estate agents expect all parties to have a copy of the fully-executed document. It is also an industry norm for legal record-keeping. | MEDIUM | Each signer gets an email with a presigned download link to the final merged PDF. Agent gets the same. Not an attachment (avoids size limits and expired links from email attachment storage policies) — a time-limited download link from Vercel Blob. | + +### Differentiators (Competitive Advantage) + +These are not expected behaviors in a basic multi-signer implementation, but they fit the context of this specific app. + +| Feature | Value Proposition | Complexity | Notes | +|---------|-------------------|------------|-------| +| Per-signer color coding on the field placement canvas | When the agent is placing fields, it is easy to lose track of which fields belong to which signer. DocuSign uses color-coded recipient fields (blue for Recipient 1, green for Recipient 2). This prevents misassignment. | LOW | CSS border/background color per `signerEmail`. Color assigned deterministically (first signer = blue, second = green, etc.). Legend shown in FieldPlacer sidebar. No user-configurable color. | +| Signing confirmation page shows what was just signed | After submitting, the signer sees "Thanks Sarah — you've signed. Bob Smith still needs to sign." This is the Dropbox Sign post-sign experience. It reduces "did it work?" anxiety without exposing any privacy details about the other signer's contact info. | LOW | Simple post-submit page. Shows: signer's own name, document name, completion timestamp, number of remaining signers (not their names/emails). Download link for the in-progress PDF (the accumulator). | +| Document status shows per-signer completion in agent portal | The agent can see at a glance which signers have completed and which have not. In single-signer this was binary (Sent/Signed). Multi-signer needs granularity: "Buyer signed March 3 — Seller has not yet signed." | MEDIUM | Agent portal document detail reads `doc.signers[]` array. Each entry shows: signer name/email, `signedAt` timestamp if completed, "Awaiting" if not. This is the `signers` JSONB column from ARCHITECTURE.md. | + +### Anti-Features (Commonly Requested, Often Problematic) + +| Feature | Why Requested | Why Problematic | Alternative | +|---------|---------------|-----------------|-------------| +| Sequential signing (Signer 2 cannot sign until Signer 1 completes) | "Agent should sign before the buyer" or "Seller must sign before buyer sees it" | PROJECT.md explicitly specifies parallel signing ("any order"). Sequential adds routing logic, waiting states, and re-notification emails — significant complexity for a feature the project owner explicitly rejected. For this app, the agent signs first at prepare time (before sending). The remaining signers are all clients who sign in parallel. | Agent signs before sending (already built in v1.1 as agent-signature flow). All client signers receive links simultaneously. | +| Signing reminder emails (automated resend if not signed in N days) | "What if the client doesn't sign for a week?" | Requires background job infrastructure (cron or worker queue). For a solo agent sending to known clients, a manual "Resend reminder" button is sufficient and avoids the infra cost. | Manual resend: agent can click "Resend link" in the portal for any signer who has not yet completed. This creates a new token and sends a fresh email. LOW complexity, same user outcome for 95% of cases. | +| Signer can see other signers' completed signatures on the PDF before they sign | "It feels more complete to see the buyer already signed when the seller opens the doc" | Requires a different field visibility model: fetching all signers' submitted data and rendering it on the PDF before the current signer submits. The accumulator model in ARCHITECTURE.md handles this post-hoc (each signer signs into the accumulator), not pre-emptively. Getting this right without creating privacy or sequence issues (what if buyer changes their mind?) is genuinely complex. | Signers see the base PDF with their own fields. After ALL signers complete, the final merged PDF goes to everyone — this is the standard industry pattern. | +| Signing order labels on fields ("Sign here - Signer 2") | "Helps signers know which fields are theirs" | Since each signer only sees their own fields on their signing page, the label is redundant. It adds visual noise and raises questions ("why does it say Signer 2 — am I in the right place?"). | Per-signer filtered signing page already solves this. Each signer sees a clean page with only their fields, labeled by type ("Sign here", "Initial here"). | +| Signer-level expiry (Signer A's link expires in 7 days, Signer B's in 14 days) | "Different parties have different deadlines" | Adds per-token configuration to the send flow. For real estate, all parties in a transaction are on the same deadline. Universal expiry (e.g., 14 days) covers 100% of Teressa's use cases. | All tokens for a document share the same expiry duration. Agent sets it once at send time (or uses the default). A future resend flow handles expired tokens. | +| Email address validation against a contacts database | "Warn me if I enter an email that's not in my clients list" | Signers may be parties Teressa hasn't added to her contacts (co-buyers, lenders, title officers). Restricting to the clients table introduces friction for the 30-40% of signers who aren't pre-loaded. | Free-text email input with standard email format validation only. Signer does not need a client record. This is confirmed in ARCHITECTURE.md: "signers may not be in clients table." | +| Bulk send to a document template (one form, many different transactions) | "Send this listing agreement template to 10 different clients at once" | Requires template management, variable substitution, and per-recipient field instances. Equivalent to DocuSign PowerForms. Multi-signer in v1.2 is scoped to one document, one transaction, multiple parties. | Template save is deferred to v2 per v1.1 research. Bulk send depends on templates. Correct scope is: one document, name the signers, send. | + +--- + +## Feature Dependencies (v1.2) + +``` +[v1.0 Single-Signer Signing Flow] + └──extended by──> [Per-Signer Token Creation] + └──requires──> [Signer List Entry UI] + └──enables──> [Parallel Email Dispatch] + └──enables──> [Signer-Filtered Signing Page] + +[v1.1 Field Placement UI] + └──extended by──> [Field-to-Signer Assignment] + └──requires──> [Signer List Entry UI] (must know signers before tagging fields) + └──enables──> [Per-Signer Color Coding] + +[Signer List Entry UI] + └──gates──> [Field-to-Signer Assignment] (chicken-and-egg: signers must exist before fields can be tagged) + └──gates──> [Per-Signer Token Creation + Dispatch] + +[Field-to-Signer Assignment] + └──gates──> [Send] (unassigned client-visible fields block send) + +[Per-Signer Token Creation + Dispatch] + └──enables──> [Signer-Filtered Signing Page] + └──enables──> [Per-Signer Completion Tracking] + +[Per-Signer Completion Tracking] + └──enables──> [Document Completion Detection] + └──triggers──> [Agent Completion Notification Email] + └──triggers──> [All-Parties Final PDF Email] + +[v1.1 Accumulator PDF (signed by one signer, file updated)] + └──extended by──> [Multi-Signer Sequential Accumulation] + └──requires──> [Postgres Advisory Lock] (concurrent signer writes) + └──produces──> [Final Merged PDF] +``` + +### Dependency Notes + +- **Signer list must be entered before fields are placed.** This is the most important UX sequencing constraint. If the agent enters signers after placing fields, existing fields have no `signerEmail` and the validation block prevents sending. Resolution: the PreparePanel prompts for signers as the first step in the prepare flow. Fields can be placed after signers are defined. + +- **The chicken-and-egg problem with signer entry order.** Some agents will want to place fields first, then decide who the signers are. The system should allow out-of-order workflows: agent places fields with "Unassigned" status, then enters signers and assigns them. The send block validates that no client-visible field is unassigned at send time — it does not prevent field placement before signer entry. + +- **Accumulator PDF depends on the advisory lock.** Two signers submitting simultaneously without a lock will produce a corrupted PDF (both read the same input file and write conflicting outputs). The `pg_advisory_xact_lock` pattern from ARCHITECTURE.md is mandatory, not optional. + +- **Date fields inherit their signer from context.** A date field adjacent to a buyer signature field should be tagged to the buyer. Currently date fields are auto-stamped at session completion — in multi-signer, the timestamp is written at the moment that signer's token is claimed, not at "all done." A date field on buyer's row uses buyer's signing timestamp. This means date fields need a `signerEmail` assignment just like signature fields. The agent should assign date fields to the adjacent signer, or UI could auto-assign based on proximity. + +- **Completion emails require the final merged PDF path to be stable.** The all-parties completion email contains a download link. That link must point to the final accumulator output, which is only stable after the last signer submits. The completion detection step in ARCHITECTURE.md (step 10b of the signing POST) correctly gates the email send until after all tokens are claimed. + +--- + +## v1.2 Feature Definitions + +### Launch With (v1.2 — this milestone) + +- [ ] **Signer list entry UI** — PreparePanel adds a "Signers" section. Agent enters rows of (optional name, required email). Minimum 1 signer. Add/remove rows. Saved to `documents.signers` JSONB via PUT `/api/documents/[id]/signers`. Displayed above field placement step so signers are known before tagging begins. + +- [ ] **Field-to-signer assignment** — In FieldPlacer, when placing a client-visible field (signature, initials, checkbox), a dropdown appears to select which signer the field belongs to. Options populated from `doc.signers[]`. Color-coded by signer. Unassigned fields shown in a distinct "no signer" color. Field data stores `signerEmail` in `SignatureFieldData` JSONB. + +- [ ] **Per-signer color coding** — Each signer is assigned a color deterministically (first signer = blue, second = green, etc.). Placed fields render with that signer's color border/background. A color legend appears in the FieldPlacer sidebar. Maximum 4-5 signers before colors become indistinct — adequate for real estate documents which rarely have more than 3 parties. + +- [ ] **Send validation: block if fields unassigned** — Before sending, server validates all client-visible fields have a `signerEmail`. UI shows: "3 fields are not assigned to a signer." Send button disabled. Agent resolves in FieldPlacer. + +- [ ] **Per-signer token creation and parallel dispatch** — Send route detects `doc.signers?.length > 0`. For each signer: create a signing token with `signer_email`, send signing request email, log `signer_email_sent` audit event. All dispatched via `Promise.all()` (parallel, any order). Legacy single-signer path (no `doc.signers`) remains unchanged. + +- [ ] **Signer-filtered signing page** — Signing GET endpoint reads `tokenRow.signerEmail`. Returns only `signatureFields` where `field.signerEmail === tokenRow.signerEmail`. Legacy null `signer_email` falls through to `isClientVisibleField()` (unchanged). The signing page renders the full PDF with only the current signer's fields as interactive overlays. + +- [ ] **Per-signer completion tracking in portal** — Agent portal document detail shows a signers list: name/email, status ("Signed March 3 at 2:14 PM" or "Awaiting signature"). Data sourced from `doc.signers[].signedAt`. Simple read from JSONB — no new query required. + +- [ ] **Document completion detection** — After each signer submits, the signing POST checks: all tokens for this document have `used_at IS NOT NULL`. If yes, fires `document_completed` audit event, sets `documents.status = 'Signed'`, sets `signedAt`, computes `pdfHash`. + +- [ ] **Agent completion notification email** — On `document_completed`, send one email to the agent: "All parties have signed [Document Name]. Download the final document: [link]." Presigned download link from Vercel Blob, 30-day expiry. + +- [ ] **All-parties final PDF email** — On `document_completed`, send to each signer: "The document is fully signed. Here is your copy: [link]." Same presigned download link. Personalized greeting using `signer.name` if available, fallback to email address. + +- [ ] **Post-sign confirmation page** — After a signer submits, they see: document name, their name/email, signing timestamp, "X of Y signers have completed." If they are the last signer: "The document is fully executed. All parties will receive a copy by email." Download link to the current merged PDF state. + +### Add After Validation (v1.x) + +- [ ] **Manual resend link to a specific signer** — Agent portal: "Resend link" button per signer row in the document detail. Creates a new signing token, invalidates the old one (sets `used_at` to a sentinel value or deletes and reissues), sends fresh email. Triggered by: agent reporting a signer didn't receive their link or token expired. + +- [ ] **Token expiry handling at signing time** — Currently expired tokens show a generic "link expired" error. In multi-signer context, show: "This signing link has expired. Contact [agent name] to request a new link." Include agent contact info from agent profile. + +### Future Consideration (v2+) + +- [ ] **Automated reminder emails** — Reminder sent to unsigned signers after N days. Requires a background job (cron). For current volume (solo agent, known clients), manual resend covers the need. Add when agent requests automation. + +- [ ] **Sequential signing order** — Signer 2 does not receive their link until Signer 1 completes. PROJECT.md explicitly out of scope for v1.2. If Teressa requests this in the future, it requires: routing state machine, deferred email dispatch, per-signer "waiting" status. + +- [ ] **Template-based multi-signer** — Save a field layout + signer role map as a reusable template. "Buyer/Seller roles" pre-assigned; agent enters names/emails at send time. Requires template management UI and role-based field assignment (rather than email-based). Depends on template save feature (also v2). + +- [ ] **Signing deadline per document** — Agent sets an expiry date ("all signers must sign by March 15"). System rejects signing submissions after deadline. Useful for offer deadlines in real estate. Not needed for v1.2 — use token expiry as a proxy for now. + +--- + +## Feature Prioritization Matrix (v1.2) + +| Feature | User Value | Implementation Cost | Priority | +|---------|------------|---------------------|----------| +| Signer list entry UI | HIGH | LOW | P1 | +| Field-to-signer assignment | HIGH | MEDIUM | P1 | +| Send validation: block if fields unassigned | HIGH | LOW | P1 | +| Per-signer token creation + parallel dispatch | HIGH | MEDIUM | P1 | +| Signer-filtered signing page | HIGH | MEDIUM | P1 | +| Agent completion notification email | HIGH | LOW | P1 | +| All-parties final PDF email | HIGH | LOW | P1 | +| Document completion detection | HIGH | MEDIUM | P1 | +| Per-signer completion tracking in portal | MEDIUM | LOW | P1 | +| Post-sign confirmation page | MEDIUM | LOW | P1 | +| Per-signer color coding | MEDIUM | LOW | P2 | +| Manual resend link | MEDIUM | MEDIUM | P2 | +| Token expiry UX for multi-signer | LOW | LOW | P2 | +| Automated reminder emails | MEDIUM | HIGH | P3 | +| Sequential signing order | LOW | HIGH | P3 | +| Template-based multi-signer | MEDIUM | HIGH | P3 | + +**Priority key:** +- P1: Must have for v1.2 milestone +- P2: Should have, add when possible within this milestone +- P3: Defer to v2+ + +--- + +## Competitor Feature Analysis (v1.2 Scope: Multi-Signer) + +| Feature | DocuSign | Dropbox Sign (HelloSign) | PandaDoc | This App (v1.2) | +|---------|----------|--------------------------|----------|-----------------| +| Signer identity model | Named recipients with email + role | Named recipients with email | Named signers (Signer 1, 2, 3) | Name + email rows in PreparePanel; no role abstraction | +| Field-to-signer assignment | Explicit per-field recipient assignment, color-coded | Explicit per-field signer assignment | Explicit; template roles pre-assigned | Per-field signer dropdown with deterministic color coding | +| Parallel vs sequential dispatch | Both; configurable via signing order numbers (same number = parallel group) | Both; set via signing order | Both | Parallel only (any order); PROJECT.md explicitly rejects sequential for v1.2 | +| Signer sees only own fields | Yes — recipient-scoped signing page | Yes | Yes | Yes — field filtering by `signerEmail` in token at signing GET | +| Completion notification to sender | Yes — automatic email + PDF | Yes | Yes | Agent email on `document_completed` with download link | +| Completion PDF to all signers | Yes — configurable; PDF attached or linked | Yes | Yes | Presigned Vercel Blob link sent to each signer on completion | +| Per-signer status in sender dashboard | Yes — "Delivered", "Viewed", "Signed" per recipient | Yes | Yes | "Signed [timestamp]" or "Awaiting" per signer in portal | +| Reminder emails | Yes — configurable auto-reminders | Yes — "Send reminder" manually or auto | Yes | Manual resend (v1.2); auto-reminders deferred to v2 | +| In-progress PDF visibility | Sender can view; signers do not see others' submissions | Same | Same | Final PDF only sent on completion; no mid-process access | + +--- + +## Behavioral Specifications (v1.2) + +### UX Flow: Agent Prepares and Sends a Multi-Signer Document + +1. **Agent opens document** in portal → navigates to Prepare tab +2. **Signer entry step** (new): "Who needs to sign this document?" — agent adds rows: "Sarah Chen (sarah@example.com)" and "Mike Chen (mikec@example.com)" — clicks Save Signers +3. **Field placement**: agent uses drag-drop or AI auto-place — fields appear on canvas; each client-visible field shows a signer selector; agent assigns "Signature (Sarah)" to the buyer fields and "Signature (Mike)" to the co-buyer fields — color distinction makes it clear at a glance +4. **Validation**: agent clicks Send — system checks: all client-visible fields have a signer. If not, shows "2 fields unassigned — resolve in field placement" +5. **Dispatch**: send route creates two signing tokens, sends two emails simultaneously. Agent sees: "Signing links sent to 2 signers." +6. **Signing**: Sarah opens her link and sees only her 4 fields. She signs and submits. Mike opens his link and sees only his 3 fields. He signs and submits. (Order does not matter.) +7. **Completion**: after both submit, agent receives "All parties have signed — Sales Agreement Chen." Sarah and Mike each receive an email with the final PDF download link. + +### What Happens at the Signing Page (Per Signer) + +The signing page is functionally identical to the existing single-signer page with one difference: only the current signer's fields appear as interactive. The PDF renders in full — the signer can read the entire document (this is legally important — they are agreeing to the whole document, not just their fields). The interactive signature/initials/checkbox overlays appear only at the positions assigned to their email. + +There is no indication of where the other signers' fields are. No greyed-out fields, no "this field belongs to someone else" markers. Clean presentation with only actionable elements visible. + +### Accumulator PDF Integrity + +Each signer's signing POST acquires a Postgres advisory lock keyed to the document ID, reads the current accumulator path (`doc.signedFilePath ?? doc.preparedFilePath`), embeds that signer's captured signatures at their field positions, writes the result to a new path, updates `doc.signedFilePath`, then releases the lock. This is sequential within the lock — two signers who submit simultaneously are serialized, not concurrent. The final accumulator after the last signer completes is the authoritative fully-signed document. See ARCHITECTURE.md for the full implementation detail. + +--- + +## Sources + +**Multi-signer signing behavior — DocuSign:** +- [DocuSign Trainer Tips: Mastering Signing Order](https://community.docusign.com/tips-from-docusign-155/docusign-trainer-tips-mastering-signing-order-22771) — MEDIUM confidence (community doc) +- [Quick Tip: Setting a Signing Order for Recipients — DocuSign](https://www.docusign.com/en-gb/blog/quick-tip-setting-signing-order) — HIGH confidence (official) +- [Does fully executed contract automatically get emailed back to all parties? — DocuSign Community](https://community.docusign.com/esignature-111/does-fully-executed-contracts-automatically-get-emailed-back-to-all-parties-3772) — MEDIUM confidence (confirms default completion behavior) +- [Why are documents not attached to the Completed email notification — DocuSign Support](https://support.docusign.com/s/articles/Why-are-documents-not-attached-to-the-Completed-email-notification?language=en_US) — HIGH confidence (official support) + +**Multi-signer signing behavior — HelloSign / PandaDoc:** +- [PandaDoc vs HelloSign feature comparison — eSignGlobal](https://www.esignglobal.com/blog/pandadoc-vs-hellosign-comparison) — MEDIUM confidence +- [How to set up a signature workflow with multiple signers — PandaDoc Blog](https://www.pandadoc.com/blog/digital-signature-workflow/) — MEDIUM confidence +- [PandaDoc vs Hellosign: Detailed Comparison 2025 — DocuPilot](https://www.docupilot.com/vs/pandadoc-vs-hellosign) — MEDIUM confidence + +**Real estate multi-signer and reminder patterns:** +- [Multi-party signing workflow — eSignGlobal](https://www.esignglobal.com/blog/multi-party-signing-workflow) — MEDIUM confidence +- [Electronic Signatures for Real Estate: 2026 Guide — DocuPilot](https://www.docupilot.com/blog/electronic-signature-for-real-estate) — MEDIUM confidence +- [Lone Wolf Authentisign — Real estate's leading eSignature solution](https://www.lwolf.com/operate/esignature) — HIGH confidence (industry tool) +- [BoldSign — E-Signature for Real Estate](https://boldsign.com/solutions/electronic-signature-for-real-estate/) — MEDIUM confidence + +**Implementation architecture (codebase-derived — HIGH confidence):** +- `/Users/ccopeland/temp/red/.planning/research/ARCHITECTURE.md` — direct codebase inspection, schema decisions, data flow, build order + +--- +*Feature research for: Teressa Copeland Homes — v1.2 Multi-Signer Workflow* +*Researched: 2026-04-03* diff --git a/.planning/research/PITFALLS.md b/.planning/research/PITFALLS.md index 80ab11d..1cc64ee 100644 --- a/.planning/research/PITFALLS.md +++ b/.planning/research/PITFALLS.md @@ -1,381 +1,641 @@ # Pitfalls Research -**Domain:** Real estate broker web app — v1.1 additions: AI field placement, expanded field types, agent saved signature, filled document preview -**Researched:** 2026-03-21 -**Confidence:** HIGH (all pitfalls grounded in the actual v1.0 codebase reviewed; no speculative claims) +**Domain:** Real estate broker web app — v1.2 additions: multi-signer support and Docker production deployment +**Researched:** 2026-04-03 +**Confidence:** HIGH — all pitfalls grounded in the v1.1 codebase reviewed directly; no speculative claims. Source code line references included throughout. --- -## Context: What v1.1 Is Adding to the Existing System +## Context: What v1.2 Is Adding to the Existing System -The v1.0 codebase has been reviewed. Key facts that shape every pitfall below: +The v1.1 codebase has been reviewed in full. Key facts that make every pitfall below concrete: -- `SignatureFieldData` (schema.ts) has **no `type` field** — it stores only `{ id, page, x, y, width, height }`. Every field is treated as a signature. -- `FieldPlacer.tsx` has **one draggable token** labeled "Signature" — no other field types exist in the palette. -- `SigningPageClient.tsx` **iterates `signatureFields`** and opens the signature modal for every field. It has no concept of field type. -- `embed-signature.ts` **only draws PNG images** — no logic for text, checkboxes, or dates. -- `prepare-document.ts` uses `@cantoo/pdf-lib` (confirmed import), fills AcroForm text fields and draws blue rectangles for signature placeholders. It does not handle the new field types. -- Prepared PDF paths are stored as relative local filesystem paths (not Vercel Blob URLs). The signing route builds absolute paths from these. -- Agent saved signature: no infrastructure exists yet. The v1.0 `SignatureModal` checks `localStorage` for a saved signature — that is the only "save" mechanism today, and it is per-browser only. +- `signingTokens` table has one row per document, no `signerEmail` column. One token = one signer = current architecture. +- `SignatureFieldData` (schema.ts) stores `{ id, page, x, y, width, height, type? }` — no `signerEmail` field. All fields belong to the single signer. +- `send/route.ts` calls `createSigningToken(doc.id)` once and emails `client.email`. Multi-signer needs iteration. +- `documents.status` enum is `Draft | Sent | Viewed | Signed`. No per-signer completion state exists. +- `POST /api/sign/[token]` marks `documents.status = 'Signed'` when its one token is claimed. With multiple signers, the first signer to complete will trigger this transition prematurely. +- PDF files live at `process.cwd() + '/uploads'` — a local filesystem path. Docker containers have ephemeral filesystems by default. +- `NEXT_PUBLIC_BASE_URL` is used to construct signing URLs. Variables prefixed `NEXT_PUBLIC_` are inlined at build time in Next.js, not resolved at container startup. +- Nodemailer transporter in `signing-mailer.tsx` calls `createTransporter()` per send — healthy pattern, but reads `CONTACT_SMTP_HOST` at call time, which only works if the env var is present in the container. +- `src/lib/db/index.ts` uses `postgres(url)` with no explicit `max` connection limit. In Docker, the `postgres` npm package defaults to `10` connections per instance. Against Neon, the free tier allows 10 concurrent connections total — one container saturates this budget entirely. +- `next.config.ts` declares `serverExternalPackages: ['@napi-rs/canvas']`. This native binary must be present in the Docker image. The package ships platform-specific `.node` files selected by npm at install time. If the Docker image is built on ARM (Apple Silicon) and run on x86_64 Linux, the wrong binary is included. +- `package.json` lists `@vercel/blob` as a production dependency. It is not used anywhere in the codebase. Its presence creates a risk of accidental use in future code that would break in a non-Vercel Docker deployment. --- -## Critical Pitfalls +## Summary -### Pitfall 1: Breaking the Signing Page by Adding Field Types Without Type Discrimination +Eight risk areas for v1.2: + +1. **Multi-signer completion detection** — the current "first signer marks Signed" pattern will falsely complete documents. +2. **Docker filesystem and env var** — Next.js bakes `NEXT_PUBLIC_*` at build time; container loses uploads unless a volume is mounted; `DATABASE_URL` and SMTP secrets silently absent in container. +3. **SMTP in Docker** — not a DNS problem for external SMTP services, but env var injection failure is the confirmed root cause of the reported email breakage. +4. **PDF assembly on partial completion** — the final merged PDF must only be produced once, after all signers complete, without race conditions. +5. **Token security** — multiple tokens per document opens surfaces that a single-token system didn't have. +6. **Neon connection pool exhaustion** — `postgres` npm client's default 10 connections saturates Neon's free tier connection limit in a single container. +7. **`@napi-rs/canvas` native binary** — cross-platform Docker builds break this native module without explicit platform targeting. +8. **`@vercel/blob` dead dependency** — installed but unused; its presence risks accidental use in code that would silently fail outside Vercel. + +--- + +## Multi-Signer Pitfalls + +### Pitfall 1: First Signer Marks Document "Signed" — Completion Fires Prematurely **What goes wrong:** -`SignatureFieldData` has no `type` field. `SigningPageClient.tsx` opens the signature-draw modal for every field in `signatureFields`. When new field types (text, checkbox, initials, date, agent-signature) are stored in that same array with only coordinates, the client signing page either (a) shows a signature canvas for a checkbox field, or (b) crashes with a runtime error when it encounters a field type it doesn't handle, blocking the entire signing page. - -**Why it happens:** -The schema change is made on the agent side first (adding a `type` discriminant to `SignatureFieldData` and new field types to `FieldPlacer`), but the signing page is not updated in the same commit. Even one deployed document with mixed field types — sent before the signing page update — will be broken for that client. - -**How to avoid:** -Add `type` to `SignatureFieldData` as a string literal union **before** any field placement UI changes ship. Make the signing page's field renderer branch on `type` defensively: unknown types default to a placeholder ("not required") rather than throwing. Ship both changes atomically — schema migration, `FieldPlacer` update, and `SigningPageClient` update must be deployed together. Never have a deployed state where the schema supports types the signing page doesn't handle. - -**Warning signs:** -- `SignatureFieldData` in `schema.ts` gains a `type` property but `SigningPageClient.tsx` still iterates fields without branching on it. -- The FieldPlacer palette has more tokens than the signing page has rendering branches. -- A document is sent before the signing page is updated to handle the new types. - -**Phase to address:** -Phase 1 of v1.1 (schema and signing page update) — must be the first change, before any AI or UI work touches field types. - ---- - -### Pitfall 2: AI Coordinate System Mismatch — OpenAI Returns CSS-Space Percentages, pdf-lib Expects PDF Points - -**What goes wrong:** -The OpenAI response for field placement will return bounding boxes in one of several formats: percentage of page (0–1 or 0–100), pixel coordinates at an assumed render resolution, or CSS-style top-left origin. The existing `SignatureFieldData` schema stores **PDF user space coordinates** (bottom-left origin, points). When the AI output is stored without conversion, every AI-placed field appears at the wrong position — often inverted on the Y axis. The mismatch is not obvious during development if you test with PDFs where fields land approximately near the correct area. - -**Why it happens:** -The current `FieldPlacer.tsx` already has a correct `screenToPdfCoords` function for converting drag events. But that function takes rendered pixel dimensions as input. When AI output arrives as a JSON payload, developers mistakenly store the raw AI coordinates directly into the database without passing them through the same conversion. The sign-on-screen overlay in `SigningPageClient.tsx` then applies `getFieldOverlayStyle()` which expects PDF-space coords, producing the wrong position. - -**Concrete example from the codebase:** -`screenToPdfCoords` in `FieldPlacer.tsx` computes: -``` -pdfY = ((renderedH - screenY) / renderedH) * pageInfo.originalHeight -``` -If the AI returns a y_min as fraction of page height from the top (0 = top), storing it directly as `field.y` means the field appears at the bottom of the page instead of the top, because PDF Y=0 is the bottom. - -**How to avoid:** -Define a canonical AI output format contract before building the prompt. Use normalized coordinates (0–1 fractions from top-left) in the AI JSON response, then convert server-side using a single `aiCoordsToPagePdfSpace(norm_x, norm_y, norm_w, norm_h, pageWidthPts, pageHeightPts)` utility. This utility mirrors the existing `screenToPdfCoords` logic. Unit-test it against a known Utah purchase agreement with known field positions before shipping. - -**Warning signs:** -- AI-placed fields appear clustered at the bottom or top of the page regardless of document content. -- The AI integration test uses visual eyeballing rather than coordinate assertions. -- The conversion function is not covered by the existing test suite (`prepare-document.test.ts`). - -**Phase to address:** -AI field placement phase — write the coordinate conversion utility and its test before the OpenAI API call is made. - ---- - -### Pitfall 3: OpenAI Token Limits on Large Utah Real Estate PDFs - -**What goes wrong:** -Utah standard real estate forms (REPC, listing agreements, buyer representation agreements) are 10–30 pages. Sending the raw PDF bytes or a base64-encoded PDF to GPT-4o-mini will immediately hit the 128k context window limit for multi-page forms, or produce truncated/hallucinated field detection when the document is silently cut off mid-content. GPT-4o-mini's vision context limit is further constrained by image tokens — a single PDF page rendered at 72 DPI costs roughly 1,700 tokens; a 20-page document at standard resolution consumes ~34,000 tokens before any prompt text. - -**Why it happens:** -Developers prototype with short test PDFs (2–3 pages) where the approach works, then discover it fails on production forms. The failure mode is not a hard error — the API returns a response, but field positions are wrong or missing because the model never saw the later pages. - -**How to avoid:** -Page-by-page processing: render each PDF page to a base64 PNG (using `pdfjs-dist` or `sharp` on the server), send each page image in a separate API call, then merge the field results. Cap input image resolution to 1024px wide (sufficient for field detection). Set a token budget guard before each API call and log when pages approach the limit. Use structured output (JSON mode) so partial responses fail loudly rather than silently returning incomplete data. - -**Warning signs:** -- AI analysis is tested with only a 2-page or 3-page sample PDF. -- The implementation sends the entire PDF to OpenAI in a single request. -- Field detection success rate degrades noticeably on page 8+. - -**Phase to address:** -AI integration phase — establish the page-by-page pipeline pattern before testing with real Utah forms. - ---- - -### Pitfall 4: Prompt Design — AI Hallucinates Fields That Don't Exist or Misses Required Fields - -**What goes wrong:** -Without a carefully constrained prompt, GPT-4o-mini will "helpfully" infer field locations that don't exist in the PDF (e.g., detecting a printed date as a fillable date field) or will use inconsistent field type names that don't match the application's `type` enum (`"text_input"` instead of `"text"`, `"check_box"` instead of `"checkbox"`). This produces spurious fields in the agent's document and breaks the downstream field type renderer. - -**Why it happens:** -The default behavior of vision models is to be helpful and infer structure. Without explicit constraints (exact allowed types, instructions to return empty array when no fields exist, max field count), the output is non-deterministic and schema-incompatible. - -**How to avoid:** -Use OpenAI's structured output (JSON schema mode) with an explicit enum for field types matching the application's type discriminant exactly. Include a negative instruction: "Only detect fields that have an explicit visual placeholder (blank line, box, checkbox square) — do not infer fields from printed text labels." Include a `confidence` score per field so the agent UI can filter low-confidence placements. Validate the response JSON against a Zod schema server-side before storing — reject the entire AI response if any field has an invalid type. - -**Warning signs:** -- The prompt asks the model to "detect all form fields" without specifying what counts as a field. -- The response is stored directly in the database without Zod validation. -- The agent sees unexpected fields on pages with no visual placeholders. - -**Phase to address:** -AI integration phase — validate prompt output against Zod before the first real Utah form is tested. - ---- - -### Pitfall 5: Agent Saved Signature Stored as Raw DataURL — Database Bloat and Serving Risk - -**What goes wrong:** -A canvas signature exported as `toDataURL('image/png')` produces a base64-encoded PNG string. A typical signature on a 400x150 canvas is 15–60KB as base64. If this is stored directly in the database (e.g., a `TEXT` column in the `users` table), every query that fetches the user row will carry 15–60KB of base64 data it may not need. More critically, if the dataURL is ever sent to the client to pre-populate a form field, it exposes the full signature as a downloadable string in page source. - -**How to avoid:** -Store the signature as a file (Vercel Blob or the existing `uploads/` directory), and store only the file path/URL in the database. On the signing page and preview, serve the signature through an authenticated API route that streams the file bytes — never expose the raw dataURL to the client page. Alternatively, convert the dataURL to a `Uint8Array` immediately on the server (for PDF embedding only) and discard the string — only the file path goes to the DB. - -**Warning signs:** -- A `savedSignatureDataUrl TEXT` column is added to the `users` table. -- The agent dashboard page fetches the user row and passes `savedSignatureDataUrl` to a React component prop. -- The signature appears in the React devtools component tree as a base64 string. - -**Phase to address:** -Agent saved signature phase — establish the storage pattern (file + path, not dataURL + column) before any signature saving UI is built. - ---- - -### Pitfall 6: Race Condition — Agent Updates Saved Signature While Client Is Mid-Signing - -**What goes wrong:** -The agent draws a new saved signature and saves it while a client has the signing page open. The signing page has already loaded the signing request data (including `signatureFields`). When the agent applies their new saved signature to an agent-signature field and re-prepares the document, there are now two versions of the prepared PDF on disk: the one the client is looking at and the newly generated one. If the client submits their signature concurrently with the agent's re-preparation, `embedSignatureInPdf()` may read a partially-written prepared PDF (before the atomic rename completes) or the document may be marked "Sent" again after already being in "Viewed" state, breaking the audit trail. - -**Why it happens:** -The existing prepare flow in `PreparePanel.tsx` allows re-preparation of Draft documents. Once agent signing is added, the agent can re-run preparation on a "Sent" or "Viewed" document to swap their signature, creating a mutable prepared PDF while a client session is active. - -**How to avoid:** -Lock prepared documents once the first signing link is sent. Gate the agent re-prepare action behind a confirmation: "Resending will invalidate the existing signing link — the client will receive a new email." On confirmation, atomically: (1) mark the old signing token as `usedAt = now()` with reason "superseded", (2) delete the old prepared PDF (or rename to `_prepared_v1.pdf`), (3) generate a new prepared PDF, (4) issue a new signing token, (5) send a new email. This prevents mid-session clobber. The existing `embedSignatureInPdf` already uses atomic rename (`tmp → final`) which prevents partial-read corruption — preserve this. - -**Warning signs:** -- Agent can click "Prepare and Send" on a document with status "Sent" without any confirmation dialog. -- The prepared PDF path is deterministic and overwritten in place (e.g. always `{docId}_prepared.pdf`). -- No "superseded" state exists in the `signingTokens` table. - -**Phase to address:** -Agent signing phase — implement the supersede-and-resend flow before any agent signature is applied to a sent document. - ---- - -### Pitfall 7: Filled Preview Is Served From the Same Path as the Prepared PDF — Stale Preview After Field Changes - -**What goes wrong:** -The agent makes changes to field placement or pre-fill values after generating a preview. The preview file on disk is now stale. The preview URL is cached by the browser (or a CDN). The agent sees the old preview and believes the document is correct, then sends it to the client. The client receives a document with the old pre-fill values, not the updated ones. - -**Why it happens:** -The existing `prepare-document.ts` writes to a deterministic path: `{docId}_prepared.pdf`. If the preview is served from the same path, any browser cache of that URL shows the old version. The agent has no visual indication that the preview is stale. - -**How to avoid:** -Generate preview PDFs to a separate path with a timestamp or version suffix: `{docId}_preview_{timestamp}.pdf`. Never serve the preview from the same path as the final prepared PDF. Add a "Preview is stale — regenerate before sending" banner that appears when `signatureFields` or `textFillData` are changed after the last preview was generated. Store `lastPreviewGeneratedAt` in the document record and compare to `updatedAt`. The "Send" button should be disabled until a fresh preview has been generated (or explicitly skipped by the agent). - -**Warning signs:** -- The preview endpoint serves `/api/documents/{id}/prepared` without a cache-busting mechanism. -- The agent can modify fields after generating a preview and the preview URL does not change. -- No "stale preview" indicator exists in the UI. - -**Phase to address:** -Filled document preview phase — establish the versioned preview path and staleness indicator before the first preview is rendered. - ---- - -### Pitfall 8: Memory Issues Rendering Large PDFs for Preview on the Server - -**What goes wrong:** -Generating a filled preview requires loading the PDF into memory (via `@cantoo/pdf-lib`), modifying it, and either returning the bytes for streaming or writing to disk. Utah real estate forms (REPC, addendums) can be 15–30 pages and 2–8MB as raw PDFs. Running `PDFDocument.load()` on an 8MB PDF in a Vercel serverless function that has a 256MB memory limit can cause OOM errors under concurrent load. The Vercel function timeout (10s default, 60s max on Pro) can also be exceeded for large PDFs with many embedded fonts. - -**Why it happens:** -Developers test with a small 2-page PDF in development and the function works fine. The function hits the memory wall only when a real Utah standard form (often 20+ pages with embedded images) is processed in production. - -**How to avoid:** -Do not generate the preview inline in a serverless function on every request. Instead: generate the preview once (as a write operation), store the result in the `uploads/` directory or Vercel Blob, and serve it from there. The preview generation can be triggered on-demand (agent clicks "Generate Preview") and is idempotent. Set a timeout guard: if `PDFDocument.load()` takes longer than 8 seconds, return a 504 with "Preview temporarily unavailable." Monitor the Vercel function execution time and memory in the dashboard — alert at 70% of the memory limit. - -**Warning signs:** -- Preview is regenerated on every page load (no stored preview file). -- The preview route calls `PDFDocument.load()` within a synchronous request handler. -- Tests only use PDFs smaller than 2MB. - -**Phase to address:** -Filled document preview phase — establish the "generate once, serve cached" pattern from the start. - ---- - -### Pitfall 9: Client Signing Page Confusion — Preview Shows Agent Pre-Fill but Client Signs a Different Document - -**What goes wrong:** -The filled preview shows the document with all text pre-fills applied (client name, property address, price). The client signing page also renders the prepared PDF — which already contains those fills (because `prepare-document.ts` fills AcroForm fields and draws text onto the PDF). But the visual design difference between "this is a preview for review" and "this is the actual document you are signing" is unclear. If the agent generates a stale preview and the client signs a different (more recent) prepared PDF, the client believes they signed what they previewed, but the legal document has different content. - -**How to avoid:** -The client signing page must always serve the **same** prepared PDF that was cryptographically hashed at prepare time. The preview the agent saw must be generated from that exact file — not a re-generation. Store the SHA-256 hash of the prepared PDF at preparation time (same pattern as the existing `pdfHash` for signed PDFs). When serving the client's signing PDF, recompute and verify the hash matches before streaming. This ties the signed document back to the exact bytes the agent previewed. - -**Warning signs:** -- The preview is generated by a different code path than `prepare-document.ts` (e.g., a separate PDF rendering library). -- No hash is stored for the prepared PDF, only for the signed PDF. -- The agent can re-prepare after preview generation without the signing link being invalidated. - -**Phase to address:** -Filled document preview phase AND agent signing phase — hash the prepared PDF immediately after writing it (extend the existing `pdfHash` pattern from signed to prepared). - ---- - -### Pitfall 10: Agent Signature Field Handled by Client Signing Page - -**What goes wrong:** -A new `"agent-signature"` field type is added to `FieldPlacer`. The agent applies their saved signature to this field before sending. But `SigningPageClient.tsx` iterates all fields in `signatureFields` and shows a signing prompt for each one. If the agent-signature field is included in the array sent to the client, the client sees a field labeled "Signature" (or unlabeled) that is already visually signed with someone else's signature, and the progress bar counts it as an unsigned field the client must complete. - -**Why it happens:** -The client signing page receives the full `signatureFields` array from the GET `/api/sign/[token]` response. The route currently returns `doc.signatureFields ?? []` without filtering. When agent-signature fields are added to the same array, they are included in the client's field list. - -**Concrete location in codebase:** +`POST /api/sign/[token]` at line 254–263 of the current route unconditionally executes: ```typescript -// /src/app/api/sign/[token]/route.ts, line 88 -signatureFields: doc.signatureFields ?? [], +await db.update(documents).set({ status: 'Signed', signedAt: now, ... }) + .where(eq(documents.id, payload.documentId)); ``` -This sends ALL fields to the client, including any agent-filled fields. +With two signers, Signer A completes and triggers this. The document is now `Signed`. Signer B's token is still valid, but when Signer B opens their signing page GET request, it checks `doc.signatureFields` filtered by `isClientVisibleField`. The document's fields are all there — nothing prevents Signer B from completing. Two `signature_submitted` audit events are logged for the same document, two conflicting `_signed.pdf` files may be written, and the agent receives two "document signed" emails. The final PDF hash stored in `documents.pdfHash` is from whichever signer completed last and overwrote the row. + +**Why it happens:** +The single-signer assumption is load-bearing in the POST handler. Completion detection is a single UPDATE, not a query across all tokens for the document. **How to avoid:** -Filter the `signatureFields` array in the signing token GET route: only return fields where `type !== 'agent-signature'` (or more precisely, only return fields the client is expected to sign). Agent-signed fields should be pre-embedded into the `preparedFilePath` PDF during document preparation — by the time the client opens the signing link, the agent's signature is already baked into the prepared PDF as a drawn image. The `signatureFields` array sent to the client should contain only the fields the client needs to provide. +Add a `signerEmail TEXT NOT NULL` column to `signingTokens`. Completion detection becomes: after claiming a token (the atomic UPDATE that prevents double-submission), query `SELECT COUNT(*) FROM signing_tokens WHERE document_id = ? AND used_at IS NULL`. If count reaches zero, all signers have completed — only then trigger final PDF assembly and agent notification. Protect this with a database transaction so the count query and the "mark Signed" update are atomic. Never set `documents.status = 'Signed'` until the zero-remaining-tokens check passes. **Warning signs:** -- The full `signatureFields` array is returned from the signing token GET without filtering by `type`. -- Agent-signed fields are stored in the same `signatureFields` JSONB column as client signature fields. -- The client progress bar shows more fields than the client is responsible for signing. +- `POST /api/sign/[token]` sets `status = 'Signed'` without first counting remaining unclaimed tokens. +- Agent receives two notification emails after a two-signer document is tested. +- `documents.signedAt` is overwritten by both signers (last-write-wins). -**Phase to address:** -Agent signing phase — filter the signing response by field type before the first agent-signed document is sent to a client. +**Phase to address:** Multi-signer schema phase — before any send or signing UI is changed, establish the completion detection query. --- -## Technical Debt Patterns +### Pitfall 2: Race Condition — Two Signers Complete Simultaneously, Both Trigger Final PDF Assembly -Shortcuts that seem reasonable but create long-term problems. +**What goes wrong:** +Signer A and Signer B submit within milliseconds of each other (common if they are in the same room). Both claim their respective tokens atomically — that part works. Both then execute the "count remaining unclaimed tokens" check. If that check is not inside the same database transaction as the token claim, both reads may return 0 remaining (after the other's claim propagated), and both handlers proceed to assemble the final merged PDF simultaneously. Two concurrent writes to `{docId}_signed.pdf` corrupt the file (partial PDF bytes interleaved), or the second write silently overwrites the first. -| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable | -|----------|-------------------|----------------|-----------------| -| Store saved signature as dataURL in users table | No new file storage code needed | Every user query pulls 15–60KB of base64; dataURL exposed in client props | Never — use file storage from the start | -| Re-use same `_prepared.pdf` path for preview and final prepared doc | No versioning logic needed | Stale previews; no way to prove which prepared PDF the client signed | Never — versioned paths required for legal integrity | -| Return all signatureFields to client (no type filtering) | Simpler route code | Client sees agent-signature fields as required fields to complete | Never for agent-signature type; acceptable for debugging only | -| Prompt OpenAI with entire PDF as one request | Simpler prompt code | Fails silently on documents > ~8 pages; token limit hit without hard error | Acceptable only for prototyping with < 5 page test PDFs | -| Add `type` to SignatureFieldData but don't add a schema migration | Skip Drizzle migration step | Existing rows have `null` type; `signatureFields` JSONB array has mixed null/typed entries; TypeScript union breaks | Never — migrate immediately | -| Generate preview on every page load | No caching logic needed | OOM errors on large PDFs under Vercel memory limit; slow UX | Acceptable only during local development | +**Why it happens:** +The atomic token claim (`UPDATE ... WHERE used_at IS NULL RETURNING`) is a single row update. The subsequent completion check is a separate query. Two handlers can interleave between those two operations. + +**How to avoid:** +Use a `completionTriggeredAt TIMESTAMP` column on `documents` with a one-time-set guard: +```typescript +const won = await db.update(documents) + .set({ completionTriggeredAt: new Date() }) + .where(and(eq(documents.id, docId), isNull(documents.completionTriggeredAt))) + .returning({ id: documents.id }); +if (won.length === 0) return; // another handler already triggered completion +// proceed to final PDF assembly +``` +This is the same pattern the existing token claim uses (`UPDATE ... WHERE used_at IS NULL RETURNING`). If 0 rows returned, another handler already won the race; skip assembly silently. + +**Warning signs:** +- Two concurrent POST requests for the same document produce two `_signed.pdf` files. +- The `documents` table has no `completionTriggeredAt` column. + +**Phase to address:** Multi-signer schema phase — establish this pattern alongside the completion detection fix. --- -## Integration Gotchas +### Pitfall 3: Legacy Single-Signer Documents Break When signingTokens Gains signerEmail -Common mistakes when connecting to external services. +**What goes wrong:** +v1.0 and v1.1 documents have one row in `signingTokens` with no `signerEmail`. When the multi-signer schema adds `signerEmail NOT NULL` to `signingTokens`, all existing token rows become invalid (null violates NOT NULL). If the column is added without a migration that backfills existing rows, all existing signing links stop working: the token lookup succeeds but any code reading `token.signerEmail` throws a null dereference. -| Integration | Common Mistake | Correct Approach | -|-------------|----------------|------------------| -| OpenAI Vision API | Sending raw PDF bytes — PDFs are not natively supported by vision models | Convert each page to PNG via pdfjs-dist on the server; send page images, not PDF bytes | -| OpenAI structured output | Using `response_format: { type: 'json_object' }` and hoping the schema matches | Use `response_format: { type: 'json_schema', json_schema: { ... } }` with the exact schema, then validate with Zod | -| `@cantoo/pdf-lib` (confirmed import in codebase) | Calling `embedPng()` with a base64 dataURL that includes the `data:image/png;base64,` prefix on systems that strip it | The existing `embed-signature.ts` already handles this correctly — preserve the pattern when adding new embed paths | -| `@cantoo/pdf-lib` flatten | Flattening before drawing rectangles causes AcroForm overlay to appear on top of drawn content | The existing `prepare-document.ts` already handles order correctly (flatten first, then draw) — preserve this order in any new prepare paths | -| Vercel Blob (if migrated from local uploads) | Fetching a Blob URL inside a serverless function on the same Vercel deployment causes a request to the CDN with potential cold-start latency | Use the `@vercel/blob` SDK's `get()` method rather than `fetch(blob.url)` from within API routes | -| Agent signature file serving | Serving the agent's saved signature PNG via a public URL | Gate all signature file access behind the authenticated agent API — never expose with a public Blob URL | +**Why it happens:** +Drizzle migrations add the column in a single ALTER TABLE. There is no Drizzle migration command that backfills legacy data — that requires a separate SQL step in the migration file. + +**How to avoid:** +Add `signerEmail` as `TEXT` (nullable) initially. Backfill existing rows with the client's email via a JOIN at migration time. Then add the NOT NULL constraint in a second migration once backfill is confirmed. Alternatively, add `signerEmail TEXT DEFAULT ''` and document that empty string means "legacy single-signer." All code reading `signerEmail` must handle the legacy empty/null case. + +**Warning signs:** +- Drizzle migration adds `signer_email TEXT NOT NULL` in one step with no `DEFAULT` and no backfill SQL. +- A v1.0 document's signing link is not tested after migration. + +**Phase to address:** Multi-signer schema phase — include legacy backfill SQL in the migration script. --- -## Performance Traps +### Pitfall 4: Field-to-Signer Tag Stored in JSONB — Queries Cannot Filter by Signer Efficiently -Patterns that work at small scale but fail as usage grows. +**What goes wrong:** +`signatureFields JSONB` is an array of field objects. Adding `signerEmail` to each field object is the right call for field filtering in the signing page (already done via `isClientVisibleField`). But if the completion detection, status dashboard, or "who has signed" query tries to derive signer list from the JSONB array, it requires a Postgres JSONB containment query (`@>` or `jsonb_array_elements`). These are unindexed by default and slow on large arrays. More critically, if the agent changes a field's `signerEmail` tag after the document has been sent, the JSONB update does not cascade to any `signingTokens` rows — the token was issued for the old email. -| Trap | Symptoms | Prevention | When It Breaks | -|------|----------|------------|----------------| -| OpenAI call inline with agent "AI Place Fields" button click | 10–30 second page freeze; API timeout on multi-page PDFs | Trigger AI placement as a background job; poll for completion; show progress bar | Immediately on PDFs > 5 pages | -| PDF preview generation in a synchronous serverless function | Vercel function timeout (60s max Pro); OOM on 8MB PDFs | Generate once and store; serve from storage | On PDFs > 10MB or under concurrent load | -| Storing all signatureFields JSONB on documents table without a size guard | Large JSONB column slows document list queries | Add a field count limit (max 50 fields); if AI places more, require agent review | When AI places fields on 25+ page documents with many fields per page | -| dataURL signature image in `signaturesRef.current` in SigningPageClient | Each re-render serializes 50KB+ per signature into JSON | Already handled correctly in v1.0 (ref, not state) — do not move signature data to state when adding type-based rendering | Would break at > 5 simultaneous signature fields | +**How to avoid:** +The authoritative list of signers and their completion state lives in `signingTokens`, not in the JSONB. `signingTokens.signerEmail` is the source of truth for "who needs to sign." The JSONB field's `signerEmail` is used only at signing-page render time to filter which fields a given signer sees. Once a document is Sent (tokens issued), the JSONB field tags are considered frozen — re-tagging fields on a Sent document is not permitted without voiding the existing tokens. + +**Warning signs:** +- A query tries to derive the recipient list from `signatureFields JSONB` rather than from `signingTokens`. + +**Phase to address:** Multi-signer schema phase — document this invariant in a code comment on `signingTokens`. --- -## Security Mistakes +### Pitfall 5: Audit Trail Gap — No Record of Which Signer Completed Which Field -Domain-specific security issues beyond general web security. +**What goes wrong:** +The current `audit_events` table has `eventType: 'signature_submitted'` at the document level. With one signer this is unambiguous. With two signers, two `signature_submitted` events are logged for the same `documentId` with no `signerEmail` on the event. The legal audit trail cannot distinguish "Seller A signed at 14:00" from "Seller B signed at 14:05" — both appear as anonymous "signature submitted" events on the same document. -| Mistake | Risk | Prevention | -|---------|------|------------| -| Agent saved signature served via a predictable or public file path | Any user who can guess the path downloads the agent's legal signature | Store under a UUID path; serve only through `GET /api/agent/signature` which verifies the better-auth session before streaming | -| AI field placement values (pre-fill text) passed to OpenAI without scrubbing | Client PII (name, email, SSN, property address) sent to OpenAI and stored in their logs | Provide only anonymized document structure to the AI (page images without personally identifiable pre-fill values); apply pre-fill values server-side after AI field detection | -| Preview PDF served at a guessable URL (e.g. `/api/documents/{id}/preview`) without auth check | Anyone with the document ID can download a prepared document containing client PII | All document file routes must verify the agent session before streaming — apply the same guard as the existing `/api/documents/[id]/download/route.ts` | -| Agent signature dataURL transmitted from client to server in an unguarded API route | Any authenticated user (if multi-agent is ever added) can overwrite the saved signature | The save-signature endpoint must verify the session user matches the signature owner — prepare for this even in solo-agent v1 | -| Signed PDF stale preview served to client after re-preparation | Client signs a document that differs from what agent reviewed and approved | Hash prepared PDF at prepare time; verify hash before serving to client signing page | +**Why this matters:** +Utah e-signature law requires proof of who signed what and when. An undifferentiated audit log is a legal compliance gap (see existing LEGAL-03 compliance requirement in v1.0). + +**How to avoid:** +Add `signerEmail TEXT` to `auditEvents` (nullable, to preserve backward compatibility with v1.0 events). When logging `signature_submitted` in multi-signer mode, include the `signerEmail` from the claimed token row in the event metadata. The `metadata JSONB` column already exists and can carry this without a schema change — use `metadata: { signerEmail: tokenRow.signerEmail }` as a minimum before a proper column is added. + +**Warning signs:** +- Two `signature_submitted` events logged for the same `documentId` with no distinguishing field. + +**Phase to address:** Multi-signer signing flow phase — include signer identity in audit events before the first multi-signer document is tested. --- -## UX Pitfalls +### Pitfall 6: Document Status "Viewed" Conflicts Across Signers -Common user experience mistakes in this domain. +**What goes wrong:** +The current GET `/api/sign/[token]` sets `documents.status = 'Viewed'` when any signer opens their link (line 81 of the current route). With two signers, Signer A opens the link → document becomes Viewed. Signer A backs out without signing. Signer B hasn't even opened their link yet. Agent sees "Viewed" status and assumes both signers have engaged. If Signer A then signs, status jumps from Viewed → Signed (via the POST handler), bypassing any intermediate state. The agent has no way to know that Signer B never opened their link. -| Pitfall | User Impact | Better Approach | -|---------|-------------|-----------------| -| Preview opens in a new browser tab as a raw PDF | Agent has no context that this is a preview vs. the final document; no field overlays visible | Display preview in-app with a "PREVIEW — Fields Filled" watermark overlay on each page | -| AI-placed fields shown without a review step | Agent sends a document with misaligned AI fields to a client; client is confused by floating sign boxes | AI placement populates the FieldPlacer UI for agent review — never auto-sends; agent must manually click "Looks good, proceed" | -| "Prepare and Send" button available before the agent has placed any fields | Agent sends a blank document with no signature fields; client has nothing to sign | Disable "Prepare and Send" if `signatureFields` is empty or contains only agent-signature fields (no client fields) | -| Agent saved signature is applied but no visual confirmation is shown | Agent thinks the signature was applied; document arrives unsigned because the apply step silently failed | Show the agent's saved signature PNG in the field placer overlay immediately after apply; require explicit confirmation before the prepare step | -| Preview shows pre-filled text but not field type labels | Agent cannot distinguish a "checkbox" pre-fill from a "text" pre-fill in the visual preview | Show field type badges (small colored labels) on the preview overlay, not just the filled content | -| Client signing page shows no progress for non-signature fields (text, checkbox, date) | Client doesn't know they need to fill in text boxes or check checkboxes — sees only signature prompts | The progress bar in `SigningProgressBar.tsx` counts `signatureFields.length` — this must count all client-facing fields, not just signature-type fields | +**How to avoid:** +Per-signer status belongs in `signingTokens`, not in `documents`. Add a `viewedAt TIMESTAMP` column to `signingTokens`. The GET handler sets `signingTokens.viewedAt = NOW()` for the specific token, not `documents.status`. The documents-level status becomes a computed aggregate: `Draft` → `Sent` (any token issued) → `Partially Signed` (some tokens usedAt set) → `Signed` (all tokens usedAt set). Consider adding `Partially Signed` to the `documentStatusEnum`, or compute it in the agent dashboard query. + +**Warning signs:** +- The signing GET handler writes `documents.status = 'Viewed'` instead of `signingTokens.viewedAt = NOW()`. + +**Phase to address:** Multi-signer schema phase — add `viewedAt` to `signingTokens` and derive document status from token states. --- -## "Looks Done But Isn't" Checklist +## Docker/Deployment Pitfalls -Things that appear complete but are missing critical pieces. +### Pitfall 7: NEXT_PUBLIC_BASE_URL Is Baked at Build Time — Wrong URL in Production Container -- [ ] **AI field placement:** Verify the coordinate conversion unit test asserts specific PDF-space x/y values (not just "fields are returned") — eyeball testing will miss Y-axis inversion errors on Utah standard forms. -- [ ] **Expanded field types:** Verify `SigningPageClient.tsx` has a rendering branch for every type in the `SignatureFieldData` type union — not just the new FieldPlacer palette tokens. Check for the default/fallback case. -- [ ] **Agent saved signature:** Verify the saved signature is stored as a file path, not a dataURL TEXT column — check the Drizzle schema migration and confirm no `dataUrl` column was added to `users`. -- [ ] **Agent signs first:** Verify that after agent applies their signature, the agent-signature field is embedded into the prepared PDF and removed from the `signatureFields` array that gets sent to the client — not just visually hidden in the FieldPlacer. -- [ ] **Filled preview:** Verify the preview URL changes when fields or text fill values change (cache-busting via timestamp or hash in the path) — open DevTools network tab, modify a field, re-generate preview, confirm a new file is fetched. -- [ ] **Filled preview freshness gate:** Verify the "Send" button is disabled when `lastPreviewGeneratedAt < lastFieldsUpdatedAt` — test by generating a preview, changing a field, and confirming the send button becomes disabled. -- [ ] **OpenAI token limit:** Verify the AI placement works on a real 20-page Utah REPC form, not just a 2-page test PDF — check that page 15+ fields are detected with the same accuracy as page 1. -- [ ] **Schema migration:** Verify that documents created in v1.0 (where `signatureFields` JSONB has entries without a `type` key) are handled gracefully by all v1.1 code paths — add a null-safe fallback for `field.type ?? 'signature'` throughout. +**What goes wrong:** +`send/route.ts` line 35 reads: +```typescript +const baseUrl = process.env.NEXT_PUBLIC_BASE_URL ?? 'http://localhost:3000'; +``` +In Next.js, any variable prefixed `NEXT_PUBLIC_` is substituted at `next build` time — it becomes a string literal in the compiled JavaScript bundle. If the Docker image is built with `NEXT_PUBLIC_BASE_URL=http://localhost:3000` (or not set at all), every signing URL emailed to clients will point to `localhost:3000` regardless of what is set in the container's runtime environment. The client clicks the link and gets "connection refused." + +**This is specific to `NEXT_PUBLIC_*` variables.** Server-only variables (no `NEXT_PUBLIC_` prefix) ARE read at runtime from the container environment. Mixing the two causes precisely the confusion reported in this project. + +**How to avoid:** +For variables that need to be available on the server only (like `BASE_URL` for constructing server-side URLs), remove the `NEXT_PUBLIC_` prefix. `NEXT_PUBLIC_` should only be used for variables that need to reach the browser bundle. The signing URL is constructed in a server-side API route — it does not need `NEXT_PUBLIC_`. Rename to `SIGNING_BASE_URL` (no prefix), read it only in API routes, and inject it into the container environment at runtime via Docker Compose `environment:` block. + +**Warning signs:** +- Signing emails send but clicking the link shows a browser connection error or goes to localhost. +- `NEXT_PUBLIC_BASE_URL` is set in `docker-compose.yml` under `environment:` and the developer assumes this is sufficient — it is not, because the value was already baked in during `docker build`. + +**Phase to address:** Docker deployment phase — rename the variable and audit all `NEXT_PUBLIC_` usages before building the production image. --- -## Recovery Strategies +### Pitfall 8: Uploads Directory Is Lost on Container Restart -When pitfalls occur despite prevention, how to recover. +**What goes wrong:** +All uploaded PDFs, prepared PDFs, and signed PDFs are written to `process.cwd() + '/uploads'`. In the Docker container, `process.cwd()` is the directory where Next.js starts — typically `/app`. The path `/app/uploads` is inside the container's writable layer, which is ephemeral. When the container is stopped and recreated (deployment, crash, `docker compose up --force-recreate`), all PDFs are gone. Signed documents that were legally executed are permanently lost. Clients cannot download their signed copies. The agent loses the audit record. -| Pitfall | Recovery Cost | Recovery Steps | -|---------|---------------|----------------| -| Client received signing link but signing page crashes on new field types | HIGH | Emergency hotfix: add `field.type ?? 'signature'` fallback in SigningPageClient; deploy; invalidate old token; send new link | -| AI placed fields are wrong/inverted on first real-form test | LOW | Fix coordinate conversion unit; re-run AI placement for that document; no data migration needed | -| Agent saved signature stored as dataURL in DB | MEDIUM | Add migration: extract dataURL to file, update path column, nullify dataURL column; existing signed PDFs are unaffected | -| Preview PDF served stale after field changes | LOW | Add cache-busting query param or timestamp to preview URL; no data changes needed | -| Agent-signature field appears in client's signing field list | HIGH | Emergency hotfix: filter signatureFields in signing token GET by type; redeploy; affected in-flight signing sessions may need new tokens | -| Large PDF causes Vercel function OOM during preview generation | MEDIUM | Switch preview to background job + polling; no data migration; existing prepared PDFs are valid | +**How to avoid:** +Mount a named Docker volume at `/app/uploads` (or whatever `process.cwd()` resolves to in the container) in `docker-compose.yml`: + +```yaml +services: + app: + volumes: + - uploads_data:/app/uploads +volumes: + uploads_data: +``` + +Verify the mount path matches `process.cwd()` inside the container — do not assume it is `/app`. Run `docker exec node -e "console.log(process.cwd())"` to confirm. The volume must also be backed up separately; Docker named volumes are not automatically backed up. + +**Warning signs:** +- No `volumes:` key appears in `docker-compose.yml` for the app service. +- After a container restart, the agent portal shows documents with no downloadable PDF (the file path in the DB is valid but the file does not exist on disk). + +**Phase to address:** Docker deployment phase — establish the volume before any production upload occurs. --- -## Pitfall-to-Phase Mapping +### Pitfall 9: Database Connection String Absent in Container — App Boots but All Queries Fail -How roadmap phases should address these pitfalls. +**What goes wrong:** +`DATABASE_URL` and other secrets (`SIGNING_JWT_SECRET`, `CONTACT_SMTP_HOST`, etc.) are not committed to the repository. In development they are in `.env.local`. In a Docker container, `.env.local` is not automatically copied (`.gitignore` typically excludes it, and `COPY . .` in a Dockerfile may or may not include it depending on `.dockerignore`). If the Docker image is built without the secret baked in (correct practice) but the `docker-compose.yml` does not inject it via `environment:` or `env_file:`, the container starts successfully — `next start` does not validate env vars at startup — but every database query throws "missing connection string" at request time. The agent portal loads its login page (server components that don't query the DB) but crashes on any data operation. -| Pitfall | Prevention Phase | Verification | -|---------|------------------|--------------| -| Breaking signing page with new field types (Pitfall 1) | Phase 1: Schema + signing page update | Deploy field type union; confirm signing page renders placeholder for unknown types; load an old v1.0 document with no type field and verify graceful fallback | -| AI coordinate system mismatch (Pitfall 2) | Phase 2: AI integration — coordinate conversion utility | Unit test with a known Utah REPC: assert specific PDF-space x/y for a known field; Y-axis inversion test | -| OpenAI token limits on large PDFs (Pitfall 3) | Phase 2: AI integration — page-by-page pipeline | Test with the longest form Teressa uses (likely 20+ page REPC); verify all pages processed | -| Prompt hallucination and schema incompatibility (Pitfall 4) | Phase 2: AI integration — Zod validation of AI response | Feed an edge-case page (all text, no form fields) and verify AI returns empty array, not hallucinated fields | -| Saved signature as dataURL in DB (Pitfall 5) | Phase 3: Agent saved signature | Confirm Drizzle schema has a path column, not a dataURL column; verify file is stored under UUID path | -| Race condition: agent updates signature mid-signing (Pitfall 6) | Phase 3: Agent saved signature + supersede flow | Confirm "Prepare and Send" on a Sent/Viewed document requires confirmation and invalidates old token | -| Stale preview after field changes (Pitfall 7) | Phase 4: Filled document preview | Modify a field after preview generation; confirm send button disables or preview refreshes | -| OOM on large PDF preview (Pitfall 8) | Phase 4: Filled document preview | Test preview generation on a 20-page REPC; monitor Vercel function memory in dashboard | -| Client signs different doc than agent previewed (Pitfall 9) | Phase 4: Filled document preview | Confirm prepared PDF is hashed at prepare time; verify hash is checked before streaming to client | -| Agent-signature field shown to client (Pitfall 10) | Phase 3: Agent signing flow | Confirm signing token GET filters `type === 'agent-signature'` fields before returning; test with a document that has both agent and client signature fields | +The `src/lib/db/index.ts` lazy singleton does throw `"DATABASE_URL environment variable is not set"` when first accessed — but this error is silent at startup and only surfaces at first request. + +**How to avoid:** +Create a `.env.production` file (not committed) that is referenced in `docker-compose.yml` via `env_file: .env.production`. Alternatively, use Docker Compose `environment:` blocks with explicit variable names. Validate at container startup by adding a health check endpoint (`/api/health`) that runs `SELECT 1` against the database and returns 200 only when the connection is live. Gate the container's `healthcheck:` on this endpoint so Docker Compose's `depends_on: condition: service_healthy` prevents the app from accepting traffic before the DB is reachable. + +**Warning signs:** +- The login page loads in Docker but the agent portal shows 500 errors on every page. +- `docker logs ` shows "Environment variable DATABASE_URL is not set" at the first request, not at startup. +- The `.env.production` or secrets file is not referenced anywhere in `docker-compose.yml`. + +**Phase to address:** Docker deployment phase — validate all required env vars against a checklist before the first production deploy. + +--- + +### Pitfall 10: PostgreSQL Container and App Container Start in Wrong Order — DB Not Ready + +**What goes wrong:** +`docker compose up` starts all services in parallel by default. The Next.js app container may attempt its first database query before PostgreSQL has accepted connections. Drizzle's `postgres` client (using the `postgres` npm package) throws `ECONNREFUSED` or `ENOTFOUND` on the first query. The app container may crash-loop if the error is unhandled at startup, or silently return 500s until the DB is ready if queries are only made at request time. + +**How to avoid:** +Add `depends_on` with `condition: service_healthy` in `docker-compose.yml`. The PostgreSQL service needs a `healthcheck:` using `pg_isready`: + +```yaml +services: + db: + healthcheck: + test: ["CMD-SHELL", "pg_isready -U postgres"] + interval: 5s + timeout: 5s + retries: 5 + app: + depends_on: + db: + condition: service_healthy +``` + +Also run Drizzle migrations as part of app startup (add `drizzle-kit migrate` to the container's `command:` or an entrypoint script) so the schema is applied before the first request. Without this, a fresh deployment against an empty database will fail on every query. + +**Warning signs:** +- `docker-compose.yml` has no `healthcheck:` on the database service. +- `docker-compose.yml` has no `depends_on` on the app service. + +**Phase to address:** Docker deployment phase — write the complete `docker-compose.yml` with health checks before the first production deploy. + +--- + +### Pitfall 11: Neon Connection Pool Exhaustion in Docker + +**What goes wrong:** +`src/lib/db/index.ts` creates a `postgres(url)` client with no explicit `max` parameter. The `postgres` npm package defaults to `max: 10` connections per process. Neon's free tier allows 10 concurrent connections total. One Next.js container with default settings exhausts the entire connection budget. A second container (staging + production running simultaneously, or a restart overlap) causes all new queries to queue indefinitely until connections are freed, manifesting as timeouts on every request. + +Additionally, the current proxy-singleton pattern in `db/index.ts` creates one pool per Node.js process. Next.js in development mode can hot-reload modules, creating multiple pool instances per dev session. In production this is not a problem, but it can silently leak connections during CI test runs or development stress tests. + +**Why it happens:** +The `postgres` npm package does not warn when connection limits are exceeded — it silently queues queries. The Neon dashboard shows connection count; the app shows only request timeouts with no clear error. + +**How to avoid:** +Set an explicit `max` connection limit appropriate for the deployment. For a single-container deployment against Neon free tier (10 connection limit), use `postgres(url, { max: 5 })` to leave headroom for migrations, admin queries, and overlap during deployments. For paid Neon tiers, scale accordingly. Add `idle_timeout: 20` (seconds) to release idle connections promptly. Add `connect_timeout: 10` to surface connection failures quickly rather than queuing indefinitely. + +Recommended `db/index.ts` configuration: +```typescript +const client = postgres(url, { + max: 5, // conservative for Neon free tier; increase with paid plan + idle_timeout: 20, // release idle connections within 20s + connect_timeout: 10, // fail fast if Neon is unreachable +}); +``` + +**Warning signs:** +- `postgres(url)` called with no second argument in `db/index.ts`. +- Neon dashboard shows connection count at ceiling during normal single-user usage. +- Requests time out with no database error in logs — only generic "fetch failed" errors. + +**Phase to address:** Docker deployment phase — configure connection pool limits before the first production deploy. + +--- + +### Pitfall 12: @napi-rs/canvas Native Binary — Wrong Platform in Docker Image + +**What goes wrong:** +`@napi-rs/canvas` is declared in `serverExternalPackages` in `next.config.ts`, which tells Next.js to load it as a native Node.js module rather than bundling it. The package ships pre-compiled `.node` binary files for specific platforms (darwin-arm64, linux-x64-gnu, linux-arm64-gnu, etc.). When `npm install` runs on an Apple Silicon Mac during development, npm downloads the `darwin-arm64` binary. If the Docker image is built by running `npm install` inside a `node:alpine` container (which is `linux-musl`, not `linux-gnu`), the `linux-x64-musl` binary is selected — but `@napi-rs/canvas` does not publish musl builds. The canvas module fails to load at runtime with `Error: /app/node_modules/@napi-rs/canvas/...node: invalid ELF header`. + +Even if the Docker base image is `node:20-slim` (Debian, linux-gnu), building on an ARM host and deploying to an x86 server results in the wrong binary unless the `--platform` flag is used during `docker build`. + +**How to avoid:** +Always build the Docker image with an explicit platform target matching the production host: +```bash +docker build --platform linux/amd64 -t app . +``` + +Use `node:20-slim` (Debian-based, glibc) as the Docker base image — not `node:20-alpine` (musl). Verify the canvas module loads in the container before deploying: +```bash +docker exec node -e "require('@napi-rs/canvas'); console.log('canvas OK')" +``` + +If developing on ARM and deploying to x86, add `--platform linux/amd64` to the `docker build` command in the deployment runbook and CI pipeline. + +**Warning signs:** +- `next.config.ts` lists `@napi-rs/canvas` in `serverExternalPackages`. +- Docker base image is `node:alpine`. +- The build machine architecture differs from the deployment target. +- Runtime error: `invalid ELF header` or `Cannot find module '@napi-rs/canvas'` after a clean image build. + +**Phase to address:** Docker deployment phase — verify canvas module compatibility before the first production build. + +--- + +## Email/SMTP Pitfalls + +### Pitfall 13: SMTP Env Vars Absent in Container — Root Cause of Reported Email Breakage + +**What goes wrong:** +This is the reported issue: email worked in development but broke when deployed to Docker. The most likely root cause is that `CONTACT_SMTP_HOST`, `CONTACT_SMTP_PORT`, `CONTACT_EMAIL_USER`, `CONTACT_EMAIL_PASS`, and `AGENT_EMAIL` are not present in the container environment. `signing-mailer.tsx` reads these in `createTransporter()` which is called at send time (not at module load) — so the missing env vars do not cause a startup error. The first signing email attempt fails with Nodemailer throwing `connect ECONNREFUSED` (if host resolves to nothing) or `Invalid login` (if credentials are absent). + +**Why it looks like a DNS problem but isn't:** +Docker containers on a bridge network use the host's DNS resolver (or Docker's embedded resolver) and can reach external SMTP servers by hostname without any special configuration. The SMTP server (`CONTACT_SMTP_HOST`) is an external service (e.g., Mailgun, SendGrid, or a personal SMTP relay) — Docker does not change its reachability. The error is env var injection failure, not DNS. + +**Verification steps before attempting the Docker fix:** +1. `docker exec printenv CONTACT_SMTP_HOST` — if empty, the env var is missing. +2. `docker exec node -e "const n = require('nodemailer'); n.createTransport({host: process.env.CONTACT_SMTP_HOST, port: 465, secure: true, auth: {user: process.env.CONTACT_EMAIL_USER, pass: process.env.CONTACT_EMAIL_PASS}}).verify(console.log)"` — tests SMTP connectivity from inside the container. + +**How to avoid:** +Include all SMTP variables in the `env_file:` or `environment:` block of the app service in `docker-compose.yml`. Use an `.env.production` file that is manually provisioned on the Docker host (not committed). Consider using Docker secrets (mounted files) for the SMTP password rather than environment variables if the host is shared. + +**Warning signs:** +- `docker exec printenv CONTACT_SMTP_HOST` returns empty. +- Signing emails silently fail with no error until first send attempt. + +**Phase to address:** Docker deployment phase — SMTP env var verification is the first check in the deployment runbook. + +--- + +### Pitfall 14: Nodemailer Transporter Created With Mismatched Port and TLS Settings + +**What goes wrong:** +`signing-mailer.tsx` contains: +```typescript +port: Number(process.env.CONTACT_SMTP_PORT ?? 465), +secure: Number(process.env.CONTACT_SMTP_PORT ?? 465) === 465, +``` +`contact-mailer.ts` contains: +```typescript +port: Number(process.env.CONTACT_SMTP_PORT ?? 587), +secure: false, // STARTTLS on port 587 +``` +The two mailers use different defaults for the same env var. If `CONTACT_SMTP_PORT` is not set in the container, the signing mailer assumes port 465 (TLS), but the contact form mailer assumes port 587 (STARTTLS). If the SMTP provider only supports one of these, one mailer will connect and the other will time out. The mismatch is invisible until both code paths are exercised in production. + +**How to avoid:** +Require `CONTACT_SMTP_PORT` explicitly — remove the fallback defaults and add a startup validation check that throws if this variable is missing. Use a single `createSmtpTransporter()` utility function shared by both mailers, not two separate inline `createTransport()` calls with different defaults. Document the required env var values in a `DEPLOYMENT.md` or the `docker-compose.yml` comments. + +**Warning signs:** +- Two separate inline `createTransport()` calls with different `port` defaults for the same env var. +- Only one of the two email paths (signing email vs. contact form) is tested in Docker. + +**Phase to address:** Docker deployment phase — consolidate SMTP transporter creation before the first production email test. + +--- + +### Pitfall 15: Multi-Signer Email Loop Fails Halfway — No Partial-Send Recovery + +**What goes wrong:** +When sending to three signers, the send route will loop: create token 1, email Signer 1, create token 2, email Signer 2, create token 3, email Signer 3. If email to Signer 2 fails (SMTP timeout, invalid address), tokens 1 and 3 may still be created in the database but Signer 3 never receives their email. The document is now in an inconsistent state: tokens exist for recipients who were never emailed. Signer 1 signs, completion detection counts 2 remaining unclaimed tokens (Signers 2 and 3 never signed), document never reaches "Signed." + +**How to avoid:** +Create all tokens before sending any emails. Wrap token creation in a transaction — if any token INSERT fails, roll back all tokens and return an error before any emails are sent. Send emails outside the transaction (SMTP is not transactional). If an email send fails, mark that token as `superseded` (add a `supersededAt` column to `signingTokens`) rather than deleting it, and surface the partial-send failure to the agent with a "resend to failed recipients" option. Never leave unclaimed tokens orphaned by partial email failure. + +**Warning signs:** +- The send loop interleaves token creation and email sending (create token 1, send email 1, create token 2, send email 2...) rather than creating all tokens atomically first. + +**Phase to address:** Multi-signer send phase — design the send loop with transactional token creation from the start. + +--- + +## PDF Assembly Pitfalls + +### Pitfall 16: Final PDF Assembly Runs Multiple Times — Duplicate Signed PDFs + +**What goes wrong:** +Completion detection triggers PDF assembly (merging all signer contributions into one final PDF). If the race condition guard (Pitfall 2) is not in place, assembly runs twice. Even with the guard, if the assembly function crashes partway through and the `completionTriggeredAt` was already set, there is no way to retry assembly — the guard prevents re-entry and the document is stuck with no signed PDF. + +**How to avoid:** +Separate the "completion triggered" flag from the "signed PDF ready" flag. Add both `completionTriggeredAt TIMESTAMP` (prevents double-triggering) and `signedFilePath TEXT` (set only when PDF is successfully written). If `completionTriggeredAt` is set but `signedFilePath` is null after 60 seconds, an admin retry endpoint can reset `completionTriggeredAt` to null to allow re-triggering. The existing atomic rename pattern (`tmp → final`) in `embed-signature.ts` already prevents partial PDF corruption — preserve this in the multi-signer assembly code. + +**Warning signs:** +- Only a single flag (`completionTriggeredAt`) is used to track both triggering and completion. +- No retry mechanism exists for a stuck assembly. + +**Phase to address:** Multi-signer completion phase — implement idempotent assembly with separate trigger and completion flags. + +--- + +### Pitfall 17: Multi-Signer Final PDF — Which Prepared PDF Is the Base? + +**What goes wrong:** +In the current single-signer flow, `embedSignatureInPdf` reads from `doc.preparedFilePath` (the agent-prepared PDF with text fills and agent signatures already embedded) and writes to `_signed.pdf`. With multiple signers, each signer's signature needs to be embedded sequentially onto the same prepared PDF base. If two handlers run concurrently and both read from `preparedFilePath`, modify it in memory, and write independent output PDFs, the final "merge" step needs a different strategy — you cannot simply append two separately-signed PDFs into one document without losing the shared base. + +**How to avoid:** +The correct architecture for multi-signer PDF assembly: + +1. Each signer's POST handler embeds only that signer's signatures into an intermediate file: `{docId}_partial_{signerEmail_hash}.pdf`. This intermediate file is written atomically (tmp → rename). It is NOT the final document. +2. When completion is triggered (all tokens claimed), a single assembly function reads the prepared PDF once, iterates all signers' signature data (from DB or intermediate files), embeds all signatures in one pass, and writes `{docId}_signed.pdf`. +3. The `pdfHash` is computed only from the final assembled PDF, not from any intermediate. + +This avoids the read-modify-write race entirely. Intermediate files are cleaned up after successful final assembly. + +**Warning signs:** +- Each signer's POST handler directly writes to `_signed.pdf` rather than an intermediate file. +- The final assembly step reads from two separately-signed PDF files and tries to merge them. + +**Phase to address:** Multi-signer completion phase — establish the intermediate file pattern before any signing submission code is written. + +--- + +### Pitfall 18: Temp File Accumulation on Failed Assemblies + +**What goes wrong:** +The current code already creates a temp file during date stamping (`preparedAbsPath.datestamped.tmp`) and cleans it up with `unlink().catch(() => {})`. Multi-signer assembly will create intermediate partial files. If the assembly handler crashes between writing intermediates and producing the final PDF, those temp files are never cleaned up. Over time, the `uploads/` directory fills with orphaned intermediate files. On the home Docker server with limited disk, this causes write failures on new documents. + +**How to avoid:** +Name all intermediate and temp files with a recognizable pattern (`*.tmp`, `*_partial_*.pdf`). Add a periodic cleanup job (a Next.js route called by a cron or a simple setInterval in a route handler) that deletes `*.tmp` and `*_partial_*.pdf` files older than 24 hours. Log a warning when cleanup finds orphaned files — this surfaces incomplete assemblies that need investigation. + +**Warning signs:** +- The `uploads/` directory grows unbounded over time. +- Partial files from failed assemblies remain after a document is marked Signed. + +**Phase to address:** Multi-signer completion phase — add cleanup alongside the assembly logic. + +--- + +## Security Pitfalls + +### Pitfall 19: Multiple Tokens Per Document — Token Enumeration Attack + +**What goes wrong:** +In the single-signer system, one token is issued per document. An attacker who intercepts or guesses a token can sign one document. With multi-signer, multiple tokens are issued for the same document. If token generation uses a predictable pattern (e.g., sequential IDs, short UUIDs, or low-entropy random values), an attacker who holds one valid token for a document can enumerate sibling tokens for the same document by brute-forcing nearby values. + +**Current state:** `createSigningToken` uses `crypto.randomUUID()` for the JTI and `SignJWT` with HS256. UUID v4 provides 122 bits of randomness — sufficient. The risk is theoretical given current implementation but becomes concrete if the JTI generation is ever changed. + +**How to avoid:** +Keep using `crypto.randomUUID()` for JTI. Do not add any sequential or human-readable component to the JTI. Ensure the JWT is verified before the JTI is looked up in the database — `verifySigningToken()` already does this (JWT signature check first, then DB lookup). Add rate limiting on the signing GET and POST endpoints: `MAX 10 requests per IP per minute` prevents brute force. Log and alert on `status: 'invalid'` responses that repeat from the same IP. + +**Warning signs:** +- JTI generation switches from `crypto.randomUUID()` to a sequential or short-UUID pattern. +- No rate limiting exists on `/api/sign/[token]` GET or POST. + +**Phase to address:** Multi-signer send phase — add rate limiting before issuing multiple tokens per document. + +--- + +### Pitfall 20: Token Shared Between Signers — Signer A Uses Signer B's Token + +**What goes wrong:** +With multi-signer, the system issues separate tokens per signer email. But the signing GET handler at line 90 currently returns ALL client-visible fields (filtered by `isClientVisibleField`), not fields tagged to the specific signer. If Signer A somehow obtains Signer B's token (e.g., email forward, shared email account, phishing), Signer A sees Signer B's fields and can sign them. In real estate, this is equivalent to signing another party's name on a contract — a serious legal issue. + +The signing POST handler (lines 210-213) filters `signableFields` to all `client-signature` and `initials` fields for the entire document — it does not restrict by signer. A cross-token submission would succeed server-side. + +**How to avoid:** +After multi-signer is implemented, the signing GET handler must filter `signatureFields` by `field.signerEmail === tokenRow.signerEmail`. The signing POST handler must verify that the field IDs in the `signatures` request body correspond only to fields tagged to `tokenRow.signerEmail` — reject any submission that includes field IDs not assigned to that signer. This is a server-side enforcement, not a UI concern. + +**Warning signs:** +- The signing GET handler's `signatureFields` filter does not include a `signerEmail` check. +- The signing POST handler's `signableFields` filter does not restrict by `signerEmail`. + +**Phase to address:** Multi-signer signing flow phase — add signer-field binding validation to both GET and POST handlers. + +--- + +### Pitfall 21: Completion Notification Email Sent to Wrong Recipients + +**What goes wrong:** +The current `sendAgentNotificationEmail` sends to `process.env.AGENT_EMAIL`. In multi-signer, the requirement is to send the final merged PDF to all signers AND the agent when completion occurs. If the recipient list is derived from `documents.emailAddresses` (the JSONB array collected at prepare time), and that array is stale (e.g., the agent changed a signer's email between prepare and send), the final PDF goes to the old address. + +A worse variant: if `emailAddresses` contains CC addresses that are NOT signers (e.g., a title company contact), those recipients receive the completed PDF immediately — before the agent has reviewed it. For a solo agent workflow, this is likely acceptable, but it should be explicit. + +**How to avoid:** +Derive the final recipient list from `signingTokens.signerEmail` (the authoritative record of who was actually sent a token), not from `documents.emailAddresses`. Separate "recipients who receive the signing link" from "recipients who receive the completed PDF" explicitly in the data model. The agent should review the final recipient list at send time. + +**Warning signs:** +- The completion handler derives email recipients from `documents.emailAddresses` rather than `signingTokens.signerEmail`. + +**Phase to address:** Multi-signer send phase — establish the recipient derivation rule before tokens are issued. + +--- + +### Pitfall 22: Signing Token Issued But Document Re-Prepared — Token Points to Stale PDF + +**What goes wrong:** +v1.1 introduced a guard: Draft-only documents can be AI-prepared (`ai-prepare/route.ts` line 37: `if (doc.status !== 'Draft') return 403`). But `prepare/route.ts` (which calls `preparePdf` and writes `_prepared.pdf`) does not have an equivalent guard — a Sent document can be re-prepared if the agent POST to `/api/documents/{id}/prepare` directly. With multi-signer, if any token has been issued (even if no signer has used it), re-preparing the document overwrites `_prepared.pdf` and changes `preparedFilePath`. Signers who have already received their token will open the signing page and load the new prepared PDF — which may have different text fills, field positions, or the agent's new signature — not what was legally sent to them. + +**How to avoid:** +Add a guard to `prepare/route.ts`: if `signingTokens` has any row for this document with `usedAt IS NULL` (any token still outstanding), reject the prepare request with `409 Conflict: "Cannot re-prepare a document with outstanding signing tokens."` If the agent genuinely needs to change the document, they must first void all outstanding tokens (supersede them) and issue new ones. + +**Warning signs:** +- `prepare/route.ts` has no check against the `signingTokens` table before writing `_prepared.pdf`. + +**Phase to address:** Multi-signer send phase — add the outstanding-token guard to the prepare route before multi-signer send is implemented. + +--- + +### Pitfall 23: @vercel/blob Is Installed But Not Used — Risk of Accidental Use + +**What goes wrong:** +`package.json` lists `@vercel/blob` as a production dependency. No file in the codebase imports or uses it. The package provides a Vercel-hosted blob storage client that requires `BLOB_READ_WRITE_TOKEN` to be set in the environment. If any future code accidentally imports from `@vercel/blob` instead of using the local filesystem path utilities, it will silently fail in Docker (no `BLOB_READ_WRITE_TOKEN` in a non-Vercel environment) and would route file storage through Vercel's infrastructure rather than the local volume, breaking the signed PDF storage entirely. + +**Why it happens:** +`@vercel/blob` may have been installed during initial scaffolding when Vercel deployment was considered. It was never wired up. Its presence in `package.json` is a footgun. + +**How to avoid:** +Remove `@vercel/blob` from `package.json` and run `npm install` before building the Docker image. If Vercel deployment is ever considered in the future, re-add it intentionally with a clear decision to migrate storage. Until then, its presence is a liability. + +**Warning signs:** +- `@vercel/blob` appears in `package.json` dependencies but `grep -r "@vercel/blob"` finds no usage in `src/`. +- Any new code imports from `@vercel/blob` without an explicit architectural decision to use it. + +**Phase to address:** Docker deployment phase — remove the unused dependency before building the production image. + +--- + +## Prevention Checklist + +Group by phase for the roadmap planner. + +### Multi-Signer Schema Phase +- [ ] Add `signerEmail TEXT NOT NULL` to `signingTokens` (with backfill migration for v1.1 rows) +- [ ] Add `viewedAt TIMESTAMP` to `signingTokens` +- [ ] Add `completionTriggeredAt TIMESTAMP` to `documents` +- [ ] Add `Partially Signed` to `documentStatusEnum` or compute from token states +- [ ] Freeze `signatureFields` JSONB after tokens are issued (document invariant, enforced in prepare route) +- [ ] Document the invariant: `signingTokens.signerEmail` is the source of truth for recipient list + +### Multi-Signer Send Phase +- [ ] Wrap all token creation in a single DB transaction; send emails after commit +- [ ] Add outstanding-token guard to `prepare/route.ts` (409 if any unclaimed token exists) +- [ ] Derive final PDF recipient list from `signingTokens.signerEmail`, not `emailAddresses` +- [ ] Add rate limiting to signing GET and POST endpoints + +### Multi-Signer Signing Flow Phase +- [ ] Filter `signatureFields` by `field.signerEmail === tokenRow.signerEmail` in signing GET +- [ ] Validate submitted field IDs against signer's assigned fields in signing POST +- [ ] Include `signerEmail` in `signature_submitted` audit event metadata +- [ ] Completion detection: count unclaimed tokens in same transaction as token claim + +### Multi-Signer Completion Phase +- [ ] Race condition guard: `UPDATE documents SET completion_triggered_at = NOW() WHERE completion_triggered_at IS NULL` +- [ ] Assemble final PDF in one pass from prepared PDF base (not by merging two separately-signed files) +- [ ] Set `signedFilePath` only after successful atomic rename of final assembled PDF +- [ ] Compute `pdfHash` only from final assembled PDF +- [ ] Clean up intermediate `_partial_*.pdf` files after successful assembly +- [ ] Add periodic orphaned-temp-file cleanup + +### Docker Deployment Phase +- [ ] Rename `NEXT_PUBLIC_BASE_URL` → `SIGNING_BASE_URL` (server-only var, no NEXT_PUBLIC_ prefix) +- [ ] Audit all remaining `NEXT_PUBLIC_*` usages — confirm each one genuinely needs browser access +- [ ] Mount named Docker volume at `process.cwd() + '/uploads'` (verify path inside container first) +- [ ] Create `.env.production` on Docker host with all required secrets; reference in `docker-compose.yml` +- [ ] Add `CONTACT_SMTP_PORT` as required env var; remove fallback defaults from both mailers +- [ ] Consolidate SMTP transporter into a shared `createSmtpTransporter()` utility +- [ ] Add PostgreSQL `healthcheck` + app `depends_on: condition: service_healthy` +- [ ] Add Drizzle migration to container startup (before `next start`) +- [ ] Add `/api/health` endpoint that runs `SELECT 1` + checks `DATABASE_URL` + checks `CONTACT_SMTP_HOST` +- [ ] Verify SMTP connectivity from inside container before first production deploy +- [ ] Configure `postgres(url, { max: 5, idle_timeout: 20, connect_timeout: 10 })` for Neon free tier +- [ ] Build Docker image with `--platform linux/amd64` when deploying to x86_64 Linux +- [ ] Use `node:20-slim` (Debian glibc) as base image — not `node:alpine` (musl) +- [ ] Verify `@napi-rs/canvas` loads in container: `node -e "require('@napi-rs/canvas')"` +- [ ] Remove `@vercel/blob` from `package.json` dependencies + +### Verification (Do Not Skip) +- [ ] Test a two-signer document where both signers submit within 1 second of each other — confirm one PDF, one notification, one `signedAt` +- [ ] Restart the Docker container and confirm all previously-uploaded PDFs are still accessible +- [ ] Confirm clicking a signing link emailed from Docker opens the correct production URL (not localhost) +- [ ] Confirm `docker exec printenv CONTACT_SMTP_HOST` returns the expected value +- [ ] Test a v1.1 (single-signer) document after migration — confirm existing tokens still work +- [ ] Confirm Neon connection count stays below 7 during normal usage (check Neon dashboard) +- [ ] Confirm canvas module loads: `docker exec node -e "require('@napi-rs/canvas'); console.log('OK')"` + +--- + +## Phase-Specific Warning Summary + +| Phase Topic | Likely Pitfall | Mitigation | +|-------------|---------------|------------| +| signingTokens schema change | NOT NULL constraint breaks existing token rows | Backfill migration with client email JOIN | +| Multi-signer send loop | Partial email failure orphans tokens | Transactional token creation, separate from email sends | +| Completion detection | First signer marks document Signed | Count unclaimed tokens inside transaction before marking | +| Concurrent completion | Two handlers both run final assembly | `completionTriggeredAt` one-time-set guard | +| Docker build | NEXT_PUBLIC_BASE_URL baked into bundle | Remove NEXT_PUBLIC_ prefix for server-only URL | +| Docker volumes | Uploads lost on container recreate | Named volume mounted at uploads path | +| Docker secrets | SMTP env vars absent in container | env_file in compose, verify with printenv | +| PostgreSQL startup | App queries before DB is ready | service_healthy depends_on + pg_isready healthcheck | +| Neon connection pool | Default 10 connections saturates free tier | Set max: 5 with idle_timeout and connect_timeout | +| Native module in Docker | @napi-rs/canvas wrong platform binary | --platform linux/amd64 + node:20-slim base image | +| Unused dependency | @vercel/blob accidentally used in new code | Remove from package.json before Docker build | +| Final PDF assembly | Signer PDFs assembled by merging two separate files | Single-pass assembly from prepared PDF base | +| Signer identity in audit | Two signature_submitted events indistinguishable | signerEmail in audit event metadata | --- ## Sources -- Reviewed `src/lib/db/schema.ts` — `SignatureFieldData` has no `type` field; confirmed by inspection 2026-03-21 -- Reviewed `src/app/sign/[token]/_components/SigningPageClient.tsx` — confirmed all fields open signature modal; no type branching -- Reviewed `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` — confirmed single "Signature" token; `screenToPdfCoords` function confirms Y-axis inversion pattern -- Reviewed `src/lib/signing/embed-signature.ts` — confirms `@cantoo/pdf-lib` import; PNG-only embed -- Reviewed `src/lib/pdf/prepare-document.ts` — confirms AcroForm flatten-first ordering; text stamp fallback -- Reviewed `src/app/api/sign/[token]/route.ts` — confirmed `signatureFields: doc.signatureFields ?? []` sends unfiltered fields to client (line 88) -- Reviewed `src/app/portal/(protected)/documents/[docId]/_components/PreparePanel.tsx` — no guard against re-preparation of Sent/Viewed documents -- [OpenAI Vision API Token Counting](https://platform.openai.com/docs/guides/vision#calculating-costs) — image token costs confirmed; LOW tile = 85 tokens, HIGH tile adds detail tokens per 512px tile -- [OpenAI Structured Output (JSON Schema mode)](https://platform.openai.com/docs/guides/structured-outputs) — `json_schema` mode confirmed as more reliable than `json_object` for typed responses -- [Vercel Serverless Function Limits](https://vercel.com/docs/functions/runtimes/node-js#memory-and-compute) — 256MB default, 1024MB on Pro; 60s max execution on Pro -- `@cantoo/pdf-lib` confirmed as the import used (not `@pdfme/pdf-lib` or `pdf-lib`) — v1.0 codebase uses this fork throughout +- Reviewed `src/lib/db/schema.ts` — confirmed `signingTokens` has no `signerEmail`; `documentStatusEnum` has no partial state; `SignatureFieldData` has no signer tag; 2026-04-03 +- Reviewed `src/app/api/sign/[token]/route.ts` — confirmed completion marks document Signed unconditionally at line 254; confirmed `isClientVisibleField` filter at line 90; confirmed `signableFields` filter does not restrict by signer at lines 210-213 +- Reviewed `src/app/api/documents/[id]/send/route.ts` — confirmed single token creation, single recipient +- Reviewed `src/app/api/documents/[id]/prepare/route.ts` — confirmed no guard against re-preparation of Sent documents +- Reviewed `src/lib/signing/signing-mailer.tsx` — confirmed `createTransporter()` per send (healthy), confirmed `CONTACT_SMTP_PORT` defaults differ from `contact-mailer.ts` +- Reviewed `src/lib/signing/token.ts` — confirmed `crypto.randomUUID()` JTI generation (sufficient entropy) +- Reviewed `src/lib/signing/embed-signature.ts` — confirmed atomic rename pattern (`tmp → final`) +- Reviewed `src/lib/db/index.ts` — confirmed `postgres(url)` with no `max` parameter; Proxy singleton pattern; lazy initialization +- Reviewed `next.config.ts` — confirmed `serverExternalPackages: ['@napi-rs/canvas']` +- Reviewed `package.json` — confirmed `@vercel/blob` present in dependencies; confirmed `postgres` npm package in use; confirmed `node:` not specified in package engines +- [Next.js Environment Variables — Build-time vs Runtime](https://nextjs.org/docs/app/building-your-application/configuring/environment-variables) — NEXT_PUBLIC_ vars inlined at build time; confirmed in Next.js 15 docs +- [Docker Compose healthcheck + depends_on](https://docs.docker.com/compose/how-tos/startup-order/) — `service_healthy` condition requires explicit healthcheck definition +- [Nodemailer: SMTP port and TLS](https://nodemailer.com/smtp/) — port 465 = implicit TLS (`secure: true`), port 587 = STARTTLS (`secure: false`); mismatch causes connection timeout +- [postgres npm package documentation](https://github.com/porsager/postgres) — default `max: 10` connections per client instance; `idle_timeout` and `connect_timeout` options +- [Neon connection limits](https://neon.tech/docs/introduction/plans) — free tier: 10 concurrent connections; paid tiers increase this +- [@napi-rs/canvas supported platforms](https://github.com/Brooooooklyn/canvas#support-matrix) — no musl (Alpine) builds published; requires glibc (Debian/Ubuntu) base image --- -*Pitfalls research for: Teressa Copeland Homes — v1.1 AI field placement, expanded field types, agent signing, filled preview* -*Researched: 2026-03-21* +*Pitfalls research for: Teressa Copeland Homes — v1.2 multi-signer and Docker deployment* +*Researched: 2026-04-03* +*Previous v1.1 pitfalls (AI field placement, expanded field types, agent signing, filled preview) documented in git history — superseded by this file for v1.2 planning. The v1.1 pitfalls are assumed addressed; recovery strategies from that document remain valid if regressions occur.* diff --git a/.planning/research/STACK.md b/.planning/research/STACK.md index 3ea2863..551d3a1 100644 --- a/.planning/research/STACK.md +++ b/.planning/research/STACK.md @@ -1,13 +1,16 @@ -# Stack Research +# Technology Stack -**Domain:** Real estate agent website + PDF document signing web app -**Researched:** 2026-03-21 -**Confidence:** HIGH (versions verified via npm registry; integration issues verified via official GitHub issues) -**Scope:** v1.1 additions only — OpenAI integration, expanded field types, agent signature storage, filled preview +**Project:** teressa-copeland-homes +**Researched (v1.1):** 2026-03-21 | **Updated (v1.2):** 2026-04-03 +**Overall confidence:** HIGH --- -## Existing Stack (Do Not Re-research) +## v1.1 Stack Research (retained — do not re-research) + +**Scope:** OpenAI integration, expanded field types, agent signature storage, filled preview + +### Existing Stack (Do Not Re-research) Already validated and in `package.json`. Do not change these. @@ -22,12 +25,18 @@ Already validated and in `package.json`. Do not change these. | `@vercel/blob` | ^2.3.1 | File storage | | Drizzle ORM + `postgres` | ^0.45.1 / ^3.4.8 | Database | | Auth.js (next-auth) | 5.0.0-beta.30 | Authentication | +| `nodemailer` | ^7.0.13 | Transactional email (SMTP) | +| `@react-email/components` | ^1.0.10 | Typed React email templates | +| `@react-email/render` | ^2.0.4 | Server-side renders React email to HTML | +| `@dnd-kit/core` + `@dnd-kit/utilities` | ^6.3.1 / ^3.2.2 | Drag-drop field placement UI | +| `pdfjs-dist` | (bundled in node_modules) | PDF text extraction for AI pipeline — uses `legacy/build/pdf.mjs` to handle Node 20 | +| `@napi-rs/canvas` | ^0.1.97 | Native canvas bindings (Node.js) — used server-side for canvas operations; architecture-specific prebuilt binary | ---- +**Note on `unpdf`:** The v1.1 research recommended `unpdf` as a safer serverless wrapper around PDF.js, but the implemented code uses `pdfjs-dist/legacy/build/pdf.mjs` directly with `GlobalWorkerOptions.workerSrc` pointing to the local worker file. `unpdf` is NOT in `package.json` and was not installed. Do not add it — the existing `pdfjs-dist` integration is working. -## New Stack Additions for v1.1 +### New Stack Additions for v1.1 -### Core New Dependency: OpenAI API +#### Core New Dependency: OpenAI API | Technology | Version | Purpose | Why Recommended | |------------|---------|---------|-----------------| @@ -35,55 +44,47 @@ Already validated and in `package.json`. Do not change these. **No other new core dependencies are needed.** The remaining v1.1 features extend capabilities already in `@cantoo/pdf-lib`, `signature_pad`, and `react-pdf`. ---- - -### Supporting Libraries +#### Supporting Libraries | Library | Version | Purpose | When to Use | |---------|---------|---------|-------------| -| `unpdf` | ^1.4.0 | Server-side PDF text extraction | Use in the AI pipeline API route to extract raw text from PDF pages before sending to OpenAI. Serverless-compatible, wraps PDF.js v5, works in Next.js API routes without native bindings. More reliable in serverless than `pdfjs-dist` directly. | +| `unpdf` | NOT INSTALLED — see note above | Originally recommended, not used | Do not add | No other new supporting libraries needed. See "What NOT to Add" below. ---- - -### Development Tools +#### Development Tools No new dev tooling required for v1.1 features. ---- - -## Installation +### Installation ```bash # New dependencies for v1.1 -npm install openai unpdf +npm install openai ``` That is the full installation delta for v1.1. ---- +### Feature-by-Feature Integration Notes -## Feature-by-Feature Integration Notes - -### Feature 1: OpenAI PDF Analysis + Field Placement +#### Feature 1: OpenAI PDF Analysis + Field Placement **Flow:** 1. API route receives document ID 2. Fetch PDF bytes from Vercel Blob (`@vercel/blob` — already installed) -3. Extract text per page using `unpdf`: `getDocumentProxy()` + `extractText()` -4. Call OpenAI `gpt-4o-mini` with extracted text + a manually defined JSON schema +3. Extract text per page using `pdfjs-dist/legacy/build/pdf.mjs`: `getDocument()` + `page.getTextContent()` +4. Call OpenAI `gpt-4.1` with extracted text + a manually defined JSON schema 5. Parse structured response: array of `{ fieldType, label, pageNumber, x, y, width, height, suggestedValue }` 6. Save placement records to DB via Drizzle ORM -**Why `gpt-4o-mini` (not `gpt-4o`):** Sufficient for structured field extraction on real estate forms. Significantly cheaper. The task is extraction from known document templates — not complex reasoning. +**Why `gpt-4.1` (not `gpt-4o`):** The implemented code uses `gpt-4.1` which was released after the v1.1 research was written. Use whatever model is set in the existing `field-placement.ts` implementation. **Why manual JSON schema (not `zodResponseFormat`):** The project uses `zod` v4.3.6. The `zodResponseFormat` helper in `openai/helpers/zod` uses vendored `zod-to-json-schema` that still expects `ZodFirstPartyTypeKind` — removed in Zod v4. This is a confirmed open bug as of late 2025. Using `zodResponseFormat` with Zod v4 throws runtime exceptions. Use `response_format: { type: "json_schema", json_schema: { name: "...", strict: true, schema: { ... } } }` directly with plain TypeScript types instead. ```typescript // CORRECT for Zod v4 project — use manual JSON schema, not zodResponseFormat const response = await openai.chat.completions.create({ - model: "gpt-4o-mini", + model: "gpt-4.1", messages: [{ role: "user", content: prompt }], response_format: { type: "json_schema", @@ -121,9 +122,7 @@ const response = await openai.chat.completions.create({ const result = JSON.parse(response.choices[0].message.content!); ``` ---- - -### Feature 2: Expanded Field Types in @cantoo/pdf-lib +#### Feature 2: Expanded Field Types in @cantoo/pdf-lib **No new library needed.** `@cantoo/pdf-lib` v2.6.3 already supports all required field types natively: @@ -142,11 +141,9 @@ checkBox.addToPage(page, { x, y, width: 15, height: 15, borderWidth: 1 }) if (shouldBeChecked) checkBox.check() ``` -**Coordinate system note:** `@cantoo/pdf-lib` uses PDF coordinate space where y=0 is the bottom of the page. If field positions come from `unpdf` / PDF.js (which uses y=0 at top), you must transform: `pdfY = pageHeight - sourceY - fieldHeight`. +**Coordinate system note:** `@cantoo/pdf-lib` uses PDF coordinate space where y=0 is the bottom of the page. If field positions come from `pdfjs-dist` (which uses y=0 at top), you must transform: `pdfY = pageHeight - sourceY - fieldHeight`. ---- - -### Feature 3: Agent Signature Storage +#### Feature 3: Agent Signature Storage **No new library needed.** The project already has `signature_pad` v5.1.3, `@vercel/blob`, and Drizzle ORM. @@ -193,9 +190,7 @@ const dims = sigImage.scaleToFit(fieldWidth, fieldHeight) page.drawImage(sigImage, { x: fieldX, y: fieldY, width: dims.width, height: dims.height }) ``` ---- - -### Feature 4: Filled Document Preview +#### Feature 4: Filled Document Preview **No new library needed.** `react-pdf` v10.4.1 is already installed and supports rendering a PDF from an `ArrayBuffer` directly. @@ -216,62 +211,371 @@ const safeCopy = (buf: ArrayBuffer) => { **react-pdf renders the flattened PDF accurately** — all filled text fields, checked checkboxes, and embedded signature images will appear correctly because they are baked into the PDF bytes by `@cantoo/pdf-lib` before rendering. ---- - -## Alternatives Considered +### Alternatives Considered (v1.1) | Recommended | Alternative | Why Not | |-------------|-------------|---------| -| `unpdf` for text extraction | `pdfjs-dist` directly in Node API route | `pdfjs-dist` v5 uses `Promise.withResolvers` requiring Node 22+; the project targets Node 20 LTS. `unpdf` ships a polyfilled serverless build that handles this. | -| `unpdf` for text extraction | `pdf-parse` | `pdf-parse` is unmaintained (last publish 2019). `unpdf` is the community-recommended successor. | +| `pdfjs-dist/legacy/build/pdf.mjs` directly | `unpdf` wrapper | `unpdf` was recommended in research but not actually installed; the legacy build path works correctly with Node 20 LTS. | +| `pdfjs-dist/legacy/build/pdf.mjs` directly | `pdf-parse` | `pdf-parse` is unmaintained (last publish 2019). | | Manual JSON schema for OpenAI | `zodResponseFormat` helper | Broken with Zod v4 — open bug in `openai-node` as of Nov 2025. Manual schema avoids the dependency entirely. | -| `gpt-4o-mini` | `gpt-4o` | Real estate form field extraction is a structured extraction task on templated documents. `gpt-4o-mini` is sufficient and ~15x cheaper. Upgrade to `gpt-4o` only if accuracy on unusual forms is unacceptable. | +| `gpt-4.1` | `gpt-4o` | Real estate form field extraction is a structured extraction task on templated documents. Upgrade only if accuracy on unusual forms is unacceptable. | | `page.drawImage()` for agent signature | `PDFSignature` AcroForm field | `@cantoo/pdf-lib` has no `createSignature()` API — `PDFSignature` only reads existing signature fields and provides no image embedding. The correct approach is `embedPng()` + `drawImage()` at the field coordinates. | ---- - -## What NOT to Add +### What NOT to Add (v1.1) | Avoid | Why | Use Instead | |-------|-----|-------------| | `zodResponseFormat` from `openai/helpers/zod` | Broken at runtime with Zod v4.x (throws exceptions). Open bug, no fix merged as of 2026-03-21. | Plain `response_format: { type: "json_schema", ... }` with hand-written schema | | `react-signature-canvas` | Alpha version (1.1.0-alpha.2); project already has `signature_pad` v5 directly — the wrapper adds nothing | `signature_pad` + `useRef` directly | | `@signpdf/placeholder-pdf-lib` | For cryptographic PKCS#7 digital signatures (DocuSign-style). This project needs visual e-signatures (image embedded in PDF), not cryptographic signing. | `@cantoo/pdf-lib` `embedPng()` + `drawImage()` | -| `pdf2json` | Extracts spatial text data; useful for arbitrary document analysis. Overkill here — we only need raw text content to feed OpenAI. | `unpdf` | +| `pdf2json` | Extracts spatial text data; useful for arbitrary document analysis. Overkill here — we only need raw text content to feed OpenAI. | `pdfjs-dist` legacy build | +| `unpdf` | Was in the v1.1 research recommendation but not installed. The existing `pdfjs-dist/legacy/build/pdf.mjs` usage works correctly in Node 20 — do not add `unpdf` retroactively. | `pdfjs-dist` legacy build (already in use) | | `langchain` / Vercel AI SDK | Heavy abstractions for the simple use case of one structured extraction call per document. Adds bundle size and abstraction layers with no benefit here. | `openai` SDK directly | | A separate image processing library (`sharp`, `jimp`) | Not needed — signature PNGs from `signature_pad.toDataURL()` are already correctly sized canvas exports. `@cantoo/pdf-lib` handles embedding without pre-processing. | N/A | ---- - -## Version Compatibility +### Version Compatibility (v1.1) | Package | Compatible With | Notes | |---------|-----------------|-------| | `openai@6.32.0` | `zod@4.x` (manual schema only) | Do NOT use `zodResponseFormat` helper — use raw `json_schema` response_format. The helper is broken with Zod v4. | -| `openai@6.32.0` | Node.js 20+ | Requires Node 20 LTS or later. Next.js 16.2 on Vercel uses Node 20 by default. | -| `unpdf@1.4.0` | Node.js 18+ | Bundled PDF.js v5.2.133 with polyfills for `Promise.withResolvers`. Works on Node 20. | +| `openai@6.32.0` | Node.js 20+ | Requires Node 20 LTS or later. | +| `pdfjs-dist` (legacy build) | Node.js 20+ | Uses `legacy/build/pdf.mjs` path which handles `Promise.withResolvers` polyfill issues. Set `GlobalWorkerOptions.workerSrc` to local worker path. | | `@cantoo/pdf-lib@2.6.3` | `react-pdf@10.4.1` | These do not interact at runtime — `@cantoo/pdf-lib` runs server-side, `react-pdf` runs client-side. No conflict. | | `signature_pad@5.1.3` | React 19 | Use as a plain class instantiated in `useEffect` with a `useRef`. No React wrapper needed. | ---- - -## Sources +### Sources (v1.1) - [openai npm page](https://www.npmjs.com/package/openai) — v6.32.0 confirmed, Node 20 requirement — HIGH confidence - [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual json_schema format confirmed — HIGH confidence - [openai-node Issue #1540](https://github.com/openai/openai-node/issues/1540) — zodResponseFormat broken with Zod v4 — HIGH confidence - [openai-node Issue #1602](https://github.com/openai/openai-node/issues/1602) — zodTextFormat broken with Zod 4 — HIGH confidence -- [openai-node Issue #1709](https://github.com/openai/openai-node/issues/1709) — Zod 4.1.13+ discriminated union break — HIGH confidence - [@cantoo/pdf-lib npm page](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3, field types confirmed — HIGH confidence -- [pdf-lib.js.org PDFForm docs](https://pdf-lib.js.org/docs/api/classes/pdfform) — createTextField, createCheckBox, drawImage APIs — HIGH confidence -- [unpdf npm page](https://www.npmjs.com/package/unpdf) — v1.4.0, serverless PDF.js build, Node 20 compatible — HIGH confidence -- [unpdf GitHub](https://github.com/unjs/unpdf) — extractText API confirmed — HIGH confidence -- [react-pdf npm page](https://www.npmjs.com/package/react-pdf) — v10.4.1, ArrayBuffer file prop confirmed — HIGH confidence -- [react-pdf ArrayBuffer detach issue #1657](https://github.com/wojtekmaj/react-pdf/issues/1657) — copy workaround confirmed — HIGH confidence - [signature_pad GitHub](https://github.com/szimek/signature_pad) — v5.1.3, toDataURL API — HIGH confidence -- [pdf-lib image embedding JSFiddle](https://jsfiddle.net/Hopding/bcya43ju/5/) — embedPng/drawImage pattern — HIGH confidence +- Code audit: `src/lib/ai/extract-text.ts` — confirms `pdfjs-dist/legacy/build/pdf.mjs` in use, `unpdf` not installed — HIGH confidence --- -*Stack research for: Teressa Copeland Homes — v1.1 Smart Document Preparation additions* -*Researched: 2026-03-21* +## v1.2 Stack Research — Multi-Signer + Docker Production + +**Scope:** Multi-signer support, production Docker Compose, SMTP in Docker +**Confidence:** HIGH for Docker patterns and schema approach, HIGH for email env vars (code-verified) + +--- + +### Summary + +Multi-signer support requires **zero new npm packages**. It is a pure schema extension: a new +`document_signers` junction table in the existing Drizzle/PostgreSQL setup, plus a `signerEmail` +field added to the `SignatureFieldData` interface. The existing `signingTokens` table is extended +with a `signerEmail` column. Everything else — token generation, field rendering, email sending, +PDF merging — uses libraries already installed. + +Docker production is a three-stage Dockerfile with Next.js `standalone` mode plus a +`docker-compose.yml` with `env_file` for SMTP credentials. The known "email not working in +Docker" failure mode is almost always environment variables not reaching the container — not a +nodemailer bug. + +**Email provider confirmed:** The project uses `nodemailer` v7.0.13 with SMTP (not Resend, not +SendGrid). Both the contact form (`contact-mailer.ts`) and signing emails (`signing-mailer.tsx`) +use nodemailer with shared env vars `CONTACT_SMTP_HOST`, `CONTACT_SMTP_PORT`, `CONTACT_EMAIL_USER`, +`CONTACT_EMAIL_PASS`. The `sendAgentNotificationEmail` function already exists — it just needs to +be called for the completion event. No email library changes needed. + +--- + +### New Dependencies for v1.2 + +#### Multi-Signer: None + +| Capability | Already Covered By | +|---|---| +| Per-signer token | `signingTokens` table — extend with `signerEmail TEXT NOT NULL` column | +| Per-signer field filtering | Filter `signatureFields` JSONB by `field.signerEmail` at query time | +| Completion detection | Query `document_signers` WHERE `signedAt IS NULL` | +| Parallel email dispatch | `nodemailer` (already installed) — `Promise.all([sendMail(...), sendMail(...)])` | +| Final PDF merge after all sign | `@cantoo/pdf-lib` (already installed) | +| Agent notification on completion | `sendAgentNotificationEmail()` already implemented in `signing-mailer.tsx` | +| Final PDF to all parties | `nodemailer` + `@react-email/render` (already installed) | + +**Schema additions needed (pure Drizzle migration, no new packages):** + +1. **New table `document_signers`** — one row per (document, signer email). Columns: + `id`, `documentId` (FK → documents), `signerEmail`, `signerName` (optional), + `signedAt` (nullable timestamp), `ipAddress` (captured at signing), `tokenJti` (FK → signingTokens). + +2. **New field on `SignatureFieldData` interface** — `signerEmail?: string`. Fields without + `signerEmail` are agent-only fields (already handled). Fields with `signerEmail` route to + that signer's session. + +3. **Extend `documentStatusEnum`** — add `'PartialSigned'` (some but not all signers complete). + `'Signed'` continues to mean all signers have completed. + +4. **Extend `auditEventTypeEnum`** — add `'all_signers_complete'` for the completion notification + trigger. + +5. **Extend `signingTokens` table** — add `signerEmail text NOT NULL` column so each token is + scoped to one signer and the signing page can filter fields correctly. + +#### Docker: No New Application Packages + +The Docker setup is infrastructure only (Dockerfile + docker-compose.yml). No npm packages are +added to the application. + +**One optional dev-only Docker image for local email testing:** + +| Tool | What It Is | When to Use | +|---|---|---| +| `maildev/maildev:latest` | Lightweight SMTP trap that catches all outbound mail and shows it in a web UI | Add to a `docker-compose.override.yml` for local development only. Never deploy to production. | + +--- + +### Docker Stack + +#### Image Versions + +| Service | Image | Rationale | +|---|---|---| +| Next.js app | `node:20-alpine` | LTS, small. Do NOT use node:24 — @napi-rs/canvas ships prebuilt `.node` binaries and the build for node:24-alpine may not exist yet. Verify before upgrading. | +| PostgreSQL | `postgres:16-alpine` | Current stable, alpine keeps it small. Pin to `16-alpine` explicitly — never use `postgres:latest` which silently upgrades major versions on `docker pull`. | + +#### Dockerfile Pattern — Three-Stage Standalone + +Next.js `output: 'standalone'` in `next.config.ts` must be enabled. This generates +`.next/standalone/` with a minimal self-contained Node.js server. Reduces image from ~7 GB +(naive) to ~300 MB (verified across multiple production reports). + +**Stage 1 — deps:** `npm ci --omit=dev` with cache mounts. This layer is cached until +`package-lock.json` changes, making subsequent builds fast. + +**Stage 2 — builder:** Copy deps from Stage 1, copy source, run `next build`. Set +`NEXT_TELEMETRY_DISABLED=1` and `NODE_ENV=production`. + +**Stage 3 — runner:** Copy `.next/standalone/`, `.next/static/`, and `public/` from builder +only. Set `HOSTNAME=0.0.0.0` and `PORT=3000`. Run as non-root user. The `uploads/` +directory must be a named Docker volume — never baked into the image. + +**@napi-rs/canvas native binding note:** This package includes a compiled `.node` binary for a +specific OS and CPU architecture. The build stage and runner stage must use the same OS +(both `node:20-alpine`) and the Docker build must run on the same CPU architecture as the +deployment server (arm64 for Apple Silicon servers, amd64 for x86). Use +`docker buildx --platform linux/arm64` or `linux/amd64` explicitly if building cross-platform. +Cross-architecture builds will produce a binary that silently fails at runtime. + +**Database migrations on startup:** Add an `entrypoint.sh` that runs +`npx drizzle-kit migrate && exec node server.js`. This ensures schema migrations run before +the application accepts traffic on every container start. + +#### Docker Compose Structure + +```yaml +services: + app: + build: . + restart: unless-stopped + depends_on: + db: + condition: service_healthy + env_file: .env.production # SMTP creds, AUTH_SECRET, OPENAI_API_KEY etc. + environment: + NODE_ENV: production + DATABASE_URL: postgresql://appuser:${POSTGRES_PASSWORD}@db:5432/tchmesapp + volumes: + - uploads:/app/uploads # PDFs written at runtime + ports: + - "3000:3000" + networks: + - internal + + db: + image: postgres:16-alpine + restart: unless-stopped + environment: + POSTGRES_USER: appuser + POSTGRES_DB: tchmesapp + POSTGRES_PASSWORD_FILE: /run/secrets/postgres_password + secrets: + - postgres_password + volumes: + - pgdata:/var/lib/postgresql/data + healthcheck: + test: ["CMD-SHELL", "pg_isready -U appuser -d tchmesapp"] + interval: 10s + timeout: 5s + retries: 5 + networks: + - internal + +volumes: + pgdata: + uploads: + +networks: + internal: + driver: bridge + +secrets: + postgres_password: + file: ./secrets/postgres_password.txt +``` + +**Key decisions:** + +- `db` is on the `internal` bridge network only — not exposed to the host or internet. +- `app` waits for `db` to pass its health check before starting (`condition: service_healthy`), + preventing migration failures on cold boot. +- `restart: unless-stopped` survives server reboots without systemd service files. +- Named volumes for `pgdata` and `uploads` survive container recreation. +- PostgreSQL uses a Docker secret for its password because Postgres natively supports + `POSTGRES_PASSWORD_FILE`. The app reads its `DATABASE_URL` from `.env.production` via + `env_file` — no code change needed. + +--- + +### SMTP / Email in Docker + +#### Actual Environment Variable Names (Code-Verified) + +The signing mailer (`src/lib/signing/signing-mailer.tsx`) reads these exact env vars: + +``` +CONTACT_SMTP_HOST=smtp.gmail.com +CONTACT_SMTP_PORT=587 +CONTACT_EMAIL_USER=teressa@tcopelandhomes.com +CONTACT_EMAIL_PASS=xxxx-xxxx-xxxx-xxxx +AGENT_EMAIL=teressa@tcopelandhomes.com +``` + +These same vars are used by the contact form mailer (`src/lib/contact-mailer.ts`). Both +mailers share the same SMTP transport configuration via the same env var names. + +#### The Actual Problem + +The current "email not working in Docker" bug is almost certainly one of two causes: + +**Cause 1 — Environment variables not passed to the container.** Docker does not inherit host +environment variables. If `CONTACT_SMTP_HOST`, `CONTACT_EMAIL_PASS` etc. are in `.env.local` +on the host, they are invisible inside the container unless explicitly injected via `env_file` +or `environment:` in docker-compose.yml. + +**Cause 2 — DNS resolution failure (EAI_AGAIN).** Docker containers use Docker's internal DNS +resolver (127.0.0.11). This can intermittently fail to resolve external hostnames, producing +`getaddrinfo EAI_AGAIN smtp.gmail.com`. The symptom is email that works locally but silently +fails (or errors) in Docker. + +nodemailer itself is reliable in Docker containers. The multiple open GitHub issues on this +topic all trace back to environment or DNS configuration problems, not library bugs. + +#### Solution: env_file for Application Secrets + +Docker Compose native secrets (non-Swarm) mount plaintext files at `/run/secrets/secret_name`. +Application code must explicitly read those file paths. nodemailer reads credentials from +`process.env` string values — not file paths. Rewriting the transporter initialization to read +from `/run/secrets/` would require code changes for no meaningful security gain on a +single-server setup. + +The correct approach for this application is `env_file`: + +1. Create `.env.production` on the server (never commit, add to `.gitignore`): + ``` + CONTACT_SMTP_HOST=smtp.gmail.com + CONTACT_SMTP_PORT=587 + CONTACT_EMAIL_USER=teressa@tcopelandhomes.com + CONTACT_EMAIL_PASS=xxxx-xxxx-xxxx-xxxx + AGENT_EMAIL=teressa@tcopelandhomes.com + AUTH_SECRET= + OPENAI_API_KEY=sk-... + POSTGRES_PASSWORD= + NEXT_PUBLIC_BASE_URL=https://teressacopelandhomes.com + ``` + +2. In `docker-compose.yml`, reference it: + ```yaml + services: + app: + env_file: .env.production + ``` + +3. All variables in that file are injected as `process.env.*` inside the container. + nodemailer reads `process.env.CONTACT_EMAIL_PASS` exactly as in development. Zero code changes. + +**Why not Docker Swarm secrets for SMTP?** Plain Compose secrets have no encryption — they are +just bind-mounted plaintext files. The security profile is identical to a chmod 600 +`.env.production` file. The complexity cost (code that reads from `/run/secrets/`) is not +justified on a single-server home deployment. Use Docker secrets for PostgreSQL password only +because PostgreSQL natively reads from `POSTGRES_PASSWORD_FILE` — no code change required. + +#### DNS Fix for EAI_AGAIN + +If SMTP resolves correctly in dev but fails in Docker, add to the app service: + +```yaml +services: + app: + dns: + - 8.8.8.8 + - 1.1.1.1 + environment: + NODE_OPTIONS: --dns-result-order=ipv4first +``` + +The `dns:` keys bypass Docker's internal resolver for external lookups. `--dns-result-order=ipv4first` +tells Node.js to try IPv4 DNS results before IPv6, which resolves the most common Docker DNS +timeout pattern (IPv6 path unreachable, long timeout before IPv4 fallback). + +#### SMTP Provider + +Gmail with an App Password (not the account password) is the recommended choice for a solo +agent at low volume. Requires 2FA enabled on the Google account. The signing mailer already +uses port logic: port 465 → `secure: true`; any other port → `secure: false`. Port 587 with +STARTTLS is more reliable than port 465 implicit TLS in Docker environments — use +`CONTACT_SMTP_PORT=587`. + +--- + +### What NOT to Add (v1.2) + +| Temptation | Why to Avoid | +|---|---| +| Redis / BullMQ for email queuing | Overkill. This app sends at most 5 emails per document. `Promise.all([sendMail(...)])` is sufficient. Redis adds a third container, more ops burden, and more failure modes. | +| Resend / SendGrid / Postmark | Adds a paid external dependency. nodemailer + Gmail App Password is free, already implemented, and reliable when env vars are correctly passed. Switch only if Gmail SMTP becomes a persistent problem. | +| Docker Swarm secrets for SMTP | Requires code changes to read from file paths. No security benefit over a permission-restricted `env_file` on single-server non-Swarm setup. | +| `postgres:latest` image | Will silently upgrade major versions on `docker pull`. Always pin to `postgres:16-alpine`. | +| Node.js 22 or 24 as base image | @napi-rs/canvas ships prebuilt `.node` binaries. Verify the binding exists for the target node/alpine version before upgrading. Node 20 LTS is verified. | +| Sequential signing enforcement | The PROJECT.md specifies parallel signing only ("any order"). Do not add sequencing logic. | +| WebSockets for real-time signing status | Polling the agent dashboard every 30 seconds is sufficient for one agent monitoring a handful of documents. No WebSocket infrastructure needed. | +| Separate migration container | A `depends_on: condition: service_completed_successfully` init container is architecturally cleaner but adds complexity. An `entrypoint.sh` in the same `app` container is simpler and sufficient at this scale. | +| HelloSign / DocuSign integration | Explicitly out of scope per PROJECT.md. Custom e-signature is the intentional choice. | +| `unpdf` | Already documented in v1.1 "What NOT to Add" — the existing `pdfjs-dist` legacy build is in use and working. | + +--- + +### OpenSign Architecture Reference + +OpenSign (React + Node.js + MongoDB) implements multi-recipient signing as a signers array +embedded in each document record. Each signer object holds its own status, token reference, and +completed-at timestamp. All signing links are sent simultaneously (parallel) by default. Their +MongoDB document array maps directly to a PostgreSQL `document_signers` junction table in +relational terms. The core insight confirmed by OpenSign's design: multi-signer needs no +specialized packages — it is a data model and routing concern. + +--- + +### Sources (v1.2) + +- Code audit: `src/lib/signing/signing-mailer.tsx` — env var names `CONTACT_SMTP_HOST`, `CONTACT_EMAIL_USER`, `CONTACT_EMAIL_PASS`, `AGENT_EMAIL` confirmed — HIGH confidence +- Code audit: `src/lib/db/schema.ts` — current `SignatureFieldData`, `signingTokens`, `documentStatusEnum`, `auditEventTypeEnum` confirmed — HIGH confidence +- Code audit: `package.json` — nodemailer v7.0.13, @react-email installed, unpdf NOT installed — HIGH confidence +- [Docker Compose: Next.js + PostgreSQL + Redis (Feb 2026)](https://oneuptime.com/blog/post/2026-02-08-how-to-set-up-a-nextjs-postgresql-redis-stack-with-docker-compose/view) — HIGH confidence +- [Docker Official Next.js Containerize Guide](https://docs.docker.com/guides/nextjs/containerize/) — HIGH confidence +- [Docker Compose Secrets — Official Docs](https://docs.docker.com/compose/how-tos/use-secrets/) — HIGH confidence +- [Docker Compose Secrets: What Works, What Doesn't](https://www.bitdoze.com/docker-compose-secrets/) — MEDIUM confidence +- [Docker Compose Secrets: Export /run/secrets to Env Vars (Dec 2025)](https://phoenixtrap.com/2025/12/22/10-lines-to-better-docker-compose-secrets/) — MEDIUM confidence +- [Nodemailer Docker EAI_AGAIN — Docker Forums](https://forums.docker.com/t/not-able-to-send-email-using-nodemailer-within-docker-container-due-to-eai-again/40649) — HIGH confidence (root cause confirmed) +- [Nodemailer works local, fails in Docker — GitHub Issue #1495](https://github.com/nodemailer/nodemailer/issues/1495) — HIGH confidence (issue confirmed unresolved = env problem, not library) +- [OpenSign GitHub Repository](https://github.com/OpenSignLabs/OpenSign) — HIGH confidence +- [Next.js Standalone Docker Mode — DEV Community (2025)](https://dev.to/angojay/optimizing-nextjs-docker-images-with-standalone-mode-2nnh) — MEDIUM confidence +- [DNS EAI_AGAIN in Docker — Beyond 'It Works on My Machine'](https://dev.to/ameer-pk/beyond-it-works-on-my-machine-solving-docker-networking-dns-bottlenecks-4f3m) — MEDIUM confidence + +--- + +*Last updated: 2026-04-03 — v1.2 multi-signer and Docker production additions; corrected unpdf status, env var names, and added missing packages to existing stack table* diff --git a/.planning/research/SUMMARY.md b/.planning/research/SUMMARY.md index 022efd5..2351b55 100644 --- a/.planning/research/SUMMARY.md +++ b/.planning/research/SUMMARY.md @@ -1,185 +1,206 @@ # Project Research Summary -**Project:** Teressa Copeland Homes — v1.1 Smart Document Preparation -**Domain:** Real estate agent website + PDF document signing portal -**Researched:** 2026-03-21 +**Project:** teressa-copeland-homes +**Domain:** Real estate agent website + AI-assisted document signing portal +**Researched:** 2026-04-03 +**Milestone scope:** v1.1 (AI field placement, expanded field types, agent signature, filled preview) + v1.2 (multi-signer support, Docker production deployment) **Confidence:** HIGH +--- + ## Executive Summary -This is a v1.1 feature expansion of an existing, working Next.js 15 real estate document signing app. The v1.0 codebase is already validated — it uses Drizzle ORM, local PostgreSQL, `@cantoo/pdf-lib` for PDF writing, `react-pdf` for client-side rendering, Auth.js v5, and `signature_pad` for canvas signatures. The v1.1 additions are: AI-assisted field placement via GPT-4o-mini, five new field types (text, checkbox, initials, date, agent-signature), agent saved signature with a draw-once-reuse workflow, and a filled document preview before sending. The minimal dependency delta is two new packages: `openai@^6.32.0` and optionally `unpdf@^1.4.0` — though `pdfjs-dist` is already installed as a transitive dependency of `react-pdf` and can serve the server-side text extraction role via its legacy build. +This is a solo real estate agent's custom document signing portal built on Next.js 15 + Drizzle ORM + PostgreSQL (Neon). The core v1.0 feature set — PDF upload, drag-drop field placement, email-link signing, presigned download — is complete. Two milestones of new capability are being planned. The v1.1 milestone adds AI-assisted field detection (GPT-4.1 + PDF text extraction), five new field types (checkbox, initials, date, agent signature, text), a saved agent signature, and a filled-document preview before send. The v1.2 milestone extends single-signer to multi-signer (parallel, any-order) and adds production Docker deployment with SMTP fix. Both milestones require zero new npm packages beyond `openai` (already installed for v1.1) — all capability is extension of the existing stack. -The recommended build order is anchored by a schema-first phase. The `SignatureFieldData` type currently has no `type` discriminant — every field is treated identically as a client signature. Adding new field types without simultaneously updating both the schema AND the client signing page would break any in-flight signing session. The architecture research maps out an explicit 8-step dependency chain. For AI field placement, the correct approach uses `pdfjs-dist` for server-side text extraction (not vision), then GPT-4o-mini for semantic label classification — raw vision-based bounding box inference returns accurate coordinates less than 3% of the time. The OpenAI integration must use a manually defined JSON schema for structured output; the `zodResponseFormat` helper is broken with Zod v4 (confirmed open bug). +The most important architectural constraint for both milestones: the current signing flow has load-bearing single-signer assumptions throughout (one token per document, first token claim marks the document Signed, `documents.status` tracks one signer's journey). Multi-signer requires a deliberate schema-first approach — add the `signers` JSONB column, `signerEmail` to `signingTokens`, and a completion detection rewrite before touching any UI. Building the send route or signing UI before the schema is solid will create hard-to-unwind bugs. For Docker, the single non-obvious pitfall is that `NEXT_PUBLIC_BASE_URL` is baked at build time — signing URLs will point to localhost unless the variable is renamed to a server-only name before the production image is built. -The key risk cluster is around the AI coordinate pipeline and signing page integrity. OpenAI returns percentage-based coordinates; `@cantoo/pdf-lib` expects PDF user-space points with a bottom-left origin — a Y-axis inversion that will silently produce wrong field positions without a dedicated conversion utility and unit test. A second risk is that agent-signature fields must be filtered from the `signatureFields` array sent to clients — the exact unguarded line (`/src/app/api/sign/[token]/route.ts` line 88) is identified in pitfalls research. Preview PDFs must use versioned paths separate from the final prepared PDF to maintain legal integrity between what the agent reviewed and what the client signs. +For v1.1, AI field placement must use the text-extraction approach (pdfjs-dist `getTextContent()` → GPT-4.1 for label classification → coordinates from text bounding boxes). Vision-based coordinate inference from PDF images has under 3% accuracy in published benchmarks and is not production-viable. The hybrid text+AI approach is the pattern used by Apryse, Instafill, and DocuSign Iris. On Utah standard forms (REPC, listing agreements), which have consistent label patterns, accuracy should be 90%+. The fallback — scanned/image-based PDFs — should degrade gracefully to manual drag-drop with a clear agent-facing message. + +--- ## Key Findings ### Recommended Stack -The v1.0 stack is unchanged and validated. See `STACK.md` for full version details. +The existing stack handles every v1.1 and v1.2 requirement without new packages. `openai@6.32.0` (already installed) covers AI field placement. `@cantoo/pdf-lib@2.6.3` handles all five new field types and PDF assembly for both single-signer and multi-signer workflows. `signature_pad@5.1.3` (already installed) handles agent signature capture. `react-pdf@10.4.1` handles filled document preview. Docker deployment requires no npm changes — it is a Dockerfile + docker-compose.yml infrastructure concern. -**New dependencies for v1.1:** -- `openai@^6.32.0`: Official SDK, TypeScript-native structured output for GPT-4o-mini — use manual `json_schema` response_format, NOT `zodResponseFormat` (broken with Zod v4, confirmed open GitHub issues #1540, #1602, #1709) -- `pdfjs-dist` legacy build (already installed): Server-side PDF text extraction via `pdfjs-dist/legacy/build/pdf.mjs` — no new dependency needed if using this path +**Core technologies:** +- `openai@6.32.0`: GPT-4.1 structured extraction via manual JSON schema — not `zodResponseFormat`, which is broken with Zod v4 (confirmed open bug) +- `pdfjs-dist` (legacy build, already installed): PDF text extraction with bounding boxes for AI field placement pipeline +- `@cantoo/pdf-lib@2.6.3`: All new field types (text, checkbox, initials, date); agent signature via `embedPng()` + `drawImage()`; coordinate system: y=0 at page bottom (transform from pdfjs-dist y=0-at-top) +- `signature_pad@5.1.3`: Agent signature canvas — use as plain class with `useRef` in React, not a React wrapper +- `react-pdf@10.4.1`: Filled document preview from `ArrayBuffer`; always copy buffer before passing (detachment bug in v7+) +- `@vercel/blob`: File storage — installed but currently unused (dead dependency risk; see Gaps) +- `nodemailer@7.0.13`: SMTP email for signing links and completion notifications +- `node:20-slim` (Debian-based): Docker base image — required for `@napi-rs/canvas` glibc native binary compatibility; do NOT use Alpine (musl libc incompatible) -**Existing stack components covering all v1.1 needs:** -- `@cantoo/pdf-lib@2.6.3`: All five new field types (text, checkbox, initials, date, agent-signature) supported natively via `createTextField`, `createCheckBox`, `drawImage` APIs -- `signature_pad@5.1.3`: Agent signature canvas — use `useRef` + `useEffect` pattern directly; do NOT add `react-signature-canvas` (alpha wrapper) -- `react-pdf@10.4.1`: Filled preview rendering — pass `ArrayBuffer` directly; copy the buffer before passing to avoid detachment issue (known bug #1657) -- `@vercel/blob@2.3.1` + Drizzle ORM: Agent signature storage — architecture research recommends TEXT column on `users` table for 2-8KB base64 PNG; no new file storage needed +**Critical version constraint:** `zodResponseFormat` from `openai/helpers/zod` throws runtime exceptions with Zod v4.x. Use `response_format: { type: "json_schema", ... }` with a hand-written schema. ### Expected Features -All v1.1 features are P1 (must-have for launch). Research confirms the full feature set is aligned with industry standard behavior across DocuSign, dotloop, and SkySlope DigiSign. +**Must have (table stakes — all major real estate e-sig tools provide these):** +- Initials field type — every Utah standard form (REPC, listing agreement) has per-page initials lines +- Date field (auto-stamp only, not a calendar picker) — records actual signing timestamp; client cannot type a date +- Checkbox field type — boolean elections throughout Utah REPC and addenda +- Agent signature field type — pre-filled during agent signing flow; read-only for client +- Agent saved signature (draw once, reuse) — DocuSign "Adopted Signature" pattern; re-drawing per document is a daily friction point +- Agent signs before sending (routing order: agent first, client second) — industry convention in real estate +- Filled document preview before send — prevents wrong-version sends; agents expect to see what client sees -**Must have (table stakes):** -- Initials field type — every Utah standard form (REPC, listing agreement, addenda) has per-page initials lines; missing this makes the app unusable for standard Utah workflows -- Date field (auto-stamp, read-only) — "Date Signed" pattern; auto-populated at signing session completion; client never types a date; legally important -- Checkbox field type — Utah REPC uses boolean checkboxes throughout (mediation clauses, contingency elections, disclosure acknowledgments) -- Agent saved signature — draw once, reuse across documents; the "Adopted Signature" pattern in every major real estate e-sig tool -- Agent signs first workflow — industry convention: agent at routing order 1, client at routing order 2; confirmed by DocuSign community docs -- Filled document preview with Send gating — prevents the most-cited mistake (sending wrong document version); Send button lives in preview +**Should have (differentiators for this solo-agent, Utah-forms workflow):** +- AI field placement (text extraction + GPT-4.1 label classification) — eliminates manual drag-drop for known Utah forms; 90%+ accuracy on standard forms +- AI pre-fill from client profile data (name, property address, date) — populates obvious fields; agent reviews in PreparePanel +- Property address field on client profile — enables AI pre-fill to be transaction-specific -**Should have (differentiators):** -- AI field placement via gpt-4o-mini + text extraction — eliminates manual drag-drop session; accuracy 90%+ on structured Utah forms with predictable label patterns ("Buyer's Signature", "Date", "Initial Here") -- AI pre-fill from client profile — maps client name, email, property address to text fields; low hallucination risk (structured profile data, not free-text inference) -- Property address field on client profile — enables AI pre-fill to be property-specific; simple schema addition +**Defer to v2+:** +- AI confidence scores surfaced to agent — adds UI complexity; preview step catches gaps instead +- Template save from AI placement — high value but requires template management UI; validate AI accuracy first +- Per-document agent signature redraw — adds decision fatigue; profile settings only is the right UX -**Defer to v1.2+:** -- AI confidence display to agent — adds UI noise; agent can see and correct in preview instead -- Template save from AI placement — high value but requires template management UI; defer until AI accuracy is validated -- Multiple agent signature fields per document — needs UX design; defer +**Multi-signer (v1.2 must-haves per PROJECT.md):** +- Parallel signing in any order — no sequencing enforcement +- Per-signer field isolation — each signer sees only fields tagged to their email +- Completion detection after all signers finish — agent notified only when last signer completes +- Final PDF distributed to all parties on completion ### Architecture Approach -The v1.1 architecture is an incremental extension of the existing system — not a rewrite. Seven new files are created (two server-only AI lib files, three API routes, two client components). Eight existing files are modified with targeted additions. The critical architectural constraint: the existing client signing flow (`embed-signature.ts`, signing token route, `SignatureModal.tsx`) must not be altered. Agent-sig and text/checkbox/date fields are baked into the prepared PDF before the client opens the signing link. The client signing page handles only `client-signature` and `initials` field types. - -See `ARCHITECTURE.md` for complete component boundaries, data flow diagrams, and the full 8-step build order. +The system is a clean single-signer flow extended to multi-signer via additive schema changes without breaking existing documents. The key architectural principle: `signingTokens` is the source of truth for signer identity and completion state — never derive this from the `signatureFields` JSONB array. The per-signer PDF accumulation pattern (each signer embeds into the running `signedFilePath`, protected by a Postgres advisory lock) eliminates the need for a separate merge step and works with the existing `embedSignatureInPdf()` function unchanged. **Major components:** -1. `lib/ai/extract-text.ts` + `lib/ai/field-placement.ts` (NEW, server-only) — pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output with manual JSON schema; `server-only` import guard prevents accidental client bundle inclusion -2. `POST /api/documents/[id]/ai-prepare` (NEW) — orchestrates extract + AI call + coordinate conversion (percentage to PDF points using actual page dimensions) -3. `GET/PUT /api/agent/signature` (NEW) — stores agent signature as base64 PNG TEXT column on `users` table; always auth-gated -4. `POST /api/documents/[id]/preview` (NEW) — reuses existing `preparePdf` in preview mode; writes to versioned `_preview_{timestamp}.pdf`; streams bytes directly; never overwrites final prepared PDF -5. Extended `FieldPlacer.tsx` palette — five new draggable tokens; existing drag/move/resize/persist mechanics unchanged -6. Extended `prepare-document.ts` — type-aware rendering switch for all six field types; existing `client-signature` path unchanged +1. `SignatureFieldData` JSONB (on `documents`) — field placement data; extended with optional `signerEmail` for multi-signer field routing; `signerEmail` absent = legacy single-signer +2. `signingTokens` table — one row per signer per document; extended with `signerEmail` (nullable) and `viewedAt` columns; source of truth for who needs to sign and who has signed +3. `documents.signers` JSONB column (new) — ordered list of `{ email, name, tokenJti, signedAt }` per document; coexists with legacy `assignedClientId` for backward compatibility +4. `documents.completionTriggeredAt` column (new) — one-time-set guard preventing duplicate final PDF assembly on concurrent last-signer submissions +5. POST `/api/sign/[token]` (modified) — atomic token claim, advisory lock PDF accumulation, completion detection via unclaimed-token count, `document_completed` event + notifications when all done +6. `signing-mailer.tsx` (modified) — multi-signer path: `Promise.all()` over signers; new `sendAllSignersCompletionEmail()` +7. `PreparePanel` + `FieldPlacer` (modified) — signer list entry UI, per-field signer assignment, color-coded markers, send-block validation ### Critical Pitfalls -1. **Breaking the signing page with new field types** — `SigningPageClient.tsx` opens the signature modal for every field in `signatureFields` with no type branching. Adding new field types without updating the signing page in the same deployment breaks active signing sessions. Ship schema + signing page filter as one atomic deployment, before any other v1.1 work. +1. **First signer marks document Signed prematurely** — the current POST `/api/sign/[token]` unconditionally sets `documents.status = 'Signed'` on any token claim. With two signers, Signer A's completion fires the agent notification and marks the document complete before Signer B has signed. Fix: replace with a count of unclaimed tokens for the document; only set Signed when count reaches zero. -2. **AI coordinate Y-axis inversion** — AI returns percentages from top-left; `@cantoo/pdf-lib` uses PDF user-space with Y=0 at bottom. Storing AI coordinates without conversion inverts every field position. Write a `aiCoordsToPagePdfSpace()` conversion utility with a unit test asserting known PDF-space x/y values against a real Utah REPC before any OpenAI call is made. +2. **Race condition — two simultaneous last signers both trigger PDF assembly** — atomic token claim does not prevent two concurrent handlers from both detecting zero remaining tokens. Fix: add `completionTriggeredAt TIMESTAMP` to `documents`; use `UPDATE WHERE completionTriggeredAt IS NULL RETURNING` guard — same pattern as the existing token claim. Zero rows returned means another handler already won; skip assembly. -3. **Agent-signature field sent unfiltered to client** — `/src/app/api/sign/[token]/route.ts` line 88 returns `doc.signatureFields ?? []` without type filtering. When `agent-signature` fields are in that array, the client sees them as required unsigned fields. Add type filter before any agent-signed document is sent. +3. **`NEXT_PUBLIC_BASE_URL` baked at Docker build time, not runtime** — signing URLs will contain `localhost:3000` in production if this variable is not renamed before building the image. Fix: rename to `SIGNING_BASE_URL` (no `NEXT_PUBLIC_` prefix); inject at container runtime via `env_file:`; the send route is server-side only and never needs a public variable. -4. **Stale preview after field changes** — preview PDF written to a deterministic path gets cached; agent sends a document based on a stale preview. Use versioned preview paths (`{docId}_preview_{timestamp}.pdf`) and disable Send when fields have changed since last preview generation. +4. **`@napi-rs/canvas` native binary incompatible with Alpine Docker images** — `node:alpine` uses musl libc; `@napi-rs/canvas` only ships glibc prebuilt binaries. The module fails with `invalid ELF header` at runtime. Fix: use `node:20-slim` (Debian); build with explicit `--platform linux/amd64` when deploying to x86 from an ARM Mac. -5. **OpenAI token limits on multi-page Utah forms** — Utah standard forms are 10-30 pages; full text extraction fits in ~2,000-8,000 tokens (within gpt-4o-mini's 128k context). Risk: testing only with 2-3 page PDFs in development. Prevention: test AI pipeline with the full Utah REPC (20+ pages) before shipping. +5. **Uploads directory ephemeral in Docker** — all PDFs are written to `process.cwd()/uploads` inside the container's writable layer. Container recreation (deployment, crash) permanently deletes all documents including legally executed signed copies. Fix: mount a named Docker volume at `/app/uploads` in `docker-compose.yml` before the first production upload. + +6. **SMTP env vars absent in container** — `CONTACT_SMTP_HOST` etc. exist in `.env.local` on the dev host but are invisible to Docker unless explicitly injected. Nodemailer `createTransporter()` is called at send time, not startup, so the missing vars produce no startup error — only silent email failure at first send. Fix: `env_file: .env.production` in docker-compose.yml. Verify with `docker exec printenv CONTACT_SMTP_HOST`. + +7. **Legacy `signingTokens` rows break if `signerEmail` added as NOT NULL** — existing rows have no signer email. Fix: add column as nullable (`TEXT`); backfill existing rows in the migration using a JOIN to the client email; all signing code handles the null legacy case. + +8. **Neon connection pool exhaustion** — `postgres(url)` defaults to 10 connections, which saturates Neon free tier entirely. Fix: `postgres(url, { max: 5, idle_timeout: 20, connect_timeout: 10 })` in `db/index.ts`. + +--- ## Implications for Roadmap -The architecture research provides an explicit 8-step build order based on hard dependencies. This maps directly to 5 phases. +Based on combined research, a five-phase structure is recommended: -### Phase 1: Schema Foundation + Signing Page Safety +### Phase 1: Complete v1.1 — AI Field Placement and Expanded Field Types -**Rationale:** The single most dangerous change in v1.1 is adding field types to a schema the client signing page does not handle. Any document with mixed field types sent before the signing page is updated is a HIGH-recovery-cost production incident. Must be first, before any other v1.1 work. -**Delivers:** Extended `DocumentField` discriminated union in `schema.ts` with backward-compatible fallback for v1.0 documents (`type ?? 'client-signature'`); two new nullable DB columns (`agentSignatureData` on users, `propertyAddress` on clients); Drizzle migration; updated `SigningPageClient.tsx` and `POST /api/sign/[token]` with type-based field filtering. -**Addresses:** Foundation for all expanded field types; agent-signature client exposure risk -**Avoids:** Pitfall 1 (signing page crash on new field types), Pitfall 10 (agent-sig field shown to client as required unsigned field) -**Research flag:** None needed — Drizzle discriminated union and nullable column additions are well-documented; two-line ALTER TABLE migration. +**Rationale:** v1.1 work is in progress (MEMORY.md confirms active debugging of AI field classification). This phase finishes in-flight work before adding multi-signer complexity on top of an unstable foundation. +**Delivers:** Working AI field detection pipeline (text extraction + GPT-4.1 classification + post-processing rules), all five field types (checkbox, initials, date, agent signature, text), agent saved signature, filled document preview, agent signs first workflow, property address on client profile, AI pre-fill from client profile. +**Addresses:** All FEATURES.md P1 items. +**Avoids:** Vision-based coordinate inference (use text extraction only); `zodResponseFormat` with Zod v4 (use manual JSON schema). +**Research flag:** No additional research needed — patterns are implemented and being debugged. -### Phase 2: Agent Saved Signature + Agent Signing Workflow +### Phase 2: Multi-Signer Schema Foundation -**Rationale:** Agent signature is a prerequisite for the agent-signs-first workflow, which is a prerequisite for the filled preview (preview only makes sense after agent has signed). Agent signature embed also establishes the PNG embed pattern in `prepare-document.ts` that informs how other field types are handled. -**Delivers:** `GET/PUT /api/agent/signature` routes; `AgentSignaturePanel` component (draw + save + thumbnail); extended `prepare-document.ts` to embed agent-sig PNG at field coordinates; `FieldPlacer` palette token for agent-signature type; supersede-and-resend flow guard preventing re-preparation of sent/viewed documents without user confirmation. -**Uses:** `signature_pad@5.1.3` (existing), `@cantoo/pdf-lib@2.6.3` (existing), `users.agentSignatureData TEXT` column (Phase 1) -**Avoids:** Pitfall 5 (signature stored as dataURL in DB is correct — TEXT column is right for 2-8KB), Pitfall 6 (race condition on re-preparation), Pitfall 10 (agent-sig filtered from client fields via Phase 1 foundation) -**Research flag:** None needed — draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are DB column and API route. +**Rationale:** Schema changes are additive and backward-compatible. They must be deployed and validated against production Neon before any multi-signer sending or signing code is written. This phase has no user-visible effect — it is purely infrastructure. +**Delivers:** Migration `drizzle/0010_multi_signer.sql`: `signer_email` on `signingTokens` (nullable), `signers` JSONB on `documents`, `completionTriggeredAt` on `documents`, `viewedAt` on `signingTokens`, three new `auditEventTypeEnum` values. TypeScript additions: `DocumentSigner` interface, `signerEmail` on `SignatureFieldData`, `getSignerEmail()` helper. +**Avoids:** Legacy token breakage (nullable column), race condition setup (completionTriggeredAt column ready), enum ADD VALUE transaction issues (statement-breakpoint between each ALTER). +**Research flag:** Standard Drizzle migration patterns — no additional research needed. -### Phase 3: Expanded Field Types End-to-End +### Phase 3: Multi-Signer Backend (Token Creation, Signing Flow, Completion) -**Rationale:** Phase 1 made the schema and signing page safe. Phase 2 established the PNG embed pattern in `prepare-document.ts`. Now extend the field placement UI and prepare pipeline to handle all five new field types. Completing this phase gives the agent a fully functional field system without any AI dependency. -**Delivers:** Five new draggable palette tokens in `FieldPlacer.tsx` (text, checkbox, initials, date, agent-signature); type-aware rendering in `prepare-document.ts` (text stamp, checkbox embed, date auto-stamp, initials placeholder); `propertyAddress` field in `ClientModal` and clients server action; field type coverage from placement through to embedded PDF. -**Addresses:** All P1 table stakes: initials, date, checkbox, text field types -**Avoids:** Pitfall 1 (signing page hardened in Phase 1 before these types can be placed and sent) -**Research flag:** None needed — all APIs are in existing `@cantoo/pdf-lib@2.6.3`. +**Rationale:** Backend-complete before UI. The signing POST rewrite is the highest-complexity change and must be independently testable via API calls before any UI work begins. +**Delivers:** Updated `createSigningToken(docId, signerEmail?)`, signer-aware field filtering in signing GET (null signerEmail = legacy path), accumulator PDF pattern + advisory lock in signing POST, completion detection via unclaimed-token count + `completionTriggeredAt` guard, `document_completed` audit event, `sendAllSignersCompletionEmail()`, updated send route per-signer token loop. All changes preserve the legacy single-signer path. +**Uses:** `pg_advisory_xact_lock(hashtext(documentId))` for concurrent PDF write protection, existing `embedSignatureInPdf()` unchanged, existing `logAuditEvent()` and `sendAgentNotificationEmail()`. +**Avoids:** Premature completion (token count check), race condition (completionTriggeredAt guard), audit trail gap (signerEmail in event metadata), partial-send failure (individual email failure must not void already-created tokens). +**Research flag:** The advisory lock interaction with Drizzle transactions may warrant a focused research pass if the developer is unfamiliar with Postgres advisory locks. -### Phase 4: Filled Document Preview +### Phase 4: Multi-Signer UI (PreparePanel + FieldPlacer) -**Rationale:** Preview depends on the fully extended `preparePdf` from Phase 3 and agent signing from Phase 2. It is a composition of previous phases — build it after those foundations are solid. -**Delivers:** `POST /api/documents/[id]/preview` route; `PreviewModal` component with in-app react-pdf rendering; versioned preview path with staleness detection; Send button disabled when fields changed since last preview; Back-to-edit flow; prepared PDF hashed at prepare time (extend existing `pdfHash` pattern). -**Uses:** Existing `preparePdf` (reused unchanged), `react-pdf@10.4.1` (existing), ArrayBuffer copy pattern for react-pdf detachment bug -**Avoids:** Pitfall 7 (stale preview), Pitfall 8 (OOM — generate-once, serve-cached pattern), Pitfall 9 (client signs different doc than agent previewed — hash verification) -**Research flag:** Deployment target should be confirmed before implementation — the write-to-local-`uploads/` preview pattern fails on Vercel serverless (ephemeral filesystem). If deployed to Vercel, preview must write to Vercel Blob instead. +**Rationale:** UI changes come last — they have no downstream dependencies and are safe to build once the backend is fully operational and tested. +**Delivers:** Multi-signer list entry in PreparePanel (name + email rows, add/remove, replaces single email textarea), `PUT /api/documents/[id]/signers` endpoint, per-field signer email selector in FieldPlacer, color-coded field markers by signer, send-block validation (all client-visible fields must have signerEmail before send is allowed), agent dashboard showing per-signer status. +**Avoids:** Chicken-and-egg ordering problem (signer list is saved to `documents.signers` before FieldPlacer is loaded; enforce in UI). +**Research flag:** Standard React and Next.js API patterns — no additional research needed. -### Phase 5: AI Field Placement + Pre-fill +### Phase 5: Docker Production Deployment + SMTP Fix -**Rationale:** AI is the highest-complexity feature and depends on field types being fully placeable (Phase 3) and the FieldPlacer accepting `DocumentField[]` from an external source. Building last means the agent can use manual placement throughout earlier phases. AI placement is an enhancement of the field system, not a replacement. -**Delivers:** `lib/ai/extract-text.ts` (pdfjs-dist legacy build, server-only); `lib/ai/field-placement.ts` (GPT-4o-mini structured output, manual JSON schema, `server-only` guard); `POST /api/documents/[id]/ai-prepare` route with coordinate conversion utility + unit test; "AI Auto-place" button in PreparePanel with loading state and agent review step; AI pre-fill of text fields from client profile data. -**Uses:** `openai@^6.32.0` (new install), pdfjs-dist legacy build (existing), gpt-4o-mini (sufficient for structured label extraction; ~15x cheaper than gpt-4o) -**Avoids:** Pitfall 2 (coordinate mismatch — unit-tested conversion utility against known Utah REPC before shipping), Pitfall 3 (token limits — full-form test required), Pitfall 4 (hallucination — Zod validation of AI response before any field is stored; explicit enum for field types in JSON schema) -**Research flag:** Requires integration test with real 20-page Utah REPC before shipping. Also validate that gpt-4o-mini text extraction accuracy on Utah standard forms (which have predictable label patterns) meets the 90%+ threshold claimed in research. +**Rationale:** Deployment comes after all application features are complete and tested locally. The SMTP fix and `NEXT_PUBLIC_BASE_URL` rename must be verified before multi-signer completion emails are tested in production. +**Delivers:** Three-stage Dockerfile (`node:20-slim`, Next.js `output: 'standalone'`), `docker-compose.yml` with named uploads volume + Neon connection via `env_file:`, `NEXT_PUBLIC_BASE_URL` renamed to `SIGNING_BASE_URL`, Neon connection pool limits (`max: 5`) in `db/index.ts`, `/api/health` endpoint, `.dockerignore`, SMTP transporter consolidated to shared utility, deployment runbook. +**Avoids:** All Docker pitfalls: ephemeral uploads (named volume), wrong base image (`node:20-slim` not alpine), build-time URL baking (rename variable), SMTP failure (env_file + verification step), connection exhaustion (explicit max), startup order failure (healthcheck + depends_on if using local Postgres). +**Research flag:** Standard patterns — official Next.js Docker documentation covers the three-stage standalone pattern precisely. ### Phase Ordering Rationale -- Phase 1 is a safety gate — deploy it before any document with new field types can be created or sent -- Phase 2 before Phase 3 because `prepare-document.ts` needs the agent-sig embed pattern established before adding the full type-aware rendering switch -- Phase 3 before Phase 4 because preview calls `preparePdf` — incomplete field type handling in prepare means an incomplete preview -- Phase 5 last because it enhances a complete field system; agents can use manual placement throughout all earlier phases; no blocking dependency -- The agent-signature field filtering (Pitfall 10) is addressed in Phase 1, not Phase 2 — this is deliberate; the signing route must be hardened before the first agent-sig field can be placed and sent +- v1.1 must complete before v1.2 begins: multi-signer adds schema complexity on top of the field placement and signing pipeline; debugging both simultaneously is high-risk. +- Schema before backend: multi-signer schema changes are additive and can deploy to production Neon independently; backend code written against the old schema cannot be safely run. +- Backend before UI: the signing POST rewrite is the load-bearing change; FieldPlacer signer assignment has no effect until token and completion logic is correct. +- Deployment last: Docker work is independent of application features but should be done with a complete, tested application to avoid conflating feature bugs with deployment bugs. ### Research Flags -**Needs deeper research during planning:** -- **Phase 5 (AI):** The coordinate conversion from percentage to PDF user-space points needs a concrete unit test against a known Utah REPC before implementation. Validate pdfjs-dist legacy build text extraction works correctly in the project's actual Node 20 / Next.js 16.2 environment. -- **Phase 4 (Preview):** Deployment target (Vercel serverless vs. self-hosted container) determines whether preview files can use the local `uploads/` filesystem or must use Vercel Blob. Confirm before writing the preview route. +Phases likely needing `/gsd:research-phase` during planning: +- **Phase 3 (Multi-Signer Backend):** Advisory lock pattern and Drizzle transaction interaction may need a targeted research pass for developers unfamiliar with `pg_advisory_xact_lock`. -**Standard patterns (skip research-phase):** -- **Phase 1 (Schema):** Drizzle discriminated union extension and nullable column additions are well-documented; two-line ALTER TABLE migration. -- **Phase 2 (Agent Signature):** The draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are a DB column and API route. -- **Phase 3 (Field Types):** All field type APIs are in existing `@cantoo/pdf-lib@2.6.3`; no new library research needed. +Phases with standard patterns (skip research-phase): +- **Phase 1 (v1.1):** Already researched and partially implemented. +- **Phase 2 (Schema):** Drizzle migration patterns are routine. +- **Phase 4 (Multi-Signer UI):** Standard React + API patterns. +- **Phase 5 (Docker):** Officially documented Next.js standalone Docker pattern. + +--- ## Confidence Assessment | Area | Confidence | Notes | |------|------------|-------| -| Stack | HIGH | All versions verified via npm registry; OpenAI Zod v4 incompatibility confirmed via open GitHub issues #1540, #1602, #1709; pdfjs-dist server-side usage confirmed via actual codebase inspection | -| Features | HIGH for field types and signing flows; MEDIUM for AI field detection accuracy | Field behavior confirmed against DocuSign, dotloop, SkySlope docs; AI coordinate accuracy confirmed via Feb 2025 benchmarks (< 3% pixel accuracy from vision); actual accuracy on Utah forms is untested | -| Architecture | HIGH | Based on actual v1.0 codebase review (not speculative); specific file names, function names, and line numbers cited throughout; build order confirmed by dependency analysis | -| Pitfalls | HIGH | All pitfalls grounded in actual codebase inspection; specific file paths and line numbers identified (e.g., sign route line 88); no speculative claims | +| Stack | HIGH | All decisions code-verified against `package.json` and running source files. Zero speculative library choices. Zod v4 / zodResponseFormat incompatibility confirmed via open GitHub issues. | +| Features | HIGH | Industry analysis of DocuSign, dotloop, SkySlope DigiSign confirms field type behavior and agent-signs-first convention. AI coordinate accuracy from February 2025 published benchmarks. | +| Architecture | HIGH | Multi-signer design based on direct inspection of schema.ts, signing route, send route. Advisory lock pattern is Postgres-standard. OpenSign reference confirms the data model approach. | +| Pitfalls | HIGH | All pitfalls cite specific file paths and code from the v1.1 codebase. NEXT_PUBLIC bake-at-build confirmed against Next.js official docs. No speculative claims. | **Overall confidence:** HIGH ### Gaps to Address -- **AI coordinate accuracy on real Utah forms:** Research confirms the text-extraction + label-matching approach is correct, but accuracy on actual Utah REPC and listing agreement forms is untested. Phase 5 must include an integration test with real forms before the feature ships. -- **Preview file lifecycle in production:** The `_preview_{timestamp}.pdf` pattern creates unbounded file growth in `uploads/`. A cleanup strategy (delete previews older than 24 hours, or delete on document send) needs to be decided before Phase 4 implementation. -- **Deployment target for preview writes:** The write-to-disk preview pattern silently fails on Vercel serverless (ephemeral filesystem). Confirm whether the app runs on Vercel serverless or a persistent container before implementing Phase 4. +- **`@vercel/blob` dead dependency:** Installed in `package.json` but not used anywhere in the codebase. Risks accidental use in future code that would silently fail in a non-Vercel Docker deployment. Decision needed before Phase 5: remove the package, or document it as unused and ensure it stays that way. +- **Nodemailer port mismatch between mailers:** `signing-mailer.tsx` defaults to port 465 (implicit TLS) and `contact-mailer.ts` defaults to port 587 (STARTTLS) when `CONTACT_SMTP_PORT` is not set. A shared `createSmtpTransporter()` utility is needed; this should be addressed in Phase 5 before the first production email test. +- **AI accuracy on non-standard forms:** The 90%+ accuracy expectation applies to Utah standard forms with consistent label patterns. The system's behavior on commercial addenda, non-standard addenda, or scanned PDFs is untested. The graceful fallback (manual drag-drop with a clear message) handles the failure case, but real-world accuracy across Teressa's full form library should be validated in Phase 1 before committing to the AI-first workflow. +- **Local Postgres vs. Neon in Docker:** The research and architecture assume Neon as the production database (external managed service). If a local `postgres` Docker service is substituted, the `depends_on: service_healthy` pattern documented in ARCHITECTURE.md applies. The current research covers the Neon path only. + +--- ## Sources ### Primary (HIGH confidence) -- `src/lib/db/schema.ts` (actual codebase, inspected 2026-03-21) — `SignatureFieldData` has no `type` field confirmed -- `src/app/api/sign/[token]/route.ts` line 88 (actual codebase) — unfiltered `signatureFields` sent to client confirmed -- `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` (actual codebase) — single "Signature" token; `screenToPdfCoords` Y-inversion pattern confirmed -- [openai npm](https://www.npmjs.com/package/openai) — v6.32.0 confirmed, Node 20 requirement -- [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual json_schema format confirmed +- Direct code audit: `src/lib/ai/extract-text.ts`, `src/lib/ai/field-placement.ts`, `src/lib/db/schema.ts`, `src/lib/signing/signing-mailer.tsx`, `src/lib/db/index.ts`, `next.config.ts`, `package.json` — architecture verification and pitfall specifics +- [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual JSON schema format confirmed - [openai-node Issue #1540](https://github.com/openai/openai-node/issues/1540) — zodResponseFormat broken with Zod v4 - [openai-node Issue #1602](https://github.com/openai/openai-node/issues/1602) — zodTextFormat broken with Zod v4 -- [openai-node Issue #1709](https://github.com/openai/openai-node/issues/1709) — Zod 4.1.13+ discriminated union break -- [@cantoo/pdf-lib npm](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3; createTextField, createCheckBox, drawImage APIs confirmed -- [react-pdf ArrayBuffer detach issue #1657](https://github.com/wojtekmaj/react-pdf/issues/1657) — ArrayBuffer copy workaround confirmed -- [Vercel Serverless Function Limits](https://vercel.com/docs/functions/runtimes/node-js#memory-and-compute) — 256MB default memory, 60s max execution on Pro -- [Utah Division of Real Estate — State Approved Forms](https://realestate.utah.gov/real-estate/forms/state-approved/) — REPC form structure context +- [@cantoo/pdf-lib npm page](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3 field type API +- [signature_pad GitHub](https://github.com/szimek/signature_pad) — v5.1.3 canvas API +- [Docker Official Next.js Containerize Guide](https://docs.docker.com/guides/nextjs/containerize/) — three-stage Dockerfile +- [Next.js with-docker official example](https://github.com/vercel/next.js/blob/canary/examples/with-docker/Dockerfile) — standalone mode pattern +- [Next.js env var classification](https://nextjs.org/docs/pages/guides/environment-variables) — NEXT_PUBLIC_ bake-at-build-time behavior +- [Docker Compose Secrets — Official Docs](https://docs.docker.com/compose/how-tos/use-secrets/) — env_file vs secrets decision +- [OpenSign GitHub Repository](https://github.com/OpenSignLabs/OpenSign) — multi-signer reference implementation (signers array in document record) ### Secondary (MEDIUM confidence) -- [Edge AI and Vision Alliance — SAM 2 + GPT-4o (Feb 2025)](https://www.edge-ai-vision.com/2025/02/sam-2-gpt-4o-cascading-foundation-models-via-visual-prompting-part-2/) — GPT-4o returns accurate bounding box coordinates in < 3% of attempts -- [Instafill.ai — Real estate law flat PDF form automation (Feb 2026)](https://blog.instafill.ai/2026/02/18/case-study-real-estate-law-flat-pdf-form-automation/) — hybrid text-extraction + LLM approach confirmed as production pattern -- [DocuSign community — routing order for real estate](https://community.docusign.com/esignature-111/prefill-fields-before-sending-envelope-for-signature-180) — agent order 1, client order 2 confirmed -- [Dotloop support — date auto-stamp behavior](https://support.dotloop.com/hc/en-us/articles/217936457-Adding-Signatures-or-Initials-to-Locked-Templates) — date field auto-stamp pattern confirmed -- [DocuSign community — Date Signed field](https://community.docusign.com/esignature-111/am-i-able-to-auto-populate-the-date-field-2271) — read-only auto-populated date confirmed +- [Next.js Standalone Docker Mode — DEV Community (2025)](https://dev.to/angojay/optimizing-nextjs-docker-images-with-standalone-mode-2nnh) — image size benchmarks (~300 MB vs ~7 GB) +- [Docker DNS EAI_AGAIN — dev.to (2025)](https://dev.to/ameer-pk/beyond-it-works-on-my-machine-solving-docker-networking-dns-bottlenecks-4f3m) — DNS resolution behavior in Docker bridge network +- [Docker Compose Secrets: What Works, What Doesn't](https://www.bitdoze.com/docker-compose-secrets/) — plain Compose vs Swarm secrets security comparison +- Published AI bounding box accuracy benchmarks (February 2025) — under 3% accuracy for vision-based coordinate inference from PDF images +- [Nodemailer Docker EAI_AGAIN — Docker Forums](https://forums.docker.com/t/not-able-to-send-email-using-nodemailer-within-docker-container-due-to-eai-again/40649) — root cause confirmed as env var injection, not DNS --- -*Research completed: 2026-03-21* + +*Research completed: 2026-04-03* +*Milestone coverage: v1.1 (AI field placement + expanded field types) + v1.2 (multi-signer + Docker)* *Ready for roadmap: yes*