Files
red/.planning/research/SUMMARY.md
2026-03-21 11:28:42 -06:00

21 KiB

Project Research Summary

Project: Teressa Copeland Homes — v1.1 Smart Document Preparation Domain: Real estate agent website + PDF document signing portal Researched: 2026-03-21 Confidence: HIGH

Executive Summary

This is a v1.1 feature expansion of an existing, working Next.js 15 real estate document signing app. The v1.0 codebase is already validated — it uses Drizzle ORM, local PostgreSQL, @cantoo/pdf-lib for PDF writing, react-pdf for client-side rendering, Auth.js v5, and signature_pad for canvas signatures. The v1.1 additions are: AI-assisted field placement via GPT-4o-mini, five new field types (text, checkbox, initials, date, agent-signature), agent saved signature with a draw-once-reuse workflow, and a filled document preview before sending. The minimal dependency delta is two new packages: openai@^6.32.0 and optionally unpdf@^1.4.0 — though pdfjs-dist is already installed as a transitive dependency of react-pdf and can serve the server-side text extraction role via its legacy build.

The recommended build order is anchored by a schema-first phase. The SignatureFieldData type currently has no type discriminant — every field is treated identically as a client signature. Adding new field types without simultaneously updating both the schema AND the client signing page would break any in-flight signing session. The architecture research maps out an explicit 8-step dependency chain. For AI field placement, the correct approach uses pdfjs-dist for server-side text extraction (not vision), then GPT-4o-mini for semantic label classification — raw vision-based bounding box inference returns accurate coordinates less than 3% of the time. The OpenAI integration must use a manually defined JSON schema for structured output; the zodResponseFormat helper is broken with Zod v4 (confirmed open bug).

The key risk cluster is around the AI coordinate pipeline and signing page integrity. OpenAI returns percentage-based coordinates; @cantoo/pdf-lib expects PDF user-space points with a bottom-left origin — a Y-axis inversion that will silently produce wrong field positions without a dedicated conversion utility and unit test. A second risk is that agent-signature fields must be filtered from the signatureFields array sent to clients — the exact unguarded line (/src/app/api/sign/[token]/route.ts line 88) is identified in pitfalls research. Preview PDFs must use versioned paths separate from the final prepared PDF to maintain legal integrity between what the agent reviewed and what the client signs.

Key Findings

The v1.0 stack is unchanged and validated. See STACK.md for full version details.

New dependencies for v1.1:

  • openai@^6.32.0: Official SDK, TypeScript-native structured output for GPT-4o-mini — use manual json_schema response_format, NOT zodResponseFormat (broken with Zod v4, confirmed open GitHub issues #1540, #1602, #1709)
  • pdfjs-dist legacy build (already installed): Server-side PDF text extraction via pdfjs-dist/legacy/build/pdf.mjs — no new dependency needed if using this path

Existing stack components covering all v1.1 needs:

  • @cantoo/pdf-lib@2.6.3: All five new field types (text, checkbox, initials, date, agent-signature) supported natively via createTextField, createCheckBox, drawImage APIs
  • signature_pad@5.1.3: Agent signature canvas — use useRef<HTMLCanvasElement> + useEffect pattern directly; do NOT add react-signature-canvas (alpha wrapper)
  • react-pdf@10.4.1: Filled preview rendering — pass ArrayBuffer directly; copy the buffer before passing to avoid detachment issue (known bug #1657)
  • @vercel/blob@2.3.1 + Drizzle ORM: Agent signature storage — architecture research recommends TEXT column on users table for 2-8KB base64 PNG; no new file storage needed

Expected Features

All v1.1 features are P1 (must-have for launch). Research confirms the full feature set is aligned with industry standard behavior across DocuSign, dotloop, and SkySlope DigiSign.

Must have (table stakes):

  • Initials field type — every Utah standard form (REPC, listing agreement, addenda) has per-page initials lines; missing this makes the app unusable for standard Utah workflows
  • Date field (auto-stamp, read-only) — "Date Signed" pattern; auto-populated at signing session completion; client never types a date; legally important
  • Checkbox field type — Utah REPC uses boolean checkboxes throughout (mediation clauses, contingency elections, disclosure acknowledgments)
  • Agent saved signature — draw once, reuse across documents; the "Adopted Signature" pattern in every major real estate e-sig tool
  • Agent signs first workflow — industry convention: agent at routing order 1, client at routing order 2; confirmed by DocuSign community docs
  • Filled document preview with Send gating — prevents the most-cited mistake (sending wrong document version); Send button lives in preview

Should have (differentiators):

  • AI field placement via gpt-4o-mini + text extraction — eliminates manual drag-drop session; accuracy 90%+ on structured Utah forms with predictable label patterns ("Buyer's Signature", "Date", "Initial Here")
  • AI pre-fill from client profile — maps client name, email, property address to text fields; low hallucination risk (structured profile data, not free-text inference)
  • Property address field on client profile — enables AI pre-fill to be property-specific; simple schema addition

Defer to v1.2+:

  • AI confidence display to agent — adds UI noise; agent can see and correct in preview instead
  • Template save from AI placement — high value but requires template management UI; defer until AI accuracy is validated
  • Multiple agent signature fields per document — needs UX design; defer

Architecture Approach

The v1.1 architecture is an incremental extension of the existing system — not a rewrite. Seven new files are created (two server-only AI lib files, three API routes, two client components). Eight existing files are modified with targeted additions. The critical architectural constraint: the existing client signing flow (embed-signature.ts, signing token route, SignatureModal.tsx) must not be altered. Agent-sig and text/checkbox/date fields are baked into the prepared PDF before the client opens the signing link. The client signing page handles only client-signature and initials field types.

See ARCHITECTURE.md for complete component boundaries, data flow diagrams, and the full 8-step build order.

Major components:

  1. lib/ai/extract-text.ts + lib/ai/field-placement.ts (NEW, server-only) — pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output with manual JSON schema; server-only import guard prevents accidental client bundle inclusion
  2. POST /api/documents/[id]/ai-prepare (NEW) — orchestrates extract + AI call + coordinate conversion (percentage to PDF points using actual page dimensions)
  3. GET/PUT /api/agent/signature (NEW) — stores agent signature as base64 PNG TEXT column on users table; always auth-gated
  4. POST /api/documents/[id]/preview (NEW) — reuses existing preparePdf in preview mode; writes to versioned _preview_{timestamp}.pdf; streams bytes directly; never overwrites final prepared PDF
  5. Extended FieldPlacer.tsx palette — five new draggable tokens; existing drag/move/resize/persist mechanics unchanged
  6. Extended prepare-document.ts — type-aware rendering switch for all six field types; existing client-signature path unchanged

Critical Pitfalls

  1. Breaking the signing page with new field typesSigningPageClient.tsx opens the signature modal for every field in signatureFields with no type branching. Adding new field types without updating the signing page in the same deployment breaks active signing sessions. Ship schema + signing page filter as one atomic deployment, before any other v1.1 work.

  2. AI coordinate Y-axis inversion — AI returns percentages from top-left; @cantoo/pdf-lib uses PDF user-space with Y=0 at bottom. Storing AI coordinates without conversion inverts every field position. Write a aiCoordsToPagePdfSpace() conversion utility with a unit test asserting known PDF-space x/y values against a real Utah REPC before any OpenAI call is made.

  3. Agent-signature field sent unfiltered to client/src/app/api/sign/[token]/route.ts line 88 returns doc.signatureFields ?? [] without type filtering. When agent-signature fields are in that array, the client sees them as required unsigned fields. Add type filter before any agent-signed document is sent.

  4. Stale preview after field changes — preview PDF written to a deterministic path gets cached; agent sends a document based on a stale preview. Use versioned preview paths ({docId}_preview_{timestamp}.pdf) and disable Send when fields have changed since last preview generation.

  5. OpenAI token limits on multi-page Utah forms — Utah standard forms are 10-30 pages; full text extraction fits in ~2,000-8,000 tokens (within gpt-4o-mini's 128k context). Risk: testing only with 2-3 page PDFs in development. Prevention: test AI pipeline with the full Utah REPC (20+ pages) before shipping.

Implications for Roadmap

The architecture research provides an explicit 8-step build order based on hard dependencies. This maps directly to 5 phases.

Phase 1: Schema Foundation + Signing Page Safety

Rationale: The single most dangerous change in v1.1 is adding field types to a schema the client signing page does not handle. Any document with mixed field types sent before the signing page is updated is a HIGH-recovery-cost production incident. Must be first, before any other v1.1 work. Delivers: Extended DocumentField discriminated union in schema.ts with backward-compatible fallback for v1.0 documents (type ?? 'client-signature'); two new nullable DB columns (agentSignatureData on users, propertyAddress on clients); Drizzle migration; updated SigningPageClient.tsx and POST /api/sign/[token] with type-based field filtering. Addresses: Foundation for all expanded field types; agent-signature client exposure risk Avoids: Pitfall 1 (signing page crash on new field types), Pitfall 10 (agent-sig field shown to client as required unsigned field) Research flag: None needed — Drizzle discriminated union and nullable column additions are well-documented; two-line ALTER TABLE migration.

Phase 2: Agent Saved Signature + Agent Signing Workflow

Rationale: Agent signature is a prerequisite for the agent-signs-first workflow, which is a prerequisite for the filled preview (preview only makes sense after agent has signed). Agent signature embed also establishes the PNG embed pattern in prepare-document.ts that informs how other field types are handled. Delivers: GET/PUT /api/agent/signature routes; AgentSignaturePanel component (draw + save + thumbnail); extended prepare-document.ts to embed agent-sig PNG at field coordinates; FieldPlacer palette token for agent-signature type; supersede-and-resend flow guard preventing re-preparation of sent/viewed documents without user confirmation. Uses: signature_pad@5.1.3 (existing), @cantoo/pdf-lib@2.6.3 (existing), users.agentSignatureData TEXT column (Phase 1) Avoids: Pitfall 5 (signature stored as dataURL in DB is correct — TEXT column is right for 2-8KB), Pitfall 6 (race condition on re-preparation), Pitfall 10 (agent-sig filtered from client fields via Phase 1 foundation) Research flag: None needed — draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are DB column and API route.

Phase 3: Expanded Field Types End-to-End

Rationale: Phase 1 made the schema and signing page safe. Phase 2 established the PNG embed pattern in prepare-document.ts. Now extend the field placement UI and prepare pipeline to handle all five new field types. Completing this phase gives the agent a fully functional field system without any AI dependency. Delivers: Five new draggable palette tokens in FieldPlacer.tsx (text, checkbox, initials, date, agent-signature); type-aware rendering in prepare-document.ts (text stamp, checkbox embed, date auto-stamp, initials placeholder); propertyAddress field in ClientModal and clients server action; field type coverage from placement through to embedded PDF. Addresses: All P1 table stakes: initials, date, checkbox, text field types Avoids: Pitfall 1 (signing page hardened in Phase 1 before these types can be placed and sent) Research flag: None needed — all APIs are in existing @cantoo/pdf-lib@2.6.3.

Phase 4: Filled Document Preview

Rationale: Preview depends on the fully extended preparePdf from Phase 3 and agent signing from Phase 2. It is a composition of previous phases — build it after those foundations are solid. Delivers: POST /api/documents/[id]/preview route; PreviewModal component with in-app react-pdf rendering; versioned preview path with staleness detection; Send button disabled when fields changed since last preview; Back-to-edit flow; prepared PDF hashed at prepare time (extend existing pdfHash pattern). Uses: Existing preparePdf (reused unchanged), react-pdf@10.4.1 (existing), ArrayBuffer copy pattern for react-pdf detachment bug Avoids: Pitfall 7 (stale preview), Pitfall 8 (OOM — generate-once, serve-cached pattern), Pitfall 9 (client signs different doc than agent previewed — hash verification) Research flag: Deployment target should be confirmed before implementation — the write-to-local-uploads/ preview pattern fails on Vercel serverless (ephemeral filesystem). If deployed to Vercel, preview must write to Vercel Blob instead.

Phase 5: AI Field Placement + Pre-fill

Rationale: AI is the highest-complexity feature and depends on field types being fully placeable (Phase 3) and the FieldPlacer accepting DocumentField[] from an external source. Building last means the agent can use manual placement throughout earlier phases. AI placement is an enhancement of the field system, not a replacement. Delivers: lib/ai/extract-text.ts (pdfjs-dist legacy build, server-only); lib/ai/field-placement.ts (GPT-4o-mini structured output, manual JSON schema, server-only guard); POST /api/documents/[id]/ai-prepare route with coordinate conversion utility + unit test; "AI Auto-place" button in PreparePanel with loading state and agent review step; AI pre-fill of text fields from client profile data. Uses: openai@^6.32.0 (new install), pdfjs-dist legacy build (existing), gpt-4o-mini (sufficient for structured label extraction; ~15x cheaper than gpt-4o) Avoids: Pitfall 2 (coordinate mismatch — unit-tested conversion utility against known Utah REPC before shipping), Pitfall 3 (token limits — full-form test required), Pitfall 4 (hallucination — Zod validation of AI response before any field is stored; explicit enum for field types in JSON schema) Research flag: Requires integration test with real 20-page Utah REPC before shipping. Also validate that gpt-4o-mini text extraction accuracy on Utah standard forms (which have predictable label patterns) meets the 90%+ threshold claimed in research.

Phase Ordering Rationale

  • Phase 1 is a safety gate — deploy it before any document with new field types can be created or sent
  • Phase 2 before Phase 3 because prepare-document.ts needs the agent-sig embed pattern established before adding the full type-aware rendering switch
  • Phase 3 before Phase 4 because preview calls preparePdf — incomplete field type handling in prepare means an incomplete preview
  • Phase 5 last because it enhances a complete field system; agents can use manual placement throughout all earlier phases; no blocking dependency
  • The agent-signature field filtering (Pitfall 10) is addressed in Phase 1, not Phase 2 — this is deliberate; the signing route must be hardened before the first agent-sig field can be placed and sent

Research Flags

Needs deeper research during planning:

  • Phase 5 (AI): The coordinate conversion from percentage to PDF user-space points needs a concrete unit test against a known Utah REPC before implementation. Validate pdfjs-dist legacy build text extraction works correctly in the project's actual Node 20 / Next.js 16.2 environment.
  • Phase 4 (Preview): Deployment target (Vercel serverless vs. self-hosted container) determines whether preview files can use the local uploads/ filesystem or must use Vercel Blob. Confirm before writing the preview route.

Standard patterns (skip research-phase):

  • Phase 1 (Schema): Drizzle discriminated union extension and nullable column additions are well-documented; two-line ALTER TABLE migration.
  • Phase 2 (Agent Signature): The draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are a DB column and API route.
  • Phase 3 (Field Types): All field type APIs are in existing @cantoo/pdf-lib@2.6.3; no new library research needed.

Confidence Assessment

Area Confidence Notes
Stack HIGH All versions verified via npm registry; OpenAI Zod v4 incompatibility confirmed via open GitHub issues #1540, #1602, #1709; pdfjs-dist server-side usage confirmed via actual codebase inspection
Features HIGH for field types and signing flows; MEDIUM for AI field detection accuracy Field behavior confirmed against DocuSign, dotloop, SkySlope docs; AI coordinate accuracy confirmed via Feb 2025 benchmarks (< 3% pixel accuracy from vision); actual accuracy on Utah forms is untested
Architecture HIGH Based on actual v1.0 codebase review (not speculative); specific file names, function names, and line numbers cited throughout; build order confirmed by dependency analysis
Pitfalls HIGH All pitfalls grounded in actual codebase inspection; specific file paths and line numbers identified (e.g., sign route line 88); no speculative claims

Overall confidence: HIGH

Gaps to Address

  • AI coordinate accuracy on real Utah forms: Research confirms the text-extraction + label-matching approach is correct, but accuracy on actual Utah REPC and listing agreement forms is untested. Phase 5 must include an integration test with real forms before the feature ships.
  • Preview file lifecycle in production: The _preview_{timestamp}.pdf pattern creates unbounded file growth in uploads/. A cleanup strategy (delete previews older than 24 hours, or delete on document send) needs to be decided before Phase 4 implementation.
  • Deployment target for preview writes: The write-to-disk preview pattern silently fails on Vercel serverless (ephemeral filesystem). Confirm whether the app runs on Vercel serverless or a persistent container before implementing Phase 4.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)


Research completed: 2026-03-21 Ready for roadmap: yes