Files
red/.planning/research/SUMMARY.md
2026-03-21 11:28:42 -06:00

186 lines
21 KiB
Markdown

# Project Research Summary
**Project:** Teressa Copeland Homes — v1.1 Smart Document Preparation
**Domain:** Real estate agent website + PDF document signing portal
**Researched:** 2026-03-21
**Confidence:** HIGH
## Executive Summary
This is a v1.1 feature expansion of an existing, working Next.js 15 real estate document signing app. The v1.0 codebase is already validated — it uses Drizzle ORM, local PostgreSQL, `@cantoo/pdf-lib` for PDF writing, `react-pdf` for client-side rendering, Auth.js v5, and `signature_pad` for canvas signatures. The v1.1 additions are: AI-assisted field placement via GPT-4o-mini, five new field types (text, checkbox, initials, date, agent-signature), agent saved signature with a draw-once-reuse workflow, and a filled document preview before sending. The minimal dependency delta is two new packages: `openai@^6.32.0` and optionally `unpdf@^1.4.0` — though `pdfjs-dist` is already installed as a transitive dependency of `react-pdf` and can serve the server-side text extraction role via its legacy build.
The recommended build order is anchored by a schema-first phase. The `SignatureFieldData` type currently has no `type` discriminant — every field is treated identically as a client signature. Adding new field types without simultaneously updating both the schema AND the client signing page would break any in-flight signing session. The architecture research maps out an explicit 8-step dependency chain. For AI field placement, the correct approach uses `pdfjs-dist` for server-side text extraction (not vision), then GPT-4o-mini for semantic label classification — raw vision-based bounding box inference returns accurate coordinates less than 3% of the time. The OpenAI integration must use a manually defined JSON schema for structured output; the `zodResponseFormat` helper is broken with Zod v4 (confirmed open bug).
The key risk cluster is around the AI coordinate pipeline and signing page integrity. OpenAI returns percentage-based coordinates; `@cantoo/pdf-lib` expects PDF user-space points with a bottom-left origin — a Y-axis inversion that will silently produce wrong field positions without a dedicated conversion utility and unit test. A second risk is that agent-signature fields must be filtered from the `signatureFields` array sent to clients — the exact unguarded line (`/src/app/api/sign/[token]/route.ts` line 88) is identified in pitfalls research. Preview PDFs must use versioned paths separate from the final prepared PDF to maintain legal integrity between what the agent reviewed and what the client signs.
## Key Findings
### Recommended Stack
The v1.0 stack is unchanged and validated. See `STACK.md` for full version details.
**New dependencies for v1.1:**
- `openai@^6.32.0`: Official SDK, TypeScript-native structured output for GPT-4o-mini — use manual `json_schema` response_format, NOT `zodResponseFormat` (broken with Zod v4, confirmed open GitHub issues #1540, #1602, #1709)
- `pdfjs-dist` legacy build (already installed): Server-side PDF text extraction via `pdfjs-dist/legacy/build/pdf.mjs` — no new dependency needed if using this path
**Existing stack components covering all v1.1 needs:**
- `@cantoo/pdf-lib@2.6.3`: All five new field types (text, checkbox, initials, date, agent-signature) supported natively via `createTextField`, `createCheckBox`, `drawImage` APIs
- `signature_pad@5.1.3`: Agent signature canvas — use `useRef<HTMLCanvasElement>` + `useEffect` pattern directly; do NOT add `react-signature-canvas` (alpha wrapper)
- `react-pdf@10.4.1`: Filled preview rendering — pass `ArrayBuffer` directly; copy the buffer before passing to avoid detachment issue (known bug #1657)
- `@vercel/blob@2.3.1` + Drizzle ORM: Agent signature storage — architecture research recommends TEXT column on `users` table for 2-8KB base64 PNG; no new file storage needed
### Expected Features
All v1.1 features are P1 (must-have for launch). Research confirms the full feature set is aligned with industry standard behavior across DocuSign, dotloop, and SkySlope DigiSign.
**Must have (table stakes):**
- Initials field type — every Utah standard form (REPC, listing agreement, addenda) has per-page initials lines; missing this makes the app unusable for standard Utah workflows
- Date field (auto-stamp, read-only) — "Date Signed" pattern; auto-populated at signing session completion; client never types a date; legally important
- Checkbox field type — Utah REPC uses boolean checkboxes throughout (mediation clauses, contingency elections, disclosure acknowledgments)
- Agent saved signature — draw once, reuse across documents; the "Adopted Signature" pattern in every major real estate e-sig tool
- Agent signs first workflow — industry convention: agent at routing order 1, client at routing order 2; confirmed by DocuSign community docs
- Filled document preview with Send gating — prevents the most-cited mistake (sending wrong document version); Send button lives in preview
**Should have (differentiators):**
- AI field placement via gpt-4o-mini + text extraction — eliminates manual drag-drop session; accuracy 90%+ on structured Utah forms with predictable label patterns ("Buyer's Signature", "Date", "Initial Here")
- AI pre-fill from client profile — maps client name, email, property address to text fields; low hallucination risk (structured profile data, not free-text inference)
- Property address field on client profile — enables AI pre-fill to be property-specific; simple schema addition
**Defer to v1.2+:**
- AI confidence display to agent — adds UI noise; agent can see and correct in preview instead
- Template save from AI placement — high value but requires template management UI; defer until AI accuracy is validated
- Multiple agent signature fields per document — needs UX design; defer
### Architecture Approach
The v1.1 architecture is an incremental extension of the existing system — not a rewrite. Seven new files are created (two server-only AI lib files, three API routes, two client components). Eight existing files are modified with targeted additions. The critical architectural constraint: the existing client signing flow (`embed-signature.ts`, signing token route, `SignatureModal.tsx`) must not be altered. Agent-sig and text/checkbox/date fields are baked into the prepared PDF before the client opens the signing link. The client signing page handles only `client-signature` and `initials` field types.
See `ARCHITECTURE.md` for complete component boundaries, data flow diagrams, and the full 8-step build order.
**Major components:**
1. `lib/ai/extract-text.ts` + `lib/ai/field-placement.ts` (NEW, server-only) — pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output with manual JSON schema; `server-only` import guard prevents accidental client bundle inclusion
2. `POST /api/documents/[id]/ai-prepare` (NEW) — orchestrates extract + AI call + coordinate conversion (percentage to PDF points using actual page dimensions)
3. `GET/PUT /api/agent/signature` (NEW) — stores agent signature as base64 PNG TEXT column on `users` table; always auth-gated
4. `POST /api/documents/[id]/preview` (NEW) — reuses existing `preparePdf` in preview mode; writes to versioned `_preview_{timestamp}.pdf`; streams bytes directly; never overwrites final prepared PDF
5. Extended `FieldPlacer.tsx` palette — five new draggable tokens; existing drag/move/resize/persist mechanics unchanged
6. Extended `prepare-document.ts` — type-aware rendering switch for all six field types; existing `client-signature` path unchanged
### Critical Pitfalls
1. **Breaking the signing page with new field types**`SigningPageClient.tsx` opens the signature modal for every field in `signatureFields` with no type branching. Adding new field types without updating the signing page in the same deployment breaks active signing sessions. Ship schema + signing page filter as one atomic deployment, before any other v1.1 work.
2. **AI coordinate Y-axis inversion** — AI returns percentages from top-left; `@cantoo/pdf-lib` uses PDF user-space with Y=0 at bottom. Storing AI coordinates without conversion inverts every field position. Write a `aiCoordsToPagePdfSpace()` conversion utility with a unit test asserting known PDF-space x/y values against a real Utah REPC before any OpenAI call is made.
3. **Agent-signature field sent unfiltered to client**`/src/app/api/sign/[token]/route.ts` line 88 returns `doc.signatureFields ?? []` without type filtering. When `agent-signature` fields are in that array, the client sees them as required unsigned fields. Add type filter before any agent-signed document is sent.
4. **Stale preview after field changes** — preview PDF written to a deterministic path gets cached; agent sends a document based on a stale preview. Use versioned preview paths (`{docId}_preview_{timestamp}.pdf`) and disable Send when fields have changed since last preview generation.
5. **OpenAI token limits on multi-page Utah forms** — Utah standard forms are 10-30 pages; full text extraction fits in ~2,000-8,000 tokens (within gpt-4o-mini's 128k context). Risk: testing only with 2-3 page PDFs in development. Prevention: test AI pipeline with the full Utah REPC (20+ pages) before shipping.
## Implications for Roadmap
The architecture research provides an explicit 8-step build order based on hard dependencies. This maps directly to 5 phases.
### Phase 1: Schema Foundation + Signing Page Safety
**Rationale:** The single most dangerous change in v1.1 is adding field types to a schema the client signing page does not handle. Any document with mixed field types sent before the signing page is updated is a HIGH-recovery-cost production incident. Must be first, before any other v1.1 work.
**Delivers:** Extended `DocumentField` discriminated union in `schema.ts` with backward-compatible fallback for v1.0 documents (`type ?? 'client-signature'`); two new nullable DB columns (`agentSignatureData` on users, `propertyAddress` on clients); Drizzle migration; updated `SigningPageClient.tsx` and `POST /api/sign/[token]` with type-based field filtering.
**Addresses:** Foundation for all expanded field types; agent-signature client exposure risk
**Avoids:** Pitfall 1 (signing page crash on new field types), Pitfall 10 (agent-sig field shown to client as required unsigned field)
**Research flag:** None needed — Drizzle discriminated union and nullable column additions are well-documented; two-line ALTER TABLE migration.
### Phase 2: Agent Saved Signature + Agent Signing Workflow
**Rationale:** Agent signature is a prerequisite for the agent-signs-first workflow, which is a prerequisite for the filled preview (preview only makes sense after agent has signed). Agent signature embed also establishes the PNG embed pattern in `prepare-document.ts` that informs how other field types are handled.
**Delivers:** `GET/PUT /api/agent/signature` routes; `AgentSignaturePanel` component (draw + save + thumbnail); extended `prepare-document.ts` to embed agent-sig PNG at field coordinates; `FieldPlacer` palette token for agent-signature type; supersede-and-resend flow guard preventing re-preparation of sent/viewed documents without user confirmation.
**Uses:** `signature_pad@5.1.3` (existing), `@cantoo/pdf-lib@2.6.3` (existing), `users.agentSignatureData TEXT` column (Phase 1)
**Avoids:** Pitfall 5 (signature stored as dataURL in DB is correct — TEXT column is right for 2-8KB), Pitfall 6 (race condition on re-preparation), Pitfall 10 (agent-sig filtered from client fields via Phase 1 foundation)
**Research flag:** None needed — draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are DB column and API route.
### Phase 3: Expanded Field Types End-to-End
**Rationale:** Phase 1 made the schema and signing page safe. Phase 2 established the PNG embed pattern in `prepare-document.ts`. Now extend the field placement UI and prepare pipeline to handle all five new field types. Completing this phase gives the agent a fully functional field system without any AI dependency.
**Delivers:** Five new draggable palette tokens in `FieldPlacer.tsx` (text, checkbox, initials, date, agent-signature); type-aware rendering in `prepare-document.ts` (text stamp, checkbox embed, date auto-stamp, initials placeholder); `propertyAddress` field in `ClientModal` and clients server action; field type coverage from placement through to embedded PDF.
**Addresses:** All P1 table stakes: initials, date, checkbox, text field types
**Avoids:** Pitfall 1 (signing page hardened in Phase 1 before these types can be placed and sent)
**Research flag:** None needed — all APIs are in existing `@cantoo/pdf-lib@2.6.3`.
### Phase 4: Filled Document Preview
**Rationale:** Preview depends on the fully extended `preparePdf` from Phase 3 and agent signing from Phase 2. It is a composition of previous phases — build it after those foundations are solid.
**Delivers:** `POST /api/documents/[id]/preview` route; `PreviewModal` component with in-app react-pdf rendering; versioned preview path with staleness detection; Send button disabled when fields changed since last preview; Back-to-edit flow; prepared PDF hashed at prepare time (extend existing `pdfHash` pattern).
**Uses:** Existing `preparePdf` (reused unchanged), `react-pdf@10.4.1` (existing), ArrayBuffer copy pattern for react-pdf detachment bug
**Avoids:** Pitfall 7 (stale preview), Pitfall 8 (OOM — generate-once, serve-cached pattern), Pitfall 9 (client signs different doc than agent previewed — hash verification)
**Research flag:** Deployment target should be confirmed before implementation — the write-to-local-`uploads/` preview pattern fails on Vercel serverless (ephemeral filesystem). If deployed to Vercel, preview must write to Vercel Blob instead.
### Phase 5: AI Field Placement + Pre-fill
**Rationale:** AI is the highest-complexity feature and depends on field types being fully placeable (Phase 3) and the FieldPlacer accepting `DocumentField[]` from an external source. Building last means the agent can use manual placement throughout earlier phases. AI placement is an enhancement of the field system, not a replacement.
**Delivers:** `lib/ai/extract-text.ts` (pdfjs-dist legacy build, server-only); `lib/ai/field-placement.ts` (GPT-4o-mini structured output, manual JSON schema, `server-only` guard); `POST /api/documents/[id]/ai-prepare` route with coordinate conversion utility + unit test; "AI Auto-place" button in PreparePanel with loading state and agent review step; AI pre-fill of text fields from client profile data.
**Uses:** `openai@^6.32.0` (new install), pdfjs-dist legacy build (existing), gpt-4o-mini (sufficient for structured label extraction; ~15x cheaper than gpt-4o)
**Avoids:** Pitfall 2 (coordinate mismatch — unit-tested conversion utility against known Utah REPC before shipping), Pitfall 3 (token limits — full-form test required), Pitfall 4 (hallucination — Zod validation of AI response before any field is stored; explicit enum for field types in JSON schema)
**Research flag:** Requires integration test with real 20-page Utah REPC before shipping. Also validate that gpt-4o-mini text extraction accuracy on Utah standard forms (which have predictable label patterns) meets the 90%+ threshold claimed in research.
### Phase Ordering Rationale
- Phase 1 is a safety gate — deploy it before any document with new field types can be created or sent
- Phase 2 before Phase 3 because `prepare-document.ts` needs the agent-sig embed pattern established before adding the full type-aware rendering switch
- Phase 3 before Phase 4 because preview calls `preparePdf` — incomplete field type handling in prepare means an incomplete preview
- Phase 5 last because it enhances a complete field system; agents can use manual placement throughout all earlier phases; no blocking dependency
- The agent-signature field filtering (Pitfall 10) is addressed in Phase 1, not Phase 2 — this is deliberate; the signing route must be hardened before the first agent-sig field can be placed and sent
### Research Flags
**Needs deeper research during planning:**
- **Phase 5 (AI):** The coordinate conversion from percentage to PDF user-space points needs a concrete unit test against a known Utah REPC before implementation. Validate pdfjs-dist legacy build text extraction works correctly in the project's actual Node 20 / Next.js 16.2 environment.
- **Phase 4 (Preview):** Deployment target (Vercel serverless vs. self-hosted container) determines whether preview files can use the local `uploads/` filesystem or must use Vercel Blob. Confirm before writing the preview route.
**Standard patterns (skip research-phase):**
- **Phase 1 (Schema):** Drizzle discriminated union extension and nullable column additions are well-documented; two-line ALTER TABLE migration.
- **Phase 2 (Agent Signature):** The draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are a DB column and API route.
- **Phase 3 (Field Types):** All field type APIs are in existing `@cantoo/pdf-lib@2.6.3`; no new library research needed.
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | All versions verified via npm registry; OpenAI Zod v4 incompatibility confirmed via open GitHub issues #1540, #1602, #1709; pdfjs-dist server-side usage confirmed via actual codebase inspection |
| Features | HIGH for field types and signing flows; MEDIUM for AI field detection accuracy | Field behavior confirmed against DocuSign, dotloop, SkySlope docs; AI coordinate accuracy confirmed via Feb 2025 benchmarks (< 3% pixel accuracy from vision); actual accuracy on Utah forms is untested |
| Architecture | HIGH | Based on actual v1.0 codebase review (not speculative); specific file names, function names, and line numbers cited throughout; build order confirmed by dependency analysis |
| Pitfalls | HIGH | All pitfalls grounded in actual codebase inspection; specific file paths and line numbers identified (e.g., sign route line 88); no speculative claims |
**Overall confidence:** HIGH
### Gaps to Address
- **AI coordinate accuracy on real Utah forms:** Research confirms the text-extraction + label-matching approach is correct, but accuracy on actual Utah REPC and listing agreement forms is untested. Phase 5 must include an integration test with real forms before the feature ships.
- **Preview file lifecycle in production:** The `_preview_{timestamp}.pdf` pattern creates unbounded file growth in `uploads/`. A cleanup strategy (delete previews older than 24 hours, or delete on document send) needs to be decided before Phase 4 implementation.
- **Deployment target for preview writes:** The write-to-disk preview pattern silently fails on Vercel serverless (ephemeral filesystem). Confirm whether the app runs on Vercel serverless or a persistent container before implementing Phase 4.
## Sources
### Primary (HIGH confidence)
- `src/lib/db/schema.ts` (actual codebase, inspected 2026-03-21) — `SignatureFieldData` has no `type` field confirmed
- `src/app/api/sign/[token]/route.ts` line 88 (actual codebase) — unfiltered `signatureFields` sent to client confirmed
- `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` (actual codebase) — single "Signature" token; `screenToPdfCoords` Y-inversion pattern confirmed
- [openai npm](https://www.npmjs.com/package/openai) — v6.32.0 confirmed, Node 20 requirement
- [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual json_schema format confirmed
- [openai-node Issue #1540](https://github.com/openai/openai-node/issues/1540) — zodResponseFormat broken with Zod v4
- [openai-node Issue #1602](https://github.com/openai/openai-node/issues/1602) — zodTextFormat broken with Zod v4
- [openai-node Issue #1709](https://github.com/openai/openai-node/issues/1709) — Zod 4.1.13+ discriminated union break
- [@cantoo/pdf-lib npm](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3; createTextField, createCheckBox, drawImage APIs confirmed
- [react-pdf ArrayBuffer detach issue #1657](https://github.com/wojtekmaj/react-pdf/issues/1657) — ArrayBuffer copy workaround confirmed
- [Vercel Serverless Function Limits](https://vercel.com/docs/functions/runtimes/node-js#memory-and-compute) — 256MB default memory, 60s max execution on Pro
- [Utah Division of Real Estate — State Approved Forms](https://realestate.utah.gov/real-estate/forms/state-approved/) — REPC form structure context
### Secondary (MEDIUM confidence)
- [Edge AI and Vision Alliance — SAM 2 + GPT-4o (Feb 2025)](https://www.edge-ai-vision.com/2025/02/sam-2-gpt-4o-cascading-foundation-models-via-visual-prompting-part-2/) — GPT-4o returns accurate bounding box coordinates in < 3% of attempts
- [Instafill.ai — Real estate law flat PDF form automation (Feb 2026)](https://blog.instafill.ai/2026/02/18/case-study-real-estate-law-flat-pdf-form-automation/) — hybrid text-extraction + LLM approach confirmed as production pattern
- [DocuSign community — routing order for real estate](https://community.docusign.com/esignature-111/prefill-fields-before-sending-envelope-for-signature-180) — agent order 1, client order 2 confirmed
- [Dotloop support — date auto-stamp behavior](https://support.dotloop.com/hc/en-us/articles/217936457-Adding-Signatures-or-Initials-to-Locked-Templates) — date field auto-stamp pattern confirmed
- [DocuSign community — Date Signed field](https://community.docusign.com/esignature-111/am-i-able-to-auto-populate-the-date-field-2271) — read-only auto-populated date confirmed
---
*Research completed: 2026-03-21*
*Ready for roadmap: yes*