| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | extractBlanks() extracts blanks with exact PDF user-space coords from pdfjs text layer; classifyFieldsWithAI() sends blank descriptions to GPT-4.1 for type classification; no coordinate conversion needed (pdfjs coords stored directly) |
| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | classifyFieldsWithAI() returns textFillData keyed by field UUID; merged into DocumentPageClient.textFillData state |
Phase 13 is implemented through Plans 01–03 and is awaiting final E2E verification in Plan 04. Three plans are complete. The architecture evolved significantly from the original research: the system now uses **direct PDF text-layer extraction** (pdfjs-dist `getTextContent()` with transform matrix coordinates) rather than GPT-4o vision. This eliminates the coordinate conversion bug entirely — pdfjs returns coordinates already in PDF user-space (bottom-left origin, points), which is exactly what FieldPlacer stores and renders.
The original vision-based approach (render pages as JPEG → GPT-4o → xPct/yPct → aiCoordsToPagePdfSpace conversion) was attempted and abandoned due to a systematic 30-40% vertical offset that could not be resolved. The text-extraction approach (extract blank positions from pdfjs text layer → GPT-4.1 for type classification only → store raw coordinates) is now working code in the repository.
**Plan 04 is the only remaining task**: run unit tests, fix the TypeScript build, and perform human E2E verification. The `ai-coords.test.ts` file was deleted when the vision approach was abandoned and must NOT be recreated for the new architecture (it tested a function that no longer exists). Plan 04 Task 1 must be updated to reflect the current test infrastructure.
**Primary recommendation:** Plan 04 should confirm TypeScript compiles clean, run `prepare-document.test.ts` (10 tests passing), then proceed directly to human E2E verification. The coordinate bug is resolved by architecture — no code fix required for coordinates.
---
## Current Implementation State
### What Is Already Built (Plans 01–03 Complete)
The following files are implemented and committed:
| File | Status | Purpose |
|------|--------|---------|
| `src/lib/ai/extract-text.ts` | Complete (uncommitted changes) | pdfjs-dist blank extraction via text layer |
| `src/lib/ai/field-placement.ts` | Complete (uncommitted changes) | GPT-4.1 field type classification |
| `src/app/api/documents/[id]/ai-prepare/route.ts` | Complete (uncommitted changes) | POST route orchestrating the pipeline |
| `src/lib/pdf/__tests__/ai-coords.test.ts` | **Deleted** | Was for vision approach; no longer needed |
**The coordinate bug from the vision approach is NOT present in the current architecture.**
The text-extraction approach works because:
- pdfjs `item.transform[4]` = x position in PDF user-space (points, left from page left edge)
- pdfjs `item.transform[5]` = y position in PDF user-space (points, up from page bottom)
- These are stored directly as `field.x` and `field.y` in SignatureFieldData
- FieldPlacer renders stored fields using `pdfToScreenCoords(field.x, field.y, renderedW, renderedH, pageInfo)` which uses `pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf mediaBox dimensions)
- Since both extraction (pdfjs transform matrix) and rendering (react-pdf mediaBox) read from the same PDF mediaBox, they are inherently consistent
**No `aiCoordsToPagePdfSpace` conversion function is needed or present in the current code.**
The `aiCoordsToPagePdfSpace` function and its test (`ai-coords.test.ts`) were created for the vision approach and then deleted when the approach changed. Do not recreate them.
| pdfjs-dist | 5.4.296 (hoisted from react-pdf) | PDF text-layer extraction via `getTextContent()` | Already installed; legacy build works in Node.js with file:// workerSrc |
| openai | ^6.32.0 (installed) | GPT-4.1 structured output for field type classification | Official SDK; installed; manual json_schema required (Zod v4 incompatibility) |
| @napi-rs/canvas | ^0.1.97 (installed, no longer used) | Was for server-side JPEG rendering | Kept in package.json but no longer imported; `serverExternalPackages` still lists it in next.config.ts — leave that entry to avoid breaking changes |
| crypto.randomUUID() | Node built-in | UUID generation for field IDs | Used in classifyFieldsWithAI to assign IDs before textFillData keying |
**What:** Use pdfjs `getTextContent()` to get all text items. Each item has a `transform` matrix where `transform[4]` = x and `transform[5]` = y in PDF user-space (bottom-left origin, points). Find underscore sequences (Strategy 1: pure runs, Strategy 2: embedded runs) and bracket patterns (Strategy 3: single-item `[ ]`, Strategy 4: multi-item `[ … ]`). Group items into lines with ±5pt y-tolerance.
**Why text-layer over vision:** Coordinates are exact (sub-point accuracy). No DPI scaling math, no image rendering, no API image tokens, no coordinate conversion needed.
**pdfjs-dist 5.x workerSrc (critical):** Must use `file://` URL pointing to the worker .mjs file — empty string is falsy and causes PDFWorker to throw before the fake-worker import runs.
**What:** Send a compact text description of each detected blank to GPT-4.1. The description includes blank index, page number, row position metadata (`row=N/T`), and context strings (contextBefore, contextAfter, contextAbove, contextBelow). AI returns only the field type and prefill value — coordinates come from pdfjs directly.
**Schema:** Manual `json_schema` with `strict: true`, all properties in `required`, `additionalProperties: false` at every nesting level. Do NOT use `zodResponseFormat` (broken with Zod v4).
**Note:** The `checkbox` type is in the enum so the AI can classify inline checkboxes — but fields with `fieldType === 'checkbox'` are filtered out (not added to the fields array). They represent selection options, not placed fields.
**The -2pt y offset:** Underscore characters descend slightly below the text baseline. Moving the field bottom edge 2pt below the baseline positions the field box to sit ON the underline visually.
**Why this works without conversion:** pdfjs `transform[5]` is the text baseline Y in the same coordinate space (PDF user-space, bottom-left origin) that FieldPlacer's `pdfToScreenCoords` expects for rendering. `pageInfo.originalHeight` in FieldPlacer comes from `page.view[3]` (react-pdf mediaBox), which is the same value as pdfjs `getViewport({scale:1}).height` for standard PDF pages.
| A | `contextBefore` last word is "date" | → `date` |
| B | `contextBefore` last word is "initials" | → `initials` |
| C | `rowTotal > 1` AND `rowIndex > 1` AND AI classified as signature | → `date` (if last+contextBelow has "(Date)") or `text` |
| D | `rowTotal=2`, `rowIndex=1`, AI classified as signature, contextBelow has "(Address/Phone)"+"(Date)" but NOT "(Seller"/"(Buyer" | → `text` |
**Why needed:** The footer pattern `Seller's Initials [ ] Date ___` and signature block rows `[sig] [address/phone] [date]` are structurally deterministic but the AI sometimes misclassifies them.
### Pattern 5: FieldPlacer Coordinate Rendering Formula
**What:** How stored PDF coordinates are converted back to screen pixels for rendering.
- **DO NOT** recreate `aiCoordsToPagePdfSpace` — that function was for the abandoned vision approach and does not belong in the current architecture
- **DO NOT** recreate `ai-coords.test.ts` — it tested a function that no longer exists; Plan 04 Task 1 must reference `prepare-document.test.ts` instead
- **DO NOT** import from `'pdfjs-dist/legacy/build/pdf.mjs'` with `GlobalWorkerOptions.workerSrc = ''` (empty string) — this is the pdfjs v3+ breaking change; always use the `file://` URL path
- **DO NOT** use `zodResponseFormat` — broken with Zod v4.3.6 (confirmed GitHub issues #1540, #1602, #1709)
- **DO NOT** add `@napi-rs/canvas` imports back to extract-text.ts — the vision approach was abandoned; the `serverExternalPackages` entry in next.config.ts can stay but the import is gone
- **DO NOT** use vision/image-based approach — the text-extraction approach is simpler, cheaper, and coordinate-accurate
- **DO NOT** lock fields after AI placement — agent must be able to edit, move, resize, delete
| PDF text extraction | Custom PDF parser | pdfjs-dist `getTextContent()` | Already installed; handles encoding, multi-page, returns transform matrix with exact coordinates |
| Field type classification | Rule-based regex | GPT-4.1 with manual json_schema | Context analysis (above/below/before/after) is complex; AI handles natural language labels |
| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used everywhere in the project |
| Structured AI output | Parse JSON manually | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed; no parse failures |
### Pitfall 1: Plan 04 Task 1 References Deleted Test File
**What goes wrong:** Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts`. That file was deleted when the vision approach was abandoned. Running this command fails immediately.
**Why it happens:** The plan was written for the vision approach. The architecture pivoted after the plan was written.
**How to avoid:** Plan 04 Task 1 must run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts` instead. The `prepare-document.test.ts` file has 10 passing tests covering the Y-flip coordinate formula — these remain valid and relevant.
**Why it happens:** pdfjs-dist 5.x changed fake-worker mode. Empty string is falsy; the PDFWorker getter throws before attempting the dynamic import. The workerSrc must be a valid URL the Node.js dynamic importer can resolve.
**How to avoid:** Use `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`. This is already correct in the current `extract-text.ts`.
**What goes wrong:** A JSON schema with `strict: true` that omits `required` or `additionalProperties: false` on a nested object causes a 400 API error.
**Why it happens:** OpenAI strict mode requires `required` listing ALL properties AND `additionalProperties: false` at EVERY object level — including `items` inside arrays.
**What goes wrong:** GPT-4.1 returns `fieldType: "checkbox"` for inline selection checkboxes (e.g., `[ ] ARE [ ] ARE NOT`). If these are added to `SignatureFieldData[]`, they appear as placed fields that cannot be properly handled by the signing flow.
**Why it happens:** Phase 12.1 wired textFillData to use field UUID as key (STATE.md confirmed locked decision). Label-keyed maps are silently ignored.
**How to avoid:** The route handler creates UUIDs (`crypto.randomUUID()`) BEFORE building textFillData, then uses those same UUIDs as keys. This is already correct in the current code.
**What goes wrong:** A field placed exactly at `blank.y` (text baseline) appears above the underline, not on it. Underscores descend slightly below the baseline.
**How to avoid:** The current code uses `const y = Math.max(0, blank.y - 2)`. The -2pt offset anchors the field bottom edge slightly below the baseline, sitting on the underline visually.
**What goes wrong:** Two `console.log` statements remain in `classifyFieldsWithAI`: one printing the blank count, one printing ALL blank descriptions (can be very verbose for large forms), and one printing all AI classifications.
**Why it happens:** Debugging statements added during development were not removed.
**How to avoid:** Plan 04 should include removing these console.log statements before final sign-off. They do not affect correctness but should not ship in production code.
| GPT-4o vision (render pages as JPEG) | pdfjs text-layer extraction | After Plan 01, during debugging | Eliminates coordinate conversion math and the 30-40% vertical offset bug |
| `aiCoordsToPagePdfSpace()` + `ai-coords.test.ts` | Direct pdfjs coordinates — no conversion | After Plan 01 pivot | Deleted function and test; `prepare-document.test.ts` is the only relevant test |
| GPT-4o (vision model) | GPT-4.1 (text model) | After vision approach abandoned | Cheaper, faster, sufficient for text classification task |
| `GlobalWorkerOptions.workerSrc = ''` | `file://` URL path | pdfjs-dist 5.x requirement | Required for fake-worker mode; empty string is falsy in 5.x |
| `@napi-rs/canvas` for image rendering | Not used (deleted from imports) | After vision approach abandoned | Package still installed but not imported; `serverExternalPackages` entry can remain |
**Deprecated/outdated items that should NOT appear in new code:**
-`import { createCanvas } from '@napi-rs/canvas'` in extract-text.ts — deleted, do not restore
-`aiCoordsToPagePdfSpace()` function — deleted, do not recreate
-`ai-coords.test.ts` — deleted, do not recreate
-`zodResponseFormat` — broken with Zod v4, do not use
-`GlobalWorkerOptions.workerSrc = ''` (empty string) — wrong for pdfjs 5.x, do not use
1.**Debug console.log statements in classifyFieldsWithAI**
- What we know: Three `console.log` calls remain in `field-placement.ts` (blank count, all descriptions, all classifications). These print verbose output to server logs on every AI auto-place request.
- What's unclear: Whether to remove before Plan 04 verification or after.
- Recommendation: Remove before E2E verification so server logs are clean for manual testing. Add to Plan 04 Task 1.
- What we know: `field-placement.ts` uses `model: 'gpt-4.1'`. This is the current model. The project's OPENAI_API_KEY must have access to this model.
- What's unclear: Whether the developer's API key has gpt-4.1 access (it's not universally available to all OpenAI accounts as of 2026).
- Recommendation: If the API call returns a 404 model error, fall back to `gpt-4o` (which was used in an earlier iteration and is broadly available). The prompt and schema work identically for both models.
- What we know: The text-extraction approach extracts blanks accurately based on the 4 detection strategies. Accuracy depends on the quality of the PDF text layer.
- What's unclear: How many blanks are missed or double-detected on the full 20-page Utah REPC. The system prompt is tuned for Utah real estate signature patterns.
- Recommendation: Plan 04 human verification on a real Utah REPC is non-negotiable. Expect 80-95% accuracy — imperfect placement is acceptable as the agent reviews before sending.
- What we know: Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts` — that file is deleted.
- Recommendation: Plan 04 must be updated to run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` and `npx tsc --noEmit` instead.
---
## Environment Availability
| Dependency | Required By | Available | Version | Fallback |
- **Phase gate:** Full suite green + TypeScript clean + human E2E approval before phase complete
### Wave 0 Gaps
None — existing test infrastructure covers coordinate math. No new test files required for Plan 04. The deleted `ai-coords.test.ts` is intentionally not recreated (it tested a deleted function).
-`/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/extract-text.ts` — current implementation, text-layer extraction approach
-`/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/field-placement.ts` — current implementation, GPT-4.1 classification
-`/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/api/documents/[id]/ai-prepare/route.ts` — current route implementation
-`/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` — `pdfToScreenCoords` function (lines 44-54) and drag-end coordinate formula (lines 291-292)
-`/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/PdfViewer.tsx` — `pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf)
-`.planning/phases/13-ai-field-placement-and-pre-fill/.continue-here.md` — complete record of what was attempted, what failed, and current state
-`git log --oneline` — confirmed commit history showing vision→text pivot
- pdfjs-dist transform matrix format — verified by reading pdfjs docs + confirmed by the implementation's usage of `transform[4]`/`transform[5]` for x/y
- react-pdf `page.view` array format — confirmed from PdfViewer.tsx `page.view[0], page.view[2]` / `page.view[1], page.view[3]` usage with comment `// Math.max handles non-standard mediaBox ordering`
- Standard stack: HIGH — all packages confirmed installed and version-verified
- Architecture: HIGH — current code read directly, coordinate system verified analytically and confirmed by test results
- Pitfalls: HIGH — all pitfalls documented from actual bugs encountered during development (per .continue-here.md) or confirmed from code inspection
- Plan 04 gap (Task 1 test command): HIGH — confirmed by checking git status (ai-coords.test.ts deleted), running jest (prepare-document.test.ts passes)