Files
red/.planning/phases/13-ai-field-placement-and-pre-fill/13-RESEARCH.md

541 lines
30 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 13: AI Field Placement and Pre-fill - Research
**Researched:** 2026-04-03 (re-research — complete rewrite)
**Domain:** pdfjs-dist text-layer blank extraction + GPT-4.1 structured output field classification + coordinate system
**Confidence:** HIGH
<phase_requirements>
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|-----------------|
| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | extractBlanks() extracts blanks with exact PDF user-space coords from pdfjs text layer; classifyFieldsWithAI() sends blank descriptions to GPT-4.1 for type classification; no coordinate conversion needed (pdfjs coords stored directly) |
| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | classifyFieldsWithAI() returns textFillData keyed by field UUID; merged into DocumentPageClient.textFillData state |
</phase_requirements>
---
## Summary
Phase 13 is implemented through Plans 0103 and is awaiting final E2E verification in Plan 04. Three plans are complete. The architecture evolved significantly from the original research: the system now uses **direct PDF text-layer extraction** (pdfjs-dist `getTextContent()` with transform matrix coordinates) rather than GPT-4o vision. This eliminates the coordinate conversion bug entirely — pdfjs returns coordinates already in PDF user-space (bottom-left origin, points), which is exactly what FieldPlacer stores and renders.
The original vision-based approach (render pages as JPEG → GPT-4o → xPct/yPct → aiCoordsToPagePdfSpace conversion) was attempted and abandoned due to a systematic 30-40% vertical offset that could not be resolved. The text-extraction approach (extract blank positions from pdfjs text layer → GPT-4.1 for type classification only → store raw coordinates) is now working code in the repository.
**Plan 04 is the only remaining task**: run unit tests, fix the TypeScript build, and perform human E2E verification. The `ai-coords.test.ts` file was deleted when the vision approach was abandoned and must NOT be recreated for the new architecture (it tested a function that no longer exists). Plan 04 Task 1 must be updated to reflect the current test infrastructure.
**Primary recommendation:** Plan 04 should confirm TypeScript compiles clean, run `prepare-document.test.ts` (10 tests passing), then proceed directly to human E2E verification. The coordinate bug is resolved by architecture — no code fix required for coordinates.
---
## Current Implementation State
### What Is Already Built (Plans 0103 Complete)
The following files are implemented and committed:
| File | Status | Purpose |
|------|--------|---------|
| `src/lib/ai/extract-text.ts` | Complete (uncommitted changes) | pdfjs-dist blank extraction via text layer |
| `src/lib/ai/field-placement.ts` | Complete (uncommitted changes) | GPT-4.1 field type classification |
| `src/app/api/documents/[id]/ai-prepare/route.ts` | Complete (uncommitted changes) | POST route orchestrating the pipeline |
| `src/lib/pdf/__tests__/ai-coords.test.ts` | **Deleted** | Was for vision approach; no longer needed |
| `src/lib/pdf/__tests__/prepare-document.test.ts` | Complete | 10 tests passing — Y-flip formula |
### Architecture (Current — Text Extraction Based)
```
PDF file
extractBlanks() [extract-text.ts]
→ pdfjs getTextContent() → transform[4]=x, transform[5]=y (PDF user-space, bottom-left origin)
→ 4 detection strategies (underscore runs, embedded underscores, bracket items)
→ groupIntoLines() with ±5pt y-tolerance
→ deduplication for Strategy 3+4 overlap
→ returns BlankField[] with {page, x, y, width, contextBefore, contextAfter, contextAbove, contextBelow, rowIndex, rowTotal}
classifyFieldsWithAI() [field-placement.ts]
→ GPT-4.1 receives compact text descriptions (index, page, row=N/T, context strings)
→ returns {index, fieldType, prefillValue} per blank
→ deterministic post-processing (Rules A/B/C/D) overrides AI errors
→ FIELD_HEIGHTS map (type → height in pts)
→ SIZE_LIMITS map (type → {minW, maxW} in pts)
→ stores: { id: UUID, page, x: blank.x, y: blank.y-2, width: clamped, height: by-type }
→ returns { fields: SignatureFieldData[], textFillData: Record<UUID, string> }
POST /api/documents/[id]/ai-prepare [route.ts]
→ writes fields to DB (signatureFields column)
→ returns { fields, textFillData }
DocumentPageClient (React)
→ setAiPlacementKey(k+1) → FieldPlacer re-fetches from DB
→ setTextFillData(prev => { ...prev, ...aiTextFill }) — merge, not replace
```
### Coordinate System — Fully Resolved
**The coordinate bug from the vision approach is NOT present in the current architecture.**
The text-extraction approach works because:
- pdfjs `item.transform[4]` = x position in PDF user-space (points, left from page left edge)
- pdfjs `item.transform[5]` = y position in PDF user-space (points, up from page bottom)
- These are stored directly as `field.x` and `field.y` in SignatureFieldData
- FieldPlacer renders stored fields using `pdfToScreenCoords(field.x, field.y, renderedW, renderedH, pageInfo)` which uses `pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf mediaBox dimensions)
- Since both extraction (pdfjs transform matrix) and rendering (react-pdf mediaBox) read from the same PDF mediaBox, they are inherently consistent
**No `aiCoordsToPagePdfSpace` conversion function is needed or present in the current code.**
The `aiCoordsToPagePdfSpace` function and its test (`ai-coords.test.ts`) were created for the vision approach and then deleted when the approach changed. Do not recreate them.
---
## Standard Stack
### Core
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| pdfjs-dist | 5.4.296 (hoisted from react-pdf) | PDF text-layer extraction via `getTextContent()` | Already installed; legacy build works in Node.js with file:// workerSrc |
| openai | ^6.32.0 (installed) | GPT-4.1 structured output for field type classification | Official SDK; installed; manual json_schema required (Zod v4 incompatibility) |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| @napi-rs/canvas | ^0.1.97 (installed, no longer used) | Was for server-side JPEG rendering | Kept in package.json but no longer imported; `serverExternalPackages` still lists it in next.config.ts — leave that entry to avoid breaking changes |
| crypto.randomUUID() | Node built-in | UUID generation for field IDs | Used in classifyFieldsWithAI to assign IDs before textFillData keying |
### Verified Package Versions
```bash
# pdfjs-dist version confirmed:
cat node_modules/pdfjs-dist/package.json | grep '"version"'
# → "version": "5.4.296"
# openai version confirmed:
cat node_modules/openai/package.json | grep '"version"'
```
**No installation needed** — all required packages are already in node_modules.
---
## Architecture Patterns
### Recommended Project Structure (Current)
```
src/
├── lib/
│ ├── ai/
│ │ ├── extract-text.ts # pdfjs-dist blank extraction (4 strategies)
│ │ └── field-placement.ts # GPT-4.1 type classification + post-processing
│ └── pdf/
│ └── __tests__/
│ └── prepare-document.test.ts # 10 tests — Y-flip formula (passing)
└── app/
└── api/
└── documents/
└── [id]/
└── ai-prepare/
└── route.ts # POST handler
```
### Pattern 1: pdfjs-dist Blank Extraction (Text Layer)
**What:** Use pdfjs `getTextContent()` to get all text items. Each item has a `transform` matrix where `transform[4]` = x and `transform[5]` = y in PDF user-space (bottom-left origin, points). Find underscore sequences (Strategy 1: pure runs, Strategy 2: embedded runs) and bracket patterns (Strategy 3: single-item `[ ]`, Strategy 4: multi-item `[ … ]`). Group items into lines with ±5pt y-tolerance.
**Why text-layer over vision:** Coordinates are exact (sub-point accuracy). No DPI scaling math, no image rendering, no API image tokens, no coordinate conversion needed.
**pdfjs-dist 5.x workerSrc (critical):** Must use `file://` URL pointing to the worker .mjs file — empty string is falsy and causes PDFWorker to throw before the fake-worker import runs.
```typescript
// Source: extract-text.ts (confirmed working)
import { join } from 'node:path';
GlobalWorkerOptions.workerSrc = `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`;
```
**Text item transform matrix:** `[scaleX, skewY, skewX, scaleY, translateX, translateY]`
- `transform[4]` = x left edge of item (PDF points from page left)
- `transform[5]` = y baseline of item (PDF points from page bottom)
- `transform[0]` = font size (approximately) when no rotation
### Pattern 2: GPT-4.1 Type Classification (Text-Only, No Vision)
**What:** Send a compact text description of each detected blank to GPT-4.1. The description includes blank index, page number, row position metadata (`row=N/T`), and context strings (contextBefore, contextAfter, contextAbove, contextBelow). AI returns only the field type and prefill value — coordinates come from pdfjs directly.
**Schema:** Manual `json_schema` with `strict: true`, all properties in `required`, `additionalProperties: false` at every nesting level. Do NOT use `zodResponseFormat` (broken with Zod v4).
```typescript
// Source: field-placement.ts (confirmed working)
const CLASSIFICATION_SCHEMA = {
type: 'object',
properties: {
fields: {
type: 'array',
items: {
type: 'object',
properties: {
index: { type: 'integer' },
fieldType: { type: 'string', enum: ['text', 'initials', 'date', 'client-signature', 'agent-signature', 'agent-initials', 'checkbox'] },
prefillValue: { type: 'string' },
},
required: ['index', 'fieldType', 'prefillValue'],
additionalProperties: false,
},
},
},
required: ['fields'],
additionalProperties: false,
} as const;
```
**Note:** The `checkbox` type is in the enum so the AI can classify inline checkboxes — but fields with `fieldType === 'checkbox'` are filtered out (not added to the fields array). They represent selection options, not placed fields.
### Pattern 3: Coordinate Handling — No Conversion Needed
**What:** Blank coordinates from pdfjs text layer are already in PDF user-space. Store them directly.
```typescript
// Source: field-placement.ts (confirmed working)
const y = Math.max(0, blank.y - 2); // -2pt: anchor just below baseline (underscores descend slightly)
fields.push({ id, page: blank.page, x: blank.x, y, width, height, type: fieldType });
```
**The -2pt y offset:** Underscore characters descend slightly below the text baseline. Moving the field bottom edge 2pt below the baseline positions the field box to sit ON the underline visually.
**Why this works without conversion:** pdfjs `transform[5]` is the text baseline Y in the same coordinate space (PDF user-space, bottom-left origin) that FieldPlacer's `pdfToScreenCoords` expects for rendering. `pageInfo.originalHeight` in FieldPlacer comes from `page.view[3]` (react-pdf mediaBox), which is the same value as pdfjs `getViewport({scale:1}).height` for standard PDF pages.
### Pattern 4: Deterministic Post-Processing Rules
**What:** Four rules that override AI classifications for structurally unambiguous cases.
| Rule | Condition | Override |
|------|-----------|---------|
| A | `contextBefore` last word is "date" | → `date` |
| B | `contextBefore` last word is "initials" | → `initials` |
| C | `rowTotal > 1` AND `rowIndex > 1` AND AI classified as signature | → `date` (if last+contextBelow has "(Date)") or `text` |
| D | `rowTotal=2`, `rowIndex=1`, AI classified as signature, contextBelow has "(Address/Phone)"+"(Date)" but NOT "(Seller"/"(Buyer" | → `text` |
**Why needed:** The footer pattern `Seller's Initials [ ] Date ___` and signature block rows `[sig] [address/phone] [date]` are structurally deterministic but the AI sometimes misclassifies them.
### Pattern 5: FieldPlacer Coordinate Rendering Formula
**What:** How stored PDF coordinates are converted back to screen pixels for rendering.
```typescript
// Source: FieldPlacer.tsx pdfToScreenCoords (lines 44-54)
function pdfToScreenCoords(pdfX, pdfY, renderedW, renderedH, pageInfo) {
const left = (pdfX / pageInfo.originalWidth) * renderedW;
// top is distance from DOM top to BOTTOM EDGE of field
const top = renderedH - (pdfY / pageInfo.originalHeight) * renderedH;
return { left, top };
}
// Rendering (FieldPlacer.tsx line 630):
// top: top - heightPx + canvasOffset.y ← shift up by height to get visual top of field
```
`pageInfo.originalWidth/originalHeight` are from react-pdf's `page.view[2]/page.view[3]` (mediaBox dimensions in PDF points at scale 1.0).
### Pattern 6: AI Auto-place Route and Client State Update
**What:** POST to `/api/documents/[id]/ai-prepare` → receive `{ fields, textFillData }` → update FieldPlacer state via `aiPlacementKey` increment.
```typescript
// Source: DocumentPageClient.tsx (confirmed working in Plan 03)
async function handleAiAutoPlace() {
const res = await fetch(`/api/documents/${docId}/ai-prepare`, { method: 'POST' });
if (res.ok) {
const { textFillData: aiTextFill } = await res.json();
setTextFillData(prev => ({ ...prev, ...aiTextFill })); // MERGE — preserves manual values
setAiPlacementKey(k => k + 1); // triggers FieldPlacer re-fetch from DB
setPreviewToken(null); // invalidate stale preview
}
}
```
### Anti-Patterns to Avoid
- **DO NOT** recreate `aiCoordsToPagePdfSpace` — that function was for the abandoned vision approach and does not belong in the current architecture
- **DO NOT** recreate `ai-coords.test.ts` — it tested a function that no longer exists; Plan 04 Task 1 must reference `prepare-document.test.ts` instead
- **DO NOT** import from `'pdfjs-dist/legacy/build/pdf.mjs'` with `GlobalWorkerOptions.workerSrc = ''` (empty string) — this is the pdfjs v3+ breaking change; always use the `file://` URL path
- **DO NOT** use `zodResponseFormat` — broken with Zod v4.3.6 (confirmed GitHub issues #1540, #1602, #1709)
- **DO NOT** add `@napi-rs/canvas` imports back to extract-text.ts — the vision approach was abandoned; the `serverExternalPackages` entry in next.config.ts can stay but the import is gone
- **DO NOT** use vision/image-based approach — the text-extraction approach is simpler, cheaper, and coordinate-accurate
- **DO NOT** lock fields after AI placement — agent must be able to edit, move, resize, delete
---
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| PDF text extraction | Custom PDF parser | pdfjs-dist `getTextContent()` | Already installed; handles encoding, multi-page, returns transform matrix with exact coordinates |
| Field type classification | Rule-based regex | GPT-4.1 with manual json_schema | Context analysis (above/below/before/after) is complex; AI handles natural language labels |
| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used everywhere in the project |
| Structured AI output | Parse JSON manually | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed; no parse failures |
**Key insight:** The current architecture is correct and complete. Plan 04 is verification-only — no new code is needed.
---
## Common Pitfalls
### Pitfall 1: Plan 04 Task 1 References Deleted Test File
**What goes wrong:** Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts`. That file was deleted when the vision approach was abandoned. Running this command fails immediately.
**Why it happens:** The plan was written for the vision approach. The architecture pivoted after the plan was written.
**How to avoid:** Plan 04 Task 1 must run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts` instead. The `prepare-document.test.ts` file has 10 passing tests covering the Y-flip coordinate formula — these remain valid and relevant.
**Warning signs:** `No tests found for path pattern 'src/lib/pdf/__tests__/ai-coords.test.ts'`
### Pitfall 2: pdfjs-dist 5.x Fake-Worker Requires `file://` URL
**What goes wrong:** Using `GlobalWorkerOptions.workerSrc = ''` (empty string) throws `PDFWorker: workerSrc not set` in Node.js route handlers with pdfjs-dist 5.x.
**Why it happens:** pdfjs-dist 5.x changed fake-worker mode. Empty string is falsy; the PDFWorker getter throws before attempting the dynamic import. The workerSrc must be a valid URL the Node.js dynamic importer can resolve.
**How to avoid:** Use `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`. This is already correct in the current `extract-text.ts`.
**Warning signs:** `Error: Setting up fake worker failed` or `workerSrc not set` in server logs.
### Pitfall 3: OpenAI Strict Mode Schema Cascades to Nested Objects
**What goes wrong:** A JSON schema with `strict: true` that omits `required` or `additionalProperties: false` on a nested object causes a 400 API error.
**Why it happens:** OpenAI strict mode requires `required` listing ALL properties AND `additionalProperties: false` at EVERY object level — including `items` inside arrays.
**How to avoid:** The current `CLASSIFICATION_SCHEMA` in `field-placement.ts` is correct. Do not modify the schema structure.
**Warning signs:** `400 BadRequestError: Invalid schema for response_format`
### Pitfall 4: Checkbox Fields Must Be Filtered Out Before Storing
**What goes wrong:** GPT-4.1 returns `fieldType: "checkbox"` for inline selection checkboxes (e.g., `[ ] ARE [ ] ARE NOT`). If these are added to `SignatureFieldData[]`, they appear as placed fields that cannot be properly handled by the signing flow.
**Why it happens:** The AI correctly identifies these as checkboxes (not fill-in blanks). They should be classified but not placed.
**How to avoid:** The current code already filters: `if (result.fieldType === 'checkbox') continue;`. This is correct.
**Warning signs:** Dozens of tiny checkbox-type fields placed throughout the document body.
### Pitfall 5: textFillData Keys Must Be Field UUIDs
**What goes wrong:** textFillData keyed by label string ("clientName") does not match DocumentPageClient's lookup of `textFillData[field.id]`.
**Why it happens:** Phase 12.1 wired textFillData to use field UUID as key (STATE.md confirmed locked decision). Label-keyed maps are silently ignored.
**How to avoid:** The route handler creates UUIDs (`crypto.randomUUID()`) BEFORE building textFillData, then uses those same UUIDs as keys. This is already correct in the current code.
**Warning signs:** Text pre-fill values don't appear in the preview even though AI returned them.
### Pitfall 6: Y Coordinate Is Baseline, Not Field Bottom
**What goes wrong:** A field placed exactly at `blank.y` (text baseline) appears above the underline, not on it. Underscores descend slightly below the baseline.
**Why it happens:** PDF text baseline is where capital letters sit. Descenders (underscores, lowercase g/y/p) extend below the baseline.
**How to avoid:** The current code uses `const y = Math.max(0, blank.y - 2)`. The -2pt offset anchors the field bottom edge slightly below the baseline, sitting on the underline visually.
**Warning signs:** Fields appear to float just above the underline instead of sitting on it.
### Pitfall 7: Debug Console.log Statements Are Still in Production Code
**What goes wrong:** Two `console.log` statements remain in `classifyFieldsWithAI`: one printing the blank count, one printing ALL blank descriptions (can be very verbose for large forms), and one printing all AI classifications.
**Why it happens:** Debugging statements added during development were not removed.
**How to avoid:** Plan 04 should include removing these console.log statements before final sign-off. They do not affect correctness but should not ship in production code.
**Warning signs:**
```
[ai-prepare] calling gpt-4.1 with 47 blanks
[ai-prepare] blank descriptions:
[0] page1 before="..." ...
```
---
## Code Examples
### Verified: How pdfjs Text Coordinates Map to Field Positions
```typescript
// pdfjs transform matrix: [scaleX, skewY, skewX, scaleY, translateX, translateY]
// For a text item on a US Letter page (612 × 792 pts):
// transform[4] = x (distance from left edge of page, in points)
// transform[5] = y (distance from BOTTOM of page, in points — PDF user-space)
//
// Example: text near top of page
// transform[5] ≈ 720 (720pt from bottom = ~1" from top on a 792pt page)
//
// FieldPlacer renders this as:
// top = renderedH - (720 / 792) * renderedH = renderedH * 0.091 ≈ 9% from top ✓
//
// Example: footer initials near bottom of page
// transform[5] ≈ 36 (36pt from bottom = 0.5" from bottom)
//
// FieldPlacer renders this as:
// top = renderedH - (36 / 792) * renderedH = renderedH * 0.955 ≈ 95.5% from top ✓
```
### Verified: groupIntoLines Threshold
```typescript
// Source: extract-text.ts groupIntoLines (line 58)
// ±5pt tolerance groups items on the same visual line.
// This handles multi-run items (same underline split across font boundaries)
// AND minor baseline variations in real PDFs.
//
// A 3-blank signature row (sig | addr/phone | date) will group correctly IF
// all three blanks have y-values within ±5pt of each other.
// If NOT grouped (y-drift > 5pt), Rule D in post-processing handles the misclassification.
```
### Verified: Route Handler Security and Error Pattern
```typescript
// Source: ai-prepare/route.ts (confirmed complete)
// Guards in order:
// 1. auth() — session check
// 2. OPENAI_API_KEY existence — 503 if missing
// 3. document lookup — 404 if not found
// 4. filePath check — 422 if no PDF
// 5. status === 'Draft' — 403 if locked
// 6. path traversal check — 403 if escapes UPLOADS_DIR
// 7. try/catch wrapping extractBlanks + classifyFieldsWithAI — 500 with message
// 8. DB write — direct update, no status change
// 9. return { fields, textFillData }
```
---
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| GPT-4o vision (render pages as JPEG) | pdfjs text-layer extraction | After Plan 01, during debugging | Eliminates coordinate conversion math and the 30-40% vertical offset bug |
| `aiCoordsToPagePdfSpace()` + `ai-coords.test.ts` | Direct pdfjs coordinates — no conversion | After Plan 01 pivot | Deleted function and test; `prepare-document.test.ts` is the only relevant test |
| GPT-4o (vision model) | GPT-4.1 (text model) | After vision approach abandoned | Cheaper, faster, sufficient for text classification task |
| `GlobalWorkerOptions.workerSrc = ''` | `file://` URL path | pdfjs-dist 5.x requirement | Required for fake-worker mode; empty string is falsy in 5.x |
| `zodResponseFormat` helper | Manual `json_schema` | Zod v4 incompatibility (issues #1540, #1602, #1709) | Permanent — project uses Zod v4.3.6 |
| `@napi-rs/canvas` for image rendering | Not used (deleted from imports) | After vision approach abandoned | Package still installed but not imported; `serverExternalPackages` entry can remain |
**Deprecated/outdated items that should NOT appear in new code:**
- `import { createCanvas } from '@napi-rs/canvas'` in extract-text.ts — deleted, do not restore
- `aiCoordsToPagePdfSpace()` function — deleted, do not recreate
- `ai-coords.test.ts` — deleted, do not recreate
- `zodResponseFormat` — broken with Zod v4, do not use
- `GlobalWorkerOptions.workerSrc = ''` (empty string) — wrong for pdfjs 5.x, do not use
---
## Open Questions
1. **Debug console.log statements in classifyFieldsWithAI**
- What we know: Three `console.log` calls remain in `field-placement.ts` (blank count, all descriptions, all classifications). These print verbose output to server logs on every AI auto-place request.
- What's unclear: Whether to remove before Plan 04 verification or after.
- Recommendation: Remove before E2E verification so server logs are clean for manual testing. Add to Plan 04 Task 1.
2. **gpt-4.1 model availability**
- What we know: `field-placement.ts` uses `model: 'gpt-4.1'`. This is the current model. The project's OPENAI_API_KEY must have access to this model.
- What's unclear: Whether the developer's API key has gpt-4.1 access (it's not universally available to all OpenAI accounts as of 2026).
- Recommendation: If the API call returns a 404 model error, fall back to `gpt-4o` (which was used in an earlier iteration and is broadly available). The prompt and schema work identically for both models.
3. **Real Utah REPC 20-page accuracy**
- What we know: The text-extraction approach extracts blanks accurately based on the 4 detection strategies. Accuracy depends on the quality of the PDF text layer.
- What's unclear: How many blanks are missed or double-detected on the full 20-page Utah REPC. The system prompt is tuned for Utah real estate signature patterns.
- Recommendation: Plan 04 human verification on a real Utah REPC is non-negotiable. Expect 80-95% accuracy — imperfect placement is acceptable as the agent reviews before sending.
4. **Plan 04 Task 1 command mismatch**
- What we know: Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts` — that file is deleted.
- Recommendation: Plan 04 must be updated to run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` and `npx tsc --noEmit` instead.
---
## Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|------------|------------|-----------|---------|----------|
| pdfjs-dist | extractBlanks() | ✓ | 5.4.296 | — (required, already installed) |
| pdfjs worker file | pdfjs fake-worker | ✓ | `node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs` | — |
| openai SDK | classifyFieldsWithAI() | ✓ | ^6.32.0 | — (required, already installed) |
| OPENAI_API_KEY env var | classifyFieldsWithAI() | Unknown | — | Route returns 503 with message if missing |
| Node.js | All server-side code | ✓ | v23.6.0 | — |
| @napi-rs/canvas | NOT required anymore | ✓ (installed) | ^0.1.97 | N/A — no longer imported |
**Missing dependencies with no fallback:**
- OPENAI_API_KEY in `.env.local` — must be set before Plan 04 verification. Route returns 503 if missing (actionable error, not a crash).
---
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | Jest 29.7.0 + ts-jest |
| Config file | package.json `"jest": { "preset": "ts-jest", "testEnvironment": "node" }` |
| Quick run command | `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` |
| Full suite command | `npx jest --no-coverage --verbose` |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| AI-01 | Y-flip coordinate formula correct (FieldPlacer screen↔PDF) | unit | `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` | ✅ |
| AI-01 | AI auto-place button appears, fields land on canvas, correct vertical positions | manual E2E | human verification (Plan 04 Task 2) | N/A |
| AI-02 | Text pre-fill values from client profile appear in field values | manual E2E | human verification (Plan 04 Task 2) | N/A |
### Sampling Rate
- **Per task commit:** `npx jest --no-coverage`
- **Per wave merge:** `npx jest --no-coverage && npx tsc --noEmit`
- **Phase gate:** Full suite green + TypeScript clean + human E2E approval before phase complete
### Wave 0 Gaps
None — existing test infrastructure covers coordinate math. No new test files required for Plan 04. The deleted `ai-coords.test.ts` is intentionally not recreated (it tested a deleted function).
---
## Sources
### Primary (HIGH confidence)
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/extract-text.ts` — current implementation, text-layer extraction approach
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/field-placement.ts` — current implementation, GPT-4.1 classification
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/api/documents/[id]/ai-prepare/route.ts` — current route implementation
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx``pdfToScreenCoords` function (lines 44-54) and drag-end coordinate formula (lines 291-292)
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/PdfViewer.tsx``pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf)
- `.planning/phases/13-ai-field-placement-and-pre-fill/.continue-here.md` — complete record of what was attempted, what failed, and current state
- `git log --oneline` — confirmed commit history showing vision→text pivot
- STATE.md — locked decisions including manual json_schema, pdfjs-dist legacy build, Zod v4 incompatibility
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/pdf/__tests__/prepare-document.test.ts` — 10 tests, all passing
### Secondary (MEDIUM confidence)
- pdfjs-dist transform matrix format — verified by reading pdfjs docs + confirmed by the implementation's usage of `transform[4]`/`transform[5]` for x/y
- react-pdf `page.view` array format — confirmed from PdfViewer.tsx `page.view[0], page.view[2]` / `page.view[1], page.view[3]` usage with comment `// Math.max handles non-standard mediaBox ordering`
### Tertiary (LOW confidence)
- GPT-4.1 model availability for all API keys — not verified; gpt-4.1 may not be available in all tiers
- Real Utah REPC 20-page blank extraction accuracy — not measured; text-layer quality varies by PDF
---
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH — all packages confirmed installed and version-verified
- Architecture: HIGH — current code read directly, coordinate system verified analytically and confirmed by test results
- Pitfalls: HIGH — all pitfalls documented from actual bugs encountered during development (per .continue-here.md) or confirmed from code inspection
- Plan 04 gap (Task 1 test command): HIGH — confirmed by checking git status (ai-coords.test.ts deleted), running jest (prepare-document.test.ts passes)
**Research date:** 2026-04-03
**Valid until:** 2026-05-03 (30 days — stack stable; coordinate math is deterministic; no external dependencies changing)