diff --git a/.planning/phases/13-ai-field-placement-and-pre-fill/13-RESEARCH.md b/.planning/phases/13-ai-field-placement-and-pre-fill/13-RESEARCH.md
index 1329162..d1cc4c9 100644
--- a/.planning/phases/13-ai-field-placement-and-pre-fill/13-RESEARCH.md
+++ b/.planning/phases/13-ai-field-placement-and-pre-fill/13-RESEARCH.md
@@ -1,7 +1,7 @@
# Phase 13: AI Field Placement and Pre-fill - Research
-**Researched:** 2026-03-21
-**Domain:** OpenAI GPT-4o-mini structured outputs + pdfjs-dist server-side text extraction + coordinate conversion
+**Researched:** 2026-04-03 (re-research — complete rewrite)
+**Domain:** pdfjs-dist text-layer blank extraction + GPT-4.1 structured output field classification + coordinate system
**Confidence:** HIGH
@@ -9,21 +9,82 @@
| ID | Description | Research Support |
|----|-------------|-----------------|
-| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output for field classification; aiCoordsToPagePdfSpace() for coordinate conversion; fields PUT to existing /api/documents/[id]/fields endpoint |
-| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | Client profile already has name + propertyAddress in DB; textFillData (Record keyed by field UUID) already wired through DocumentPageClient → FieldPlacer; AI route returns both fields array and pre-fill map; agent reviews before committing |
+| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | extractBlanks() extracts blanks with exact PDF user-space coords from pdfjs text layer; classifyFieldsWithAI() sends blank descriptions to GPT-4.1 for type classification; no coordinate conversion needed (pdfjs coords stored directly) |
+| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | classifyFieldsWithAI() returns textFillData keyed by field UUID; merged into DocumentPageClient.textFillData state |
---
## Summary
-Phase 13 adds an "AI Auto-place" button that extracts text from the PDF via pdfjs-dist (already installed), sends that text to GPT-4o-mini with a structured output schema asking for field type + page + normalized percentage coordinates, converts those percentages to PDF user-space points (with Y-axis flip), and writes the resulting `SignatureFieldData[]` array to the existing `/api/documents/[id]/fields` PUT endpoint. The agent reviews, adjusts, or deletes any AI-placed field before proceeding.
+Phase 13 is implemented through Plans 01–03 and is awaiting final E2E verification in Plan 04. Three plans are complete. The architecture evolved significantly from the original research: the system now uses **direct PDF text-layer extraction** (pdfjs-dist `getTextContent()` with transform matrix coordinates) rather than GPT-4o vision. This eliminates the coordinate conversion bug entirely — pdfjs returns coordinates already in PDF user-space (bottom-left origin, points), which is exactly what FieldPlacer stores and renders.
-The `openai` npm package (v6.x, latest as of March 2026) is NOT currently installed in the project — it must be added. However, pdfjs-dist 5.4.296 is already present (via react-pdf) and the legacy build at `pdfjs-dist/legacy/build/pdf.mjs` is usable for server-side text extraction. The coordinate conversion formula already exists in FieldPlacer.tsx and `prepare-document.test.ts`; Phase 13 must replicate it as a named utility (`aiCoordsToPagePdfSpace`) with a dedicated unit test.
+The original vision-based approach (render pages as JPEG → GPT-4o → xPct/yPct → aiCoordsToPagePdfSpace conversion) was attempted and abandoned due to a systematic 30-40% vertical offset that could not be resolved. The text-extraction approach (extract blank positions from pdfjs text layer → GPT-4.1 for type classification only → store raw coordinates) is now working code in the repository.
-The decision recorded in STATE.md is authoritative: do NOT use `zodResponseFormat` (broken with Zod v4 that's installed). Use manual `json_schema` response_format with `strict: true`, `additionalProperties: false`, and every property in `required` at every nesting level. GPT-4o-mini supports this natively (confirmed since gpt-4o-mini-2024-07-18).
+**Plan 04 is the only remaining task**: run unit tests, fix the TypeScript build, and perform human E2E verification. The `ai-coords.test.ts` file was deleted when the vision approach was abandoned and must NOT be recreated for the new architecture (it tested a function that no longer exists). Plan 04 Task 1 must be updated to reflect the current test infrastructure.
-**Primary recommendation:** Install `openai` npm package, use `pdfjs-dist/legacy/build/pdf.mjs` for text extraction with `GlobalWorkerOptions.workerSrc = ''` in Node.js route context, send extracted text to GPT-4o-mini with manual JSON schema, convert percentage coordinates to PDF points with Y-axis flip, write fields via existing PUT endpoint, and have the agent review before committing.
+**Primary recommendation:** Plan 04 should confirm TypeScript compiles clean, run `prepare-document.test.ts` (10 tests passing), then proceed directly to human E2E verification. The coordinate bug is resolved by architecture — no code fix required for coordinates.
+
+---
+
+## Current Implementation State
+
+### What Is Already Built (Plans 01–03 Complete)
+
+The following files are implemented and committed:
+
+| File | Status | Purpose |
+|------|--------|---------|
+| `src/lib/ai/extract-text.ts` | Complete (uncommitted changes) | pdfjs-dist blank extraction via text layer |
+| `src/lib/ai/field-placement.ts` | Complete (uncommitted changes) | GPT-4.1 field type classification |
+| `src/app/api/documents/[id]/ai-prepare/route.ts` | Complete (uncommitted changes) | POST route orchestrating the pipeline |
+| `src/lib/pdf/__tests__/ai-coords.test.ts` | **Deleted** | Was for vision approach; no longer needed |
+| `src/lib/pdf/__tests__/prepare-document.test.ts` | Complete | 10 tests passing — Y-flip formula |
+
+### Architecture (Current — Text Extraction Based)
+
+```
+PDF file
+ ↓
+extractBlanks() [extract-text.ts]
+ → pdfjs getTextContent() → transform[4]=x, transform[5]=y (PDF user-space, bottom-left origin)
+ → 4 detection strategies (underscore runs, embedded underscores, bracket items)
+ → groupIntoLines() with ±5pt y-tolerance
+ → deduplication for Strategy 3+4 overlap
+ → returns BlankField[] with {page, x, y, width, contextBefore, contextAfter, contextAbove, contextBelow, rowIndex, rowTotal}
+ ↓
+classifyFieldsWithAI() [field-placement.ts]
+ → GPT-4.1 receives compact text descriptions (index, page, row=N/T, context strings)
+ → returns {index, fieldType, prefillValue} per blank
+ → deterministic post-processing (Rules A/B/C/D) overrides AI errors
+ → FIELD_HEIGHTS map (type → height in pts)
+ → SIZE_LIMITS map (type → {minW, maxW} in pts)
+ → stores: { id: UUID, page, x: blank.x, y: blank.y-2, width: clamped, height: by-type }
+ → returns { fields: SignatureFieldData[], textFillData: Record }
+ ↓
+POST /api/documents/[id]/ai-prepare [route.ts]
+ → writes fields to DB (signatureFields column)
+ → returns { fields, textFillData }
+ ↓
+DocumentPageClient (React)
+ → setAiPlacementKey(k+1) → FieldPlacer re-fetches from DB
+ → setTextFillData(prev => { ...prev, ...aiTextFill }) — merge, not replace
+```
+
+### Coordinate System — Fully Resolved
+
+**The coordinate bug from the vision approach is NOT present in the current architecture.**
+
+The text-extraction approach works because:
+- pdfjs `item.transform[4]` = x position in PDF user-space (points, left from page left edge)
+- pdfjs `item.transform[5]` = y position in PDF user-space (points, up from page bottom)
+- These are stored directly as `field.x` and `field.y` in SignatureFieldData
+- FieldPlacer renders stored fields using `pdfToScreenCoords(field.x, field.y, renderedW, renderedH, pageInfo)` which uses `pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf mediaBox dimensions)
+- Since both extraction (pdfjs transform matrix) and rendering (react-pdf mediaBox) read from the same PDF mediaBox, they are inherently consistent
+
+**No `aiCoordsToPagePdfSpace` conversion function is needed or present in the current code.**
+
+The `aiCoordsToPagePdfSpace` function and its test (`ai-coords.test.ts`) were created for the vision approach and then deleted when the approach changed. Do not recreate them.
---
@@ -33,122 +94,80 @@ The decision recorded in STATE.md is authoritative: do NOT use `zodResponseForma
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
-| openai (npm) | ^6.32.0 (latest Mar 2026) | Official OpenAI TypeScript SDK for Chat Completions API | Not yet installed; must `npm install openai`. Official SDK with strict mode structured outputs |
-| pdfjs-dist | 5.4.296 (already installed) | Server-side PDF text extraction via `getTextContent()` | Already a project dependency via react-pdf. Legacy build works in Node.js route handlers |
+| pdfjs-dist | 5.4.296 (hoisted from react-pdf) | PDF text-layer extraction via `getTextContent()` | Already installed; legacy build works in Node.js with file:// workerSrc |
+| openai | ^6.32.0 (installed) | GPT-4.1 structured output for field type classification | Official SDK; installed; manual json_schema required (Zod v4 incompatibility) |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
-| server-only | NOT installed (not in node_modules) | Build-time guard preventing client-side import of server modules | Use comment guard `// server-only` or explicit `if (typeof window !== 'undefined') throw` — OR install `server-only` package |
-| crypto.randomUUID() | Node built-in | Generate UUIDs for AI-placed field IDs | Already used in FieldPlacer for field IDs; same pattern here |
+| @napi-rs/canvas | ^0.1.97 (installed, no longer used) | Was for server-side JPEG rendering | Kept in package.json but no longer imported; `serverExternalPackages` still lists it in next.config.ts — leave that entry to avoid breaking changes |
+| crypto.randomUUID() | Node built-in | UUID generation for field IDs | Used in classifyFieldsWithAI to assign IDs before textFillData keying |
-### Alternatives Considered
+### Verified Package Versions
-| Instead of | Could Use | Tradeoff |
-|------------|-----------|----------|
-| Manual `json_schema` response_format | `zodResponseFormat` helper | zodResponseFormat is broken with Zod v4 (confirmed issues #1540, #1602, #1709 — this is a locked decision in STATE.md) |
-| pdfjs-dist legacy build | pdf-parse, unpdf | pdfjs-dist is already installed; unpdf would add a dependency; pdf-parse is simpler but less positional data available |
-| GPT-4o-mini | GPT-4o, GPT-4.1 | GPT-4o-mini is cheapest and supports structured outputs with 100% schema compliance; good enough for field classification |
-
-**Installation:**
```bash
-npm install openai
+# pdfjs-dist version confirmed:
+cat node_modules/pdfjs-dist/package.json | grep '"version"'
+# → "version": "5.4.296"
+
+# openai version confirmed:
+cat node_modules/openai/package.json | grep '"version"'
```
+**No installation needed** — all required packages are already in node_modules.
+
---
## Architecture Patterns
-### Recommended Project Structure
+### Recommended Project Structure (Current)
```
src/
├── lib/
│ ├── ai/
-│ │ ├── extract-text.ts # pdfjs-dist server-side text extraction (server-only guard)
-│ │ └── field-placement.ts # GPT-4o-mini call + aiCoordsToPagePdfSpace() (server-only guard)
+│ │ ├── extract-text.ts # pdfjs-dist blank extraction (4 strategies)
+│ │ └── field-placement.ts # GPT-4.1 type classification + post-processing
│ └── pdf/
-│ ├── prepare-document.ts # existing — unchanged
│ └── __tests__/
-│ ├── prepare-document.test.ts # existing — unchanged
-│ └── ai-coords.test.ts # NEW — unit test for aiCoordsToPagePdfSpace()
+│ └── prepare-document.test.ts # 10 tests — Y-flip formula (passing)
└── app/
└── api/
└── documents/
└── [id]/
└── ai-prepare/
- └── route.ts # NEW — POST /api/documents/[id]/ai-prepare
+ └── route.ts # POST handler
```
-### Pattern 1: Server-Side PDF Text Extraction with pdfjs-dist Legacy Build
+### Pattern 1: pdfjs-dist Blank Extraction (Text Layer)
-**What:** Import from `pdfjs-dist/legacy/build/pdf.mjs`, set `GlobalWorkerOptions.workerSrc = ''` (empty string tells pdf.js to fall back to synchronous/fake worker mode in Node.js — this is the documented pattern for server-side use without a browser worker thread), load the PDF bytes, iterate pages calling `getTextContent()`.
+**What:** Use pdfjs `getTextContent()` to get all text items. Each item has a `transform` matrix where `transform[4]` = x and `transform[5]` = y in PDF user-space (bottom-left origin, points). Find underscore sequences (Strategy 1: pure runs, Strategy 2: embedded runs) and bracket patterns (Strategy 3: single-item `[ ]`, Strategy 4: multi-item `[ … ]`). Group items into lines with ±5pt y-tolerance.
-**When to use:** Server-only route handlers (Next.js App Router route.ts files run on Node.js).
+**Why text-layer over vision:** Coordinates are exact (sub-point accuracy). No DPI scaling math, no image rendering, no API image tokens, no coordinate conversion needed.
-**Important:** The project's existing client-side usage (`PdfViewer.tsx`, `PreviewModal.tsx`) sets `workerSrc = new URL('pdfjs-dist/build/pdf.worker.min.mjs', import.meta.url).toString()` — that is browser-only. The server-side module must set `workerSrc = ''` independently. Do NOT share the import or the GlobalWorkerOptions assignment across server and client.
+**pdfjs-dist 5.x workerSrc (critical):** Must use `file://` URL pointing to the worker .mjs file — empty string is falsy and causes PDFWorker to throw before the fake-worker import runs.
```typescript
-// Source: pdfjs-dist docs + STATE.md v1.1 Research decision
-// lib/ai/extract-text.ts — server-only
-
-import { getDocument, GlobalWorkerOptions } from 'pdfjs-dist/legacy/build/pdf.mjs';
-import { readFile } from 'node:fs/promises';
-
-// Empty string = no worker thread (fake/synchronous worker) — required for Node.js server context
-GlobalWorkerOptions.workerSrc = '';
-
-export interface PageText {
- page: number; // 1-indexed
- text: string; // all text items joined with spaces
- width: number; // page width in PDF points (72 DPI)
- height: number; // page height in PDF points (72 DPI)
-}
-
-export async function extractPdfText(filePath: string): Promise {
- const data = new Uint8Array(await readFile(filePath));
- const pdf = await getDocument({ data }).promise;
- const pages: PageText[] = [];
-
- for (let pageNum = 1; pageNum <= pdf.numPages; pageNum++) {
- const page = await pdf.getPage(pageNum);
- const viewport = page.getViewport({ scale: 1.0 });
- const textContent = await page.getTextContent();
- const text = textContent.items
- .filter((item) => 'str' in item)
- .map((item) => (item as { str: string }).str)
- .join(' ');
- pages.push({
- page: pageNum,
- width: viewport.width,
- height: viewport.height,
- text,
- });
- }
- return pages;
-}
+// Source: extract-text.ts (confirmed working)
+import { join } from 'node:path';
+GlobalWorkerOptions.workerSrc = `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`;
```
-### Pattern 2: GPT-4o-mini Structured Output with Manual JSON Schema
+**Text item transform matrix:** `[scaleX, skewY, skewX, scaleY, translateX, translateY]`
+- `transform[4]` = x left edge of item (PDF points from page left)
+- `transform[5]` = y baseline of item (PDF points from page bottom)
+- `transform[0]` = font size (approximately) when no rotation
-**What:** Use the `openai` SDK `chat.completions.create()` with `response_format: { type: 'json_schema', json_schema: { ... strict: true } }`. The schema asks the model to return an array of field placement objects with: `page` (1-indexed integer), `fieldType` (enum of 7 types), `xPct` (0–100 percentage of page width), `yPct` (0–100 percentage of page height, measured from page TOP — AI models think top-left origin), `widthPct`, `heightPct`, and optionally `prefillValue` for text fields.
+### Pattern 2: GPT-4.1 Type Classification (Text-Only, No Vision)
-**Critical JSON Schema rule:** When `strict: true`, ALL properties must be in `required` and ALL objects must have `additionalProperties: false`. Any missing field from `required` causes an API 400 error.
+**What:** Send a compact text description of each detected blank to GPT-4.1. The description includes blank index, page number, row position metadata (`row=N/T`), and context strings (contextBefore, contextAfter, contextAbove, contextBelow). AI returns only the field type and prefill value — coordinates come from pdfjs directly.
-**When to use:** All AI field placement requests.
+**Schema:** Manual `json_schema` with `strict: true`, all properties in `required`, `additionalProperties: false` at every nesting level. Do NOT use `zodResponseFormat` (broken with Zod v4).
```typescript
-// Source: OpenAI docs + STATE.md locked decision (manual json_schema, not zodResponseFormat)
-// lib/ai/field-placement.ts — server-only
-
-import OpenAI from 'openai';
-import type { PageText } from './extract-text';
-import type { SignatureFieldData, SignatureFieldType } from '@/lib/db/schema';
-
-const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
-
-const FIELD_PLACEMENT_SCHEMA = {
+// Source: field-placement.ts (confirmed working)
+const CLASSIFICATION_SCHEMA = {
type: 'object',
properties: {
fields: {
@@ -156,15 +175,11 @@ const FIELD_PLACEMENT_SCHEMA = {
items: {
type: 'object',
properties: {
- page: { type: 'integer' },
- fieldType: { type: 'string', enum: ['text', 'checkbox', 'initials', 'date', 'client-signature', 'agent-signature', 'agent-initials'] },
- xPct: { type: 'number' },
- yPct: { type: 'number' }, // % from page TOP (AI top-left origin)
- widthPct: { type: 'number' },
- heightPct: { type: 'number' },
- prefillValue: { type: 'string' }, // only for text fields; empty string if none
+ index: { type: 'integer' },
+ fieldType: { type: 'string', enum: ['text', 'initials', 'date', 'client-signature', 'agent-signature', 'agent-initials', 'checkbox'] },
+ prefillValue: { type: 'string' },
},
- required: ['page', 'fieldType', 'xPct', 'yPct', 'widthPct', 'heightPct', 'prefillValue'],
+ required: ['index', 'fieldType', 'prefillValue'],
additionalProperties: false,
},
},
@@ -174,119 +189,80 @@ const FIELD_PLACEMENT_SCHEMA = {
} as const;
```
-### Pattern 3: Coordinate Conversion — AI Percentage to PDF User-Space
+**Note:** The `checkbox` type is in the enum so the AI can classify inline checkboxes — but fields with `fieldType === 'checkbox'` are filtered out (not added to the fields array). They represent selection options, not placed fields.
-**What:** AI returns percentage coordinates with top-left origin (AI models think of a page as a grid where (0,0) is top-left). PDF user-space uses bottom-left origin with points (1 pt = 1/72 inch). Two conversions are needed:
-1. Percentage to absolute points: `x = (xPct / 100) * pageWidth`
-2. Y-axis flip: `y = pageHeight - (yPct / 100) * pageHeight - fieldHeight`
- (the stored y is the BOTTOM edge of the field in PDF space)
+### Pattern 3: Coordinate Handling — No Conversion Needed
-**This is the exact same formula already used in FieldPlacer.tsx** for the drag-and-drop coordinate conversion. The new `aiCoordsToPagePdfSpace()` function extracts it as a named utility, verified by a unit test.
+**What:** Blank coordinates from pdfjs text layer are already in PDF user-space. Store them directly.
```typescript
-// Source: FieldPlacer.tsx coordinate math (lines 287-295) — same formula
-// lib/ai/field-placement.ts
-
-export interface AiFieldCoords {
- page: number;
- fieldType: SignatureFieldType;
- xPct: number; // % from left, top-left origin
- yPct: number; // % from top, top-left origin
- widthPct: number;
- heightPct: number;
- prefillValue: string;
-}
-
-/**
- * Convert AI percentage coordinates (top-left origin) to PDF user-space points (bottom-left origin).
- *
- * pageWidth/pageHeight in PDF points (from page.getViewport({ scale: 1.0 })).
- *
- * Formula mirrors FieldPlacer.tsx handleDragEnd (lines 289-291):
- * pdfX = (clampedX / renderedW) * pageInfo.originalWidth
- * pdfY = ((renderedH - (clampedY + fieldHpx)) / renderedH) * pageInfo.originalHeight
- */
-export function aiCoordsToPagePdfSpace(
- coords: AiFieldCoords,
- pageWidth: number,
- pageHeight: number,
-): { x: number; y: number; width: number; height: number } {
- const fieldWidth = (coords.widthPct / 100) * pageWidth;
- const fieldHeight = (coords.heightPct / 100) * pageHeight;
- const screenX = (coords.xPct / 100) * pageWidth;
- const screenY = (coords.yPct / 100) * pageHeight; // screen Y from top
-
- const x = screenX;
- // PDF y = distance from BOTTOM. screenY is from top, so flip:
- // pdfY = pageHeight - screenY - fieldHeight (bottom edge of field)
- const y = pageHeight - screenY - fieldHeight;
-
- return { x, y, width: fieldWidth, height: fieldHeight };
-}
+// Source: field-placement.ts (confirmed working)
+const y = Math.max(0, blank.y - 2); // -2pt: anchor just below baseline (underscores descend slightly)
+fields.push({ id, page: blank.page, x: blank.x, y, width, height, type: fieldType });
```
-### Pattern 4: AI Auto-place API Route
+**The -2pt y offset:** Underscore characters descend slightly below the text baseline. Moving the field bottom edge 2pt below the baseline positions the field box to sit ON the underline visually.
-**What:** POST `/api/documents/[id]/ai-prepare` — server-side route that orchestrates:
-1. Load document + client from DB
-2. Resolve PDF file path
-3. Call `extractPdfText()` for all pages
-4. Call GPT-4o-mini with extracted text and client profile data
-5. Convert AI coords to PDF user-space for each field
-6. Write `SignatureFieldData[]` back to DB via direct DB update (same as fields PUT endpoint pattern)
-7. Return `{ fields, textFillData }` — client updates local state from response
+**Why this works without conversion:** pdfjs `transform[5]` is the text baseline Y in the same coordinate space (PDF user-space, bottom-left origin) that FieldPlacer's `pdfToScreenCoords` expects for rendering. `pageInfo.originalHeight` in FieldPlacer comes from `page.view[3]` (react-pdf mediaBox), which is the same value as pdfjs `getViewport({scale:1}).height` for standard PDF pages.
-**When to use:** Called from the "AI Auto-place" button in PreparePanel.
+### Pattern 4: Deterministic Post-Processing Rules
+
+**What:** Four rules that override AI classifications for structurally unambiguous cases.
+
+| Rule | Condition | Override |
+|------|-----------|---------|
+| A | `contextBefore` last word is "date" | → `date` |
+| B | `contextBefore` last word is "initials" | → `initials` |
+| C | `rowTotal > 1` AND `rowIndex > 1` AND AI classified as signature | → `date` (if last+contextBelow has "(Date)") or `text` |
+| D | `rowTotal=2`, `rowIndex=1`, AI classified as signature, contextBelow has "(Address/Phone)"+"(Date)" but NOT "(Seller"/"(Buyer" | → `text` |
+
+**Why needed:** The footer pattern `Seller's Initials [ ] Date ___` and signature block rows `[sig] [address/phone] [date]` are structurally deterministic but the AI sometimes misclassifies them.
+
+### Pattern 5: FieldPlacer Coordinate Rendering Formula
+
+**What:** How stored PDF coordinates are converted back to screen pixels for rendering.
```typescript
-// Source: pattern matches /api/documents/[id]/prepare/route.ts
-// app/api/documents/[id]/ai-prepare/route.ts
-
-export async function POST(
- req: Request,
- { params }: { params: Promise<{ id: string }> }
-) {
- const session = await auth();
- if (!session?.user?.id) return new Response('Unauthorized', { status: 401 });
-
- const { id } = await params;
- const doc = await db.query.documents.findFirst({
- where: eq(documents.id, id),
- with: { client: true },
- });
- if (!doc) return Response.json({ error: 'Not found' }, { status: 404 });
- if (!doc.filePath) return Response.json({ error: 'No PDF file' }, { status: 422 });
- if (doc.status !== 'Draft') return Response.json({ error: 'Document is locked' }, { status: 403 });
-
- const filePath = path.join(UPLOADS_DIR, doc.filePath);
- const pageTexts = await extractPdfText(filePath);
- const { fields, textFillData } = await classifyFieldsWithAI(pageTexts, doc.client);
-
- // Write fields to DB (same as PUT /fields)
- const [updated] = await db
- .update(documents)
- .set({ signatureFields: fields })
- .where(eq(documents.id, id))
- .returning();
-
- return Response.json({ fields: updated.signatureFields, textFillData });
+// Source: FieldPlacer.tsx pdfToScreenCoords (lines 44-54)
+function pdfToScreenCoords(pdfX, pdfY, renderedW, renderedH, pageInfo) {
+ const left = (pdfX / pageInfo.originalWidth) * renderedW;
+ // top is distance from DOM top to BOTTOM EDGE of field
+ const top = renderedH - (pdfY / pageInfo.originalHeight) * renderedH;
+ return { left, top };
}
+
+// Rendering (FieldPlacer.tsx line 630):
+// top: top - heightPx + canvasOffset.y ← shift up by height to get visual top of field
```
-### Pattern 5: AI Auto-place Button in PreparePanel
+`pageInfo.originalWidth/originalHeight` are from react-pdf's `page.view[2]/page.view[3]` (mediaBox dimensions in PDF points at scale 1.0).
-**What:** Add "AI Auto-place" button to PreparePanel. On click: POST to `/api/documents/[id]/ai-prepare`, receive `{ fields, textFillData }`, update DocumentPageClient state (`setTextFillData`, invalidate preview token). FieldPlacer reloads from DB via its existing `loadFields` useEffect (or receives the updated fields as a prop — both approaches work; recommend refreshing from DB to keep single source of truth).
+### Pattern 6: AI Auto-place Route and Client State Update
-**Recommended approach:** After the AI route returns, trigger a page reload or force FieldPlacer to re-fetch by incrementing a `fieldReloadKey` prop. The simplest approach is to call `router.refresh()` (which is already used in PreparePanel after prepare+send) or to expose a `reload` callback from FieldPlacer.
+**What:** POST to `/api/documents/[id]/ai-prepare` → receive `{ fields, textFillData }` → update FieldPlacer state via `aiPlacementKey` increment.
+
+```typescript
+// Source: DocumentPageClient.tsx (confirmed working in Plan 03)
+async function handleAiAutoPlace() {
+ const res = await fetch(`/api/documents/${docId}/ai-prepare`, { method: 'POST' });
+ if (res.ok) {
+ const { textFillData: aiTextFill } = await res.json();
+ setTextFillData(prev => ({ ...prev, ...aiTextFill })); // MERGE — preserves manual values
+ setAiPlacementKey(k => k + 1); // triggers FieldPlacer re-fetch from DB
+ setPreviewToken(null); // invalidate stale preview
+ }
+}
+```
### Anti-Patterns to Avoid
-- **DO NOT** import pdfjs-dist in client components for server extraction — server-only extraction guard is mandatory (pdfjs-dist in a server route is fine; it just must not be bundled into the client)
-- **DO NOT** use `zodResponseFormat` — broken with Zod v4 (issues #1540, #1602, #1709). This is a locked decision.
-- **DO NOT** use `workerSrc = new URL('...', import.meta.url)` in the server route — `import.meta.url` may not resolve correctly in all Next.js Route Handler contexts and triggers browser worker initialization
-- **DO NOT** use the AI coordinates as-is without conversion — AI returns top-left origin percentages; PDF requires bottom-left origin points
-- **DO NOT** lock fields after AI placement — the agent MUST be able to review, adjust, or delete any AI-placed field (this is a success criterion)
-- **DO NOT** have the AI route set document status to anything other than Draft — only the prepare route should move status to Sent
+- **DO NOT** recreate `aiCoordsToPagePdfSpace` — that function was for the abandoned vision approach and does not belong in the current architecture
+- **DO NOT** recreate `ai-coords.test.ts` — it tested a function that no longer exists; Plan 04 Task 1 must reference `prepare-document.test.ts` instead
+- **DO NOT** import from `'pdfjs-dist/legacy/build/pdf.mjs'` with `GlobalWorkerOptions.workerSrc = ''` (empty string) — this is the pdfjs v3+ breaking change; always use the `file://` URL path
+- **DO NOT** use `zodResponseFormat` — broken with Zod v4.3.6 (confirmed GitHub issues #1540, #1602, #1709)
+- **DO NOT** add `@napi-rs/canvas` imports back to extract-text.ts — the vision approach was abandoned; the `serverExternalPackages` entry in next.config.ts can stay but the import is gone
+- **DO NOT** use vision/image-based approach — the text-extraction approach is simpler, cheaper, and coordinate-accurate
+- **DO NOT** lock fields after AI placement — agent must be able to edit, move, resize, delete
---
@@ -294,155 +270,144 @@ export async function POST(
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
-| PDF text extraction | Custom PDF parser | pdfjs-dist legacy build (`getTextContent`) | Already installed; handles fonts, encodings, multi-page; tested by Mozilla |
-| Structured AI output | Manual JSON parsing + regex fallback | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed via constrained decoding; no parse failures |
-| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used in FieldPlacer and schema |
-| Field type enum validation | Custom validation function | TypeScript union type `SignatureFieldType` | Already defined in schema.ts; pass as `enum` array in JSON schema |
+| PDF text extraction | Custom PDF parser | pdfjs-dist `getTextContent()` | Already installed; handles encoding, multi-page, returns transform matrix with exact coordinates |
+| Field type classification | Rule-based regex | GPT-4.1 with manual json_schema | Context analysis (above/below/before/after) is complex; AI handles natural language labels |
+| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used everywhere in the project |
+| Structured AI output | Parse JSON manually | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed; no parse failures |
-**Key insight:** Both the PDF extraction and the field write endpoint already exist in the project. Phase 13 is primarily a thin orchestration layer connecting them via an AI call.
+**Key insight:** The current architecture is correct and complete. Plan 04 is verification-only — no new code is needed.
---
## Common Pitfalls
-### Pitfall 1: Y-Axis Inversion Bug
+### Pitfall 1: Plan 04 Task 1 References Deleted Test File
-**What goes wrong:** AI returns `yPct: 10` meaning "10% from the top of the page." Developer naively converts: `y = (10/100) * pageHeight = 79.2 pts`. Stored as y=79.2 in PDF space, which means 79.2 pts from the BOTTOM — so the field appears near the bottom, not 10% from the top.
+**What goes wrong:** Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts`. That file was deleted when the vision approach was abandoned. Running this command fails immediately.
-**Why it happens:** AI models describe positions with top-left origin. PDF user-space uses bottom-left origin.
+**Why it happens:** The plan was written for the vision approach. The architecture pivoted after the plan was written.
-**How to avoid:** Use `aiCoordsToPagePdfSpace()` which applies the flip: `y = pageHeight - screenY - fieldHeight`. The unit test validates this with US Letter dimensions.
+**How to avoid:** Plan 04 Task 1 must run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts` instead. The `prepare-document.test.ts` file has 10 passing tests covering the Y-flip coordinate formula — these remain valid and relevant.
-**Warning signs:** Fields appear at the wrong vertical position; fields meant for the top of a page appear near the bottom.
+**Warning signs:** `No tests found for path pattern 'src/lib/pdf/__tests__/ai-coords.test.ts'`
-### Pitfall 2: OpenAI Strict Mode Schema Requirements
+### Pitfall 2: pdfjs-dist 5.x Fake-Worker Requires `file://` URL
-**What goes wrong:** Passing a JSON schema with `strict: true` that omits a property from `required` or omits `additionalProperties: false` on a nested object. The API returns a 400 error: "Invalid schema for response_format."
+**What goes wrong:** Using `GlobalWorkerOptions.workerSrc = ''` (empty string) throws `PDFWorker: workerSrc not set` in Node.js route handlers with pdfjs-dist 5.x.
-**Why it happens:** OpenAI strict mode enforces that EVERY object at EVERY nesting level has ALL properties listed in `required` AND `additionalProperties: false`. The requirement cascades to nested objects.
+**Why it happens:** pdfjs-dist 5.x changed fake-worker mode. Empty string is falsy; the PDFWorker getter throws before attempting the dynamic import. The workerSrc must be a valid URL the Node.js dynamic importer can resolve.
-**How to avoid:** Include `required` listing all property keys and `additionalProperties: false` on EVERY object, including the items schema inside arrays.
+**How to avoid:** Use `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`. This is already correct in the current `extract-text.ts`.
-**Warning signs:** `400 BadRequestError: Invalid schema for response_format` in server logs.
+**Warning signs:** `Error: Setting up fake worker failed` or `workerSrc not set` in server logs.
-### Pitfall 3: pdfjs-dist Worker in Node.js Context
+### Pitfall 3: OpenAI Strict Mode Schema Cascades to Nested Objects
-**What goes wrong:** Calling `getDocument()` without setting `GlobalWorkerOptions.workerSrc = ''` in Node.js throws: `Error: 'No "GlobalWorkerOptions.workerSrc" specified.'` OR tries to spin up a worker thread and fails.
+**What goes wrong:** A JSON schema with `strict: true` that omits `required` or `additionalProperties: false` on a nested object causes a 400 API error.
-**Why it happens:** pdf.js was designed for browsers. It tries to auto-detect the worker path via `document.currentScript.src`, which is null in Node.js.
+**Why it happens:** OpenAI strict mode requires `required` listing ALL properties AND `additionalProperties: false` at EVERY object level — including `items` inside arrays.
-**How to avoid:** Set `GlobalWorkerOptions.workerSrc = ''` at the top of the server-side extract-text module. This enables "fake worker" (synchronous) mode in Node.js — no worker thread needed for text extraction.
+**How to avoid:** The current `CLASSIFICATION_SCHEMA` in `field-placement.ts` is correct. Do not modify the schema structure.
-**Warning signs:** `No "GlobalWorkerOptions.workerSrc" specified` error in route handler logs.
+**Warning signs:** `400 BadRequestError: Invalid schema for response_format`
-### Pitfall 4: pdfjs-dist TypeScript Import Path
+### Pitfall 4: Checkbox Fields Must Be Filtered Out Before Storing
-**What goes wrong:** Importing from `'pdfjs-dist/legacy/build/pdf.mjs'` without adjusting TypeScript config. The `legacy/build/pdf.d.mts` file contains `export * from "pdfjs-dist"` which re-exports the main types, but the import path itself may trigger `skipLibCheck` issues.
+**What goes wrong:** GPT-4.1 returns `fieldType: "checkbox"` for inline selection checkboxes (e.g., `[ ] ARE [ ] ARE NOT`). If these are added to `SignatureFieldData[]`, they appear as placed fields that cannot be properly handled by the signing flow.
-**Why it happens:** The legacy build has `.d.mts` extension (ESM TypeScript declaration), which some older TypeScript configurations don't automatically pick up.
+**Why it happens:** The AI correctly identifies these as checkboxes (not fill-in blanks). They should be classified but not placed.
-**How to avoid:** Confirmed — the project already has `"transpilePackages": ['react-pdf', 'pdfjs-dist']` in next.config.ts. Import `getDocument` and `GlobalWorkerOptions` from `'pdfjs-dist/legacy/build/pdf.mjs'`. If type errors appear, add `@ts-ignore` or use the main `pdfjs-dist` types (they're the same via the re-export).
+**How to avoid:** The current code already filters: `if (result.fieldType === 'checkbox') continue;`. This is correct.
-**Warning signs:** TypeScript error `Could not find declaration file for 'pdfjs-dist/legacy/build/pdf.mjs'`.
+**Warning signs:** Dozens of tiny checkbox-type fields placed throughout the document body.
-### Pitfall 5: FieldPlacer State Not Reflecting AI-Placed Fields
+### Pitfall 5: textFillData Keys Must Be Field UUIDs
-**What goes wrong:** AI route writes fields to DB successfully. But FieldPlacer on screen still shows the previous (empty) fields because its local state isn't updated.
+**What goes wrong:** textFillData keyed by label string ("clientName") does not match DocumentPageClient's lookup of `textFillData[field.id]`.
-**Why it happens:** FieldPlacer loads fields once on mount via `useEffect`. The AI route write bypasses the React state.
+**Why it happens:** Phase 12.1 wired textFillData to use field UUID as key (STATE.md confirmed locked decision). Label-keyed maps are silently ignored.
-**How to avoid:** After the AI route returns successfully, trigger FieldPlacer to re-fetch from DB. Options:
-- Call `router.refresh()` (Next.js App Router — causes server component re-render, but FieldPlacer re-mounts and calls loadFields)
-- Add a `fieldReloadKey` prop to FieldPlacer that causes its `useEffect` to re-run when incremented
-- Pass the returned `fields` array directly to FieldPlacer via a prop (requires making FieldPlacer accept `initialFields`)
+**How to avoid:** The route handler creates UUIDs (`crypto.randomUUID()`) BEFORE building textFillData, then uses those same UUIDs as keys. This is already correct in the current code.
-**Recommended:** The simplest approach is to expose an `onAiPlacement` callback that calls `router.refresh()` on DocumentPageClient, OR expose a reload callback from FieldPlacer. Given that FieldPlacer already loads from DB on mount, `router.refresh()` is clean.
+**Warning signs:** Text pre-fill values don't appear in the preview even though AI returned them.
-### Pitfall 6: Text Field Pre-fill Requires Field IDs
+### Pitfall 6: Y Coordinate Is Baseline, Not Field Bottom
-**What goes wrong:** AI route returns `textFillData` keyed by "client-name" or label string. But `textFillData` in the project is keyed by **field UUID** (SignatureFieldData.id). A label-keyed map won't match anything in `textFillData[field.id]` lookups.
+**What goes wrong:** A field placed exactly at `blank.y` (text baseline) appears above the underline, not on it. Underscores descend slightly below the baseline.
-**Why it happens:** Phase 12.1 wired textFillData to be keyed by `field.id` (UUID) not by label. This is a confirmed design decision in STATE.md.
+**Why it happens:** PDF text baseline is where capital letters sit. Descenders (underscores, lowercase g/y/p) extend below the baseline.
-**How to avoid:** When building `textFillData` from AI output, use the `id: crypto.randomUUID()` assigned to each AI-placed field in the route handler. The route handler creates the field UUIDs, so it can simultaneously build `{ [fieldId]: prefillValue }` for text fields with prefill values.
+**How to avoid:** The current code uses `const y = Math.max(0, blank.y - 2)`. The -2pt offset anchors the field bottom edge slightly below the baseline, sitting on the underline visually.
-**Warning signs:** Text fill values don't appear in the preview even though AI returned them.
+**Warning signs:** Fields appear to float just above the underline instead of sitting on it.
+
+### Pitfall 7: Debug Console.log Statements Are Still in Production Code
+
+**What goes wrong:** Two `console.log` statements remain in `classifyFieldsWithAI`: one printing the blank count, one printing ALL blank descriptions (can be very verbose for large forms), and one printing all AI classifications.
+
+**Why it happens:** Debugging statements added during development were not removed.
+
+**How to avoid:** Plan 04 should include removing these console.log statements before final sign-off. They do not affect correctness but should not ship in production code.
+
+**Warning signs:**
+```
+[ai-prepare] calling gpt-4.1 with 47 blanks
+[ai-prepare] blank descriptions:
+[0] page1 before="..." ...
+```
---
## Code Examples
-### Full Coordinate Conversion with Unit Test Target
+### Verified: How pdfjs Text Coordinates Map to Field Positions
```typescript
-// Source: FieldPlacer.tsx lines 287-295 (exact formula)
-// Expected unit test cases for aiCoordsToPagePdfSpace:
-
-// US Letter: 612 × 792 pts
-// AI says: text field at xPct=10, yPct=5, widthPct=30, heightPct=5 (top of page)
-// Expected: x=61.2, y=792-(0.05*792)-(0.05*792) = 792 - 39.6 - 39.6 = 712.8
-// i.e., field bottom edge is 712.8 pts from page bottom (near the top)
-
-// Checkbox: AI says xPct=90, yPct=95, widthPct=3, heightPct=3 (near bottom-right)
-// Expected: x = 0.90*612 = 550.8
-// fieldH = 0.03*792 = 23.76
-// y = 792 - (0.95*792) - 23.76 = 792 - 752.4 - 23.76 = 15.84
+// pdfjs transform matrix: [scaleX, skewY, skewX, scaleY, translateX, translateY]
+// For a text item on a US Letter page (612 × 792 pts):
+// transform[4] = x (distance from left edge of page, in points)
+// transform[5] = y (distance from BOTTOM of page, in points — PDF user-space)
+//
+// Example: text near top of page
+// transform[5] ≈ 720 (720pt from bottom = ~1" from top on a 792pt page)
+//
+// FieldPlacer renders this as:
+// top = renderedH - (720 / 792) * renderedH = renderedH * 0.091 ≈ 9% from top ✓
+//
+// Example: footer initials near bottom of page
+// transform[5] ≈ 36 (36pt from bottom = 0.5" from bottom)
+//
+// FieldPlacer renders this as:
+// top = renderedH - (36 / 792) * renderedH = renderedH * 0.955 ≈ 95.5% from top ✓
```
-### Manual JSON Schema for GPT-4o-mini (Complete)
+### Verified: groupIntoLines Threshold
```typescript
-// Source: OpenAI structured outputs docs
-const response = await openai.chat.completions.create({
- model: 'gpt-4o-mini',
- messages: [
- {
- role: 'system',
- content: `You are a real estate document form field extractor.
-Given extracted text from a PDF page (with context about page number and dimensions),
-identify where signature, text, checkbox, initials, and date fields should be placed.
-Return fields as percentage positions (0-100) from the TOP-LEFT of the page.
-Use these field types: text (for typed values), checkbox, initials, date, client-signature, agent-signature, agent-initials.
-For text fields that match the client profile, set prefillValue to the known value. Otherwise use empty string.`,
- },
- {
- role: 'user',
- content: `Client name: ${clientName}\nProperty address: ${propertyAddress}\n\nPDF pages:\n${pagesSummary}`,
- },
- ],
- response_format: {
- type: 'json_schema',
- json_schema: {
- name: 'field_placement',
- strict: true,
- schema: FIELD_PLACEMENT_SCHEMA, // defined above — all fields required, additionalProperties: false
- },
- },
-});
+// Source: extract-text.ts groupIntoLines (line 58)
+// ±5pt tolerance groups items on the same visual line.
+// This handles multi-run items (same underline split across font boundaries)
+// AND minor baseline variations in real PDFs.
+//
+// A 3-blank signature row (sig | addr/phone | date) will group correctly IF
+// all three blanks have y-values within ±5pt of each other.
+// If NOT grouped (y-drift > 5pt), Rule D in post-processing handles the misclassification.
```
-### FieldPlacer Reload After AI Placement (Recommended Pattern)
+### Verified: Route Handler Security and Error Pattern
```typescript
-// In DocumentPageClient.tsx — add aiPlacementKey state
-const [aiPlacementKey, setAiPlacementKey] = useState(0);
-
-// Pass to FieldPlacer via PdfViewerWrapper
-// In FieldPlacer, add to loadFields useEffect dependency array:
-useEffect(() => {
- loadFields();
-}, [docId, aiPlacementKey]); // re-fetch when key increments
-
-// After AI auto-place succeeds in PreparePanel:
-async function handleAiAutoPlace() {
- const res = await fetch(`/api/documents/${docId}/ai-prepare`, { method: 'POST' });
- if (res.ok) {
- const { textFillData: aiTextFill } = await res.json();
- setTextFillData(prev => ({ ...prev, ...aiTextFill }));
- setAiPlacementKey(k => k + 1); // triggers FieldPlacer reload
- setPreviewToken(null);
- }
-}
+// Source: ai-prepare/route.ts (confirmed complete)
+// Guards in order:
+// 1. auth() — session check
+// 2. OPENAI_API_KEY existence — 503 if missing
+// 3. document lookup — 404 if not found
+// 4. filePath check — 422 if no PDF
+// 5. status === 'Draft' — 403 if locked
+// 6. path traversal check — 403 if escapes UPLOADS_DIR
+// 7. try/catch wrapping extractBlanks + classifyFieldsWithAI — 500 with message
+// 8. DB write — direct update, no status change
+// 9. return { fields, textFillData }
```
---
@@ -451,67 +416,125 @@ async function handleAiAutoPlace() {
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
-| AcroForm heuristic field detection (FORMS-V2-01) | GPT-4o-mini text classification | Superseded per REQUIREMENTS.md | AI-based approach handles flat/scanned forms that have no AcroForm fields |
-| `zodResponseFormat` helper | Manual `json_schema` response_format | Broken in Zod v4 (as of 2025) | Must write JSON schema manually; more verbose but reliable |
-| `disableWorker = true` (pdfjs-dist v2 API) | `GlobalWorkerOptions.workerSrc = ''` | pdfjs-dist v3+ | Correct server-side Node.js pattern for current pdfjs-dist 5.x |
-| Positional text extraction via bounding boxes | Text-only extraction with AI classification | Phase 13 approach | Simpler and sufficient for label classification; AI handles the spatial reasoning |
+| GPT-4o vision (render pages as JPEG) | pdfjs text-layer extraction | After Plan 01, during debugging | Eliminates coordinate conversion math and the 30-40% vertical offset bug |
+| `aiCoordsToPagePdfSpace()` + `ai-coords.test.ts` | Direct pdfjs coordinates — no conversion | After Plan 01 pivot | Deleted function and test; `prepare-document.test.ts` is the only relevant test |
+| GPT-4o (vision model) | GPT-4.1 (text model) | After vision approach abandoned | Cheaper, faster, sufficient for text classification task |
+| `GlobalWorkerOptions.workerSrc = ''` | `file://` URL path | pdfjs-dist 5.x requirement | Required for fake-worker mode; empty string is falsy in 5.x |
+| `zodResponseFormat` helper | Manual `json_schema` | Zod v4 incompatibility (issues #1540, #1602, #1709) | Permanent — project uses Zod v4.3.6 |
+| `@napi-rs/canvas` for image rendering | Not used (deleted from imports) | After vision approach abandoned | Package still installed but not imported; `serverExternalPackages` entry can remain |
-**Deprecated/outdated:**
-- `PDFJS.disableWorker = true`: v2 API, removed in v3+. Use `GlobalWorkerOptions.workerSrc = ''` instead.
-- `zodResponseFormat`: Works only with Zod v3. Project uses Zod v4.3.6. Use manual json_schema.
+**Deprecated/outdated items that should NOT appear in new code:**
+- `import { createCanvas } from '@napi-rs/canvas'` in extract-text.ts — deleted, do not restore
+- `aiCoordsToPagePdfSpace()` function — deleted, do not recreate
+- `ai-coords.test.ts` — deleted, do not recreate
+- `zodResponseFormat` — broken with Zod v4, do not use
+- `GlobalWorkerOptions.workerSrc = ''` (empty string) — wrong for pdfjs 5.x, do not use
---
## Open Questions
-1. **OPENAI_API_KEY environment variable**
- - What we know: The project's `.env.local` file exists but its contents are private. The openai SDK reads from `process.env.OPENAI_API_KEY` by default.
- - What's unclear: Whether OPENAI_API_KEY is already set in `.env.local` for the project.
- - Recommendation: The plan should include a step noting that OPENAI_API_KEY must be set in `.env.local`. The route handler should return a clear 503 error if OPENAI_API_KEY is not configured.
+1. **Debug console.log statements in classifyFieldsWithAI**
+ - What we know: Three `console.log` calls remain in `field-placement.ts` (blank count, all descriptions, all classifications). These print verbose output to server logs on every AI auto-place request.
+ - What's unclear: Whether to remove before Plan 04 verification or after.
+ - Recommendation: Remove before E2E verification so server logs are clean for manual testing. Add to Plan 04 Task 1.
-2. **Utah REPC 20-page form field accuracy**
- - What we know: AI coordinate accuracy on real Utah forms is listed as an explicit concern in STATE.md Blockers/Concerns.
- - What's unclear: How well GPT-4o-mini will perform on a dense legal document like the Utah REPC without fine-tuning. Percentage coordinates from text-only extraction (no visual) may be imprecise.
- - Recommendation: Plan 13-04 integration test is non-negotiable. Expect iteration on the system prompt. Consider truncating very long page text to stay within GPT-4o-mini context limits.
+2. **gpt-4.1 model availability**
+ - What we know: `field-placement.ts` uses `model: 'gpt-4.1'`. This is the current model. The project's OPENAI_API_KEY must have access to this model.
+ - What's unclear: Whether the developer's API key has gpt-4.1 access (it's not universally available to all OpenAI accounts as of 2026).
+ - Recommendation: If the API call returns a 404 model error, fall back to `gpt-4o` (which was used in an earlier iteration and is broadly available). The prompt and schema work identically for both models.
-3. **Token limit for 20-page document**
- - What we know: GPT-4o-mini context window is 128K tokens. A 20-page dense legal document could approach 30,000-50,000 tokens of extracted text.
- - What's unclear: Whether sending all 20 pages in one call will stay within limits comfortably.
- - Recommendation: In `classifyFieldsWithAI`, cap page text at ~2000 chars per page (truncate with ellipsis) to stay well under limits. Total text budget: 20 × 2000 = 40,000 chars ≈ ~10,000 tokens — well within 128K. The plan should include this truncation.
+3. **Real Utah REPC 20-page accuracy**
+ - What we know: The text-extraction approach extracts blanks accurately based on the 4 detection strategies. Accuracy depends on the quality of the PDF text layer.
+ - What's unclear: How many blanks are missed or double-detected on the full 20-page Utah REPC. The system prompt is tuned for Utah real estate signature patterns.
+ - Recommendation: Plan 04 human verification on a real Utah REPC is non-negotiable. Expect 80-95% accuracy — imperfect placement is acceptable as the agent reviews before sending.
-4. **FieldPlacer prop vs. DB reload for displaying AI fields**
- - What we know: FieldPlacer currently loads fields via `useEffect([docId])` on mount. After AI placement writes to DB, FieldPlacer needs to re-fetch.
- - What's unclear: Whether adding `aiPlacementKey` prop to FieldPlacer (which threads through PdfViewerWrapper) or using `router.refresh()` is cleaner given the current component hierarchy.
- - Recommendation: Add `aiPlacementKey` prop to FieldPlacer and thread through PdfViewerWrapper. This avoids a full page re-render from `router.refresh()` and gives more surgical control.
+4. **Plan 04 Task 1 command mismatch**
+ - What we know: Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts` — that file is deleted.
+ - Recommendation: Plan 04 must be updated to run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` and `npx tsc --noEmit` instead.
+
+---
+
+## Environment Availability
+
+| Dependency | Required By | Available | Version | Fallback |
+|------------|------------|-----------|---------|----------|
+| pdfjs-dist | extractBlanks() | ✓ | 5.4.296 | — (required, already installed) |
+| pdfjs worker file | pdfjs fake-worker | ✓ | `node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs` | — |
+| openai SDK | classifyFieldsWithAI() | ✓ | ^6.32.0 | — (required, already installed) |
+| OPENAI_API_KEY env var | classifyFieldsWithAI() | Unknown | — | Route returns 503 with message if missing |
+| Node.js | All server-side code | ✓ | v23.6.0 | — |
+| @napi-rs/canvas | NOT required anymore | ✓ (installed) | ^0.1.97 | N/A — no longer imported |
+
+**Missing dependencies with no fallback:**
+- OPENAI_API_KEY in `.env.local` — must be set before Plan 04 verification. Route returns 503 if missing (actionable error, not a crash).
+
+---
+
+## Validation Architecture
+
+### Test Framework
+
+| Property | Value |
+|----------|-------|
+| Framework | Jest 29.7.0 + ts-jest |
+| Config file | package.json `"jest": { "preset": "ts-jest", "testEnvironment": "node" }` |
+| Quick run command | `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` |
+| Full suite command | `npx jest --no-coverage --verbose` |
+
+### Phase Requirements → Test Map
+
+| Req ID | Behavior | Test Type | Automated Command | File Exists? |
+|--------|----------|-----------|-------------------|-------------|
+| AI-01 | Y-flip coordinate formula correct (FieldPlacer screen↔PDF) | unit | `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` | ✅ |
+| AI-01 | AI auto-place button appears, fields land on canvas, correct vertical positions | manual E2E | human verification (Plan 04 Task 2) | N/A |
+| AI-02 | Text pre-fill values from client profile appear in field values | manual E2E | human verification (Plan 04 Task 2) | N/A |
+
+### Sampling Rate
+
+- **Per task commit:** `npx jest --no-coverage`
+- **Per wave merge:** `npx jest --no-coverage && npx tsc --noEmit`
+- **Phase gate:** Full suite green + TypeScript clean + human E2E approval before phase complete
+
+### Wave 0 Gaps
+
+None — existing test infrastructure covers coordinate math. No new test files required for Plan 04. The deleted `ai-coords.test.ts` is intentionally not recreated (it tested a deleted function).
---
## Sources
### Primary (HIGH confidence)
-- pdfjs-dist 5.4.296 package introspection — `node_modules/pdfjs-dist/legacy/build/` directory listing, `package.json` main/types fields
-- Project codebase — `FieldPlacer.tsx` (coordinate formula, lines 287-295), `prepare-document.test.ts` (existing unit test pattern), `schema.ts` (SignatureFieldData types), `PreparePanel.tsx` (UI pattern for async buttons), `prepare/route.ts` (API route pattern)
-- STATE.md decisions — locked: manual json_schema (not zodResponseFormat); pdfjs-dist legacy build; Zod v4 incompatibility confirmed issues #1540 #1602 #1709
-- REQUIREMENTS.md AI-01, AI-02 — exact success criteria
+
+- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/extract-text.ts` — current implementation, text-layer extraction approach
+- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/field-placement.ts` — current implementation, GPT-4.1 classification
+- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/api/documents/[id]/ai-prepare/route.ts` — current route implementation
+- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` — `pdfToScreenCoords` function (lines 44-54) and drag-end coordinate formula (lines 291-292)
+- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/PdfViewer.tsx` — `pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf)
+- `.planning/phases/13-ai-field-placement-and-pre-fill/.continue-here.md` — complete record of what was attempted, what failed, and current state
+- `git log --oneline` — confirmed commit history showing vision→text pivot
+- STATE.md — locked decisions including manual json_schema, pdfjs-dist legacy build, Zod v4 incompatibility
+- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/pdf/__tests__/prepare-document.test.ts` — 10 tests, all passing
### Secondary (MEDIUM confidence)
-- [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — json_schema shape, strict mode requirements (additionalProperties: false, all fields required)
-- [openai npm package](https://www.npmjs.com/package/openai) — version 6.32.0 confirmed as latest March 2026
-- WebSearch results confirming zodResponseFormat broken with Zod v4: GitHub issues #1540, #1602, #1709, #1739 all open/confirmed
-- pdfjs-dist server-side text extraction pattern — multiple WebSearch sources confirm `GlobalWorkerOptions.workerSrc = ''` for Node.js fake-worker mode
+
+- pdfjs-dist transform matrix format — verified by reading pdfjs docs + confirmed by the implementation's usage of `transform[4]`/`transform[5]` for x/y
+- react-pdf `page.view` array format — confirmed from PdfViewer.tsx `page.view[0], page.view[2]` / `page.view[1], page.view[3]` usage with comment `// Math.max handles non-standard mediaBox ordering`
### Tertiary (LOW confidence)
-- AI coordinate accuracy on Utah REPC forms — untested; flagged as open question in STATE.md
-- GPT-4o-mini token usage estimate for 20-page legal document — estimated from typical legal document density, not measured
+
+- GPT-4.1 model availability for all API keys — not verified; gpt-4.1 may not be available in all tiers
+- Real Utah REPC 20-page blank extraction accuracy — not measured; text-layer quality varies by PDF
---
## Metadata
**Confidence breakdown:**
-- Standard stack: HIGH — pdfjs-dist already installed and confirmed v5.4.296; openai v6.32.0 confirmed from npm; project uses Zod v4.3.6 (manual json_schema confirmed required)
-- Architecture: HIGH — coordinate formula confirmed from FieldPlacer.tsx source; API route pattern confirmed from prepare/route.ts and fields/route.ts; field ID keying confirmed from STATE.md Phase 12.1 decisions
-- Pitfalls: HIGH — Y-axis inversion, Zod v4 zodResponseFormat breakage, worker setup, and field ID keying are all confirmed from authoritative sources (code + STATE.md + verified GitHub issues)
+- Standard stack: HIGH — all packages confirmed installed and version-verified
+- Architecture: HIGH — current code read directly, coordinate system verified analytically and confirmed by test results
+- Pitfalls: HIGH — all pitfalls documented from actual bugs encountered during development (per .continue-here.md) or confirmed from code inspection
+- Plan 04 gap (Task 1 test command): HIGH — confirmed by checking git status (ai-coords.test.ts deleted), running jest (prepare-document.test.ts passes)
-**Research date:** 2026-03-21
-**Valid until:** 2026-04-21 (30 days — openai SDK stable; pdfjs-dist stable; Zod v4 issues open but workaround confirmed)
+**Research date:** 2026-04-03
+**Valid until:** 2026-05-03 (30 days — stack stable; coordinate math is deterministic; no external dependencies changing)