docs(13): re-research — text-extraction approach, coordinate system, plan-04 gap analysis

This commit is contained in:
Chandler Copeland
2026-04-03 14:29:08 -06:00
parent 34ee72004b
commit b792971dac

View File

@@ -1,7 +1,7 @@
# Phase 13: AI Field Placement and Pre-fill - Research
**Researched:** 2026-03-21
**Domain:** OpenAI GPT-4o-mini structured outputs + pdfjs-dist server-side text extraction + coordinate conversion
**Researched:** 2026-04-03 (re-research — complete rewrite)
**Domain:** pdfjs-dist text-layer blank extraction + GPT-4.1 structured output field classification + coordinate system
**Confidence:** HIGH
<phase_requirements>
@@ -9,21 +9,82 @@
| ID | Description | Research Support |
|----|-------------|-----------------|
| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output for field classification; aiCoordsToPagePdfSpace() for coordinate conversion; fields PUT to existing /api/documents/[id]/fields endpoint |
| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | Client profile already has name + propertyAddress in DB; textFillData (Record<string,string> keyed by field UUID) already wired through DocumentPageClient → FieldPlacer; AI route returns both fields array and pre-fill map; agent reviews before committing |
| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | extractBlanks() extracts blanks with exact PDF user-space coords from pdfjs text layer; classifyFieldsWithAI() sends blank descriptions to GPT-4.1 for type classification; no coordinate conversion needed (pdfjs coords stored directly) |
| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | classifyFieldsWithAI() returns textFillData keyed by field UUID; merged into DocumentPageClient.textFillData state |
</phase_requirements>
---
## Summary
Phase 13 adds an "AI Auto-place" button that extracts text from the PDF via pdfjs-dist (already installed), sends that text to GPT-4o-mini with a structured output schema asking for field type + page + normalized percentage coordinates, converts those percentages to PDF user-space points (with Y-axis flip), and writes the resulting `SignatureFieldData[]` array to the existing `/api/documents/[id]/fields` PUT endpoint. The agent reviews, adjusts, or deletes any AI-placed field before proceeding.
Phase 13 is implemented through Plans 0103 and is awaiting final E2E verification in Plan 04. Three plans are complete. The architecture evolved significantly from the original research: the system now uses **direct PDF text-layer extraction** (pdfjs-dist `getTextContent()` with transform matrix coordinates) rather than GPT-4o vision. This eliminates the coordinate conversion bug entirely — pdfjs returns coordinates already in PDF user-space (bottom-left origin, points), which is exactly what FieldPlacer stores and renders.
The `openai` npm package (v6.x, latest as of March 2026) is NOT currently installed in the project — it must be added. However, pdfjs-dist 5.4.296 is already present (via react-pdf) and the legacy build at `pdfjs-dist/legacy/build/pdf.mjs` is usable for server-side text extraction. The coordinate conversion formula already exists in FieldPlacer.tsx and `prepare-document.test.ts`; Phase 13 must replicate it as a named utility (`aiCoordsToPagePdfSpace`) with a dedicated unit test.
The original vision-based approach (render pages as JPEG → GPT-4o → xPct/yPct → aiCoordsToPagePdfSpace conversion) was attempted and abandoned due to a systematic 30-40% vertical offset that could not be resolved. The text-extraction approach (extract blank positions from pdfjs text layer → GPT-4.1 for type classification only → store raw coordinates) is now working code in the repository.
The decision recorded in STATE.md is authoritative: do NOT use `zodResponseFormat` (broken with Zod v4 that's installed). Use manual `json_schema` response_format with `strict: true`, `additionalProperties: false`, and every property in `required` at every nesting level. GPT-4o-mini supports this natively (confirmed since gpt-4o-mini-2024-07-18).
**Plan 04 is the only remaining task**: run unit tests, fix the TypeScript build, and perform human E2E verification. The `ai-coords.test.ts` file was deleted when the vision approach was abandoned and must NOT be recreated for the new architecture (it tested a function that no longer exists). Plan 04 Task 1 must be updated to reflect the current test infrastructure.
**Primary recommendation:** Install `openai` npm package, use `pdfjs-dist/legacy/build/pdf.mjs` for text extraction with `GlobalWorkerOptions.workerSrc = ''` in Node.js route context, send extracted text to GPT-4o-mini with manual JSON schema, convert percentage coordinates to PDF points with Y-axis flip, write fields via existing PUT endpoint, and have the agent review before committing.
**Primary recommendation:** Plan 04 should confirm TypeScript compiles clean, run `prepare-document.test.ts` (10 tests passing), then proceed directly to human E2E verification. The coordinate bug is resolved by architecture — no code fix required for coordinates.
---
## Current Implementation State
### What Is Already Built (Plans 0103 Complete)
The following files are implemented and committed:
| File | Status | Purpose |
|------|--------|---------|
| `src/lib/ai/extract-text.ts` | Complete (uncommitted changes) | pdfjs-dist blank extraction via text layer |
| `src/lib/ai/field-placement.ts` | Complete (uncommitted changes) | GPT-4.1 field type classification |
| `src/app/api/documents/[id]/ai-prepare/route.ts` | Complete (uncommitted changes) | POST route orchestrating the pipeline |
| `src/lib/pdf/__tests__/ai-coords.test.ts` | **Deleted** | Was for vision approach; no longer needed |
| `src/lib/pdf/__tests__/prepare-document.test.ts` | Complete | 10 tests passing — Y-flip formula |
### Architecture (Current — Text Extraction Based)
```
PDF file
extractBlanks() [extract-text.ts]
→ pdfjs getTextContent() → transform[4]=x, transform[5]=y (PDF user-space, bottom-left origin)
→ 4 detection strategies (underscore runs, embedded underscores, bracket items)
→ groupIntoLines() with ±5pt y-tolerance
→ deduplication for Strategy 3+4 overlap
→ returns BlankField[] with {page, x, y, width, contextBefore, contextAfter, contextAbove, contextBelow, rowIndex, rowTotal}
classifyFieldsWithAI() [field-placement.ts]
→ GPT-4.1 receives compact text descriptions (index, page, row=N/T, context strings)
→ returns {index, fieldType, prefillValue} per blank
→ deterministic post-processing (Rules A/B/C/D) overrides AI errors
→ FIELD_HEIGHTS map (type → height in pts)
→ SIZE_LIMITS map (type → {minW, maxW} in pts)
→ stores: { id: UUID, page, x: blank.x, y: blank.y-2, width: clamped, height: by-type }
→ returns { fields: SignatureFieldData[], textFillData: Record<UUID, string> }
POST /api/documents/[id]/ai-prepare [route.ts]
→ writes fields to DB (signatureFields column)
→ returns { fields, textFillData }
DocumentPageClient (React)
→ setAiPlacementKey(k+1) → FieldPlacer re-fetches from DB
→ setTextFillData(prev => { ...prev, ...aiTextFill }) — merge, not replace
```
### Coordinate System — Fully Resolved
**The coordinate bug from the vision approach is NOT present in the current architecture.**
The text-extraction approach works because:
- pdfjs `item.transform[4]` = x position in PDF user-space (points, left from page left edge)
- pdfjs `item.transform[5]` = y position in PDF user-space (points, up from page bottom)
- These are stored directly as `field.x` and `field.y` in SignatureFieldData
- FieldPlacer renders stored fields using `pdfToScreenCoords(field.x, field.y, renderedW, renderedH, pageInfo)` which uses `pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf mediaBox dimensions)
- Since both extraction (pdfjs transform matrix) and rendering (react-pdf mediaBox) read from the same PDF mediaBox, they are inherently consistent
**No `aiCoordsToPagePdfSpace` conversion function is needed or present in the current code.**
The `aiCoordsToPagePdfSpace` function and its test (`ai-coords.test.ts`) were created for the vision approach and then deleted when the approach changed. Do not recreate them.
---
@@ -33,122 +94,80 @@ The decision recorded in STATE.md is authoritative: do NOT use `zodResponseForma
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| openai (npm) | ^6.32.0 (latest Mar 2026) | Official OpenAI TypeScript SDK for Chat Completions API | Not yet installed; must `npm install openai`. Official SDK with strict mode structured outputs |
| pdfjs-dist | 5.4.296 (already installed) | Server-side PDF text extraction via `getTextContent()` | Already a project dependency via react-pdf. Legacy build works in Node.js route handlers |
| pdfjs-dist | 5.4.296 (hoisted from react-pdf) | PDF text-layer extraction via `getTextContent()` | Already installed; legacy build works in Node.js with file:// workerSrc |
| openai | ^6.32.0 (installed) | GPT-4.1 structured output for field type classification | Official SDK; installed; manual json_schema required (Zod v4 incompatibility) |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| server-only | NOT installed (not in node_modules) | Build-time guard preventing client-side import of server modules | Use comment guard `// server-only` or explicit `if (typeof window !== 'undefined') throw` — OR install `server-only` package |
| crypto.randomUUID() | Node built-in | Generate UUIDs for AI-placed field IDs | Already used in FieldPlacer for field IDs; same pattern here |
| @napi-rs/canvas | ^0.1.97 (installed, no longer used) | Was for server-side JPEG rendering | Kept in package.json but no longer imported; `serverExternalPackages` still lists it in next.config.ts — leave that entry to avoid breaking changes |
| crypto.randomUUID() | Node built-in | UUID generation for field IDs | Used in classifyFieldsWithAI to assign IDs before textFillData keying |
### Alternatives Considered
### Verified Package Versions
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| Manual `json_schema` response_format | `zodResponseFormat` helper | zodResponseFormat is broken with Zod v4 (confirmed issues #1540, #1602, #1709 — this is a locked decision in STATE.md) |
| pdfjs-dist legacy build | pdf-parse, unpdf | pdfjs-dist is already installed; unpdf would add a dependency; pdf-parse is simpler but less positional data available |
| GPT-4o-mini | GPT-4o, GPT-4.1 | GPT-4o-mini is cheapest and supports structured outputs with 100% schema compliance; good enough for field classification |
**Installation:**
```bash
npm install openai
# pdfjs-dist version confirmed:
cat node_modules/pdfjs-dist/package.json | grep '"version"'
# → "version": "5.4.296"
# openai version confirmed:
cat node_modules/openai/package.json | grep '"version"'
```
**No installation needed** — all required packages are already in node_modules.
---
## Architecture Patterns
### Recommended Project Structure
### Recommended Project Structure (Current)
```
src/
├── lib/
│ ├── ai/
│ │ ├── extract-text.ts # pdfjs-dist server-side text extraction (server-only guard)
│ │ └── field-placement.ts # GPT-4o-mini call + aiCoordsToPagePdfSpace() (server-only guard)
│ │ ├── extract-text.ts # pdfjs-dist blank extraction (4 strategies)
│ │ └── field-placement.ts # GPT-4.1 type classification + post-processing
│ └── pdf/
│ ├── prepare-document.ts # existing — unchanged
│ └── __tests__/
── prepare-document.test.ts # existing — unchanged
│ └── ai-coords.test.ts # NEW — unit test for aiCoordsToPagePdfSpace()
── prepare-document.test.ts # 10 tests — Y-flip formula (passing)
└── app/
└── api/
└── documents/
└── [id]/
└── ai-prepare/
└── route.ts # NEW — POST /api/documents/[id]/ai-prepare
└── route.ts # POST handler
```
### Pattern 1: Server-Side PDF Text Extraction with pdfjs-dist Legacy Build
### Pattern 1: pdfjs-dist Blank Extraction (Text Layer)
**What:** Import from `pdfjs-dist/legacy/build/pdf.mjs`, set `GlobalWorkerOptions.workerSrc = ''` (empty string tells pdf.js to fall back to synchronous/fake worker mode in Node.js — this is the documented pattern for server-side use without a browser worker thread), load the PDF bytes, iterate pages calling `getTextContent()`.
**What:** Use pdfjs `getTextContent()` to get all text items. Each item has a `transform` matrix where `transform[4]` = x and `transform[5]` = y in PDF user-space (bottom-left origin, points). Find underscore sequences (Strategy 1: pure runs, Strategy 2: embedded runs) and bracket patterns (Strategy 3: single-item `[ ]`, Strategy 4: multi-item `[ … ]`). Group items into lines with ±5pt y-tolerance.
**When to use:** Server-only route handlers (Next.js App Router route.ts files run on Node.js).
**Why text-layer over vision:** Coordinates are exact (sub-point accuracy). No DPI scaling math, no image rendering, no API image tokens, no coordinate conversion needed.
**Important:** The project's existing client-side usage (`PdfViewer.tsx`, `PreviewModal.tsx`) sets `workerSrc = new URL('pdfjs-dist/build/pdf.worker.min.mjs', import.meta.url).toString()` — that is browser-only. The server-side module must set `workerSrc = ''` independently. Do NOT share the import or the GlobalWorkerOptions assignment across server and client.
**pdfjs-dist 5.x workerSrc (critical):** Must use `file://` URL pointing to the worker .mjs file — empty string is falsy and causes PDFWorker to throw before the fake-worker import runs.
```typescript
// Source: pdfjs-dist docs + STATE.md v1.1 Research decision
// lib/ai/extract-text.ts — server-only
import { getDocument, GlobalWorkerOptions } from 'pdfjs-dist/legacy/build/pdf.mjs';
import { readFile } from 'node:fs/promises';
// Empty string = no worker thread (fake/synchronous worker) — required for Node.js server context
GlobalWorkerOptions.workerSrc = '';
export interface PageText {
page: number; // 1-indexed
text: string; // all text items joined with spaces
width: number; // page width in PDF points (72 DPI)
height: number; // page height in PDF points (72 DPI)
}
export async function extractPdfText(filePath: string): Promise<PageText[]> {
const data = new Uint8Array(await readFile(filePath));
const pdf = await getDocument({ data }).promise;
const pages: PageText[] = [];
for (let pageNum = 1; pageNum <= pdf.numPages; pageNum++) {
const page = await pdf.getPage(pageNum);
const viewport = page.getViewport({ scale: 1.0 });
const textContent = await page.getTextContent();
const text = textContent.items
.filter((item) => 'str' in item)
.map((item) => (item as { str: string }).str)
.join(' ');
pages.push({
page: pageNum,
width: viewport.width,
height: viewport.height,
text,
});
}
return pages;
}
// Source: extract-text.ts (confirmed working)
import { join } from 'node:path';
GlobalWorkerOptions.workerSrc = `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`;
```
### Pattern 2: GPT-4o-mini Structured Output with Manual JSON Schema
**Text item transform matrix:** `[scaleX, skewY, skewX, scaleY, translateX, translateY]`
- `transform[4]` = x left edge of item (PDF points from page left)
- `transform[5]` = y baseline of item (PDF points from page bottom)
- `transform[0]` = font size (approximately) when no rotation
**What:** Use the `openai` SDK `chat.completions.create()` with `response_format: { type: 'json_schema', json_schema: { ... strict: true } }`. The schema asks the model to return an array of field placement objects with: `page` (1-indexed integer), `fieldType` (enum of 7 types), `xPct` (0100 percentage of page width), `yPct` (0100 percentage of page height, measured from page TOP — AI models think top-left origin), `widthPct`, `heightPct`, and optionally `prefillValue` for text fields.
### Pattern 2: GPT-4.1 Type Classification (Text-Only, No Vision)
**Critical JSON Schema rule:** When `strict: true`, ALL properties must be in `required` and ALL objects must have `additionalProperties: false`. Any missing field from `required` causes an API 400 error.
**What:** Send a compact text description of each detected blank to GPT-4.1. The description includes blank index, page number, row position metadata (`row=N/T`), and context strings (contextBefore, contextAfter, contextAbove, contextBelow). AI returns only the field type and prefill value — coordinates come from pdfjs directly.
**When to use:** All AI field placement requests.
**Schema:** Manual `json_schema` with `strict: true`, all properties in `required`, `additionalProperties: false` at every nesting level. Do NOT use `zodResponseFormat` (broken with Zod v4).
```typescript
// Source: OpenAI docs + STATE.md locked decision (manual json_schema, not zodResponseFormat)
// lib/ai/field-placement.ts — server-only
import OpenAI from 'openai';
import type { PageText } from './extract-text';
import type { SignatureFieldData, SignatureFieldType } from '@/lib/db/schema';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const FIELD_PLACEMENT_SCHEMA = {
// Source: field-placement.ts (confirmed working)
const CLASSIFICATION_SCHEMA = {
type: 'object',
properties: {
fields: {
@@ -156,15 +175,11 @@ const FIELD_PLACEMENT_SCHEMA = {
items: {
type: 'object',
properties: {
page: { type: 'integer' },
fieldType: { type: 'string', enum: ['text', 'checkbox', 'initials', 'date', 'client-signature', 'agent-signature', 'agent-initials'] },
xPct: { type: 'number' },
yPct: { type: 'number' }, // % from page TOP (AI top-left origin)
widthPct: { type: 'number' },
heightPct: { type: 'number' },
prefillValue: { type: 'string' }, // only for text fields; empty string if none
index: { type: 'integer' },
fieldType: { type: 'string', enum: ['text', 'initials', 'date', 'client-signature', 'agent-signature', 'agent-initials', 'checkbox'] },
prefillValue: { type: 'string' },
},
required: ['page', 'fieldType', 'xPct', 'yPct', 'widthPct', 'heightPct', 'prefillValue'],
required: ['index', 'fieldType', 'prefillValue'],
additionalProperties: false,
},
},
@@ -174,119 +189,80 @@ const FIELD_PLACEMENT_SCHEMA = {
} as const;
```
### Pattern 3: Coordinate Conversion — AI Percentage to PDF User-Space
**Note:** The `checkbox` type is in the enum so the AI can classify inline checkboxes — but fields with `fieldType === 'checkbox'` are filtered out (not added to the fields array). They represent selection options, not placed fields.
**What:** AI returns percentage coordinates with top-left origin (AI models think of a page as a grid where (0,0) is top-left). PDF user-space uses bottom-left origin with points (1 pt = 1/72 inch). Two conversions are needed:
1. Percentage to absolute points: `x = (xPct / 100) * pageWidth`
2. Y-axis flip: `y = pageHeight - (yPct / 100) * pageHeight - fieldHeight`
(the stored y is the BOTTOM edge of the field in PDF space)
### Pattern 3: Coordinate Handling — No Conversion Needed
**This is the exact same formula already used in FieldPlacer.tsx** for the drag-and-drop coordinate conversion. The new `aiCoordsToPagePdfSpace()` function extracts it as a named utility, verified by a unit test.
**What:** Blank coordinates from pdfjs text layer are already in PDF user-space. Store them directly.
```typescript
// Source: FieldPlacer.tsx coordinate math (lines 287-295) — same formula
// lib/ai/field-placement.ts
export interface AiFieldCoords {
page: number;
fieldType: SignatureFieldType;
xPct: number; // % from left, top-left origin
yPct: number; // % from top, top-left origin
widthPct: number;
heightPct: number;
prefillValue: string;
}
/**
* Convert AI percentage coordinates (top-left origin) to PDF user-space points (bottom-left origin).
*
* pageWidth/pageHeight in PDF points (from page.getViewport({ scale: 1.0 })).
*
* Formula mirrors FieldPlacer.tsx handleDragEnd (lines 289-291):
* pdfX = (clampedX / renderedW) * pageInfo.originalWidth
* pdfY = ((renderedH - (clampedY + fieldHpx)) / renderedH) * pageInfo.originalHeight
*/
export function aiCoordsToPagePdfSpace(
coords: AiFieldCoords,
pageWidth: number,
pageHeight: number,
): { x: number; y: number; width: number; height: number } {
const fieldWidth = (coords.widthPct / 100) * pageWidth;
const fieldHeight = (coords.heightPct / 100) * pageHeight;
const screenX = (coords.xPct / 100) * pageWidth;
const screenY = (coords.yPct / 100) * pageHeight; // screen Y from top
const x = screenX;
// PDF y = distance from BOTTOM. screenY is from top, so flip:
// pdfY = pageHeight - screenY - fieldHeight (bottom edge of field)
const y = pageHeight - screenY - fieldHeight;
return { x, y, width: fieldWidth, height: fieldHeight };
}
// Source: field-placement.ts (confirmed working)
const y = Math.max(0, blank.y - 2); // -2pt: anchor just below baseline (underscores descend slightly)
fields.push({ id, page: blank.page, x: blank.x, y, width, height, type: fieldType });
```
### Pattern 4: AI Auto-place API Route
**The -2pt y offset:** Underscore characters descend slightly below the text baseline. Moving the field bottom edge 2pt below the baseline positions the field box to sit ON the underline visually.
**What:** POST `/api/documents/[id]/ai-prepare` — server-side route that orchestrates:
1. Load document + client from DB
2. Resolve PDF file path
3. Call `extractPdfText()` for all pages
4. Call GPT-4o-mini with extracted text and client profile data
5. Convert AI coords to PDF user-space for each field
6. Write `SignatureFieldData[]` back to DB via direct DB update (same as fields PUT endpoint pattern)
7. Return `{ fields, textFillData }` — client updates local state from response
**Why this works without conversion:** pdfjs `transform[5]` is the text baseline Y in the same coordinate space (PDF user-space, bottom-left origin) that FieldPlacer's `pdfToScreenCoords` expects for rendering. `pageInfo.originalHeight` in FieldPlacer comes from `page.view[3]` (react-pdf mediaBox), which is the same value as pdfjs `getViewport({scale:1}).height` for standard PDF pages.
**When to use:** Called from the "AI Auto-place" button in PreparePanel.
### Pattern 4: Deterministic Post-Processing Rules
**What:** Four rules that override AI classifications for structurally unambiguous cases.
| Rule | Condition | Override |
|------|-----------|---------|
| A | `contextBefore` last word is "date" | → `date` |
| B | `contextBefore` last word is "initials" | → `initials` |
| C | `rowTotal > 1` AND `rowIndex > 1` AND AI classified as signature | → `date` (if last+contextBelow has "(Date)") or `text` |
| D | `rowTotal=2`, `rowIndex=1`, AI classified as signature, contextBelow has "(Address/Phone)"+"(Date)" but NOT "(Seller"/"(Buyer" | → `text` |
**Why needed:** The footer pattern `Seller's Initials [ ] Date ___` and signature block rows `[sig] [address/phone] [date]` are structurally deterministic but the AI sometimes misclassifies them.
### Pattern 5: FieldPlacer Coordinate Rendering Formula
**What:** How stored PDF coordinates are converted back to screen pixels for rendering.
```typescript
// Source: pattern matches /api/documents/[id]/prepare/route.ts
// app/api/documents/[id]/ai-prepare/route.ts
export async function POST(
req: Request,
{ params }: { params: Promise<{ id: string }> }
) {
const session = await auth();
if (!session?.user?.id) return new Response('Unauthorized', { status: 401 });
const { id } = await params;
const doc = await db.query.documents.findFirst({
where: eq(documents.id, id),
with: { client: true },
});
if (!doc) return Response.json({ error: 'Not found' }, { status: 404 });
if (!doc.filePath) return Response.json({ error: 'No PDF file' }, { status: 422 });
if (doc.status !== 'Draft') return Response.json({ error: 'Document is locked' }, { status: 403 });
const filePath = path.join(UPLOADS_DIR, doc.filePath);
const pageTexts = await extractPdfText(filePath);
const { fields, textFillData } = await classifyFieldsWithAI(pageTexts, doc.client);
// Write fields to DB (same as PUT /fields)
const [updated] = await db
.update(documents)
.set({ signatureFields: fields })
.where(eq(documents.id, id))
.returning();
return Response.json({ fields: updated.signatureFields, textFillData });
// Source: FieldPlacer.tsx pdfToScreenCoords (lines 44-54)
function pdfToScreenCoords(pdfX, pdfY, renderedW, renderedH, pageInfo) {
const left = (pdfX / pageInfo.originalWidth) * renderedW;
// top is distance from DOM top to BOTTOM EDGE of field
const top = renderedH - (pdfY / pageInfo.originalHeight) * renderedH;
return { left, top };
}
// Rendering (FieldPlacer.tsx line 630):
// top: top - heightPx + canvasOffset.y ← shift up by height to get visual top of field
```
### Pattern 5: AI Auto-place Button in PreparePanel
`pageInfo.originalWidth/originalHeight` are from react-pdf's `page.view[2]/page.view[3]` (mediaBox dimensions in PDF points at scale 1.0).
**What:** Add "AI Auto-place" button to PreparePanel. On click: POST to `/api/documents/[id]/ai-prepare`, receive `{ fields, textFillData }`, update DocumentPageClient state (`setTextFillData`, invalidate preview token). FieldPlacer reloads from DB via its existing `loadFields` useEffect (or receives the updated fields as a prop — both approaches work; recommend refreshing from DB to keep single source of truth).
### Pattern 6: AI Auto-place Route and Client State Update
**Recommended approach:** After the AI route returns, trigger a page reload or force FieldPlacer to re-fetch by incrementing a `fieldReloadKey` prop. The simplest approach is to call `router.refresh()` (which is already used in PreparePanel after prepare+send) or to expose a `reload` callback from FieldPlacer.
**What:** POST to `/api/documents/[id]/ai-prepare` → receive `{ fields, textFillData }` → update FieldPlacer state via `aiPlacementKey` increment.
```typescript
// Source: DocumentPageClient.tsx (confirmed working in Plan 03)
async function handleAiAutoPlace() {
const res = await fetch(`/api/documents/${docId}/ai-prepare`, { method: 'POST' });
if (res.ok) {
const { textFillData: aiTextFill } = await res.json();
setTextFillData(prev => ({ ...prev, ...aiTextFill })); // MERGE — preserves manual values
setAiPlacementKey(k => k + 1); // triggers FieldPlacer re-fetch from DB
setPreviewToken(null); // invalidate stale preview
}
}
```
### Anti-Patterns to Avoid
- **DO NOT** import pdfjs-dist in client components for server extraction — server-only extraction guard is mandatory (pdfjs-dist in a server route is fine; it just must not be bundled into the client)
- **DO NOT** use `zodResponseFormat` — broken with Zod v4 (issues #1540, #1602, #1709). This is a locked decision.
- **DO NOT** use `workerSrc = new URL('...', import.meta.url)` in the server route — `import.meta.url` may not resolve correctly in all Next.js Route Handler contexts and triggers browser worker initialization
- **DO NOT** use the AI coordinates as-is without conversion — AI returns top-left origin percentages; PDF requires bottom-left origin points
- **DO NOT** lock fields after AI placement — the agent MUST be able to review, adjust, or delete any AI-placed field (this is a success criterion)
- **DO NOT** have the AI route set document status to anything other than Draft — only the prepare route should move status to Sent
- **DO NOT** recreate `aiCoordsToPagePdfSpace` — that function was for the abandoned vision approach and does not belong in the current architecture
- **DO NOT** recreate `ai-coords.test.ts` — it tested a function that no longer exists; Plan 04 Task 1 must reference `prepare-document.test.ts` instead
- **DO NOT** import from `'pdfjs-dist/legacy/build/pdf.mjs'` with `GlobalWorkerOptions.workerSrc = ''` (empty string) — this is the pdfjs v3+ breaking change; always use the `file://` URL path
- **DO NOT** use `zodResponseFormat` — broken with Zod v4.3.6 (confirmed GitHub issues #1540, #1602, #1709)
- **DO NOT** add `@napi-rs/canvas` imports back to extract-text.ts — the vision approach was abandoned; the `serverExternalPackages` entry in next.config.ts can stay but the import is gone
- **DO NOT** use vision/image-based approach — the text-extraction approach is simpler, cheaper, and coordinate-accurate
- **DO NOT** lock fields after AI placement — agent must be able to edit, move, resize, delete
---
@@ -294,155 +270,144 @@ export async function POST(
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| PDF text extraction | Custom PDF parser | pdfjs-dist legacy build (`getTextContent`) | Already installed; handles fonts, encodings, multi-page; tested by Mozilla |
| Structured AI output | Manual JSON parsing + regex fallback | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed via constrained decoding; no parse failures |
| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used in FieldPlacer and schema |
| Field type enum validation | Custom validation function | TypeScript union type `SignatureFieldType` | Already defined in schema.ts; pass as `enum` array in JSON schema |
| PDF text extraction | Custom PDF parser | pdfjs-dist `getTextContent()` | Already installed; handles encoding, multi-page, returns transform matrix with exact coordinates |
| Field type classification | Rule-based regex | GPT-4.1 with manual json_schema | Context analysis (above/below/before/after) is complex; AI handles natural language labels |
| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used everywhere in the project |
| Structured AI output | Parse JSON manually | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed; no parse failures |
**Key insight:** Both the PDF extraction and the field write endpoint already exist in the project. Phase 13 is primarily a thin orchestration layer connecting them via an AI call.
**Key insight:** The current architecture is correct and complete. Plan 04 is verification-only — no new code is needed.
---
## Common Pitfalls
### Pitfall 1: Y-Axis Inversion Bug
### Pitfall 1: Plan 04 Task 1 References Deleted Test File
**What goes wrong:** AI returns `yPct: 10` meaning "10% from the top of the page." Developer naively converts: `y = (10/100) * pageHeight = 79.2 pts`. Stored as y=79.2 in PDF space, which means 79.2 pts from the BOTTOM — so the field appears near the bottom, not 10% from the top.
**What goes wrong:** Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts`. That file was deleted when the vision approach was abandoned. Running this command fails immediately.
**Why it happens:** AI models describe positions with top-left origin. PDF user-space uses bottom-left origin.
**Why it happens:** The plan was written for the vision approach. The architecture pivoted after the plan was written.
**How to avoid:** Use `aiCoordsToPagePdfSpace()` which applies the flip: `y = pageHeight - screenY - fieldHeight`. The unit test validates this with US Letter dimensions.
**How to avoid:** Plan 04 Task 1 must run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts` instead. The `prepare-document.test.ts` file has 10 passing tests covering the Y-flip coordinate formula — these remain valid and relevant.
**Warning signs:** Fields appear at the wrong vertical position; fields meant for the top of a page appear near the bottom.
**Warning signs:** `No tests found for path pattern 'src/lib/pdf/__tests__/ai-coords.test.ts'`
### Pitfall 2: OpenAI Strict Mode Schema Requirements
### Pitfall 2: pdfjs-dist 5.x Fake-Worker Requires `file://` URL
**What goes wrong:** Passing a JSON schema with `strict: true` that omits a property from `required` or omits `additionalProperties: false` on a nested object. The API returns a 400 error: "Invalid schema for response_format."
**What goes wrong:** Using `GlobalWorkerOptions.workerSrc = ''` (empty string) throws `PDFWorker: workerSrc not set` in Node.js route handlers with pdfjs-dist 5.x.
**Why it happens:** OpenAI strict mode enforces that EVERY object at EVERY nesting level has ALL properties listed in `required` AND `additionalProperties: false`. The requirement cascades to nested objects.
**Why it happens:** pdfjs-dist 5.x changed fake-worker mode. Empty string is falsy; the PDFWorker getter throws before attempting the dynamic import. The workerSrc must be a valid URL the Node.js dynamic importer can resolve.
**How to avoid:** Include `required` listing all property keys and `additionalProperties: false` on EVERY object, including the items schema inside arrays.
**How to avoid:** Use `file://${join(process.cwd(), 'node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs')}`. This is already correct in the current `extract-text.ts`.
**Warning signs:** `400 BadRequestError: Invalid schema for response_format` in server logs.
**Warning signs:** `Error: Setting up fake worker failed` or `workerSrc not set` in server logs.
### Pitfall 3: pdfjs-dist Worker in Node.js Context
### Pitfall 3: OpenAI Strict Mode Schema Cascades to Nested Objects
**What goes wrong:** Calling `getDocument()` without setting `GlobalWorkerOptions.workerSrc = ''` in Node.js throws: `Error: 'No "GlobalWorkerOptions.workerSrc" specified.'` OR tries to spin up a worker thread and fails.
**What goes wrong:** A JSON schema with `strict: true` that omits `required` or `additionalProperties: false` on a nested object causes a 400 API error.
**Why it happens:** pdf.js was designed for browsers. It tries to auto-detect the worker path via `document.currentScript.src`, which is null in Node.js.
**Why it happens:** OpenAI strict mode requires `required` listing ALL properties AND `additionalProperties: false` at EVERY object level — including `items` inside arrays.
**How to avoid:** Set `GlobalWorkerOptions.workerSrc = ''` at the top of the server-side extract-text module. This enables "fake worker" (synchronous) mode in Node.js — no worker thread needed for text extraction.
**How to avoid:** The current `CLASSIFICATION_SCHEMA` in `field-placement.ts` is correct. Do not modify the schema structure.
**Warning signs:** `No "GlobalWorkerOptions.workerSrc" specified` error in route handler logs.
**Warning signs:** `400 BadRequestError: Invalid schema for response_format`
### Pitfall 4: pdfjs-dist TypeScript Import Path
### Pitfall 4: Checkbox Fields Must Be Filtered Out Before Storing
**What goes wrong:** Importing from `'pdfjs-dist/legacy/build/pdf.mjs'` without adjusting TypeScript config. The `legacy/build/pdf.d.mts` file contains `export * from "pdfjs-dist"` which re-exports the main types, but the import path itself may trigger `skipLibCheck` issues.
**What goes wrong:** GPT-4.1 returns `fieldType: "checkbox"` for inline selection checkboxes (e.g., `[ ] ARE [ ] ARE NOT`). If these are added to `SignatureFieldData[]`, they appear as placed fields that cannot be properly handled by the signing flow.
**Why it happens:** The legacy build has `.d.mts` extension (ESM TypeScript declaration), which some older TypeScript configurations don't automatically pick up.
**Why it happens:** The AI correctly identifies these as checkboxes (not fill-in blanks). They should be classified but not placed.
**How to avoid:** Confirmed — the project already has `"transpilePackages": ['react-pdf', 'pdfjs-dist']` in next.config.ts. Import `getDocument` and `GlobalWorkerOptions` from `'pdfjs-dist/legacy/build/pdf.mjs'`. If type errors appear, add `@ts-ignore` or use the main `pdfjs-dist` types (they're the same via the re-export).
**How to avoid:** The current code already filters: `if (result.fieldType === 'checkbox') continue;`. This is correct.
**Warning signs:** TypeScript error `Could not find declaration file for 'pdfjs-dist/legacy/build/pdf.mjs'`.
**Warning signs:** Dozens of tiny checkbox-type fields placed throughout the document body.
### Pitfall 5: FieldPlacer State Not Reflecting AI-Placed Fields
### Pitfall 5: textFillData Keys Must Be Field UUIDs
**What goes wrong:** AI route writes fields to DB successfully. But FieldPlacer on screen still shows the previous (empty) fields because its local state isn't updated.
**What goes wrong:** textFillData keyed by label string ("clientName") does not match DocumentPageClient's lookup of `textFillData[field.id]`.
**Why it happens:** FieldPlacer loads fields once on mount via `useEffect`. The AI route write bypasses the React state.
**Why it happens:** Phase 12.1 wired textFillData to use field UUID as key (STATE.md confirmed locked decision). Label-keyed maps are silently ignored.
**How to avoid:** After the AI route returns successfully, trigger FieldPlacer to re-fetch from DB. Options:
- Call `router.refresh()` (Next.js App Router — causes server component re-render, but FieldPlacer re-mounts and calls loadFields)
- Add a `fieldReloadKey` prop to FieldPlacer that causes its `useEffect` to re-run when incremented
- Pass the returned `fields` array directly to FieldPlacer via a prop (requires making FieldPlacer accept `initialFields`)
**How to avoid:** The route handler creates UUIDs (`crypto.randomUUID()`) BEFORE building textFillData, then uses those same UUIDs as keys. This is already correct in the current code.
**Recommended:** The simplest approach is to expose an `onAiPlacement` callback that calls `router.refresh()` on DocumentPageClient, OR expose a reload callback from FieldPlacer. Given that FieldPlacer already loads from DB on mount, `router.refresh()` is clean.
**Warning signs:** Text pre-fill values don't appear in the preview even though AI returned them.
### Pitfall 6: Text Field Pre-fill Requires Field IDs
### Pitfall 6: Y Coordinate Is Baseline, Not Field Bottom
**What goes wrong:** AI route returns `textFillData` keyed by "client-name" or label string. But `textFillData` in the project is keyed by **field UUID** (SignatureFieldData.id). A label-keyed map won't match anything in `textFillData[field.id]` lookups.
**What goes wrong:** A field placed exactly at `blank.y` (text baseline) appears above the underline, not on it. Underscores descend slightly below the baseline.
**Why it happens:** Phase 12.1 wired textFillData to be keyed by `field.id` (UUID) not by label. This is a confirmed design decision in STATE.md.
**Why it happens:** PDF text baseline is where capital letters sit. Descenders (underscores, lowercase g/y/p) extend below the baseline.
**How to avoid:** When building `textFillData` from AI output, use the `id: crypto.randomUUID()` assigned to each AI-placed field in the route handler. The route handler creates the field UUIDs, so it can simultaneously build `{ [fieldId]: prefillValue }` for text fields with prefill values.
**How to avoid:** The current code uses `const y = Math.max(0, blank.y - 2)`. The -2pt offset anchors the field bottom edge slightly below the baseline, sitting on the underline visually.
**Warning signs:** Text fill values don't appear in the preview even though AI returned them.
**Warning signs:** Fields appear to float just above the underline instead of sitting on it.
### Pitfall 7: Debug Console.log Statements Are Still in Production Code
**What goes wrong:** Two `console.log` statements remain in `classifyFieldsWithAI`: one printing the blank count, one printing ALL blank descriptions (can be very verbose for large forms), and one printing all AI classifications.
**Why it happens:** Debugging statements added during development were not removed.
**How to avoid:** Plan 04 should include removing these console.log statements before final sign-off. They do not affect correctness but should not ship in production code.
**Warning signs:**
```
[ai-prepare] calling gpt-4.1 with 47 blanks
[ai-prepare] blank descriptions:
[0] page1 before="..." ...
```
---
## Code Examples
### Full Coordinate Conversion with Unit Test Target
### Verified: How pdfjs Text Coordinates Map to Field Positions
```typescript
// Source: FieldPlacer.tsx lines 287-295 (exact formula)
// Expected unit test cases for aiCoordsToPagePdfSpace:
// US Letter: 612 × 792 pts
// AI says: text field at xPct=10, yPct=5, widthPct=30, heightPct=5 (top of page)
// Expected: x=61.2, y=792-(0.05*792)-(0.05*792) = 792 - 39.6 - 39.6 = 712.8
// i.e., field bottom edge is 712.8 pts from page bottom (near the top)
// Checkbox: AI says xPct=90, yPct=95, widthPct=3, heightPct=3 (near bottom-right)
// Expected: x = 0.90*612 = 550.8
// fieldH = 0.03*792 = 23.76
// y = 792 - (0.95*792) - 23.76 = 792 - 752.4 - 23.76 = 15.84
// pdfjs transform matrix: [scaleX, skewY, skewX, scaleY, translateX, translateY]
// For a text item on a US Letter page (612 × 792 pts):
// transform[4] = x (distance from left edge of page, in points)
// transform[5] = y (distance from BOTTOM of page, in points — PDF user-space)
//
// Example: text near top of page
// transform[5] ≈ 720 (720pt from bottom = ~1" from top on a 792pt page)
//
// FieldPlacer renders this as:
// top = renderedH - (720 / 792) * renderedH = renderedH * 0.091 ≈ 9% from top ✓
//
// Example: footer initials near bottom of page
// transform[5] ≈ 36 (36pt from bottom = 0.5" from bottom)
//
// FieldPlacer renders this as:
// top = renderedH - (36 / 792) * renderedH = renderedH * 0.955 ≈ 95.5% from top ✓
```
### Manual JSON Schema for GPT-4o-mini (Complete)
### Verified: groupIntoLines Threshold
```typescript
// Source: OpenAI structured outputs docs
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: `You are a real estate document form field extractor.
Given extracted text from a PDF page (with context about page number and dimensions),
identify where signature, text, checkbox, initials, and date fields should be placed.
Return fields as percentage positions (0-100) from the TOP-LEFT of the page.
Use these field types: text (for typed values), checkbox, initials, date, client-signature, agent-signature, agent-initials.
For text fields that match the client profile, set prefillValue to the known value. Otherwise use empty string.`,
},
{
role: 'user',
content: `Client name: ${clientName}\nProperty address: ${propertyAddress}\n\nPDF pages:\n${pagesSummary}`,
},
],
response_format: {
type: 'json_schema',
json_schema: {
name: 'field_placement',
strict: true,
schema: FIELD_PLACEMENT_SCHEMA, // defined above — all fields required, additionalProperties: false
},
},
});
// Source: extract-text.ts groupIntoLines (line 58)
// ±5pt tolerance groups items on the same visual line.
// This handles multi-run items (same underline split across font boundaries)
// AND minor baseline variations in real PDFs.
//
// A 3-blank signature row (sig | addr/phone | date) will group correctly IF
// all three blanks have y-values within ±5pt of each other.
// If NOT grouped (y-drift > 5pt), Rule D in post-processing handles the misclassification.
```
### FieldPlacer Reload After AI Placement (Recommended Pattern)
### Verified: Route Handler Security and Error Pattern
```typescript
// In DocumentPageClient.tsx — add aiPlacementKey state
const [aiPlacementKey, setAiPlacementKey] = useState(0);
// Pass to FieldPlacer via PdfViewerWrapper
// In FieldPlacer, add to loadFields useEffect dependency array:
useEffect(() => {
loadFields();
}, [docId, aiPlacementKey]); // re-fetch when key increments
// After AI auto-place succeeds in PreparePanel:
async function handleAiAutoPlace() {
const res = await fetch(`/api/documents/${docId}/ai-prepare`, { method: 'POST' });
if (res.ok) {
const { textFillData: aiTextFill } = await res.json();
setTextFillData(prev => ({ ...prev, ...aiTextFill }));
setAiPlacementKey(k => k + 1); // triggers FieldPlacer reload
setPreviewToken(null);
}
}
// Source: ai-prepare/route.ts (confirmed complete)
// Guards in order:
// 1. auth() — session check
// 2. OPENAI_API_KEY existence — 503 if missing
// 3. document lookup — 404 if not found
// 4. filePath check — 422 if no PDF
// 5. status === 'Draft' — 403 if locked
// 6. path traversal check — 403 if escapes UPLOADS_DIR
// 7. try/catch wrapping extractBlanks + classifyFieldsWithAI — 500 with message
// 8. DB write — direct update, no status change
// 9. return { fields, textFillData }
```
---
@@ -451,67 +416,125 @@ async function handleAiAutoPlace() {
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| AcroForm heuristic field detection (FORMS-V2-01) | GPT-4o-mini text classification | Superseded per REQUIREMENTS.md | AI-based approach handles flat/scanned forms that have no AcroForm fields |
| `zodResponseFormat` helper | Manual `json_schema` response_format | Broken in Zod v4 (as of 2025) | Must write JSON schema manually; more verbose but reliable |
| `disableWorker = true` (pdfjs-dist v2 API) | `GlobalWorkerOptions.workerSrc = ''` | pdfjs-dist v3+ | Correct server-side Node.js pattern for current pdfjs-dist 5.x |
| Positional text extraction via bounding boxes | Text-only extraction with AI classification | Phase 13 approach | Simpler and sufficient for label classification; AI handles the spatial reasoning |
| GPT-4o vision (render pages as JPEG) | pdfjs text-layer extraction | After Plan 01, during debugging | Eliminates coordinate conversion math and the 30-40% vertical offset bug |
| `aiCoordsToPagePdfSpace()` + `ai-coords.test.ts` | Direct pdfjs coordinates — no conversion | After Plan 01 pivot | Deleted function and test; `prepare-document.test.ts` is the only relevant test |
| GPT-4o (vision model) | GPT-4.1 (text model) | After vision approach abandoned | Cheaper, faster, sufficient for text classification task |
| `GlobalWorkerOptions.workerSrc = ''` | `file://` URL path | pdfjs-dist 5.x requirement | Required for fake-worker mode; empty string is falsy in 5.x |
| `zodResponseFormat` helper | Manual `json_schema` | Zod v4 incompatibility (issues #1540, #1602, #1709) | Permanent — project uses Zod v4.3.6 |
| `@napi-rs/canvas` for image rendering | Not used (deleted from imports) | After vision approach abandoned | Package still installed but not imported; `serverExternalPackages` entry can remain |
**Deprecated/outdated:**
- `PDFJS.disableWorker = true`: v2 API, removed in v3+. Use `GlobalWorkerOptions.workerSrc = ''` instead.
- `zodResponseFormat`: Works only with Zod v3. Project uses Zod v4.3.6. Use manual json_schema.
**Deprecated/outdated items that should NOT appear in new code:**
- `import { createCanvas } from '@napi-rs/canvas'` in extract-text.ts — deleted, do not restore
- `aiCoordsToPagePdfSpace()` function — deleted, do not recreate
- `ai-coords.test.ts` — deleted, do not recreate
- `zodResponseFormat` — broken with Zod v4, do not use
- `GlobalWorkerOptions.workerSrc = ''` (empty string) — wrong for pdfjs 5.x, do not use
---
## Open Questions
1. **OPENAI_API_KEY environment variable**
- What we know: The project's `.env.local` file exists but its contents are private. The openai SDK reads from `process.env.OPENAI_API_KEY` by default.
- What's unclear: Whether OPENAI_API_KEY is already set in `.env.local` for the project.
- Recommendation: The plan should include a step noting that OPENAI_API_KEY must be set in `.env.local`. The route handler should return a clear 503 error if OPENAI_API_KEY is not configured.
1. **Debug console.log statements in classifyFieldsWithAI**
- What we know: Three `console.log` calls remain in `field-placement.ts` (blank count, all descriptions, all classifications). These print verbose output to server logs on every AI auto-place request.
- What's unclear: Whether to remove before Plan 04 verification or after.
- Recommendation: Remove before E2E verification so server logs are clean for manual testing. Add to Plan 04 Task 1.
2. **Utah REPC 20-page form field accuracy**
- What we know: AI coordinate accuracy on real Utah forms is listed as an explicit concern in STATE.md Blockers/Concerns.
- What's unclear: How well GPT-4o-mini will perform on a dense legal document like the Utah REPC without fine-tuning. Percentage coordinates from text-only extraction (no visual) may be imprecise.
- Recommendation: Plan 13-04 integration test is non-negotiable. Expect iteration on the system prompt. Consider truncating very long page text to stay within GPT-4o-mini context limits.
2. **gpt-4.1 model availability**
- What we know: `field-placement.ts` uses `model: 'gpt-4.1'`. This is the current model. The project's OPENAI_API_KEY must have access to this model.
- What's unclear: Whether the developer's API key has gpt-4.1 access (it's not universally available to all OpenAI accounts as of 2026).
- Recommendation: If the API call returns a 404 model error, fall back to `gpt-4o` (which was used in an earlier iteration and is broadly available). The prompt and schema work identically for both models.
3. **Token limit for 20-page document**
- What we know: GPT-4o-mini context window is 128K tokens. A 20-page dense legal document could approach 30,000-50,000 tokens of extracted text.
- What's unclear: Whether sending all 20 pages in one call will stay within limits comfortably.
- Recommendation: In `classifyFieldsWithAI`, cap page text at ~2000 chars per page (truncate with ellipsis) to stay well under limits. Total text budget: 20 × 2000 = 40,000 chars ≈ ~10,000 tokens — well within 128K. The plan should include this truncation.
3. **Real Utah REPC 20-page accuracy**
- What we know: The text-extraction approach extracts blanks accurately based on the 4 detection strategies. Accuracy depends on the quality of the PDF text layer.
- What's unclear: How many blanks are missed or double-detected on the full 20-page Utah REPC. The system prompt is tuned for Utah real estate signature patterns.
- Recommendation: Plan 04 human verification on a real Utah REPC is non-negotiable. Expect 80-95% accuracy — imperfect placement is acceptable as the agent reviews before sending.
4. **FieldPlacer prop vs. DB reload for displaying AI fields**
- What we know: FieldPlacer currently loads fields via `useEffect([docId])` on mount. After AI placement writes to DB, FieldPlacer needs to re-fetch.
- What's unclear: Whether adding `aiPlacementKey` prop to FieldPlacer (which threads through PdfViewerWrapper) or using `router.refresh()` is cleaner given the current component hierarchy.
- Recommendation: Add `aiPlacementKey` prop to FieldPlacer and thread through PdfViewerWrapper. This avoids a full page re-render from `router.refresh()` and gives more surgical control.
4. **Plan 04 Task 1 command mismatch**
- What we know: Plan 04 Task 1 runs `npx jest src/lib/pdf/__tests__/ai-coords.test.ts` — that file is deleted.
- Recommendation: Plan 04 must be updated to run `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` and `npx tsc --noEmit` instead.
---
## Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|------------|------------|-----------|---------|----------|
| pdfjs-dist | extractBlanks() | ✓ | 5.4.296 | — (required, already installed) |
| pdfjs worker file | pdfjs fake-worker | ✓ | `node_modules/pdfjs-dist/legacy/build/pdf.worker.mjs` | — |
| openai SDK | classifyFieldsWithAI() | ✓ | ^6.32.0 | — (required, already installed) |
| OPENAI_API_KEY env var | classifyFieldsWithAI() | Unknown | — | Route returns 503 with message if missing |
| Node.js | All server-side code | ✓ | v23.6.0 | — |
| @napi-rs/canvas | NOT required anymore | ✓ (installed) | ^0.1.97 | N/A — no longer imported |
**Missing dependencies with no fallback:**
- OPENAI_API_KEY in `.env.local` — must be set before Plan 04 verification. Route returns 503 if missing (actionable error, not a crash).
---
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | Jest 29.7.0 + ts-jest |
| Config file | package.json `"jest": { "preset": "ts-jest", "testEnvironment": "node" }` |
| Quick run command | `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` |
| Full suite command | `npx jest --no-coverage --verbose` |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| AI-01 | Y-flip coordinate formula correct (FieldPlacer screen↔PDF) | unit | `npx jest src/lib/pdf/__tests__/prepare-document.test.ts --no-coverage --verbose` | ✅ |
| AI-01 | AI auto-place button appears, fields land on canvas, correct vertical positions | manual E2E | human verification (Plan 04 Task 2) | N/A |
| AI-02 | Text pre-fill values from client profile appear in field values | manual E2E | human verification (Plan 04 Task 2) | N/A |
### Sampling Rate
- **Per task commit:** `npx jest --no-coverage`
- **Per wave merge:** `npx jest --no-coverage && npx tsc --noEmit`
- **Phase gate:** Full suite green + TypeScript clean + human E2E approval before phase complete
### Wave 0 Gaps
None — existing test infrastructure covers coordinate math. No new test files required for Plan 04. The deleted `ai-coords.test.ts` is intentionally not recreated (it tested a deleted function).
---
## Sources
### Primary (HIGH confidence)
- pdfjs-dist 5.4.296 package introspection — `node_modules/pdfjs-dist/legacy/build/` directory listing, `package.json` main/types fields
- Project codebase — `FieldPlacer.tsx` (coordinate formula, lines 287-295), `prepare-document.test.ts` (existing unit test pattern), `schema.ts` (SignatureFieldData types), `PreparePanel.tsx` (UI pattern for async buttons), `prepare/route.ts` (API route pattern)
- STATE.md decisions — locked: manual json_schema (not zodResponseFormat); pdfjs-dist legacy build; Zod v4 incompatibility confirmed issues #1540 #1602 #1709
- REQUIREMENTS.md AI-01, AI-02 — exact success criteria
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/extract-text.ts` — current implementation, text-layer extraction approach
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/ai/field-placement.ts` — current implementation, GPT-4.1 classification
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/api/documents/[id]/ai-prepare/route.ts` — current route implementation
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx``pdfToScreenCoords` function (lines 44-54) and drag-end coordinate formula (lines 291-292)
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/app/portal/(protected)/documents/[docId]/_components/PdfViewer.tsx``pageInfo.originalWidth/originalHeight` from `page.view[2]/page.view[3]` (react-pdf)
- `.planning/phases/13-ai-field-placement-and-pre-fill/.continue-here.md` — complete record of what was attempted, what failed, and current state
- `git log --oneline` — confirmed commit history showing vision→text pivot
- STATE.md — locked decisions including manual json_schema, pdfjs-dist legacy build, Zod v4 incompatibility
- `/Users/ccopeland/temp/red/teressa-copeland-homes/src/lib/pdf/__tests__/prepare-document.test.ts` — 10 tests, all passing
### Secondary (MEDIUM confidence)
- [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — json_schema shape, strict mode requirements (additionalProperties: false, all fields required)
- [openai npm package](https://www.npmjs.com/package/openai) — version 6.32.0 confirmed as latest March 2026
- WebSearch results confirming zodResponseFormat broken with Zod v4: GitHub issues #1540, #1602, #1709, #1739 all open/confirmed
- pdfjs-dist server-side text extraction pattern — multiple WebSearch sources confirm `GlobalWorkerOptions.workerSrc = ''` for Node.js fake-worker mode
- pdfjs-dist transform matrix format — verified by reading pdfjs docs + confirmed by the implementation's usage of `transform[4]`/`transform[5]` for x/y
- react-pdf `page.view` array format — confirmed from PdfViewer.tsx `page.view[0], page.view[2]` / `page.view[1], page.view[3]` usage with comment `// Math.max handles non-standard mediaBox ordering`
### Tertiary (LOW confidence)
- AI coordinate accuracy on Utah REPC forms — untested; flagged as open question in STATE.md
- GPT-4o-mini token usage estimate for 20-page legal document — estimated from typical legal document density, not measured
- GPT-4.1 model availability for all API keys — not verified; gpt-4.1 may not be available in all tiers
- Real Utah REPC 20-page blank extraction accuracy — not measured; text-layer quality varies by PDF
---
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH — pdfjs-dist already installed and confirmed v5.4.296; openai v6.32.0 confirmed from npm; project uses Zod v4.3.6 (manual json_schema confirmed required)
- Architecture: HIGH — coordinate formula confirmed from FieldPlacer.tsx source; API route pattern confirmed from prepare/route.ts and fields/route.ts; field ID keying confirmed from STATE.md Phase 12.1 decisions
- Pitfalls: HIGH — Y-axis inversion, Zod v4 zodResponseFormat breakage, worker setup, and field ID keying are all confirmed from authoritative sources (code + STATE.md + verified GitHub issues)
- Standard stack: HIGH — all packages confirmed installed and version-verified
- Architecture: HIGH — current code read directly, coordinate system verified analytically and confirmed by test results
- Pitfalls: HIGH — all pitfalls documented from actual bugs encountered during development (per .continue-here.md) or confirmed from code inspection
- Plan 04 gap (Task 1 test command): HIGH — confirmed by checking git status (ai-coords.test.ts deleted), running jest (prepare-document.test.ts passes)
**Research date:** 2026-03-21
**Valid until:** 2026-04-21 (30 days — openai SDK stable; pdfjs-dist stable; Zod v4 issues open but workaround confirmed)
**Research date:** 2026-04-03
**Valid until:** 2026-05-03 (30 days — stack stable; coordinate math is deterministic; no external dependencies changing)