Files
red/.planning/phases/13-ai-field-placement-and-pre-fill/13-RESEARCH.md
2026-03-21 16:47:28 -06:00

518 lines
29 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 13: AI Field Placement and Pre-fill - Research
**Researched:** 2026-03-21
**Domain:** OpenAI GPT-4o-mini structured outputs + pdfjs-dist server-side text extraction + coordinate conversion
**Confidence:** HIGH
<phase_requirements>
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|-----------------|
| AI-01 | Agent can click one button to have AI auto-place all field types (text, checkbox, initials, date, agent signature, client signature) on a PDF in correct positions | pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output for field classification; aiCoordsToPagePdfSpace() for coordinate conversion; fields PUT to existing /api/documents/[id]/fields endpoint |
| AI-02 | AI pre-fills text fields with known values from the client profile (name, property address, date) | Client profile already has name + propertyAddress in DB; textFillData (Record<string,string> keyed by field UUID) already wired through DocumentPageClient → FieldPlacer; AI route returns both fields array and pre-fill map; agent reviews before committing |
</phase_requirements>
---
## Summary
Phase 13 adds an "AI Auto-place" button that extracts text from the PDF via pdfjs-dist (already installed), sends that text to GPT-4o-mini with a structured output schema asking for field type + page + normalized percentage coordinates, converts those percentages to PDF user-space points (with Y-axis flip), and writes the resulting `SignatureFieldData[]` array to the existing `/api/documents/[id]/fields` PUT endpoint. The agent reviews, adjusts, or deletes any AI-placed field before proceeding.
The `openai` npm package (v6.x, latest as of March 2026) is NOT currently installed in the project — it must be added. However, pdfjs-dist 5.4.296 is already present (via react-pdf) and the legacy build at `pdfjs-dist/legacy/build/pdf.mjs` is usable for server-side text extraction. The coordinate conversion formula already exists in FieldPlacer.tsx and `prepare-document.test.ts`; Phase 13 must replicate it as a named utility (`aiCoordsToPagePdfSpace`) with a dedicated unit test.
The decision recorded in STATE.md is authoritative: do NOT use `zodResponseFormat` (broken with Zod v4 that's installed). Use manual `json_schema` response_format with `strict: true`, `additionalProperties: false`, and every property in `required` at every nesting level. GPT-4o-mini supports this natively (confirmed since gpt-4o-mini-2024-07-18).
**Primary recommendation:** Install `openai` npm package, use `pdfjs-dist/legacy/build/pdf.mjs` for text extraction with `GlobalWorkerOptions.workerSrc = ''` in Node.js route context, send extracted text to GPT-4o-mini with manual JSON schema, convert percentage coordinates to PDF points with Y-axis flip, write fields via existing PUT endpoint, and have the agent review before committing.
---
## Standard Stack
### Core
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| openai (npm) | ^6.32.0 (latest Mar 2026) | Official OpenAI TypeScript SDK for Chat Completions API | Not yet installed; must `npm install openai`. Official SDK with strict mode structured outputs |
| pdfjs-dist | 5.4.296 (already installed) | Server-side PDF text extraction via `getTextContent()` | Already a project dependency via react-pdf. Legacy build works in Node.js route handlers |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| server-only | NOT installed (not in node_modules) | Build-time guard preventing client-side import of server modules | Use comment guard `// server-only` or explicit `if (typeof window !== 'undefined') throw` — OR install `server-only` package |
| crypto.randomUUID() | Node built-in | Generate UUIDs for AI-placed field IDs | Already used in FieldPlacer for field IDs; same pattern here |
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| Manual `json_schema` response_format | `zodResponseFormat` helper | zodResponseFormat is broken with Zod v4 (confirmed issues #1540, #1602, #1709 — this is a locked decision in STATE.md) |
| pdfjs-dist legacy build | pdf-parse, unpdf | pdfjs-dist is already installed; unpdf would add a dependency; pdf-parse is simpler but less positional data available |
| GPT-4o-mini | GPT-4o, GPT-4.1 | GPT-4o-mini is cheapest and supports structured outputs with 100% schema compliance; good enough for field classification |
**Installation:**
```bash
npm install openai
```
---
## Architecture Patterns
### Recommended Project Structure
```
src/
├── lib/
│ ├── ai/
│ │ ├── extract-text.ts # pdfjs-dist server-side text extraction (server-only guard)
│ │ └── field-placement.ts # GPT-4o-mini call + aiCoordsToPagePdfSpace() (server-only guard)
│ └── pdf/
│ ├── prepare-document.ts # existing — unchanged
│ └── __tests__/
│ ├── prepare-document.test.ts # existing — unchanged
│ └── ai-coords.test.ts # NEW — unit test for aiCoordsToPagePdfSpace()
└── app/
└── api/
└── documents/
└── [id]/
└── ai-prepare/
└── route.ts # NEW — POST /api/documents/[id]/ai-prepare
```
### Pattern 1: Server-Side PDF Text Extraction with pdfjs-dist Legacy Build
**What:** Import from `pdfjs-dist/legacy/build/pdf.mjs`, set `GlobalWorkerOptions.workerSrc = ''` (empty string tells pdf.js to fall back to synchronous/fake worker mode in Node.js — this is the documented pattern for server-side use without a browser worker thread), load the PDF bytes, iterate pages calling `getTextContent()`.
**When to use:** Server-only route handlers (Next.js App Router route.ts files run on Node.js).
**Important:** The project's existing client-side usage (`PdfViewer.tsx`, `PreviewModal.tsx`) sets `workerSrc = new URL('pdfjs-dist/build/pdf.worker.min.mjs', import.meta.url).toString()` — that is browser-only. The server-side module must set `workerSrc = ''` independently. Do NOT share the import or the GlobalWorkerOptions assignment across server and client.
```typescript
// Source: pdfjs-dist docs + STATE.md v1.1 Research decision
// lib/ai/extract-text.ts — server-only
import { getDocument, GlobalWorkerOptions } from 'pdfjs-dist/legacy/build/pdf.mjs';
import { readFile } from 'node:fs/promises';
// Empty string = no worker thread (fake/synchronous worker) — required for Node.js server context
GlobalWorkerOptions.workerSrc = '';
export interface PageText {
page: number; // 1-indexed
text: string; // all text items joined with spaces
width: number; // page width in PDF points (72 DPI)
height: number; // page height in PDF points (72 DPI)
}
export async function extractPdfText(filePath: string): Promise<PageText[]> {
const data = new Uint8Array(await readFile(filePath));
const pdf = await getDocument({ data }).promise;
const pages: PageText[] = [];
for (let pageNum = 1; pageNum <= pdf.numPages; pageNum++) {
const page = await pdf.getPage(pageNum);
const viewport = page.getViewport({ scale: 1.0 });
const textContent = await page.getTextContent();
const text = textContent.items
.filter((item) => 'str' in item)
.map((item) => (item as { str: string }).str)
.join(' ');
pages.push({
page: pageNum,
width: viewport.width,
height: viewport.height,
text,
});
}
return pages;
}
```
### Pattern 2: GPT-4o-mini Structured Output with Manual JSON Schema
**What:** Use the `openai` SDK `chat.completions.create()` with `response_format: { type: 'json_schema', json_schema: { ... strict: true } }`. The schema asks the model to return an array of field placement objects with: `page` (1-indexed integer), `fieldType` (enum of 7 types), `xPct` (0100 percentage of page width), `yPct` (0100 percentage of page height, measured from page TOP — AI models think top-left origin), `widthPct`, `heightPct`, and optionally `prefillValue` for text fields.
**Critical JSON Schema rule:** When `strict: true`, ALL properties must be in `required` and ALL objects must have `additionalProperties: false`. Any missing field from `required` causes an API 400 error.
**When to use:** All AI field placement requests.
```typescript
// Source: OpenAI docs + STATE.md locked decision (manual json_schema, not zodResponseFormat)
// lib/ai/field-placement.ts — server-only
import OpenAI from 'openai';
import type { PageText } from './extract-text';
import type { SignatureFieldData, SignatureFieldType } from '@/lib/db/schema';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const FIELD_PLACEMENT_SCHEMA = {
type: 'object',
properties: {
fields: {
type: 'array',
items: {
type: 'object',
properties: {
page: { type: 'integer' },
fieldType: { type: 'string', enum: ['text', 'checkbox', 'initials', 'date', 'client-signature', 'agent-signature', 'agent-initials'] },
xPct: { type: 'number' },
yPct: { type: 'number' }, // % from page TOP (AI top-left origin)
widthPct: { type: 'number' },
heightPct: { type: 'number' },
prefillValue: { type: 'string' }, // only for text fields; empty string if none
},
required: ['page', 'fieldType', 'xPct', 'yPct', 'widthPct', 'heightPct', 'prefillValue'],
additionalProperties: false,
},
},
},
required: ['fields'],
additionalProperties: false,
} as const;
```
### Pattern 3: Coordinate Conversion — AI Percentage to PDF User-Space
**What:** AI returns percentage coordinates with top-left origin (AI models think of a page as a grid where (0,0) is top-left). PDF user-space uses bottom-left origin with points (1 pt = 1/72 inch). Two conversions are needed:
1. Percentage to absolute points: `x = (xPct / 100) * pageWidth`
2. Y-axis flip: `y = pageHeight - (yPct / 100) * pageHeight - fieldHeight`
(the stored y is the BOTTOM edge of the field in PDF space)
**This is the exact same formula already used in FieldPlacer.tsx** for the drag-and-drop coordinate conversion. The new `aiCoordsToPagePdfSpace()` function extracts it as a named utility, verified by a unit test.
```typescript
// Source: FieldPlacer.tsx coordinate math (lines 287-295) — same formula
// lib/ai/field-placement.ts
export interface AiFieldCoords {
page: number;
fieldType: SignatureFieldType;
xPct: number; // % from left, top-left origin
yPct: number; // % from top, top-left origin
widthPct: number;
heightPct: number;
prefillValue: string;
}
/**
* Convert AI percentage coordinates (top-left origin) to PDF user-space points (bottom-left origin).
*
* pageWidth/pageHeight in PDF points (from page.getViewport({ scale: 1.0 })).
*
* Formula mirrors FieldPlacer.tsx handleDragEnd (lines 289-291):
* pdfX = (clampedX / renderedW) * pageInfo.originalWidth
* pdfY = ((renderedH - (clampedY + fieldHpx)) / renderedH) * pageInfo.originalHeight
*/
export function aiCoordsToPagePdfSpace(
coords: AiFieldCoords,
pageWidth: number,
pageHeight: number,
): { x: number; y: number; width: number; height: number } {
const fieldWidth = (coords.widthPct / 100) * pageWidth;
const fieldHeight = (coords.heightPct / 100) * pageHeight;
const screenX = (coords.xPct / 100) * pageWidth;
const screenY = (coords.yPct / 100) * pageHeight; // screen Y from top
const x = screenX;
// PDF y = distance from BOTTOM. screenY is from top, so flip:
// pdfY = pageHeight - screenY - fieldHeight (bottom edge of field)
const y = pageHeight - screenY - fieldHeight;
return { x, y, width: fieldWidth, height: fieldHeight };
}
```
### Pattern 4: AI Auto-place API Route
**What:** POST `/api/documents/[id]/ai-prepare` — server-side route that orchestrates:
1. Load document + client from DB
2. Resolve PDF file path
3. Call `extractPdfText()` for all pages
4. Call GPT-4o-mini with extracted text and client profile data
5. Convert AI coords to PDF user-space for each field
6. Write `SignatureFieldData[]` back to DB via direct DB update (same as fields PUT endpoint pattern)
7. Return `{ fields, textFillData }` — client updates local state from response
**When to use:** Called from the "AI Auto-place" button in PreparePanel.
```typescript
// Source: pattern matches /api/documents/[id]/prepare/route.ts
// app/api/documents/[id]/ai-prepare/route.ts
export async function POST(
req: Request,
{ params }: { params: Promise<{ id: string }> }
) {
const session = await auth();
if (!session?.user?.id) return new Response('Unauthorized', { status: 401 });
const { id } = await params;
const doc = await db.query.documents.findFirst({
where: eq(documents.id, id),
with: { client: true },
});
if (!doc) return Response.json({ error: 'Not found' }, { status: 404 });
if (!doc.filePath) return Response.json({ error: 'No PDF file' }, { status: 422 });
if (doc.status !== 'Draft') return Response.json({ error: 'Document is locked' }, { status: 403 });
const filePath = path.join(UPLOADS_DIR, doc.filePath);
const pageTexts = await extractPdfText(filePath);
const { fields, textFillData } = await classifyFieldsWithAI(pageTexts, doc.client);
// Write fields to DB (same as PUT /fields)
const [updated] = await db
.update(documents)
.set({ signatureFields: fields })
.where(eq(documents.id, id))
.returning();
return Response.json({ fields: updated.signatureFields, textFillData });
}
```
### Pattern 5: AI Auto-place Button in PreparePanel
**What:** Add "AI Auto-place" button to PreparePanel. On click: POST to `/api/documents/[id]/ai-prepare`, receive `{ fields, textFillData }`, update DocumentPageClient state (`setTextFillData`, invalidate preview token). FieldPlacer reloads from DB via its existing `loadFields` useEffect (or receives the updated fields as a prop — both approaches work; recommend refreshing from DB to keep single source of truth).
**Recommended approach:** After the AI route returns, trigger a page reload or force FieldPlacer to re-fetch by incrementing a `fieldReloadKey` prop. The simplest approach is to call `router.refresh()` (which is already used in PreparePanel after prepare+send) or to expose a `reload` callback from FieldPlacer.
### Anti-Patterns to Avoid
- **DO NOT** import pdfjs-dist in client components for server extraction — server-only extraction guard is mandatory (pdfjs-dist in a server route is fine; it just must not be bundled into the client)
- **DO NOT** use `zodResponseFormat` — broken with Zod v4 (issues #1540, #1602, #1709). This is a locked decision.
- **DO NOT** use `workerSrc = new URL('...', import.meta.url)` in the server route — `import.meta.url` may not resolve correctly in all Next.js Route Handler contexts and triggers browser worker initialization
- **DO NOT** use the AI coordinates as-is without conversion — AI returns top-left origin percentages; PDF requires bottom-left origin points
- **DO NOT** lock fields after AI placement — the agent MUST be able to review, adjust, or delete any AI-placed field (this is a success criterion)
- **DO NOT** have the AI route set document status to anything other than Draft — only the prepare route should move status to Sent
---
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| PDF text extraction | Custom PDF parser | pdfjs-dist legacy build (`getTextContent`) | Already installed; handles fonts, encodings, multi-page; tested by Mozilla |
| Structured AI output | Manual JSON parsing + regex fallback | OpenAI `json_schema` response_format with `strict: true` | 100% schema compliance guaranteed via constrained decoding; no parse failures |
| UUID generation for field IDs | Custom ID generator | `crypto.randomUUID()` | Node built-in; same pattern used in FieldPlacer and schema |
| Field type enum validation | Custom validation function | TypeScript union type `SignatureFieldType` | Already defined in schema.ts; pass as `enum` array in JSON schema |
**Key insight:** Both the PDF extraction and the field write endpoint already exist in the project. Phase 13 is primarily a thin orchestration layer connecting them via an AI call.
---
## Common Pitfalls
### Pitfall 1: Y-Axis Inversion Bug
**What goes wrong:** AI returns `yPct: 10` meaning "10% from the top of the page." Developer naively converts: `y = (10/100) * pageHeight = 79.2 pts`. Stored as y=79.2 in PDF space, which means 79.2 pts from the BOTTOM — so the field appears near the bottom, not 10% from the top.
**Why it happens:** AI models describe positions with top-left origin. PDF user-space uses bottom-left origin.
**How to avoid:** Use `aiCoordsToPagePdfSpace()` which applies the flip: `y = pageHeight - screenY - fieldHeight`. The unit test validates this with US Letter dimensions.
**Warning signs:** Fields appear at the wrong vertical position; fields meant for the top of a page appear near the bottom.
### Pitfall 2: OpenAI Strict Mode Schema Requirements
**What goes wrong:** Passing a JSON schema with `strict: true` that omits a property from `required` or omits `additionalProperties: false` on a nested object. The API returns a 400 error: "Invalid schema for response_format."
**Why it happens:** OpenAI strict mode enforces that EVERY object at EVERY nesting level has ALL properties listed in `required` AND `additionalProperties: false`. The requirement cascades to nested objects.
**How to avoid:** Include `required` listing all property keys and `additionalProperties: false` on EVERY object, including the items schema inside arrays.
**Warning signs:** `400 BadRequestError: Invalid schema for response_format` in server logs.
### Pitfall 3: pdfjs-dist Worker in Node.js Context
**What goes wrong:** Calling `getDocument()` without setting `GlobalWorkerOptions.workerSrc = ''` in Node.js throws: `Error: 'No "GlobalWorkerOptions.workerSrc" specified.'` OR tries to spin up a worker thread and fails.
**Why it happens:** pdf.js was designed for browsers. It tries to auto-detect the worker path via `document.currentScript.src`, which is null in Node.js.
**How to avoid:** Set `GlobalWorkerOptions.workerSrc = ''` at the top of the server-side extract-text module. This enables "fake worker" (synchronous) mode in Node.js — no worker thread needed for text extraction.
**Warning signs:** `No "GlobalWorkerOptions.workerSrc" specified` error in route handler logs.
### Pitfall 4: pdfjs-dist TypeScript Import Path
**What goes wrong:** Importing from `'pdfjs-dist/legacy/build/pdf.mjs'` without adjusting TypeScript config. The `legacy/build/pdf.d.mts` file contains `export * from "pdfjs-dist"` which re-exports the main types, but the import path itself may trigger `skipLibCheck` issues.
**Why it happens:** The legacy build has `.d.mts` extension (ESM TypeScript declaration), which some older TypeScript configurations don't automatically pick up.
**How to avoid:** Confirmed — the project already has `"transpilePackages": ['react-pdf', 'pdfjs-dist']` in next.config.ts. Import `getDocument` and `GlobalWorkerOptions` from `'pdfjs-dist/legacy/build/pdf.mjs'`. If type errors appear, add `@ts-ignore` or use the main `pdfjs-dist` types (they're the same via the re-export).
**Warning signs:** TypeScript error `Could not find declaration file for 'pdfjs-dist/legacy/build/pdf.mjs'`.
### Pitfall 5: FieldPlacer State Not Reflecting AI-Placed Fields
**What goes wrong:** AI route writes fields to DB successfully. But FieldPlacer on screen still shows the previous (empty) fields because its local state isn't updated.
**Why it happens:** FieldPlacer loads fields once on mount via `useEffect`. The AI route write bypasses the React state.
**How to avoid:** After the AI route returns successfully, trigger FieldPlacer to re-fetch from DB. Options:
- Call `router.refresh()` (Next.js App Router — causes server component re-render, but FieldPlacer re-mounts and calls loadFields)
- Add a `fieldReloadKey` prop to FieldPlacer that causes its `useEffect` to re-run when incremented
- Pass the returned `fields` array directly to FieldPlacer via a prop (requires making FieldPlacer accept `initialFields`)
**Recommended:** The simplest approach is to expose an `onAiPlacement` callback that calls `router.refresh()` on DocumentPageClient, OR expose a reload callback from FieldPlacer. Given that FieldPlacer already loads from DB on mount, `router.refresh()` is clean.
### Pitfall 6: Text Field Pre-fill Requires Field IDs
**What goes wrong:** AI route returns `textFillData` keyed by "client-name" or label string. But `textFillData` in the project is keyed by **field UUID** (SignatureFieldData.id). A label-keyed map won't match anything in `textFillData[field.id]` lookups.
**Why it happens:** Phase 12.1 wired textFillData to be keyed by `field.id` (UUID) not by label. This is a confirmed design decision in STATE.md.
**How to avoid:** When building `textFillData` from AI output, use the `id: crypto.randomUUID()` assigned to each AI-placed field in the route handler. The route handler creates the field UUIDs, so it can simultaneously build `{ [fieldId]: prefillValue }` for text fields with prefill values.
**Warning signs:** Text fill values don't appear in the preview even though AI returned them.
---
## Code Examples
### Full Coordinate Conversion with Unit Test Target
```typescript
// Source: FieldPlacer.tsx lines 287-295 (exact formula)
// Expected unit test cases for aiCoordsToPagePdfSpace:
// US Letter: 612 × 792 pts
// AI says: text field at xPct=10, yPct=5, widthPct=30, heightPct=5 (top of page)
// Expected: x=61.2, y=792-(0.05*792)-(0.05*792) = 792 - 39.6 - 39.6 = 712.8
// i.e., field bottom edge is 712.8 pts from page bottom (near the top)
// Checkbox: AI says xPct=90, yPct=95, widthPct=3, heightPct=3 (near bottom-right)
// Expected: x = 0.90*612 = 550.8
// fieldH = 0.03*792 = 23.76
// y = 792 - (0.95*792) - 23.76 = 792 - 752.4 - 23.76 = 15.84
```
### Manual JSON Schema for GPT-4o-mini (Complete)
```typescript
// Source: OpenAI structured outputs docs
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: `You are a real estate document form field extractor.
Given extracted text from a PDF page (with context about page number and dimensions),
identify where signature, text, checkbox, initials, and date fields should be placed.
Return fields as percentage positions (0-100) from the TOP-LEFT of the page.
Use these field types: text (for typed values), checkbox, initials, date, client-signature, agent-signature, agent-initials.
For text fields that match the client profile, set prefillValue to the known value. Otherwise use empty string.`,
},
{
role: 'user',
content: `Client name: ${clientName}\nProperty address: ${propertyAddress}\n\nPDF pages:\n${pagesSummary}`,
},
],
response_format: {
type: 'json_schema',
json_schema: {
name: 'field_placement',
strict: true,
schema: FIELD_PLACEMENT_SCHEMA, // defined above — all fields required, additionalProperties: false
},
},
});
```
### FieldPlacer Reload After AI Placement (Recommended Pattern)
```typescript
// In DocumentPageClient.tsx — add aiPlacementKey state
const [aiPlacementKey, setAiPlacementKey] = useState(0);
// Pass to FieldPlacer via PdfViewerWrapper
// In FieldPlacer, add to loadFields useEffect dependency array:
useEffect(() => {
loadFields();
}, [docId, aiPlacementKey]); // re-fetch when key increments
// After AI auto-place succeeds in PreparePanel:
async function handleAiAutoPlace() {
const res = await fetch(`/api/documents/${docId}/ai-prepare`, { method: 'POST' });
if (res.ok) {
const { textFillData: aiTextFill } = await res.json();
setTextFillData(prev => ({ ...prev, ...aiTextFill }));
setAiPlacementKey(k => k + 1); // triggers FieldPlacer reload
setPreviewToken(null);
}
}
```
---
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| AcroForm heuristic field detection (FORMS-V2-01) | GPT-4o-mini text classification | Superseded per REQUIREMENTS.md | AI-based approach handles flat/scanned forms that have no AcroForm fields |
| `zodResponseFormat` helper | Manual `json_schema` response_format | Broken in Zod v4 (as of 2025) | Must write JSON schema manually; more verbose but reliable |
| `disableWorker = true` (pdfjs-dist v2 API) | `GlobalWorkerOptions.workerSrc = ''` | pdfjs-dist v3+ | Correct server-side Node.js pattern for current pdfjs-dist 5.x |
| Positional text extraction via bounding boxes | Text-only extraction with AI classification | Phase 13 approach | Simpler and sufficient for label classification; AI handles the spatial reasoning |
**Deprecated/outdated:**
- `PDFJS.disableWorker = true`: v2 API, removed in v3+. Use `GlobalWorkerOptions.workerSrc = ''` instead.
- `zodResponseFormat`: Works only with Zod v3. Project uses Zod v4.3.6. Use manual json_schema.
---
## Open Questions
1. **OPENAI_API_KEY environment variable**
- What we know: The project's `.env.local` file exists but its contents are private. The openai SDK reads from `process.env.OPENAI_API_KEY` by default.
- What's unclear: Whether OPENAI_API_KEY is already set in `.env.local` for the project.
- Recommendation: The plan should include a step noting that OPENAI_API_KEY must be set in `.env.local`. The route handler should return a clear 503 error if OPENAI_API_KEY is not configured.
2. **Utah REPC 20-page form field accuracy**
- What we know: AI coordinate accuracy on real Utah forms is listed as an explicit concern in STATE.md Blockers/Concerns.
- What's unclear: How well GPT-4o-mini will perform on a dense legal document like the Utah REPC without fine-tuning. Percentage coordinates from text-only extraction (no visual) may be imprecise.
- Recommendation: Plan 13-04 integration test is non-negotiable. Expect iteration on the system prompt. Consider truncating very long page text to stay within GPT-4o-mini context limits.
3. **Token limit for 20-page document**
- What we know: GPT-4o-mini context window is 128K tokens. A 20-page dense legal document could approach 30,000-50,000 tokens of extracted text.
- What's unclear: Whether sending all 20 pages in one call will stay within limits comfortably.
- Recommendation: In `classifyFieldsWithAI`, cap page text at ~2000 chars per page (truncate with ellipsis) to stay well under limits. Total text budget: 20 × 2000 = 40,000 chars ≈ ~10,000 tokens — well within 128K. The plan should include this truncation.
4. **FieldPlacer prop vs. DB reload for displaying AI fields**
- What we know: FieldPlacer currently loads fields via `useEffect([docId])` on mount. After AI placement writes to DB, FieldPlacer needs to re-fetch.
- What's unclear: Whether adding `aiPlacementKey` prop to FieldPlacer (which threads through PdfViewerWrapper) or using `router.refresh()` is cleaner given the current component hierarchy.
- Recommendation: Add `aiPlacementKey` prop to FieldPlacer and thread through PdfViewerWrapper. This avoids a full page re-render from `router.refresh()` and gives more surgical control.
---
## Sources
### Primary (HIGH confidence)
- pdfjs-dist 5.4.296 package introspection — `node_modules/pdfjs-dist/legacy/build/` directory listing, `package.json` main/types fields
- Project codebase — `FieldPlacer.tsx` (coordinate formula, lines 287-295), `prepare-document.test.ts` (existing unit test pattern), `schema.ts` (SignatureFieldData types), `PreparePanel.tsx` (UI pattern for async buttons), `prepare/route.ts` (API route pattern)
- STATE.md decisions — locked: manual json_schema (not zodResponseFormat); pdfjs-dist legacy build; Zod v4 incompatibility confirmed issues #1540 #1602 #1709
- REQUIREMENTS.md AI-01, AI-02 — exact success criteria
### Secondary (MEDIUM confidence)
- [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — json_schema shape, strict mode requirements (additionalProperties: false, all fields required)
- [openai npm package](https://www.npmjs.com/package/openai) — version 6.32.0 confirmed as latest March 2026
- WebSearch results confirming zodResponseFormat broken with Zod v4: GitHub issues #1540, #1602, #1709, #1739 all open/confirmed
- pdfjs-dist server-side text extraction pattern — multiple WebSearch sources confirm `GlobalWorkerOptions.workerSrc = ''` for Node.js fake-worker mode
### Tertiary (LOW confidence)
- AI coordinate accuracy on Utah REPC forms — untested; flagged as open question in STATE.md
- GPT-4o-mini token usage estimate for 20-page legal document — estimated from typical legal document density, not measured
---
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH — pdfjs-dist already installed and confirmed v5.4.296; openai v6.32.0 confirmed from npm; project uses Zod v4.3.6 (manual json_schema confirmed required)
- Architecture: HIGH — coordinate formula confirmed from FieldPlacer.tsx source; API route pattern confirmed from prepare/route.ts and fields/route.ts; field ID keying confirmed from STATE.md Phase 12.1 decisions
- Pitfalls: HIGH — Y-axis inversion, Zod v4 zodResponseFormat breakage, worker setup, and field ID keying are all confirmed from authoritative sources (code + STATE.md + verified GitHub issues)
**Research date:** 2026-03-21
**Valid until:** 2026-04-21 (30 days — openai SDK stable; pdfjs-dist stable; Zod v4 issues open but workaround confirmed)