Files

79 lines
5.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
phase: 13-ai-field-placement-and-pre-fill
task: human-verification (Plan 13-04 Task 2)
status: in_progress
last_updated: 2026-03-21T23:56:48.249Z
---
<current_state>
Plan 13-04 checkpoint is open — Task 1 (automated unit tests + tsc) passed, waiting on human E2E verification (Task 2). However the AI field placement itself is still being debugged. The placement is visually wrong: fields land 30-40% too low on the page, specifically missing the real estate signature block (blank line ABOVE label pattern) and clustering near the page footer instead.
We are in a coordinate system debugging loop. The last thing we tried was stamping PAGE N on rendered images and tightening the prompt, but it didn't fix the offset.
</current_state>
<completed_work>
- Plan 13-01 ✓ — `extract-text.ts` (pdfjs-dist fake-worker, @napi-rs/canvas render), `field-placement.ts` (GPT-4o vision, aiCoordsToPagePdfSpace), unit tests (3 passing)
- Plan 13-02 ✓ — `POST /api/documents/[id]/ai-prepare` route (extract → AI → convert → DB write)
- Plan 13-03 ✓ — "AI Auto-place Fields" violet button in PreparePanel, aiPlacementKey prop chain through DocumentPageClient → PdfViewerWrapper → PdfViewer → FieldPlacer
- Plan 13-04 Task 1 ✓ — automated checks pass
- Several bug fixes committed:
- pdfjs workerSrc must be `file://` path not empty string
- PreviewModal needs `dynamic(..., { ssr: false })` to prevent SSR DOMMatrix crash
- @napi-rs/canvas must be in `serverExternalPackages` in next.config.ts
- Upgraded model gpt-4o-mini → gpt-4o
- Removed checkboxes from AI placement
- Y-coordinate clamping to prevent fields below canvas
- Switched from text extraction to vision (render pages as JPEG, send to GPT-4o)
- Added PAGE N red stamp on each rendered image
- Reduced field heights, added 0.5% y-nudge
</completed_work>
<remaining_work>
- **ACTIVE BUG**: AI field placement coordinates are systematically wrong — fields land ~30-40% too low on the page. Fields that should be on the signature block (y≈50%) are appearing at y≈85-90% near the copyright footer. The initials/date at the very bottom of the page DO land correctly, suggesting the bottom area is right but the middle is wrong.
- Next debugging step that was interrupted: read `PdfViewer.tsx` to compare how `pageInfo.originalHeight` is constructed vs what `page.getViewport({ scale: 1.0 }).height` returns server-side. A DPI/scale mismatch between react-pdf and pdfjs server-side would cause a systematic offset.
- After coordinate fix: complete Plan 13-04 Task 2 human verification checkpoint (approved or issues reported)
- After verification: run `node gsd-tools.cjs phase complete 13`, commit docs
</remaining_work>
<decisions_made>
- Use `@napi-rs/canvas` for server-side PDF page rendering (pre-built binaries, no native compilation)
- Must add `@napi-rs/canvas` to `serverExternalPackages` in next.config.ts or Turbopack bundles the .node file and fails
- pdfjs-dist 5.x fake-worker requires a `file://` URL as workerSrc (empty string is falsy, throws)
- Use GPT-4o vision (not text extraction) — text extraction can't identify WHERE the blanks are
- RENDER_SCALE = 1.5 for the page images sent to GPT-4o
- No checkboxes — positions are input-dependent, can't be AI-determined
- Field sizes are type-bounded (text: 40-260×12-16pt, signature: 100-250×16-26pt, etc.)
- Manual `json_schema` response_format (zodResponseFormat broken with Zod v4)
</decisions_made>
<blockers>
- **Coordinate offset bug**: Fields are landing ~30-40% too low. Suspected cause: `pageInfo.originalHeight` in FieldPlacer (from react-pdf) may differ from `page.getViewport({ scale: 1.0 }).height` in the server-side pdfjs extraction. If react-pdf reports originalHeight at 96dpi (1056pt for a Letter page) but server pdfjs reports 792pt, the conversion would be wrong.
- User noted model quality could be improved. GPT-4o is current model. Could try Claude claude-opus-4-6 via Anthropic API as alternative if coordinate fix doesn't solve accuracy.
</blockers>
<context>
The AI field placement uses: render page images server-side → send to GPT-4o vision → get xPct/yPct back → convert to PDF user-space points via aiCoordsToPagePdfSpace → store in DB → FieldPlacer re-fetches and renders.
The conversion formula: `pdfY = pageHeight - (yPct/100 * pageHeight) - fieldHeight` (with 0.5% nudge and clamping).
FieldPlacer renders with: `top = renderedH - (pdfY / pageInfo.originalHeight) * renderedH - heightPx`
If `pageInfo.originalHeight``pageHeight` used in server-side conversion, every field will be offset. This is the prime suspect.
The real estate signature block pattern is: blank underline on one line, label like "(Seller's Signature)" on the line BELOW. The AI needs to place the field ON the blank, not on the label. This is in the prompt but may still be confusing the model.
</context>
<next_action>
1. Read `PdfViewer.tsx` to find where `pageInfo.originalHeight` is set (look for `onPageLoadSuccess` or similar react-pdf callback that provides page dimensions)
2. Compare to `page.getViewport({ scale: 1.0 }).height` in extract-text.ts
3. If they differ (e.g. react-pdf uses devicePixelRatio or a different scale), fix the server-side `pageHeight` used in `aiCoordsToPagePdfSpace` to match what FieldPlacer expects
4. If they match, the bug is elsewhere — add console.log in the route to print actual x/y values for a field and compare to where it visually appears
</next_action>