Files

5.4 KiB
Raw Permalink Blame History

phase, task, status, last_updated
phase task status last_updated
13-ai-field-placement-and-pre-fill human-verification (Plan 13-04 Task 2) in_progress 2026-03-21T23:56:48.249Z

<current_state> Plan 13-04 checkpoint is open — Task 1 (automated unit tests + tsc) passed, waiting on human E2E verification (Task 2). However the AI field placement itself is still being debugged. The placement is visually wrong: fields land 30-40% too low on the page, specifically missing the real estate signature block (blank line ABOVE label pattern) and clustering near the page footer instead.

We are in a coordinate system debugging loop. The last thing we tried was stamping PAGE N on rendered images and tightening the prompt, but it didn't fix the offset. </current_state>

<completed_work>

  • Plan 13-01 ✓ — extract-text.ts (pdfjs-dist fake-worker, @napi-rs/canvas render), field-placement.ts (GPT-4o vision, aiCoordsToPagePdfSpace), unit tests (3 passing)
  • Plan 13-02 ✓ — POST /api/documents/[id]/ai-prepare route (extract → AI → convert → DB write)
  • Plan 13-03 ✓ — "AI Auto-place Fields" violet button in PreparePanel, aiPlacementKey prop chain through DocumentPageClient → PdfViewerWrapper → PdfViewer → FieldPlacer
  • Plan 13-04 Task 1 ✓ — automated checks pass
  • Several bug fixes committed:
    • pdfjs workerSrc must be file:// path not empty string
    • PreviewModal needs dynamic(..., { ssr: false }) to prevent SSR DOMMatrix crash
    • @napi-rs/canvas must be in serverExternalPackages in next.config.ts
    • Upgraded model gpt-4o-mini → gpt-4o
    • Removed checkboxes from AI placement
    • Y-coordinate clamping to prevent fields below canvas
    • Switched from text extraction to vision (render pages as JPEG, send to GPT-4o)
    • Added PAGE N red stamp on each rendered image
    • Reduced field heights, added 0.5% y-nudge </completed_work>

<remaining_work>

  • ACTIVE BUG: AI field placement coordinates are systematically wrong — fields land ~30-40% too low on the page. Fields that should be on the signature block (y≈50%) are appearing at y≈85-90% near the copyright footer. The initials/date at the very bottom of the page DO land correctly, suggesting the bottom area is right but the middle is wrong.
  • Next debugging step that was interrupted: read PdfViewer.tsx to compare how pageInfo.originalHeight is constructed vs what page.getViewport({ scale: 1.0 }).height returns server-side. A DPI/scale mismatch between react-pdf and pdfjs server-side would cause a systematic offset.
  • After coordinate fix: complete Plan 13-04 Task 2 human verification checkpoint (approved or issues reported)
  • After verification: run node gsd-tools.cjs phase complete 13, commit docs

</remaining_work>

<decisions_made>

  • Use @napi-rs/canvas for server-side PDF page rendering (pre-built binaries, no native compilation)
  • Must add @napi-rs/canvas to serverExternalPackages in next.config.ts or Turbopack bundles the .node file and fails
  • pdfjs-dist 5.x fake-worker requires a file:// URL as workerSrc (empty string is falsy, throws)
  • Use GPT-4o vision (not text extraction) — text extraction can't identify WHERE the blanks are
  • RENDER_SCALE = 1.5 for the page images sent to GPT-4o
  • No checkboxes — positions are input-dependent, can't be AI-determined
  • Field sizes are type-bounded (text: 40-260×12-16pt, signature: 100-250×16-26pt, etc.)
  • Manual json_schema response_format (zodResponseFormat broken with Zod v4)

</decisions_made>

  • Coordinate offset bug: Fields are landing ~30-40% too low. Suspected cause: pageInfo.originalHeight in FieldPlacer (from react-pdf) may differ from page.getViewport({ scale: 1.0 }).height in the server-side pdfjs extraction. If react-pdf reports originalHeight at 96dpi (1056pt for a Letter page) but server pdfjs reports 792pt, the conversion would be wrong.
  • User noted model quality could be improved. GPT-4o is current model. Could try Claude claude-opus-4-6 via Anthropic API as alternative if coordinate fix doesn't solve accuracy.
The AI field placement uses: render page images server-side → send to GPT-4o vision → get xPct/yPct back → convert to PDF user-space points via aiCoordsToPagePdfSpace → store in DB → FieldPlacer re-fetches and renders.

The conversion formula: pdfY = pageHeight - (yPct/100 * pageHeight) - fieldHeight (with 0.5% nudge and clamping).

FieldPlacer renders with: top = renderedH - (pdfY / pageInfo.originalHeight) * renderedH - heightPx

If pageInfo.originalHeightpageHeight used in server-side conversion, every field will be offset. This is the prime suspect.

The real estate signature block pattern is: blank underline on one line, label like "(Seller's Signature)" on the line BELOW. The AI needs to place the field ON the blank, not on the label. This is in the prompt but may still be confusing the model.

<next_action>

  1. Read PdfViewer.tsx to find where pageInfo.originalHeight is set (look for onPageLoadSuccess or similar react-pdf callback that provides page dimensions)
  2. Compare to page.getViewport({ scale: 1.0 }).height in extract-text.ts
  3. If they differ (e.g. react-pdf uses devicePixelRatio or a different scale), fix the server-side pageHeight used in aiCoordsToPagePdfSpace to match what FieldPlacer expects
  4. If they match, the bug is elsewhere — add console.log in the route to print actual x/y values for a field and compare to where it visually appears </next_action>