fix(13): use AI-estimated field sizes with type bounds, stricter no-inline-text rule

- Replace fixed 144x36 with AI widthPct/heightPct clamped to per-type min/max
  (signatures 100-250x20-40pt, initials 36-80x16-28pt, date 60-130x14-24pt, text 60-280x14-24pt)
- Prompt: explicit 'no inline body text' rule — if text is part of a sentence, skip it
- Prompt: widthPct should match visual underline width, heightPct kept thin (~2-2.5%)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Chandler Copeland
2026-03-21 17:46:04 -06:00
parent 461abb0dc4
commit 48788dea23

View File

@@ -119,16 +119,19 @@ export async function classifyFieldsWithAI(
role: 'system',
content: `You are a real estate document form field extractor. You will receive images of PDF pages. Your job is to identify every location that needs to be filled in.
WHAT TO PLACE FIELDS ON:
- Blank underlines: ____________
- Labeled blanks: "Name: ______", "Address: ______", "Price: $______"
- Signature lines with labels like "(Seller's Signature)", "(Buyer's Signature)", "(Agent)"
- Date lines labeled "(Date)" or with a date underline
- Initials boxes: "[ ]" or "_____ Initials" or small boxes at page bottoms/margins
WHAT TO PLACE FIELDS ON (only these):
- Visible blank underlines: ____________ (a horizontal line with nothing on it)
- Labeled blank lines: "Name: ______", "Address: ______", "Price: $______"
- Signature lines labeled "(Seller's Signature)", "(Buyer's Signature)", "(Agent)", etc.
- Date underlines labeled "(Date)" or similar
- Initials boxes: small "[ ]" or "____" next to "Initials" labels, usually at page bottom margins
WHAT NOT TO PLACE FIELDS ON:
- Paragraph body text, instructions, legal boilerplate
- Headings and section titles
WHAT NOT TO PLACE FIELDS ON — STRICT:
- ANY paragraph body text, even if it contains an address, name, or value inline
- Document title, headings, section numbers
- Printed values that are already filled in (e.g. a pre-printed address in the document body)
- Descriptive or instructional text
- If the text is part of a sentence or clause, do NOT place a field on it
FIELD TYPES:
- "client-signature" → buyer or seller/client signature lines
@@ -138,10 +141,12 @@ FIELD TYPES:
- "date" → any date field
- "text" → all other blanks (names, addresses, prices, terms, etc.)
POSITIONING:
POSITIONING AND SIZING:
- xPct and yPct are percentages from the TOP-LEFT of that specific page image
- Place the field AT the blank line, not above or below it
- For a line like "Buyer's Signature __________ Date _______", place a client-signature at the signature blank's x/y and a date field at the date blank's x/y — they are separate fields on the same line
- Place the field AT the blank underline — align it to sit on top of the line
- For a row like "Signature __________ Date _______", create TWO separate fields: one for the signature blank and one for the date blank, each at their own x position
- widthPct: match the visual width of the underline — short blanks get small widths, long signature lines get wider
- heightPct: keep fields thin — signature/text ~2.5%, initials/date ~2%
- Do NOT place checkbox fields
PREFILL:
@@ -183,10 +188,21 @@ PREFILL:
const pageWidth = pageInfo?.width ?? 612; // fallback: US Letter
const pageHeight = pageInfo?.height ?? 792;
const { x, y } = aiCoordsToPagePdfSpace(aiField, pageWidth, pageHeight);
const { x, y, width: rawW, height: rawH } = aiCoordsToPagePdfSpace(aiField, pageWidth, pageHeight);
const width = 144; // pts: 2 inches
const height = 36; // pts: 0.5 inches
// Use AI-estimated size, clamped to type-appropriate min/max
const sizeLimits: Record<SignatureFieldType, { minW: number; maxW: number; minH: number; maxH: number }> = {
'client-signature': { minW: 100, maxW: 250, minH: 20, maxH: 40 },
'agent-signature': { minW: 100, maxW: 250, minH: 20, maxH: 40 },
'initials': { minW: 36, maxW: 80, minH: 16, maxH: 28 },
'agent-initials': { minW: 36, maxW: 80, minH: 16, maxH: 28 },
'date': { minW: 60, maxW: 130, minH: 14, maxH: 24 },
'text': { minW: 60, maxW: 280, minH: 14, maxH: 24 },
'checkbox': { minW: 16, maxW: 24, minH: 16, maxH: 24 },
};
const lim = sizeLimits[aiField.fieldType] ?? sizeLimits['text'];
const width = Math.max(lim.minW, Math.min(rawW, lim.maxW));
const height = Math.max(lim.minH, Math.min(rawH, lim.maxH));
const id = crypto.randomUUID();