14 KiB
Stack Research
Domain: Real estate agent website + PDF document signing web app Researched: 2026-03-21 Confidence: HIGH (versions verified via npm registry; integration issues verified via official GitHub issues) Scope: v1.1 additions only — OpenAI integration, expanded field types, agent signature storage, filled preview
Existing Stack (Do Not Re-research)
Already validated and in package.json. Do not change these.
| Technology | Version in package.json | Role |
|---|---|---|
| Next.js | 16.2.0 | Full-stack framework |
| React | 19.2.4 | UI |
@cantoo/pdf-lib |
^2.6.3 | PDF modification (server-side) |
react-pdf |
^10.4.1 | In-browser PDF rendering |
signature_pad |
^5.1.3 | Canvas signature drawing |
zod |
^4.3.6 | Schema validation |
@vercel/blob |
^2.3.1 | File storage |
Drizzle ORM + postgres |
^0.45.1 / ^3.4.8 | Database |
| Auth.js (next-auth) | 5.0.0-beta.30 | Authentication |
New Stack Additions for v1.1
Core New Dependency: OpenAI API
| Technology | Version | Purpose | Why Recommended |
|---|---|---|---|
openai |
^6.32.0 | OpenAI API client for GPT calls | Official SDK, current latest, TypeScript-native. Provides client.chat.completions.create() for structured JSON output via manual json_schema response format. Required for AI field placement and pre-fill. |
No other new core dependencies are needed. The remaining v1.1 features extend capabilities already in @cantoo/pdf-lib, signature_pad, and react-pdf.
Supporting Libraries
| Library | Version | Purpose | When to Use |
|---|---|---|---|
unpdf |
^1.4.0 | Server-side PDF text extraction | Use in the AI pipeline API route to extract raw text from PDF pages before sending to OpenAI. Serverless-compatible, wraps PDF.js v5, works in Next.js API routes without native bindings. More reliable in serverless than pdfjs-dist directly. |
No other new supporting libraries needed. See "What NOT to Add" below.
Development Tools
No new dev tooling required for v1.1 features.
Installation
# New dependencies for v1.1
npm install openai unpdf
That is the full installation delta for v1.1.
Feature-by-Feature Integration Notes
Feature 1: OpenAI PDF Analysis + Field Placement
Flow:
- API route receives document ID
- Fetch PDF bytes from Vercel Blob (
@vercel/blob— already installed) - Extract text per page using
unpdf:getDocumentProxy()+extractText() - Call OpenAI
gpt-4o-miniwith extracted text + a manually defined JSON schema - Parse structured response: array of
{ fieldType, label, pageNumber, x, y, width, height, suggestedValue } - Save placement records to DB via Drizzle ORM
Why gpt-4o-mini (not gpt-4o): Sufficient for structured field extraction on real estate forms. Significantly cheaper. The task is extraction from known document templates — not complex reasoning.
Why manual JSON schema (not zodResponseFormat): The project uses zod v4.3.6. The zodResponseFormat helper in openai/helpers/zod uses vendored zod-to-json-schema that still expects ZodFirstPartyTypeKind — removed in Zod v4. This is a confirmed open bug as of late 2025. Using zodResponseFormat with Zod v4 throws runtime exceptions. Use response_format: { type: "json_schema", json_schema: { name: "...", strict: true, schema: { ... } } } directly with plain TypeScript types instead.
// CORRECT for Zod v4 project — use manual JSON schema, not zodResponseFormat
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: prompt }],
response_format: {
type: "json_schema",
json_schema: {
name: "field_placements",
strict: true,
schema: {
type: "object",
properties: {
fields: {
type: "array",
items: {
type: "object",
properties: {
fieldType: { type: "string", enum: ["text", "checkbox", "initials", "date", "signature"] },
label: { type: "string" },
pageNumber: { type: "number" },
x: { type: "number" },
y: { type: "number" },
width: { type: "number" },
height: { type: "number" },
suggestedValue: { type: "string" }
},
required: ["fieldType", "label", "pageNumber", "x", "y", "width", "height", "suggestedValue"],
additionalProperties: false
}
}
},
required: ["fields"],
additionalProperties: false
}
}
}
});
const result = JSON.parse(response.choices[0].message.content!);
Feature 2: Expanded Field Types in @cantoo/pdf-lib
No new library needed. @cantoo/pdf-lib v2.6.3 already supports all required field types natively:
| Field Type | @cantoo/pdf-lib API |
|---|---|
| Text | form.createTextField(name) → .addToPage(page, options) → .setText(value) |
| Checkbox | form.createCheckBox(name) → .addToPage(page, options) → .check() / .uncheck() |
| Initials | No dedicated type — use createTextField with width/height appropriate for initials |
| Date | No dedicated type — use createTextField, constrain value format in application logic |
| Agent Signature | Use page.drawImage(embeddedPng, { x, y, width, height }) — see Feature 3 |
Key pattern for checkboxes:
const checkBox = form.createCheckBox('fieldName')
checkBox.addToPage(page, { x, y, width: 15, height: 15, borderWidth: 1 })
if (shouldBeChecked) checkBox.check()
Coordinate system note: @cantoo/pdf-lib uses PDF coordinate space where y=0 is the bottom of the page. If field positions come from unpdf / PDF.js (which uses y=0 at top), you must transform: pdfY = pageHeight - sourceY - fieldHeight.
Feature 3: Agent Signature Storage
No new library needed. The project already has signature_pad v5.1.3, @vercel/blob, and Drizzle ORM.
Architecture:
- Agent draws signature in browser using
signature_pad(already installed) - Call
signaturePad.toDataURL('image/png')to get base64 PNG - POST to API route; server converts base64 →
Uint8Array→ uploads to Vercel Blob at a stable path (e.g.,/agents/{agentId}/signature.png) - Save blob URL to agent record in DB (add
signatureImageUrlcolumn toAgent/Usertable via Drizzle migration) - On "apply agent signature": server fetches blob URL, embeds PNG into PDF using
@cantoo/pdf-lib
signature_pad v5 in React — use useRef on a <canvas> element directly:
import SignaturePad from 'signature_pad'
import { useRef, useEffect } from 'react'
export function SignatureDrawer() {
const canvasRef = useRef<HTMLCanvasElement>(null)
const padRef = useRef<SignaturePad | null>(null)
useEffect(() => {
if (canvasRef.current) {
padRef.current = new SignaturePad(canvasRef.current)
}
return () => padRef.current?.off()
}, [])
const save = () => {
const dataUrl = padRef.current?.toDataURL('image/png')
// POST dataUrl to /api/agent/signature
}
return <canvas ref={canvasRef} width={400} height={150} />
}
Do NOT add react-signature-canvas. It wraps signature_pad at v1.1.0-alpha.2 (alpha status) and the project already has signature_pad directly. Use the raw library with a useRef.
Embedding the saved signature into PDF:
const sigBytes = await fetch(agentSignatureBlobUrl).then(r => r.arrayBuffer())
const sigImage = await pdfDoc.embedPng(new Uint8Array(sigBytes))
const dims = sigImage.scaleToFit(fieldWidth, fieldHeight)
page.drawImage(sigImage, { x: fieldX, y: fieldY, width: dims.width, height: dims.height })
Feature 4: Filled Document Preview
No new library needed. react-pdf v10.4.1 is already installed and supports rendering a PDF from an ArrayBuffer directly.
Architecture:
- Server Action: load original PDF from Vercel Blob, apply all field values (text, checkboxes, embedded signature image) using
@cantoo/pdf-lib, returnpdfDoc.save()bytes - API route returns the bytes as
application/pdf; client receives asArrayBuffer - Pass
ArrayBufferdirectly toreact-pdf's<Document file={arrayBuffer}>— no upload required
Known issue with react-pdf v7+: ArrayBuffer becomes detached after first use. Always copy:
const safeCopy = (buf: ArrayBuffer) => {
const copy = new ArrayBuffer(buf.byteLength)
new Uint8Array(copy).set(new Uint8Array(buf))
return copy
}
<Document file={safeCopy(previewBuffer)}>
react-pdf renders the flattened PDF accurately — all filled text fields, checked checkboxes, and embedded signature images will appear correctly because they are baked into the PDF bytes by @cantoo/pdf-lib before rendering.
Alternatives Considered
| Recommended | Alternative | Why Not |
|---|---|---|
unpdf for text extraction |
pdfjs-dist directly in Node API route |
pdfjs-dist v5 uses Promise.withResolvers requiring Node 22+; the project targets Node 20 LTS. unpdf ships a polyfilled serverless build that handles this. |
unpdf for text extraction |
pdf-parse |
pdf-parse is unmaintained (last publish 2019). unpdf is the community-recommended successor. |
| Manual JSON schema for OpenAI | zodResponseFormat helper |
Broken with Zod v4 — open bug in openai-node as of Nov 2025. Manual schema avoids the dependency entirely. |
gpt-4o-mini |
gpt-4o |
Real estate form field extraction is a structured extraction task on templated documents. gpt-4o-mini is sufficient and ~15x cheaper. Upgrade to gpt-4o only if accuracy on unusual forms is unacceptable. |
page.drawImage() for agent signature |
PDFSignature AcroForm field |
@cantoo/pdf-lib has no createSignature() API — PDFSignature only reads existing signature fields and provides no image embedding. The correct approach is embedPng() + drawImage() at the field coordinates. |
What NOT to Add
| Avoid | Why | Use Instead |
|---|---|---|
zodResponseFormat from openai/helpers/zod |
Broken at runtime with Zod v4.x (throws exceptions). Open bug, no fix merged as of 2026-03-21. | Plain response_format: { type: "json_schema", ... } with hand-written schema |
react-signature-canvas |
Alpha version (1.1.0-alpha.2); project already has signature_pad v5 directly — the wrapper adds nothing |
signature_pad + useRef<HTMLCanvasElement> directly |
@signpdf/placeholder-pdf-lib |
For cryptographic PKCS#7 digital signatures (DocuSign-style). This project needs visual e-signatures (image embedded in PDF), not cryptographic signing. | @cantoo/pdf-lib embedPng() + drawImage() |
pdf2json |
Extracts spatial text data; useful for arbitrary document analysis. Overkill here — we only need raw text content to feed OpenAI. | unpdf |
langchain / Vercel AI SDK |
Heavy abstractions for the simple use case of one structured extraction call per document. Adds bundle size and abstraction layers with no benefit here. | openai SDK directly |
A separate image processing library (sharp, jimp) |
Not needed — signature PNGs from signature_pad.toDataURL() are already correctly sized canvas exports. @cantoo/pdf-lib handles embedding without pre-processing. |
N/A |
Version Compatibility
| Package | Compatible With | Notes |
|---|---|---|
openai@6.32.0 |
zod@4.x (manual schema only) |
Do NOT use zodResponseFormat helper — use raw json_schema response_format. The helper is broken with Zod v4. |
openai@6.32.0 |
Node.js 20+ | Requires Node 20 LTS or later. Next.js 16.2 on Vercel uses Node 20 by default. |
unpdf@1.4.0 |
Node.js 18+ | Bundled PDF.js v5.2.133 with polyfills for Promise.withResolvers. Works on Node 20. |
@cantoo/pdf-lib@2.6.3 |
react-pdf@10.4.1 |
These do not interact at runtime — @cantoo/pdf-lib runs server-side, react-pdf runs client-side. No conflict. |
signature_pad@5.1.3 |
React 19 | Use as a plain class instantiated in useEffect with a useRef<HTMLCanvasElement>. No React wrapper needed. |
Sources
- openai npm page — v6.32.0 confirmed, Node 20 requirement — HIGH confidence
- OpenAI Structured Outputs docs — manual json_schema format confirmed — HIGH confidence
- openai-node Issue #1540 — zodResponseFormat broken with Zod v4 — HIGH confidence
- openai-node Issue #1602 — zodTextFormat broken with Zod 4 — HIGH confidence
- openai-node Issue #1709 — Zod 4.1.13+ discriminated union break — HIGH confidence
- @cantoo/pdf-lib npm page — v2.6.3, field types confirmed — HIGH confidence
- pdf-lib.js.org PDFForm docs — createTextField, createCheckBox, drawImage APIs — HIGH confidence
- unpdf npm page — v1.4.0, serverless PDF.js build, Node 20 compatible — HIGH confidence
- unpdf GitHub — extractText API confirmed — HIGH confidence
- react-pdf npm page — v10.4.1, ArrayBuffer file prop confirmed — HIGH confidence
- react-pdf ArrayBuffer detach issue #1657 — copy workaround confirmed — HIGH confidence
- signature_pad GitHub — v5.1.3, toDataURL API — HIGH confidence
- pdf-lib image embedding JSFiddle — embedPng/drawImage pattern — HIGH confidence
Stack research for: Teressa Copeland Homes — v1.1 Smart Document Preparation additions Researched: 2026-03-21