# Stack Research **Domain:** Real estate agent website + PDF document signing web app **Researched:** 2026-03-21 **Confidence:** HIGH (versions verified via npm registry; integration issues verified via official GitHub issues) **Scope:** v1.1 additions only — OpenAI integration, expanded field types, agent signature storage, filled preview --- ## Existing Stack (Do Not Re-research) Already validated and in `package.json`. Do not change these. | Technology | Version in package.json | Role | |------------|------------------------|------| | Next.js | 16.2.0 | Full-stack framework | | React | 19.2.4 | UI | | `@cantoo/pdf-lib` | ^2.6.3 | PDF modification (server-side) | | `react-pdf` | ^10.4.1 | In-browser PDF rendering | | `signature_pad` | ^5.1.3 | Canvas signature drawing | | `zod` | ^4.3.6 | Schema validation | | `@vercel/blob` | ^2.3.1 | File storage | | Drizzle ORM + `postgres` | ^0.45.1 / ^3.4.8 | Database | | Auth.js (next-auth) | 5.0.0-beta.30 | Authentication | --- ## New Stack Additions for v1.1 ### Core New Dependency: OpenAI API | Technology | Version | Purpose | Why Recommended | |------------|---------|---------|-----------------| | `openai` | ^6.32.0 | OpenAI API client for GPT calls | Official SDK, current latest, TypeScript-native. Provides `client.chat.completions.create()` for structured JSON output via manual `json_schema` response format. Required for AI field placement and pre-fill. | **No other new core dependencies are needed.** The remaining v1.1 features extend capabilities already in `@cantoo/pdf-lib`, `signature_pad`, and `react-pdf`. --- ### Supporting Libraries | Library | Version | Purpose | When to Use | |---------|---------|---------|-------------| | `unpdf` | ^1.4.0 | Server-side PDF text extraction | Use in the AI pipeline API route to extract raw text from PDF pages before sending to OpenAI. Serverless-compatible, wraps PDF.js v5, works in Next.js API routes without native bindings. More reliable in serverless than `pdfjs-dist` directly. | No other new supporting libraries needed. See "What NOT to Add" below. --- ### Development Tools No new dev tooling required for v1.1 features. --- ## Installation ```bash # New dependencies for v1.1 npm install openai unpdf ``` That is the full installation delta for v1.1. --- ## Feature-by-Feature Integration Notes ### Feature 1: OpenAI PDF Analysis + Field Placement **Flow:** 1. API route receives document ID 2. Fetch PDF bytes from Vercel Blob (`@vercel/blob` — already installed) 3. Extract text per page using `unpdf`: `getDocumentProxy()` + `extractText()` 4. Call OpenAI `gpt-4o-mini` with extracted text + a manually defined JSON schema 5. Parse structured response: array of `{ fieldType, label, pageNumber, x, y, width, height, suggestedValue }` 6. Save placement records to DB via Drizzle ORM **Why `gpt-4o-mini` (not `gpt-4o`):** Sufficient for structured field extraction on real estate forms. Significantly cheaper. The task is extraction from known document templates — not complex reasoning. **Why manual JSON schema (not `zodResponseFormat`):** The project uses `zod` v4.3.6. The `zodResponseFormat` helper in `openai/helpers/zod` uses vendored `zod-to-json-schema` that still expects `ZodFirstPartyTypeKind` — removed in Zod v4. This is a confirmed open bug as of late 2025. Using `zodResponseFormat` with Zod v4 throws runtime exceptions. Use `response_format: { type: "json_schema", json_schema: { name: "...", strict: true, schema: { ... } } }` directly with plain TypeScript types instead. ```typescript // CORRECT for Zod v4 project — use manual JSON schema, not zodResponseFormat const response = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: prompt }], response_format: { type: "json_schema", json_schema: { name: "field_placements", strict: true, schema: { type: "object", properties: { fields: { type: "array", items: { type: "object", properties: { fieldType: { type: "string", enum: ["text", "checkbox", "initials", "date", "signature"] }, label: { type: "string" }, pageNumber: { type: "number" }, x: { type: "number" }, y: { type: "number" }, width: { type: "number" }, height: { type: "number" }, suggestedValue: { type: "string" } }, required: ["fieldType", "label", "pageNumber", "x", "y", "width", "height", "suggestedValue"], additionalProperties: false } } }, required: ["fields"], additionalProperties: false } } } }); const result = JSON.parse(response.choices[0].message.content!); ``` --- ### Feature 2: Expanded Field Types in @cantoo/pdf-lib **No new library needed.** `@cantoo/pdf-lib` v2.6.3 already supports all required field types natively: | Field Type | @cantoo/pdf-lib API | |------------|---------------------| | Text | `form.createTextField(name)` → `.addToPage(page, options)` → `.setText(value)` | | Checkbox | `form.createCheckBox(name)` → `.addToPage(page, options)` → `.check()` / `.uncheck()` | | Initials | No dedicated type — use `createTextField` with width/height appropriate for initials | | Date | No dedicated type — use `createTextField`, constrain value format in application logic | | Agent Signature | Use `page.drawImage(embeddedPng, { x, y, width, height })` — see Feature 3 | **Key pattern for checkboxes:** ```typescript const checkBox = form.createCheckBox('fieldName') checkBox.addToPage(page, { x, y, width: 15, height: 15, borderWidth: 1 }) if (shouldBeChecked) checkBox.check() ``` **Coordinate system note:** `@cantoo/pdf-lib` uses PDF coordinate space where y=0 is the bottom of the page. If field positions come from `unpdf` / PDF.js (which uses y=0 at top), you must transform: `pdfY = pageHeight - sourceY - fieldHeight`. --- ### Feature 3: Agent Signature Storage **No new library needed.** The project already has `signature_pad` v5.1.3, `@vercel/blob`, and Drizzle ORM. **Architecture:** 1. Agent draws signature in browser using `signature_pad` (already installed) 2. Call `signaturePad.toDataURL('image/png')` to get base64 PNG 3. POST to API route; server converts base64 → `Uint8Array` → uploads to Vercel Blob at a stable path (e.g., `/agents/{agentId}/signature.png`) 4. Save blob URL to agent record in DB (add `signatureImageUrl` column to `Agent`/`User` table via Drizzle migration) 5. On "apply agent signature": server fetches blob URL, embeds PNG into PDF using `@cantoo/pdf-lib` **`signature_pad` v5 in React — use `useRef` on a `` element directly:** ```typescript import SignaturePad from 'signature_pad' import { useRef, useEffect } from 'react' export function SignatureDrawer() { const canvasRef = useRef(null) const padRef = useRef(null) useEffect(() => { if (canvasRef.current) { padRef.current = new SignaturePad(canvasRef.current) } return () => padRef.current?.off() }, []) const save = () => { const dataUrl = padRef.current?.toDataURL('image/png') // POST dataUrl to /api/agent/signature } return } ``` **Do NOT add `react-signature-canvas`.** It wraps `signature_pad` at v1.1.0-alpha.2 (alpha status) and the project already has `signature_pad` directly. Use the raw library with a `useRef`. **Embedding the saved signature into PDF:** ```typescript const sigBytes = await fetch(agentSignatureBlobUrl).then(r => r.arrayBuffer()) const sigImage = await pdfDoc.embedPng(new Uint8Array(sigBytes)) const dims = sigImage.scaleToFit(fieldWidth, fieldHeight) page.drawImage(sigImage, { x: fieldX, y: fieldY, width: dims.width, height: dims.height }) ``` --- ### Feature 4: Filled Document Preview **No new library needed.** `react-pdf` v10.4.1 is already installed and supports rendering a PDF from an `ArrayBuffer` directly. **Architecture:** 1. Server Action: load original PDF from Vercel Blob, apply all field values (text, checkboxes, embedded signature image) using `@cantoo/pdf-lib`, return `pdfDoc.save()` bytes 2. API route returns the bytes as `application/pdf`; client receives as `ArrayBuffer` 3. Pass `ArrayBuffer` directly to `react-pdf`'s `` — no upload required **Known issue with react-pdf v7+:** `ArrayBuffer` becomes detached after first use. Always copy: ```typescript const safeCopy = (buf: ArrayBuffer) => { const copy = new ArrayBuffer(buf.byteLength) new Uint8Array(copy).set(new Uint8Array(buf)) return copy } ``` **react-pdf renders the flattened PDF accurately** — all filled text fields, checked checkboxes, and embedded signature images will appear correctly because they are baked into the PDF bytes by `@cantoo/pdf-lib` before rendering. --- ## Alternatives Considered | Recommended | Alternative | Why Not | |-------------|-------------|---------| | `unpdf` for text extraction | `pdfjs-dist` directly in Node API route | `pdfjs-dist` v5 uses `Promise.withResolvers` requiring Node 22+; the project targets Node 20 LTS. `unpdf` ships a polyfilled serverless build that handles this. | | `unpdf` for text extraction | `pdf-parse` | `pdf-parse` is unmaintained (last publish 2019). `unpdf` is the community-recommended successor. | | Manual JSON schema for OpenAI | `zodResponseFormat` helper | Broken with Zod v4 — open bug in `openai-node` as of Nov 2025. Manual schema avoids the dependency entirely. | | `gpt-4o-mini` | `gpt-4o` | Real estate form field extraction is a structured extraction task on templated documents. `gpt-4o-mini` is sufficient and ~15x cheaper. Upgrade to `gpt-4o` only if accuracy on unusual forms is unacceptable. | | `page.drawImage()` for agent signature | `PDFSignature` AcroForm field | `@cantoo/pdf-lib` has no `createSignature()` API — `PDFSignature` only reads existing signature fields and provides no image embedding. The correct approach is `embedPng()` + `drawImage()` at the field coordinates. | --- ## What NOT to Add | Avoid | Why | Use Instead | |-------|-----|-------------| | `zodResponseFormat` from `openai/helpers/zod` | Broken at runtime with Zod v4.x (throws exceptions). Open bug, no fix merged as of 2026-03-21. | Plain `response_format: { type: "json_schema", ... }` with hand-written schema | | `react-signature-canvas` | Alpha version (1.1.0-alpha.2); project already has `signature_pad` v5 directly — the wrapper adds nothing | `signature_pad` + `useRef` directly | | `@signpdf/placeholder-pdf-lib` | For cryptographic PKCS#7 digital signatures (DocuSign-style). This project needs visual e-signatures (image embedded in PDF), not cryptographic signing. | `@cantoo/pdf-lib` `embedPng()` + `drawImage()` | | `pdf2json` | Extracts spatial text data; useful for arbitrary document analysis. Overkill here — we only need raw text content to feed OpenAI. | `unpdf` | | `langchain` / Vercel AI SDK | Heavy abstractions for the simple use case of one structured extraction call per document. Adds bundle size and abstraction layers with no benefit here. | `openai` SDK directly | | A separate image processing library (`sharp`, `jimp`) | Not needed — signature PNGs from `signature_pad.toDataURL()` are already correctly sized canvas exports. `@cantoo/pdf-lib` handles embedding without pre-processing. | N/A | --- ## Version Compatibility | Package | Compatible With | Notes | |---------|-----------------|-------| | `openai@6.32.0` | `zod@4.x` (manual schema only) | Do NOT use `zodResponseFormat` helper — use raw `json_schema` response_format. The helper is broken with Zod v4. | | `openai@6.32.0` | Node.js 20+ | Requires Node 20 LTS or later. Next.js 16.2 on Vercel uses Node 20 by default. | | `unpdf@1.4.0` | Node.js 18+ | Bundled PDF.js v5.2.133 with polyfills for `Promise.withResolvers`. Works on Node 20. | | `@cantoo/pdf-lib@2.6.3` | `react-pdf@10.4.1` | These do not interact at runtime — `@cantoo/pdf-lib` runs server-side, `react-pdf` runs client-side. No conflict. | | `signature_pad@5.1.3` | React 19 | Use as a plain class instantiated in `useEffect` with a `useRef`. No React wrapper needed. | --- ## Sources - [openai npm page](https://www.npmjs.com/package/openai) — v6.32.0 confirmed, Node 20 requirement — HIGH confidence - [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual json_schema format confirmed — HIGH confidence - [openai-node Issue #1540](https://github.com/openai/openai-node/issues/1540) — zodResponseFormat broken with Zod v4 — HIGH confidence - [openai-node Issue #1602](https://github.com/openai/openai-node/issues/1602) — zodTextFormat broken with Zod 4 — HIGH confidence - [openai-node Issue #1709](https://github.com/openai/openai-node/issues/1709) — Zod 4.1.13+ discriminated union break — HIGH confidence - [@cantoo/pdf-lib npm page](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3, field types confirmed — HIGH confidence - [pdf-lib.js.org PDFForm docs](https://pdf-lib.js.org/docs/api/classes/pdfform) — createTextField, createCheckBox, drawImage APIs — HIGH confidence - [unpdf npm page](https://www.npmjs.com/package/unpdf) — v1.4.0, serverless PDF.js build, Node 20 compatible — HIGH confidence - [unpdf GitHub](https://github.com/unjs/unpdf) — extractText API confirmed — HIGH confidence - [react-pdf npm page](https://www.npmjs.com/package/react-pdf) — v10.4.1, ArrayBuffer file prop confirmed — HIGH confidence - [react-pdf ArrayBuffer detach issue #1657](https://github.com/wojtekmaj/react-pdf/issues/1657) — copy workaround confirmed — HIGH confidence - [signature_pad GitHub](https://github.com/szimek/signature_pad) — v5.1.3, toDataURL API — HIGH confidence - [pdf-lib image embedding JSFiddle](https://jsfiddle.net/Hopding/bcya43ju/5/) — embedPng/drawImage pattern — HIGH confidence --- *Stack research for: Teressa Copeland Homes — v1.1 Smart Document Preparation additions* *Researched: 2026-03-21*