docs: complete project research

This commit is contained in:
Chandler Copeland
2026-03-21 11:28:42 -06:00
parent 8c69deeb68
commit e36c6c8ee2
5 changed files with 1436 additions and 1689 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1,289 +1,259 @@
# Feature Research # Feature Research
**Domain:** Real estate agent website + document signing portal **Domain:** Real estate agent website + document signing portal — v1.1 Smart Document Preparation
**Researched:** 2026-03-19 **Researched:** 2026-03-21
**Confidence:** HIGH — cross-referenced multiple industry sources across both the marketing site and the document workflow sides **Confidence:** HIGH for field types and signing flows (strong industry evidence); MEDIUM for AI field detection accuracy (implementation approach has a critical constraint); HIGH for document preview expectations
---
## Scope Note
This file covers only the v1.1 milestone features. The existing v1.0 features (agent portal, client management, PDF upload/preview, drag-drop field placement, PreparePanel text fill, email-link signing, presigned download) are already built and validated.
**New features under research:**
1. AI-assisted field placement and pre-fill (gpt-4o-mini)
2. Expanded field types: text, checkbox, initials, date, agent signature
3. Agent saved signature + sign-first workflow
4. Filled document preview before sending
---
## Feature Landscape ## Feature Landscape
### Table Stakes (Users Expect These) ### Table Stakes (Users Expect These)
Features users assume exist. Missing these = product feels incomplete. Features a real estate agent would expect from any "smart document prep" upgrade. Missing these makes the v1.1 feel like a half-measure.
| Feature | Why Expected | Complexity | Notes | | Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------| |---------|--------------|------------|-------|
| Professional hero section with agent photo and bio | Every real estate site has this; first impression of trust and warmth | LOW | Teressa's photo is provided; needs warm/approachable design treatment | | Initials field type | Every Utah standard form (REPC, listing agreement, addenda) has initials lines on every page — initialing page-by-page is a core legal convention | MEDIUM | At signing time: initials block shows a condensed canvas (not full signature); auto-scrolls to each initials field after previous completion; behaves identically to signature field but smaller canvas |
| Mobile-responsive design | 60%+ of real estate searches happen on mobile; non-mobile sites are invisible | MEDIUM | Next.js with Tailwind handles this; must test signing flow on mobile | | Date field type | Date-of-signing is required on nearly every field block; agents expect it to auto-fill at signing moment, not require client to type a date | LOW | Industry standard: date field is a read-only "Date Signed" stamp — system fills it at the moment the signer completes the signature block; client never types a date; dotloop and DocuSign both implement this as locked/auto-populated |
| Active property listings display | Clients expect to see what you're selling; core proof of business | HIGH | WFRMLS/utahrealestate.com integration; no guaranteed public API — may require credential-based scraping | | Checkbox field type | Utah REPC and addenda use boolean checkboxes throughout (mediation clauses, contingency elections, disclosure acknowledgments) | LOW | At signing time: client taps/clicks to check; visual distinction (empty square vs. checked square); must be assigned to a specific signer or it defaults to "no one" and never gets checked (dotloop behavior confirmed) |
| Contact form / call to action | Clients need a way to reach out; missing CTA = lost leads | LOW | Simple form with email delivery; Resend or similar service | | Agent signature field type | Agent needs to counter-sign documents before sending; this is a separate field type from client signature — tied to agent's saved signature, not client's canvas | MEDIUM | Distinct from client signature: pre-filled by agent in the prep flow; client sees it as already signed when they open the document; not a client-facing interactive field |
| Client testimonials / social proof | Trust signal; real estate is relationship-driven | LOW | Static content initially; agent provides testimonial copy | | Agent signs before sending | Industry convention in real estate: the listing agent or buyer's agent signs first to demonstrate they've reviewed and authorized the document, then sends to client for counter-signature | MEDIUM | Sign-first is the dominant pattern; agent signs as Routing Order 1, client signs as Routing Order 2; DocuSign community documentation confirms this as the standard real estate workflow |
| Agent login with secure authentication | Portal requires identity protection; any breach exposes client documents | MEDIUM | Next-Auth or similar; email magic link or password auth | | Filled document preview before send | Agents expect to see the document exactly as the client will see it — with all pre-filled text, agent signature, and field placement visible — before hitting Send | MEDIUM | Preview must show: pre-filled text values, agent signature applied, all field markers overlaid on the PDF; client-facing fields should appear interactive but not be submittable from the preview |
| Document list / client management dashboard | Agent must see all active clients and document statuses at a glance | MEDIUM | CRUD operations on documents and clients; status tracking (draft, sent, signed) |
| PDF rendering in browser | Agent must see the actual document before sending | HIGH | PDF.js or similar; must render Utah standard forms accurately |
| Signature field placement on PDF | Core document workflow; agent must designate where client signs | HIGH | Drag-and-drop UI on PDF canvas; position stored as x/y/page coordinates |
| Email delivery of signing link to client | How client receives the document; no email = no signing | MEDIUM | Unique tokenized URL per document; Resend or Postmark |
| Client signing experience (no account) | Clients will not create accounts; friction here loses signatures | HIGH | Anonymous token-based access; canvas signature capture; mobile-friendly |
| Signed document storage and retrieval | Agent must be able to access signed documents after the fact | MEDIUM | Secure file storage (S3 or similar); associated with client record |
| Audit trail (IP, timestamp, signature image) | Legal requirement under ESIGN Act and UETA for enforceability | HIGH | Must capture: signer IP address, timestamp, user agent, drawn signature image embedded into PDF |
| Tamper-evident record after signing | Legal requirement; document must be provably unmodified after signing | HIGH | PDF hash or cryptographic seal after signature embed; store original + signed versions |
### Differentiators (Competitive Advantage) ### Differentiators (Competitive Advantage)
Features that set the product apart. Not valuable to every agent, but meaningful here. Features that go beyond the baseline — meaningful specifically for this app's single-agent, Utah-forms workflow.
| Feature | Value Proposition | Complexity | Notes | | Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------| |---------|-------------------|------------|-------|
| Custom branded signing experience | DocuSign and HelloSign look like DocuSign and HelloSign; this looks like Teressa's business | MEDIUM | Branded email, branded signing page, Teressa's colors/logo throughout | | AI field placement via gpt-4o-mini | Teressa can click one button and have all signature, initials, date, checkbox, and text fields placed on a Utah standard form automatically — eliminating the manual drag-drop session for known forms | HIGH | **Critical constraint:** gpt-4o does not reliably return accurate pixel coordinates from images (< 3% accuracy in bounding box studies). Correct approach: extract PDF text with positional metadata via pdf-lib or pdfjs-dist, then use gpt-4o-mini for semantic understanding of field labels ("Buyer's Signature", "Date", "Initial Here") and return normalized page coordinates calculated from the text's own bounding box data — not from vision inference. AI does the label-understanding; the PDF SDK does the coordinate math. |
| No client account required | Zero friction for clients — one link, one click, one signature; competitors require accounts | LOW | Token in URL grants access; no login wall | | AI pre-fill of text fields | Client name, property address, and date fields can be populated from client profile data + property data without manual typing in PreparePanel | MEDIUM | gpt-4o-mini maps known data (client.firstName, client.lastName, client.email, property.address) to identified text fields by label proximity. Low risk of hallucination because input is structured profile data, not free text inference. Model simply matches "Buyer" label to client name. |
| Agent-fills-then-client-signs workflow | Teressa fills property/client details before sending; client only signs — matches how real estate actually works | MEDIUM | Two-phase form flow: agent prep mode vs. client sign mode | | Agent saved signature (draw once, reuse) | Agent draws signature once on first use; it is stored as a PNG data URL and applied to any agent signature field with one click — eliminating per-document re-drawing | MEDIUM | Standard in every major real estate e-sig tool (DocuSign, dotloop, SkySlope); agents expect this; re-drawing every document is a daily friction point. Stored in agent profile. Can be cleared and redrawn. Display a "preview" of the saved signature before applying. |
| Forms library import from utahrealestate.com | Teressa already has access and uses these forms; importing avoids manual re-entry | HIGH | Session-based auth to utahrealestate.com; parse/download available forms; legal consideration: forms may be member-only | | Property address on client profile | Teressa adds the property address to the client record so AI pre-fill knows what to insert into property address fields across all documents for that transaction | LOW | Simple field addition to client profile schema; enables AI pre-fill to be property-specific without Teressa re-typing the address on every document |
| Heuristic signature field detection | Auto-detect likely signature zones on Utah standard forms; reduce agent setup time | HIGH | Pattern matching on PDF text/whitespace; Utah forms have predictable structure; manual override always available |
| Document status tracking (sent/viewed/signed) | Agent knows if client opened the link; can follow up proactively | MEDIUM | Link open tracking via redirect; signed confirmation webhook/callback |
| Hyper-local SEO content structure | Neighborhood guides and local market content rank for Utah-specific searches | LOW | Content structure in Next.js; agent provides copy; builds over time |
| Listings tied to agent brand | Listings display under Teressa's brand, not a third-party portal | HIGH | WFRMLS feed integration; listing detail pages on teressacopelandhomes.com |
### Anti-Features (Commonly Requested, Often Problematic) ### Anti-Features (Commonly Requested, Often Problematic)
Features that seem good but create problems.
| Feature | Why Requested | Why Problematic | Alternative | | Feature | Why Requested | Why Problematic | Alternative |
|---------|---------------|-----------------|-------------| |---------|---------------|-----------------|-------------|
| Client login portal / accounts | "Clients should be able to check document status" | Adds auth complexity, email verification flows, password resets, and security surface; clients rarely return to portals; v1 is one-time signing | Email-based status updates when document is signed; agent dashboard is source of truth | | AI generates pixel-perfect field placement from visual PDF analysis | "Just have AI look at the PDF image and tell me where the fields are" | gpt-4o and Claude 3.5 Sonnet return accurate pixel coordinates less than 3% of the time in bounding box studies (February 2025 research); placing a signature field 200px from the actual signature line produces a broken document | Extract text positions from the PDF's own data structures via pdfjs-dist or pdf-lib; use AI for label-matching only; compute coordinates from extracted text bounding boxes |
| Notification/reminder automation for unsigned documents | "I want automatic follow-ups" | Requires scheduling infrastructure (cron/queue), unsubscribe handling, deliverability management; high complexity for low-frequency use | Manual reminder workflow; agent sees unsigned status in dashboard and sends manual follow-up email; add automation only when confirmed as real pain point | | AI auto-fill replaces the agent's PreparePanel review | "AI should fill everything and I just hit Send" | For legal documents, pre-fill values that are wrong create liability; purchase price, closing date, earnest money must be agent-verified before the document goes to a client | AI pre-fills obvious known values (names, address); agent reviews in PreparePanel and edits before sending; preview step is the final check |
| Multi-agent / team support | "What if I hire someone?" | Adds role/permissions model, shared document ownership, audit trails per agent; doubles auth complexity | Solo agent model only in v1; revisit if business grows | | AI confidence thresholds surface to the agent | "Show me which fields AI isn't sure about" | Confidence scoring UX adds noise; solo agent workflow doesn't need a triage interface; if a field is missed, the agent sees it in preview | If AI misses a field, agent adds it manually in the existing drag-drop UI; preview step catches any gaps |
| Native iOS/Android app | "Clients prefer apps" | Separate codebase, app store approval, push notification infrastructure; signing on mobile web is fully viable | Responsive PWA-quality mobile web; signing canvas works in mobile browsers; over 50% of e-signatures now happen on mobile web successfully | | Signature type-in fallback for agent signature | "Let agent type their name as their signature" | Typed signatures have lower perceived legal weight in real estate; clients may question them; canvas signature is the industry standard for agent counter-signatures | Canvas-drawn signature stored and reused; never degrade to typed for agent's own signature |
| Real-time collaboration / live signing sessions | "Sign together on a video call" | WebSocket infrastructure, session coordination, conflict resolution; extreme complexity for rare use case | Async signing via link is industry standard; agents schedule signing calls separately | | Per-document redraw option for agent signature | "Maybe I want to sign slightly differently for important contracts" | Adds decision fatigue at the exact moment agents want speed; the value of saved signature is consistency and one-click application | Store one saved signature; provide a "Re-draw my signature" option in agent profile settings only, not in the per-document flow |
| Mortgage calculator / affordability tools | Common on real estate sites | Third-party data dependency; not core to Teressa's workflow; dilutes site focus | Link to trusted external tools (CFPB mortgage calculator) | | Fully locked preview (no corrections possible) | "The preview is just a review step — agent can't change anything" | If the preview reveals a mistake (wrong address pre-filled, agent signature in wrong position), the agent needs to correct it before sending; a read-only preview that requires going back to square one increases frustration | Preview includes a back-to-edit button; light inline corrections (move a field, fix a pre-filled value) should be possible from or after the preview |
| Full IDX search with saved searches and alerts | "Clients want to browse all listings" | Full IDX integration requires WFRMLS enrollment ($50 + ~$10/month data feed via approved vendor), saved searches need client accounts, and email automation; overkill for a personal agent site | Display Teressa's active listings only; clients searching the broader Utah market use utahrealestate.com, Zillow, or Redfin |
| AI chatbot | "24/7 lead capture" | Adds LLM costs, prompt engineering, hallucination risk for legal/real estate queries; for a solo agent with a personal brand, a chatbot feels impersonal and off-brand | Clear contact form; visible phone number; fast personal email response is the real differentiator | ---
| DocuSign/HelloSign integration | "Just use an existing service" | $25-50+/month recurring for a solo agent; loses brand control; client experience carries third-party branding; 97% of agents already use e-sig but that means DocuSign-fatigue is real | Custom in-house signature capture (already decided); lower cost, full brand control, Teressa-branded experience |
| Blockchain/smart contract signing | Sounds modern | Zero adoption in Utah residential real estate; no tooling the industry accepts; legal standing unclear in Utah courts | Standard ESIGN/UETA-compliant audit trail is legally sufficient and well understood by Utah courts |
| Blog / content marketing hub | SEO traffic over time | Meaningful payoff requires 6-12+ months of consistent publishing; solo agent rarely has bandwidth; abandoned blog hurts credibility | One strong neighborhood page is worth more than a dozen generic posts; defer until cadence is proven |
| In-app PDF editing (content editing, not field placement) | "Fix typos in the form before sending" | Real estate contracts have legally mandated language; editing form content creates liability and REALTORS association compliance issues | Treat PDFs as read-only containers; only add/position signing fields on top; edit the source form in utahrealestate.com if needed |
| SMS / text signing notifications | "Higher open rates than email" | Requires phone number collection, TCPA compliance, SMS provider setup and per-message cost; adds friction to the sending flow | Email-only is sufficient; clients are conditioned to DocuSign-style email delivery; revisit only if open rates prove to be a problem |
## Feature Dependencies ## Feature Dependencies
``` ```
[Agent Login] [v1.0 Client Management]
└──requires──> [Secure Auth (Next-Auth or similar)] └──enables──> [Property Address Field on Client Profile] (new field addition)
└──enables──> [Agent Dashboard] └──enables──> [AI Pre-fill of Text Fields]
└──enables──> [Client Management]
└──enables──> [Document Workflow]
[Document Workflow] [v1.0 PDF Upload + Rendering]
└──requires──> [PDF Rendering in Browser] └──enables──> [AI Field Detection] (needs PDF text extraction)
└──requires──> [Signature Field Placement UI] └──enables──> [Expanded Field Types] (builds on existing field overlay system)
└──requires──> [File Storage (S3)]
└──requires──> [Email Delivery Service]
└──enables──> [Client Signing Link]
└──requires──> [Token-Based Anonymous Access]
└──requires──> [Canvas Signature Capture]
└──requires──> [Audit Trail Capture (IP + timestamp)]
└──enables──> [PDF Signature Embed]
└──enables──> [Signed Document Storage]
[Listings Display] [v1.0 Drag-Drop Field Placement UI]
└──requires──> [WFRMLS/utahrealestate.com Integration] └──extended by──> [Expanded Field Types (checkbox, initials, date, agent sig)]
└──independent of──> [Document Workflow] └──enhanced by──> [AI Field Detection] (pre-populates fields; agent adjusts)
[Forms Library Import] [v1.0 PreparePanel (text fill)]
└──requires──> [utahrealestate.com credential-based session] └──enhanced by──> [AI Pre-fill] (populates known values automatically)
└──enhances──> [Document Workflow] (pre-populated PDFs) └──enables──> [Filled Document Preview] (preview shows PreparePanel values rendered)
[Heuristic Field Detection] [Agent Saved Signature]
└──enhances──> [Signature Field Placement UI] └──requires──> [Agent Profile / Settings storage]
└──independent of──> [Canvas Signature Capture] └──enables──> [Agent Signature Field Type] (one-click apply)
└──enables──> [Agent Signs First Workflow]
[Document Status Tracking] [Agent Signs First Workflow]
└──requires──> [Email Delivery Service] └──requires──> [Agent Saved Signature]
└──enhances──> [Agent Dashboard] └──requires──> [Agent Signature Field Type]
└──enables──> [Filled Document Preview] (preview makes most sense after agent signs)
[Filled Document Preview]
└──requires──> [Agent Signs First Workflow] (or at minimum: PreparePanel fill complete)
└──requires──> [Expanded Field Types] (all field types must render correctly in preview)
└──gates──> [Send to Client] (Send button lives in or after the preview step)
``` ```
### Dependency Notes ### Dependency Notes
- **Document Workflow requires PDF Rendering:** The agent must see the document to place fields. PDF.js is the standard browser-side renderer; server-side PDF manipulation (pdf-lib or PDFKit) is needed for embedding signatures. - **AI Field Detection requires PDF text extraction, not vision:** The correct implementation extracts text with positional bounding boxes from the PDF (via pdfjs-dist `getTextContent()` or pdf-lib) and passes labeled text positions to gpt-4o-mini for semantic field-type classification. The AI returns field type + which extracted text object to anchor to; the coordinate calculation uses the text object's own `transform` data. This avoids the coordinate-from-vision accuracy problem entirely.
- **Client Signing requires Token-Based Access:** The token in the email link is the client's identity. It must be single-use or expiring; no account required.
- **Audit Trail requires Canvas Signature Capture:** The signature image, IP, and timestamp are all captured at the moment of signing; they must be written together atomically.
- **Listings Display is independent of Document Workflow:** These are two separate subsystems that share only the marketing site shell. WFRMLS integration can be built or skipped without affecting document signing.
- **Forms Library Import enhances but does not block Document Workflow:** The agent can manually upload PDFs if import fails. Import is a convenience, not a prerequisite.
- **Heuristic Field Detection enhances Placement UI:** It pre-populates suggested field positions. Agent always has final control. Failure of detection degrades gracefully — agent places fields manually.
## MVP Definition - **Expanded field types extend the existing field overlay system:** The v1.0 signature field is already rendered as an overlay div on the PDF canvas with stored x/y/page/width/height. Adding checkbox, initials, date, and agent signature types means adding field-type discriminants to the same data model and rendering different UI chrome per type. Not a new system — an extension.
### Launch With (v1) - **Agent Saved Signature is a prerequisite for Agent Signs First:** The one-click agent signing workflow only makes sense if the agent's signature is already stored. Without it, the agent would need to draw their signature in the agent-facing flow, which is equivalent to the client signing flow — functional but not the intended UX.
Minimum viable product — what's needed to validate the concept. - **Filled Document Preview gates Send:** The preview is the final checkpoint. Send to Client should only be reachable from the preview screen (or with explicit bypass). This prevents the common mistake of sending the wrong document version.
- [ ] Marketing site with agent photo, bio, contact form, and testimonials — establishes professional presence - **Date fields must be non-editable by the client at signing time:** Date fields auto-populate at the moment the client completes the signing session. This is the DocuSign "Date Signed" pattern — a read-only auto-stamped field, not a calendar picker. The client cannot alter the date. This is legally important (it records when they actually signed, not a date they type in).
- [ ] Active listings pulled from WFRMLS/utahrealestate.com and displayed on site — proves integration works
- [ ] Agent login — gates the document portal
- [ ] Client management: create/edit clients with name and email — minimum data model
- [ ] PDF upload and browser rendering — agent can see the form
- [ ] Signature field placement UI (drag-and-drop on PDF canvas) — core document prep workflow
- [ ] Email delivery of unique signing link to client — how client gets the document
- [ ] Token-based anonymous client signing page — client opens link, sees PDF, draws signature
- [ ] Audit trail capture: IP, timestamp, drawn signature image — legal requirement
- [ ] PDF signature embed and signed document storage — completes the sign-and-store loop
- [ ] Agent dashboard showing document status (draft / sent / signed) — agent knows what's done
### Add After Validation (v1.x) - **Checkbox fields must be signer-assigned to function:** Based on dotloop behavior, checkboxes default to "no one" if not explicitly assigned to a signer role. In this app's model, the agent assigns each checkbox to either the agent role (pre-checked during agent signing) or the client role (checked by client during signing). Unassigned checkboxes that are pre-checked by the agent are treated as acknowledgment fields (not interactive for the client).
Features to add once core is working. ---
- [ ] Forms library import from utahrealestate.com — eliminates manual PDF upload step; add when upload friction is confirmed as a pain point ## v1.1 Feature Definitions
- [ ] Heuristic signature field detection on Utah standard forms — reduces agent prep time; add when field placement is confirmed as the biggest time cost
- [ ] Document status tracking (link opened / viewed) — add when agents ask "did my client open it?"
- [ ] Signed document download for client (PDF link in confirmation email) — add when clients report needing their copy
- [ ] Multiple signature fields per document (initials, date fields, checkboxes) — add when agents hit limits of single-signature flow
### Future Consideration (v2+) ### v1.1 Launch Features (the four areas)
Features to defer until product-market fit is established. - [ ] **Property address on client profile** — add `propertyAddress` field to client schema; display and edit in client detail view; used as AI pre-fill source for property address fields in documents
- [ ] **Expanded field types: checkbox** — checkbox field type in field placement sidebar; renders as empty square overlay; at signing: client taps to check; stores boolean checked state in document record; embeds checkmark into final PDF
- [ ] **Expanded field types: initials** — initials field type; renders as smaller signature canvas at signing time; same capture mechanics as signature (canvas drawing, stored as PNG); labeled "Initial Here" vs "Sign Here"
- [ ] **Expanded field types: date** — date field type; renders as read-only overlay; auto-populates with signing timestamp when client completes signing session; client cannot edit; embed formatted date string into final PDF
- [ ] **Expanded field types: agent signature** — agent signature field type placed by agent during prep; displays agent's saved signature image in preview and in the document sent to client; client sees it as already-signed; not interactive for client
- [ ] **Agent saved signature** — canvas draw interface on agent profile/settings; stores as base64 PNG; persists across sessions; "Re-draw" clears and prompts re-capture; shown as thumbnail in profile settings
- [ ] **Agent signs first workflow** — after field placement and PreparePanel fill, agent applies saved signature to all agent-signature fields; clicking an agent-sig field applies the stored PNG; all agent-sig fields must be filled before Proceed to Preview is enabled
- [ ] **AI field placement** — single "Auto-place Fields" button in the prep flow; PDF text extracted via pdfjs-dist with bounding boxes; payload sent to gpt-4o-mini with label-to-field-type mapping prompt; AI returns field type + text anchor identifier; system computes coordinates from text bounding box; places fields on PDF overlay; agent reviews and adjusts; always falls back gracefully to manual placement if AI returns nothing
- [ ] **AI pre-fill** — after AI field placement, text fields with high-confidence label matches (buyer name, seller name, property address, today's date) are pre-populated from client profile data; agent reviews in PreparePanel; agent can edit any pre-filled value
- [ ] **Filled document preview** — full-page PDF render with all overlays visible: pre-filled text values, agent signature rendered in-place, all client-facing fields shown with their type chrome (empty sig box, unchecked checkbox, "Initial Here" box, date placeholder); Send to Client button lives here; Back to Edit button returns to PreparePanel
- [ ] Notification/reminder system for unsigned documents — requires scheduling infrastructure; defer until manual follow-up is confirmed as painful ### Defer to v1.2
- [ ] Bulk document send (same form to multiple clients) — edge case for v1 solo agent use; revisit if volume grows
- [ ] Agent-side annotation / markup on PDFs (comments, notes to client) — nice to have; adds complexity to PDF rendering pipeline
- [ ] Client portal with document history — requires client accounts; out of scope for v1 by design
- [ ] Multi-agent / brokerage support — business model expansion; not current scope
## Feature Prioritization Matrix - [ ] **AI confidence display to agent** — show which fields AI was uncertain about; adds UI complexity; agent can see and correct in preview instead; defer until volume of AI misses proves it necessary
- [ ] **Template save from AI placement** — save AI-generated field layout as a reusable template per form type; high value but requires template management UI; defer until AI placement is validated as accurate
- [ ] **Multiple agent signature fields** — currently one agent sig is the typical pattern; multiple is possible with the field system but needs UX thought (which signature applies where); defer
---
## Feature Prioritization Matrix (v1.1)
| Feature | User Value | Implementation Cost | Priority | | Feature | User Value | Implementation Cost | Priority |
|---------|------------|---------------------|----------| |---------|------------|---------------------|----------|
| Marketing site (photo, bio, CTA) | HIGH | LOW | P1 | | Agent saved signature | HIGH | LOW | P1 |
| Active listings display (WFRMLS) | HIGH | HIGH | P1 | | Initials field type | HIGH | LOW | P1 |
| Agent login / auth | HIGH | MEDIUM | P1 | | Date field (auto-stamp) | HIGH | LOW | P1 |
| Client management (CRUD) | HIGH | LOW | P1 | | Checkbox field type | HIGH | LOW | P1 |
| PDF upload and browser rendering | HIGH | MEDIUM | P1 | | Agent signature field type | HIGH | MEDIUM | P1 |
| Signature field placement UI | HIGH | HIGH | P1 | | Agent signs first workflow | HIGH | MEDIUM | P1 |
| Email delivery of signing link | HIGH | MEDIUM | P1 | | Filled document preview | HIGH | MEDIUM | P1 |
| Token-based anonymous client signing | HIGH | MEDIUM | P1 | | Property address on client profile | HIGH | LOW | P1 |
| Canvas signature capture | HIGH | MEDIUM | P1 | | AI pre-fill from profile data | HIGH | MEDIUM | P1 |
| Audit trail (IP + timestamp + image) | HIGH | MEDIUM | P1 | | AI field placement (gpt-4o-mini + text extraction) | HIGH | HIGH | P1 |
| PDF signature embed | HIGH | HIGH | P1 | | AI field placement template save | MEDIUM | HIGH | P3 |
| Signed document storage | HIGH | MEDIUM | P1 | | AI confidence display to agent | LOW | MEDIUM | P3 |
| Agent dashboard / document status | HIGH | MEDIUM | P1 |
| Forms library import (utahrealestate.com) | MEDIUM | HIGH | P2 |
| Heuristic field detection | MEDIUM | HIGH | P2 |
| Document open/view tracking | MEDIUM | LOW | P2 |
| Signed document email to client | MEDIUM | LOW | P2 |
| Multiple field types (initials, date, checkbox) | MEDIUM | MEDIUM | P2 |
| Neighborhood guides / SEO content | LOW | LOW | P2 |
| Unsigned document reminders | LOW | HIGH | P3 |
| Client portal with document history | LOW | HIGH | P3 |
| Bulk document sending | LOW | MEDIUM | P3 |
| PDF annotation / markup | LOW | HIGH | P3 |
**Priority key:** **Priority key:**
- P1: Must have for launch - P1: Must have for v1.1 launch
- P2: Should have, add when possible - P2: Should have, add when possible
- P3: Nice to have, future consideration - P3: Nice to have, future consideration
## Real Estate Agent Site Specifics ---
### What Makes Real Estate Agent Sites Work ## Behavioral Specifications by Feature
**The hero section is the trust handshake.** Website visitors form an opinion in under one second. For a solo agent like Teressa, the hero must include: a warm, professional photo (not stock), a one-sentence value proposition, and a primary CTA ("See My Listings" or "Get in Touch"). The photo choice matters — action shots or candid poses perform better than formal headshots at conveying approachability. Refresh every 18-24 months. ### (1) AI-Assisted Field Detection
**Bio should lead with client benefit, not credential list.** The most common real estate bio mistake is listing designations and years of experience. The bio's job is lead generation. It should speak directly to the reader's situation, include one specific achievement or number, and close with a call to action. 150-200 words is right — enough to establish trust, short enough to actually be read. Avoid AI-generated copy; it reads as boilerplate and undermines the personal brand. **How it works in production tools:** Apryse, Instafill, and DocuSign Iris (January 2026) all use a hybrid approach: PDF structure parsing to extract text positions, then LLM for semantic classification. Raw vision-based bounding box inference is not production-viable — GPT-4o returns accurate pixel coordinates in under 3% of attempts in published benchmarks.
**Testimonials belong on the homepage, not only on a reviews page.** 88% of consumers trust online reviews as much as personal recommendations. Testimonials with client photos outperform text-only significantly. Recency matters — old testimonials hurt. The most powerful placement is: one or two short pull quotes below the hero, and a fuller testimonials section further down the page. Sourcing from Google or Zillow reviews adds third-party credibility. **Accuracy expectations:** On structured Utah standard forms (REPC, listing agreements, addenda), which have consistent label patterns ("Buyer's Signature", "Seller's Initials", "Date", "Property Address"), label-matching accuracy should be HIGH (90%+) because the label text is predictable. The risk is on non-standard addenda or scanned/image-based PDFs where text extraction fails.
**Listings are proof of work.** Displaying Teressa's active listings is a trust signal as much as a utility. It says: this agent is active, this site is current, this person knows the market. Each listing card should show: property photo, price, address, beds/baths/sqft, and a link to the detail page. WFRMLS covers 97% of Utah listings. Enrollment costs $50 one-time + ~$10/month for the data feed; approved vendors include IDX Broker, Showcase IDX, and Realtyna. **Fallback hierarchy:**
1. PDF has extractable text → text extraction + AI label matching → place fields at text positions (HIGH confidence path)
2. PDF is image-based (scanned) → AI cannot extract text → show message "Auto-placement requires a text-based PDF. Place fields manually." → agent falls back to drag-drop
3. AI returns no fields or low-confidence results → show "AI couldn't detect fields confidently. Review the placement or add fields manually." → pre-filled fields from AI are shown as suggestions with a different visual treatment; agent confirms or removes each one
**Contact friction kills conversions.** The contact form should ask for three fields: name, email, message. Every additional field reduces submission rate. Phone number should be visible in the site header — real estate clients often prefer to call. A "schedule a call" link (Calendly or equivalent) converts higher than an open-ended message form. **What the agent sees:** A "Detecting fields..." loading state (2-5 seconds), then fields appear on the PDF overlay. A summary count: "AI placed 14 fields — 8 signature, 3 date, 2 initials, 1 checkbox. Review and adjust." Misplaced fields can be dragged or deleted.
**Warm/approachable is the brief; make it specific.** This is not a corporate brokerage site. The design should feel like it belongs specifically to Teressa — her colors, her copy voice, her photo — not Real Estate Website Template #47. Soft palette, generous white space, readable typography, and photography that shows Teressa as a real person create the brand differentiation that generic IDX portal sites can't compete with. **gpt-4o-mini vs gpt-4o:** Use gpt-4o-mini for this task. The task is classification and mapping (semantic understanding of form labels), not complex reasoning. gpt-4o-mini handles this at lower cost (~0.15-0.60 cents per PDF page vs gpt-4o's 2.5-10 cents). At Teressa's volume (solo agent, dozens of documents per month), either is negligible cost.
**Mobile is the primary device.** Over 60-70% of real estate web traffic is mobile. The listings grid, the bio, the contact form, and critically the signing flow must all work flawlessly on a phone. iOS Safari and Android Chrome are the primary targets. Test all touch interactions, font sizes, and image loading on actual devices. ### (2) Expanded Field Types at Signing Time
**Initials:** Renders as a smaller version of the signature canvas ("Initial Here" label instead of "Sign Here"). Client draws their initials. Same capture mechanics as signature (canvas, PNG data URL). Stored separately from signature so audit trail distinguishes "initialed page 3" from "signed on page 8". The industry convention on Utah forms is to initial every page — this field type handles that pattern.
**Checkboxes:** Renders as an empty square (24x24px). Client taps or clicks to toggle. A checkmark renders inside on selection. Toggle is allowed before final submission (client can uncheck and recheck). At submission, the checked/unchecked state is embedded as a drawn checkmark or empty box in the PDF. Checkboxes assigned to agent are pre-checked during agent signing flow and appear pre-checked when client opens the document (read-only from client perspective).
**Date fields:** Auto-stamp pattern only — no calendar picker for the client. The field shows a placeholder ("Date will be filled at signing") when the agent previews. At the moment the client clicks "Complete Signing", all date fields are stamped with the UTC timestamp formatted as the local date (e.g., "March 21, 2026"). This matches DocuSign's "Date Signed" field behavior. The client cannot type a date. This is the legally correct approach — it records when they actually signed.
**Agent signature fields:** The agent places these during prep. During agent signing (before preview), clicking an agent-sig field applies the stored PNG. The PNG renders at the field's bounding box dimensions. When the document is sent to the client, these fields are already filled and read-only. The client cannot modify or remove the agent's signature.
### (3) Agent Counter-Signature Flow
**Sign-first is the industry norm:** In real estate, the agent (representing one party) typically signs first to show they've reviewed and authorized the document, then it routes to the client. This is confirmed by DocuSign's recommended routing for real estate: agent at Order 1, client at Order 2. Dotloop's auto-date behavior applies the same timestamp logic to the agent's first signature.
**Saved vs. per-document redraw:** All major tools store a saved signature. DocuSign calls it "Adopted Signature." Dotloop stores it per account. SkySlope DigiSign stores it account-wide. Per-document redraw exists in some tools as an option but agents rarely use it. The saved pattern is the standard because: (a) agents sign dozens of documents per transaction, (b) consistency across documents matters for authenticity, (c) re-drawing adds 30-60 seconds per document that adds up across a week.
**Implementation pattern for this app:** Agent Profile page has a "My Signature" section with a canvas draw interface. On first use (when an agent tries to apply a signature to a field), if no signature is saved, the app prompts them to draw and save one. After saving, applying is one click. "Re-draw Signature" in profile settings clears and prompts a new canvas session.
**What happens if no agent signature fields exist in the document:** If the agent places no agent-sig fields, the "Agent Signing" step is skipped and the flow goes directly from PreparePanel to Preview. The workflow should not require agent signature — it should be optional based on whether agent-sig fields were placed.
### (4) Document Preview Before Sending
**What users expect to see:** The industry standard preview (DocuSign, BoldSign, Adobe Sign) shows the document exactly as the recipient will see it: all filled fields rendered in-place, all empty fields visible with their interaction chrome (highlighted boxes for client-interactive fields). The agent is not in "edit mode" — they're seeing the client's view.
**What users expect to be able to do:** The most common expectation is a Back button to return to editing. Users do not expect to edit inline in the preview, but they do expect: (a) ability to scroll through all pages, (b) see a "Document is ready to send" or "X fields still need agent action" status, (c) confirm the send action from the preview. A "Download Preview" option (PDF download of the current state) is valued but not table stakes.
**Common pre-send mistake preview prevents:** Sending the wrong version of a document is the most-cited error in the industry. Agents also commonly send documents with unfilled agent signature fields (forgot to sign). The preview step catches both: the agent can see if their signature is missing from any agent-sig fields, and the document version is explicitly what they reviewed.
**Preview in the context of this app:** The preview should render using the same PDF.js rendering pipeline as the existing prep view, but with all overlay fields in "signed/filled" visual state rather than "editable" state. Pre-filled text values appear as rendered text on the PDF. Agent signature fields show the saved signature PNG. Client-interactive fields (signature, initials, checkbox) show their placeholder chrome (highlighted box, "Sign Here" label) so the agent can see what the client will see. Date fields show "—" or "Auto-fill on signing."
--- ---
## Document Signing UX Specifics ## Competitor Feature Analysis (v1.1 Scope)
### What Makes E-Signature Flows Work | Feature | DocuSign | Dotloop | SkySlope DigiSign | This App (v1.1) |
|---------|----------|---------|-------------------|-----------------|
**No client account creation — ever.** This is the single most important UX decision for the signing flow. The client clicks a link in an email and signs. Every additional step (create account, set password, verify email separately) is an abandonment driver. The unique signing URL is the authentication. This is how DocuSign, HelloSign, and every mainstream e-sig platform actually works for signers. | AI field placement | DocuSign Iris (Jan 2026) — AI auto-places via agreement analysis | No AI placement — template-based only | Smart Suite auto-audit, not field placement | gpt-4o-mini + pdfjs text extraction; Utah-specific label patterns |
| Initials field | Yes — condensed signature canvas | Yes — same as signature but labeled "Initial" | Yes | Canvas draw, labeled "Initial Here"; same capture as signature |
**Show where to sign immediately.** When the client opens the signing link, the first thing they should see is the document with the first unsigned field highlighted — not an explainer screen, not a terms acceptance wall. Signature fields should be visually distinct (yellow or blue highlighted box is the industry convention). Auto-scroll to the next unsigned field after each completion. | Date field behavior | "Date Signed" — auto-stamp, locked, client cannot edit | Auto-date/time stamp on signature and initials | Add time/date to signatures automatically | Auto-stamp at signing session completion; client cannot type a date |
| Checkbox field | Yes — toggle checkbox | Yes — binary check | Yes | Tap/click toggle; embed checkmark or empty box into PDF |
**Show progress through the document.** Clients signing a multi-page real estate contract don't know how long it takes. "Field 3 of 7" or a page indicator sets expectations and reduces abandonment. When clients can see they're almost done, they finish. Studies show streamlined signing UIs boost completion rates by up to 30%. | Agent saved signature | Yes — "Adopted Signature" stored account-wide | Yes — stored per account | Yes | Canvas-drawn, stored in agent profile as PNG |
| Agent signs first | Yes — Routing Order 1 agent, Order 2 client | Yes — sequential signing order supported | Yes | Agent signing step before preview; agent-sig fields pre-filled before client sees doc |
**Canvas signing must work well on mobile.** Over 50% of business signatures now happen on mobile. The canvas element must: be large enough to sign on a phone screen, respond correctly to touch events on iOS Safari (which handles touch differently than Chrome), provide a clear "Clear" button to redo, and produce a legible result. Always offer "type your name" as a fallback — not everyone is comfortable drawing on a phone. | Pre-send preview | Yes — "Preview" step shows filled document | Yes — document preview in loop before sharing | Yes | Full PDF render with all overlays in filled state; Send button lives in preview |
| AI pre-fill from contact data | Yes — Rooms for Real Estate prefills from transaction data | Yes — via transaction record | Yes — via transaction file | Pre-fill from client profile (name, email, property address) |
**The confirmation screen closes the loop.** After the last field is signed and submitted, the client needs: (1) an explicit success message, (2) confirmation that a copy is coming by email, and (3) optionally a download button. This is the moment the client feels done — ambiguity here creates follow-up calls.
**Agent-side field placement must be fast.** If placing fields takes more than 2-3 minutes per document, it becomes a daily friction point. Best UX: drag field types from a sidebar onto the PDF; resize with handles; delete with the Delete key or a trash icon. Auto-scroll across pages. Support at minimum: signature, initials, date (auto-populated at signing time), text input, checkbox. A "save this field layout as a template" option eliminates the most common repetitive task — Utah standard forms don't change.
**Status dashboard answers the one question agents ask daily: "Did they sign yet?"** A simple list of documents with status badges (Draft, Sent, Viewed, Signed) and the last-activity timestamp answers this without digging into individual records. A one-click "Resend Link" or "Send Reminder" button belongs here — not buried in a document detail view. Best practice reminder cadence: Day 1 (initial send), Day 3, Day 5, Day 7 (escalate to a call).
**Audit trail is quiet but legally essential.** The agent should never need to think about this during normal operation. But every signed document must silently record: signing timestamp, client IP address, email address used, and the drawn signature image embedded into the PDF. This data protects Teressa in any dispute. The signed document itself plus a basic certificate are what Utah courts understand and accept under ESIGN/UETA.
**Sub-3-second load time for the signing page.** Slow-loading signing pages erode trust — the client wonders if the link is legitimate. Optimize the PDF viewer initialization, use lazy loading for pages beyond the first, and keep the JavaScript bundle lean. A 3-second or faster load is the industry target.
--- ---
## Competitor Feature Analysis
| Feature | DocuSign Rooms | HelloSign / Dropbox Sign | Lone Wolf Authentisign | Our Approach |
|---------|---------------|--------------------------|------------------------|--------------|
| E-signature capture | Canvas + typed + uploaded | Canvas + typed | Canvas | Canvas-drawn only (v1); matches real estate norm |
| Audit trail | Full Certificate of Completion (IP, timestamp, actions) | Audit Report (IP, timestamp) | Basic trail | IP + timestamp + signature image; ESIGN/UETA compliant |
| Client account required | Yes | Yes | Yes | No — anonymous token link; differentiator |
| Branding | DocuSign branding | Dropbox branding | Lone Wolf branding | Teressa's brand throughout |
| Forms library | Generic; no Utah MLS integration | Generic | MLS-connected | utahrealestate.com import (v1.x); Utah-specific |
| Monthly cost | $25-50+/month | $15-25+/month | MLS membership fee | $0 incremental (custom built) |
| Agent-fills workflow | Yes, via templates | Yes, via templates | Yes | Yes — explicit two-phase prep + sign flow |
| MLS listing integration | No | No | Partial | WFRMLS/utahrealestate.com (full site integration) |
| Field detection | Template-based | Template-based | Template-based | Heuristic auto-detect on Utah standard forms (v1.x) |
## Sources ## Sources
**Real estate agent website features and best practices:** **AI field detection and coordinate accuracy:**
- [10 Must-Have Features for Real Estate Agent Websites — EuroDNS](https://www.eurodns.com/blog/10-must-have-features-for-high-performing-real-estate-agent-websites) - [Auto-detect PDF Form Fields with Smart Data Extraction — Apryse](https://apryse.com/blog/auto-detect-pdf-form-fields-with-smart-data-extraction)
- [15 Real Estate Web Design Features That Actually Drive Sales In 2025 — AltaStreet](https://www.altastreet.com/15-real-estate-web-design-features-that-actually-drive-sales-in-2025/) - [SAM 2 + GPT-4o Cascading Foundation Models — Edge AI and Vision Alliance (Feb 2025)](https://www.edge-ai-vision.com/2025/02/sam-2-gpt-4o-cascading-foundation-models-via-visual-prompting-part-2/) — confirms GPT-4o returns accurate bounding box coordinates in fewer than 3% of attempts
- [Real Estate Agent Website Design Trends 2025 — Fix8Media](https://www.fix8media.com/real-estate-agent-website-design-trends-2025) - [GPT-4o Model: Image Coordinate Recognition — OpenAI Community](https://community.openai.com/t/gpt-4o-model-image-coordinate-recognition/907625)
- [Real Estate Website Design Best Practices — AgentFire](https://agentfire.com/blog/real-estate-website-design-best-practices/) - [Case Study: Real Estate Law Flat PDF Form Automation — Instafill.ai (Feb 2026)](https://blog.instafill.ai/2026/02/18/case-study-real-estate-law-flat-pdf-form-automation/)
- [20 Real Estate Agent Bio Examples and Tips — InboundREM](https://inboundrem.com/real-estate-bio/) - [Intelligent Document Processing for Real Estate — Syntora](https://syntora.io/solutions/ai-document-processing-for-real_estate)
- [Real Estate Bio Examples: Copy-Paste Templates 2026 — Propphy](https://www.propphy.com/blog/real-estate-bio-examples-templates-2025) - [Using Confidence Scoring to Reduce Risk in AI-Driven Decisions — Multimodal.dev](https://www.multimodal.dev/post/using-confidence-scoring-to-reduce-risk-in-ai-driven-decisions)
- [How to Create Effective Real Estate Testimonials — Carrot](https://carrot.com/blog/effective-real-estate-testimonial/) - [Understanding Accuracy and Confidence in Azure Document Intelligence — Microsoft Learn](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/accuracy-confidence?view=doc-intel-4.0.0)
- [15 Real Estate Landing Page Best Practices — Landingi](https://landingi.com/landing-page/real-estate-best-practices/)
**WFRMLS / IDX integration:** **Field types and signing behavior:**
- [IDX Broker for Wasatch Front Regional MLS (WFRMLS)](https://www.idxbroker.com/mls/wasatch-front-regional-mls-wfrmls) - [Adding Signatures or Initials to Locked Templates — Dotloop Support](https://support.dotloop.com/hc/en-us/articles/217936457-Adding-Signatures-or-Initials-to-Locked-Templates)
- [Add Wasatch Front Regional MLS to Your Site — Showcase IDX](https://showcaseidx.com/mls-coverage/wasatch-front-regional-mls-wfrmls/) - [Understanding Signing — Dotloop Support](https://support.dotloop.com/hc/en-us/articles/202790063-How-clients-sign)
- [How to Connect WordPress to WFRMLS — Realtyna](https://realtyna.com/blog/how-to-connect-wordpress-website-to-wasatch-front-regional-mls-utah-wfrmls-with-organic-idx-mls-integration/) - [Am I able to auto-populate the date field? — DocuSign Community](https://community.docusign.com/esignature-111/am-i-able-to-auto-populate-the-date-field-2271)
- [IDX MLS Listings for WFRMLS of Utah — ProAgent Websites](https://www.proagentwebsites.com/wfr.html)
**Real estate document signing workflows:**
- [eSignature Document Workflow Software for Real Estate — SigniFlow](https://www.signiflow.com/esignature-document-workflow-software-for-the-real-estate-sector-signiflow/)
- [DocuSign for Real Estate & PandaDoc — PandaDoc](https://www.pandadoc.com/blog/docusign-real-estate-pandadoc-realtors/)
- [5 Real Estate Automation Tools Every Agent Needs in 2025 — RealOffice360](https://realoffice360.com/crm-blog-for-realtors/real-estate-workflow-automation-tools-2025)
- [DigiSign by SkySlope — E-Signature for Realtors](https://skyslope.com/products-services/digisign/) - [DigiSign by SkySlope — E-Signature for Realtors](https://skyslope.com/products-services/digisign/)
- [We Tested 15 Electronic Signature Tools That Streamline Real Estate Workflows — SignWell](https://www.signwell.com/resources/electronic-signature-for-real-estate/)
- [E-Sign Real Estate Contracts: The Agent's Guide — Market Leader](https://www.marketleader.com/blog/esign-real-estate-contracts/)
- [7 Document Collection Automation Tips for Real Estate Agents — UseCollect](https://www.usecollect.com/blog/7-document-collection-automation-tips-for-real-estate-agents/)
- [Lone Wolf Authentisign — Real estate's leading eSignature solution](https://www.lwolf.com/operate/esignature)
- [Electronic Signatures for Real Estate: 2026 Guide — DocuPilot](https://www.docupilot.com/blog/electronic-signature-for-real-estate) - [Electronic Signatures for Real Estate: 2026 Guide — DocuPilot](https://www.docupilot.com/blog/electronic-signature-for-real-estate)
- [Redefining Real Estate: The Document Automation Transformation — Experlogix](https://www.experlogix.com/blog/redefining-real-estate-the-document-automation-transformation)
**E-signature UX and PDF form field management:** **Agent counter-signature and routing order:**
- [Embedded Signing Experience Best Practices — eSignGlobal](https://www.esignglobal.com/blog/best-practices-embedded-signing-user-experience-ux) - [Prefill fields before sending envelope for signature — DocuSign Community](https://community.docusign.com/esignature-111/prefill-fields-before-sending-envelope-for-signature-180) — confirms agent at routing order 1, client at routing order 2 as real estate standard
- [eSignature UX: Digital Excellence — Emudhra](https://emudhra.com/en-us/blog/user-experience-in-esignatures-designing-for-digital-excellence) - [Mastering Sequential Multi-Signer Workflows — Docomotion](https://www.docomotion.com/mastering-sequential-multi-signer-workflows-for-salesforce-real-estate/)
- [Best Electronic Signature Software for Real Estate in 2026 — SignEasy](https://signeasy.com/blog/business/electronic-signature-software-for-real-estate)
- [Lone Wolf Authentisign — Real estate's leading eSignature solution](https://www.lwolf.com/operate/esignature)
**Document preview and prefill:**
- [Prefill agreement fields before sending — Adobe Acrobat Sign](https://helpx.adobe.com/sign/authoring/prefill.html)
- [Simplify Document Signing with Prefilled Fields — BoldSign](https://boldsign.com/blogs/how-to-prefill-form-details-before-sending-a-document-for-signature/)
- [DocuSign Rooms for Real Estate](https://www.docusign.com/products/rooms-for-real-estate)
- [How E-Signatures Transform Real Estate Transactions in 2026 — Legitt AI](https://legittai.com/blog/e-signatures-transform-real-estate-2026)
- [Top eSignature Trends to Watch in 2026 — BlueInk](https://www.blueink.com/blog/top-esignature-trends-2026) - [Top eSignature Trends to Watch in 2026 — BlueInk](https://www.blueink.com/blog/top-esignature-trends-2026)
- [Best Practices for Building Your E-Signature Workflow — OneSpan](https://www.onespan.com/blog/best-practices-building-your-e-signature-workflow)
- [Top 13 E-Signature Mistakes Businesses Must Avoid — WeSignature](https://wesignature.com/blog/top-13-e-signature-mistakes-businesses-make/)
- [Best Practices for eSignatures: Legally Binding and Secure — Acronis](https://www.acronis.com/en/blog/posts/best-practices-for-e-signature/)
- [E-Signature Software Development In 8 Easy Steps — USM Systems](https://usmsystems.com/e-signature-software-development-in-8-easy-steps/)
- [An Overview of DocHub's PDF Editing, Annotation and Signing Tools — DocHub](https://helpdesk.dochub.com/hc/en-us/articles/360019037293-An-overview-of-DocHub-s-PDF-editing-annotation-signing-tools)
- [PDF Form Fields Supported in Document Engine — Nutrient](https://www.nutrient.io/guides/document-engine/forms/introduction-to-forms/form-fields/)
- [US Electronic Signature Laws (ESIGN/UETA) — DocuSign](https://www.docusign.com/products/electronic-signature/learn/esign-act-ueta)
**Utah-specific:** **Utah REPC form specifics:**
- [Utah Division of Real Estate — State Approved Forms](https://realestate.utah.gov/real-estate/forms/state-approved/) - [Utah Division of Real Estate — State Approved Forms](https://realestate.utah.gov/real-estate/forms/state-approved/)
- [Utah Association of REALTORS Forms](https://utahrealtors.com/) - [Utah Real Estate Purchase Contract (REPC) — Utah Division of Commerce](https://commerce.utah.gov/wp-content/uploads/2023/03/purchase-contract.pdf)
--- ---
*Feature research for: Teressa Copeland Homes — real estate marketing site + document signing portal* *Feature research for: Teressa Copeland Homes — v1.1 Smart Document Preparation*
*Researched: 2026-03-19* *Researched: 2026-03-21*

View File

@@ -1,320 +1,242 @@
# Pitfalls Research # Pitfalls Research
**Domain:** Real estate broker web app with custom e-signature and document signing (Utah/WFRMLS) **Domain:** Real estate broker web app — v1.1 additions: AI field placement, expanded field types, agent saved signature, filled document preview
**Researched:** 2026-03-19 **Researched:** 2026-03-21
**Confidence:** HIGH **Confidence:** HIGH (all pitfalls grounded in the actual v1.0 codebase reviewed; no speculative claims)
---
## Context: What v1.1 Is Adding to the Existing System
The v1.0 codebase has been reviewed. Key facts that shape every pitfall below:
- `SignatureFieldData` (schema.ts) has **no `type` field** — it stores only `{ id, page, x, y, width, height }`. Every field is treated as a signature.
- `FieldPlacer.tsx` has **one draggable token** labeled "Signature" — no other field types exist in the palette.
- `SigningPageClient.tsx` **iterates `signatureFields`** and opens the signature modal for every field. It has no concept of field type.
- `embed-signature.ts` **only draws PNG images** — no logic for text, checkboxes, or dates.
- `prepare-document.ts` uses `@cantoo/pdf-lib` (confirmed import), fills AcroForm text fields and draws blue rectangles for signature placeholders. It does not handle the new field types.
- Prepared PDF paths are stored as relative local filesystem paths (not Vercel Blob URLs). The signing route builds absolute paths from these.
- Agent saved signature: no infrastructure exists yet. The v1.0 `SignatureModal` checks `localStorage` for a saved signature — that is the only "save" mechanism today, and it is per-browser only.
--- ---
## Critical Pitfalls ## Critical Pitfalls
### Pitfall 1: Custom E-Signature Has No Tamper-Evident PDF Hash ### Pitfall 1: Breaking the Signing Page by Adding Field Types Without Type Discrimination
**What goes wrong:** **What goes wrong:**
The signed PDF is stored, but there is no cryptographic hash (SHA-256 or similar) computed at the moment of signing and embedded in or stored alongside the document. If the PDF is ever challenged in court, there is no way to prove the document was not modified after signing. The signature image becomes legally just "a drawing on a page." `SignatureFieldData` has no `type` field. `SigningPageClient.tsx` opens the signature-draw modal for every field in `signatureFields`. When new field types (text, checkbox, initials, date, agent-signature) are stored in that same array with only coordinates, the client signing page either (a) shows a signature canvas for a checkbox field, or (b) crashes with a runtime error when it encounters a field type it doesn't handle, blocking the entire signing page.
**Why it happens:** **Why it happens:**
Developers focus on capturing the signature canvas image and embedding it into the PDF. The hash/integrity step feels like an edge case until a dispute occurs years later. Utah courts and ESIGN/UETA require the system to prove document integrity, not just that a signature image exists. The schema change is made on the agent side first (adding a `type` discriminant to `SignatureFieldData` and new field types to `FieldPlacer`), but the signing page is not updated in the same commit. Even one deployed document with mixed field types — sent before the signing page update — will be broken for that client.
**How to avoid:** **How to avoid:**
After embedding the signature image into the PDF and before storing the final file: compute a SHA-256 hash of the complete signed PDF bytes. Store this hash in the database record alongside the document. Optionally, embed the hash and a certificate-of-completion JSON blob as a PDF metadata/attachment. On any challenge, recompute the hash against the stored file and compare. Add `type` to `SignatureFieldData` as a string literal union **before** any field placement UI changes ship. Make the signing page's field renderer branch on `type` defensively: unknown types default to a placeholder ("not required") rather than throwing. Ship both changes atomically — schema migration, `FieldPlacer` update, and `SigningPageClient` update must be deployed together. Never have a deployed state where the schema supports types the signing page doesn't handle.
**Warning signs:** **Warning signs:**
- The signing record only stores the signature image URL and a timestamp, with no document fingerprint. - `SignatureFieldData` in `schema.ts` gains a `type` property but `SigningPageClient.tsx` still iterates fields without branching on it.
- No "certificate of completion" is generated alongside the signed PDF. - The FieldPlacer palette has more tokens than the signing page has rendering branches.
- The audit log references document events but not the final document hash. - A document is sent before the signing page is updated to handle the new types.
**Phase to address:** **Phase to address:**
Document signing backend — before any client is sent a signing link. Phase 1 of v1.1 (schema and signing page update) — must be the first change, before any AI or UI work touches field types.
--- ---
### Pitfall 2: Audit Trail is Incomplete and Would Fail Court Challenge ### Pitfall 2: AI Coordinate System Mismatch — OpenAI Returns CSS-Space Percentages, pdf-lib Expects PDF Points
**What goes wrong:** **What goes wrong:**
The system logs that a document was signed and stores an IP address. But if challenged, the opposing party argues the signer's identity cannot be verified, the viewing timestamp is missing, or the sequence of events (sent → opened → signed) cannot be reconstructed. Courts have rejected e-signatures precisely because the audit trail only showed the signature event, not the full ceremony. The OpenAI response for field placement will return bounding boxes in one of several formats: percentage of page (01 or 0100), pixel coordinates at an assumed render resolution, or CSS-style top-left origin. The existing `SignatureFieldData` schema stores **PDF user space coordinates** (bottom-left origin, points). When the AI output is stored without conversion, every AI-placed field appears at the wrong position — often inverted on the Y axis. The mismatch is not obvious during development if you test with PDFs where fields land approximately near the correct area.
**Why it happens:** **Why it happens:**
Developers build the "happy path" (sign button clicked → PDF stored). Pre-signing events (email sent, link opened, document viewed) are not logged because they seem unimportant until litigation. Federal district courts have held that detailed e-signature audit logs satisfy authentication requirements; gaps in the log create exploitable weaknesses. The current `FieldPlacer.tsx` already has a correct `screenToPdfCoords` function for converting drag events. But that function takes rendered pixel dimensions as input. When AI output arrives as a JSON payload, developers mistakenly store the raw AI coordinates directly into the database without passing them through the same conversion. The sign-on-screen overlay in `SigningPageClient.tsx` then applies `getFieldOverlayStyle()` which expects PDF-space coords, producing the wrong position.
**Concrete example from the codebase:**
`screenToPdfCoords` in `FieldPlacer.tsx` computes:
```
pdfY = ((renderedH - screenY) / renderedH) * pageInfo.originalHeight
```
If the AI returns a y_min as fraction of page height from the top (0 = top), storing it directly as `field.y` means the field appears at the bottom of the page instead of the top, because PDF Y=0 is the bottom.
**How to avoid:** **How to avoid:**
Log every event in the signing ceremony as a separate, timestamped, server-side record: (1) document prepared, (2) signing email sent with link hash, (3) link opened (IP, user agent, timestamp), (4) document viewed/scrolled, (5) signature canvas drawn/submitted, (6) final PDF hash computed and stored. All events must be stored server-side — never trust client-reported timestamps. Include: signer email, IP address, user-agent string, timezone offset, and event type for every record. Define a canonical AI output format contract before building the prompt. Use normalized coordinates (01 fractions from top-left) in the AI JSON response, then convert server-side using a single `aiCoordsToPagePdfSpace(norm_x, norm_y, norm_w, norm_h, pageWidthPts, pageHeightPts)` utility. This utility mirrors the existing `screenToPdfCoords` logic. Unit-test it against a known Utah purchase agreement with known field positions before shipping.
**Warning signs:** **Warning signs:**
- Audit records are only written on the POST to submit the signature, not on GET of the signing page. - AI-placed fields appear clustered at the bottom or top of the page regardless of document content.
- Timestamps come from the client's browser clock. - The AI integration test uses visual eyeballing rather than coordinate assertions.
- "Email sent" is logged in the email provider's dashboard but not in the app's own database. - The conversion function is not covered by the existing test suite (`prepare-document.test.ts`).
**Phase to address:** **Phase to address:**
Email-link signing flow implementation; auditing must be wired in before first signing ceremony, not added later. AI field placement phase — write the coordinate conversion utility and its test before the OpenAI API call is made.
--- ---
### Pitfall 3: Signing Link is Replayable and Has No One-Time Enforcement ### Pitfall 3: OpenAI Token Limits on Large Utah Real Estate PDFs
**What goes wrong:** **What goes wrong:**
A signing link is emailed to the client. The link contains a token (e.g., a UUID or JWT). After the client signs, the link still works — anyone who intercepts, forwards, or finds the link in an inbox can re-open the signing page, potentially submitting a second signature or viewing the partially signed document. Utah standard real estate forms (REPC, listing agreements, buyer representation agreements) are 1030 pages. Sending the raw PDF bytes or a base64-encoded PDF to GPT-4o-mini will immediately hit the 128k context window limit for multi-page forms, or produce truncated/hallucinated field detection when the document is silently cut off mid-content. GPT-4o-mini's vision context limit is further constrained by image tokens — a single PDF page rendered at 72 DPI costs roughly 1,700 tokens; a 20-page document at standard resolution consumes ~34,000 tokens before any prompt text.
**Why it happens:** **Why it happens:**
Developers treat the signing token as a read-only session key. Invalidating it requires server-side state, which feels at odds with stateless JWT approaches. The edge case of link forwarding or second-device access feels rare. Developers prototype with short test PDFs (23 pages) where the approach works, then discover it fails on production forms. The failure mode is not a hard error — the API returns a response, but field positions are wrong or missing because the model never saw the later pages.
**How to avoid:** **How to avoid:**
Signing tokens must be stored in the database with a `used_at` timestamp column. On every request to the signing page, check: (1) token exists, (2) token is not expired (recommend 72-hour TTL for real estate — clients may not check email immediately, but 15-minute windows used in authentication magic links are too short for document signing), (3) `used_at` is null. On signature submission success, set `used_at` immediately before returning the success response. Use a database transaction to prevent race conditions. If the link is accessed after use, display "This document has already been signed" with a link to contact the agent — never re-render the signing canvas. Page-by-page processing: render each PDF page to a base64 PNG (using `pdfjs-dist` or `sharp` on the server), send each page image in a separate API call, then merge the field results. Cap input image resolution to 1024px wide (sufficient for field detection). Set a token budget guard before each API call and log when pages approach the limit. Use structured output (JSON mode) so partial responses fail loudly rather than silently returning incomplete data.
**Warning signs:** **Warning signs:**
- Signing token is a stateless JWT with no server-side revocation mechanism. - AI analysis is tested with only a 2-page or 3-page sample PDF.
- Loading the signing URL twice shows the signature canvas both times. - The implementation sends the entire PDF to OpenAI in a single request.
- No `signed_at` or `used_at` column exists on the signing link record. - Field detection success rate degrades noticeably on page 8+.
**Phase to address:** **Phase to address:**
Email-link signing flow — token generation and validation logic must include one-time enforcement from the start. AI integration phase — establish the page-by-page pipeline pattern before testing with real Utah forms.
--- ---
### Pitfall 4: PDF Coordinate System Mismatch Places Signatures in Wrong Positions ### Pitfall 4: Prompt Design — AI Hallucinates Fields That Don't Exist or Misses Required Fields
**What goes wrong:** **What goes wrong:**
Signature fields are placed on the PDF using coordinates from the browser's click/drag UI (origin: top-left, Y increases downward). When the signature image is embedded into the PDF using pdf-lib or a similar library (origin: bottom-left, Y increases upward), the signature appears in the wrong position — often mirrored vertically or offset significantly. This gets worse with rotated PDF pages (common in scanned real estate forms). Without a carefully constrained prompt, GPT-4o-mini will "helpfully" infer field locations that don't exist in the PDF (e.g., detecting a printed date as a fillable date field) or will use inconsistent field type names that don't match the application's `type` enum (`"text_input"` instead of `"text"`, `"check_box"` instead of `"checkbox"`). This produces spurious fields in the agent's document and breaks the downstream field type renderer.
**Why it happens:** **Why it happens:**
PDF's coordinate system is inherited from PostScript and is the opposite of every web/canvas coordinate system. Developers test with a simple unrotated letter-size PDF and it appears to work, then discover the mismatch when processing actual Utah real estate forms that may be rotated or have non-standard page sizes. The default behavior of vision models is to be helpful and infer structure. Without explicit constraints (exact allowed types, instructions to return empty array when no fields exist, max field count), the output is non-deterministic and schema-incompatible.
**How to avoid:** **How to avoid:**
Always convert click/placement coordinates from viewer space to PDF space using the library's own conversion APIs. In PDF.js: use `viewport.convertToPdfPoint(x, y)`. In pdf-lib: the page's height must be subtracted from Y coordinates when mapping from top-left screen space. Write a coordinate conversion unit test against a known PDF page size: place a signature at the visual center of the page and assert the embedded result matches `(pageWidth/2, pageHeight/2)` in PDF points. For rotated pages, read and apply the page's `Rotate` key before computing placement — a page rotated 90 degrees requires a full matrix transform, not just a Y-flip. Use OpenAI's structured output (JSON schema mode) with an explicit enum for field types matching the application's type discriminant exactly. Include a negative instruction: "Only detect fields that have an explicit visual placeholder (blank line, box, checkbox square) — do not infer fields from printed text labels." Include a `confidence` score per field so the agent UI can filter low-confidence placements. Validate the response JSON against a Zod schema server-side before storing — reject the entire AI response if any field has an invalid type.
**Warning signs:** **Warning signs:**
- Signature placement is tested only on a single blank PDF created in the app, not on actual Utah real estate forms. - The prompt asks the model to "detect all form fields" without specifying what counts as a field.
- The placement code contains a raw Y subtraction without checking for page rotation. - The response is stored directly in the database without Zod validation.
- Signatures look correct in the browser preview but misaligned in the downloaded PDF. - The agent sees unexpected fields on pages with no visual placeholders.
**Phase to address:** **Phase to address:**
PDF field detection and signature placement — before any user-facing drag-and-drop placement UI is built. AI integration phase — validate prompt output against Zod before the first real Utah form is tested.
--- ---
### Pitfall 5: Scraping utahrealestate.com Forms Library Violates Terms of Service ### Pitfall 5: Agent Saved Signature Stored as Raw DataURL — Database Bloat and Serving Risk
**What goes wrong:** **What goes wrong:**
The app authenticates to utahrealestate.com with Teressa's credentials and scrapes the forms library (PDFs of purchase agreements, listing agreements, etc.) using HTTP requests or a headless browser. This appears to work initially, then breaks when the site updates its session handling, adds bot detection, or changes URL structures. More critically, it violates the platform's Terms of Service and the WFRMLS data licensing agreement, which restricts data use to authorized products only. A canvas signature exported as `toDataURL('image/png')` produces a base64-encoded PNG string. A typical signature on a 400x150 canvas is 1560KB as base64. If this is stored directly in the database (e.g., a `TEXT` column in the `users` table), every query that fetches the user row will carry 1560KB of base64 data it may not need. More critically, if the dataURL is ever sent to the client to pre-populate a form field, it exposes the full signature as a downloadable string in page source.
**Why it happens:**
utahrealestate.com has partnered with SkySlope Forms for its digital forms solution. There is no documented public API for the forms library. Using Teressa's credentials to download PDFs looks like a clean solution — it's just automating what she does manually. The ToS implication is easy to overlook.
**How to avoid:** **How to avoid:**
Do not scrape the utahrealestate.com forms library programmatically. Instead: (1) Identify which forms Teressa uses most frequently (purchase agreement, listing agreement, buyer rep agreement, addendums). (2) Download those PDFs manually once and store them as static assets in the app's own file storage. Utah Division of Real Estate also publishes state-approved forms publicly at commerce.utah.gov — these are explicitly public domain and safe to embed. (3) For any forms that must come from the MLS, treat the upload step as a manual agent action: Teressa uploads the PDF herself from her downloads, then the app processes it. This also means the app is not fragile to site changes. Store the signature as a file (Vercel Blob or the existing `uploads/` directory), and store only the file path/URL in the database. On the signing page and preview, serve the signature through an authenticated API route that streams the file bytes — never expose the raw dataURL to the client page. Alternatively, convert the dataURL to a `Uint8Array` immediately on the server (for PDF embedding only) and discard the string — only the file path goes to the DB.
**Warning signs:** **Warning signs:**
- The app stores Teressa's utahrealestate.com password in its own database or environment variables to automate login. - A `savedSignatureDataUrl TEXT` column is added to the `users` table.
- The forms-fetching code contains hardcoded URLs pointing to utahrealestate.com paths. - The agent dashboard page fetches the user row and passes `savedSignatureDataUrl` to a React component prop.
- A site redesign or session expiry would break the core import flow. - The signature appears in the React devtools component tree as a base64 string.
**Phase to address:** **Phase to address:**
Forms import feature — the architecture decision must be made before any scraping code is written. Agent saved signature phase — establish the storage pattern (file + path, not dataURL + column) before any signature saving UI is built.
--- ---
### Pitfall 6: IDX Listings Displayed Without Required Broker Attribution and Disclaimers ### Pitfall 6: Race Condition — Agent Updates Saved Signature While Client Is Mid-Signing
**What goes wrong:** **What goes wrong:**
Listings from WFRMLS are displayed on the public site without the required listing broker attribution (the co-listing brokerage name, not just Teressa's), without the MLS-required disclaimer text, or with data that has been modified (e.g., description truncated, price formatted differently). NAR IDX policy violations can result in fines up to $15,000 or loss of MLS access — which would effectively end Teressa's ability to operate. The agent draws a new saved signature and saves it while a client has the signing page open. The signing page has already loaded the signing request data (including `signatureFields`). When the agent applies their new saved signature to an agent-signature field and re-prepares the document, there are now two versions of the prepared PDF on disk: the one the client is looking at and the newly generated one. If the client submits their signature concurrently with the agent's re-preparation, `embedSignatureInPdf()` may read a partially-written prepared PDF (before the atomic rename completes) or the document may be marked "Sent" again after already being in "Viewed" state, breaking the audit trail.
**Why it happens:** **Why it happens:**
IDX compliance rules feel like fine print. Developers display the data that looks good and skip the attribution fields because they seem redundant on a solo agent site. The disclaimer is added as a one-time footer and then forgotten when listing detail pages are added later. The existing prepare flow in `PreparePanel.tsx` allows re-preparation of Draft documents. Once agent signing is added, the agent can re-run preparation on a "Sent" or "Viewed" document to swap their signature, creating a mutable prepared PDF while a client session is active.
**How to avoid:** **How to avoid:**
Every listing display page (card and detail) must render: (1) listing broker/office name from the `ListOfficeName` field, not Teressa's brokerage; (2) the WFRMLS-required disclaimer text verbatim on every page where IDX data appears; (3) a "Last Updated" timestamp pulled from the feed, not the app's cache write time; (4) no modified listing descriptions or prices. Implement a compliance checklist as a code review gate before any listing-display page ships. Confirm the exact required disclaimer text with WFRMLS directly — it changes with NAR policy updates (2024 settlement, 2025 seller options policy). Lock prepared documents once the first signing link is sent. Gate the agent re-prepare action behind a confirmation: "Resending will invalidate the existing signing link — the client will receive a new email." On confirmation, atomically: (1) mark the old signing token as `usedAt = now()` with reason "superseded", (2) delete the old prepared PDF (or rename to `_prepared_v1.pdf`), (3) generate a new prepared PDF, (4) issue a new signing token, (5) send a new email. This prevents mid-session clobber. The existing `embedSignatureInPdf` already uses atomic rename (`tmp → final`) which prevents partial-read corruption — preserve this.
**Warning signs:** **Warning signs:**
- Listing detail pages render only the property data, with no co-brokerage attribution field. - Agent can click "Prepare and Send" on a document with status "Sent" without any confirmation dialog.
- The IDX disclaimer appears only on the search results page, not on individual listing detail pages. - The prepared PDF path is deterministic and overwritten in place (e.g. always `{docId}_prepared.pdf`).
- The app omits the buyer's agent compensation disclosure field added as a NAR 2024 settlement requirement. - No "superseded" state exists in the `signingTokens` table.
**Phase to address:** **Phase to address:**
Public listings feature — compliance requirements must be treated as acceptance criteria, not post-launch polish. Agent signing phase — implement the supersede-and-resend flow before any agent signature is applied to a sent document.
--- ---
### Pitfall 7: Stale Listings Show Off-Market Properties as Active ### Pitfall 7: Filled Preview Is Served From the Same Path as the Prepared PDF — Stale Preview After Field Changes
**What goes wrong:** **What goes wrong:**
The app caches WFRMLS listing data in a database and refreshes it on a slow schedule (e.g., once per day via a cron job, or only when a user loads the page). A property that went under contract or sold yesterday still shows as active on the site. Clients contact Teressa about a property that is no longer available, damaging her professional credibility. WFRMLS policy expects off-market listings to be removed within 24 hours. The agent makes changes to field placement or pre-fill values after generating a preview. The preview file on disk is now stale. The preview URL is cached by the browser (or a CDN). The agent sees the old preview and believes the document is correct, then sends it to the client. The client receives a document with the old pre-fill values, not the updated ones.
**Why it happens:** **Why it happens:**
Frequent API polling adds infrastructure complexity. A daily batch job feels sufficient. The RESO OData API returns a `ModificationTimestamp` field that allows delta syncing, but developers often do full re-fetches instead, which is slow and rate-limited. The existing `prepare-document.ts` writes to a deterministic path: `{docId}_prepared.pdf`. If the preview is served from the same path, any browser cache of that URL shows the old version. The agent has no visual indication that the preview is stale.
**How to avoid:** **How to avoid:**
Implement delta sync: query the WFRMLS RESO API with a `$filter` on `ModificationTimestamp gt [last_sync_time]` to fetch only changed listings since the last run. Schedule this to run hourly via a reliable background job (not WP-Cron or in-process timers — use a proper scheduler like Vercel Cron or a dedicated worker). When a listing's `StandardStatus` transitions to Closed, Withdrawn, or Expired, remove it from the public display immediately, not at next full refresh. Display a "Last Updated" timestamp on each listing page so both users and MLS auditors can verify freshness. Generate preview PDFs to a separate path with a timestamp or version suffix: `{docId}_preview_{timestamp}.pdf`. Never serve the preview from the same path as the final prepared PDF. Add a "Preview is stale — regenerate before sending" banner that appears when `signatureFields` or `textFillData` are changed after the last preview was generated. Store `lastPreviewGeneratedAt` in the document record and compare to `updatedAt`. The "Send" button should be disabled until a fresh preview has been generated (or explicitly skipped by the agent).
**Warning signs:** **Warning signs:**
- The sync job runs once per day. - The preview endpoint serves `/api/documents/{id}/prepared` without a cache-busting mechanism.
- The sync is triggered by a user page load rather than a scheduled job. - The agent can modify fields after generating a preview and the preview URL does not change.
- There is no `ModificationTimestamp` filter in the API query — all listings are re-fetched every run. - No "stale preview" indicator exists in the UI.
- The app has no process for immediately hiding a listing when its status changes to off-market.
**Phase to address:** **Phase to address:**
Listings sync infrastructure — before the public site goes live with real listing data. Filled document preview phase — establish the versioned preview path and staleness indicator before the first preview is rendered.
--- ---
### Pitfall 8: PDF Form Field Detection Relies on Heuristics That Fail on Non-Standard Forms ### Pitfall 8: Memory Issues Rendering Large PDFs for Preview on the Server
**What goes wrong:** **What goes wrong:**
The app attempts to auto-detect signature fields in uploaded Utah real estate PDFs using text-layer heuristics (searching for the word "Signature" or "_____" underscores). This works on forms with searchable text layers, fails silently on scanned PDFs (no text layer), and places fields incorrectly on multi-column forms where the heuristic matches the wrong block. Generating a filled preview requires loading the PDF into memory (via `@cantoo/pdf-lib`), modifying it, and either returning the bytes for streaming or writing to disk. Utah real estate forms (REPC, addendums) can be 1530 pages and 28MB as raw PDFs. Running `PDFDocument.load()` on an 8MB PDF in a Vercel serverless function that has a 256MB memory limit can cause OOM errors under concurrent load. The Vercel function timeout (10s default, 60s max on Pro) can also be exceeded for large PDFs with many embedded fonts.
**Why it happens:** **Why it happens:**
Auto-detection feels like the right UX goal — the agent should not have to manually place every field. PDF form field detection is non-trivial: some forms have AcroForm fields, some have visual annotation markers only, and some are flat scans. pdf-lib does not expose field dimensions/coordinates natively (a known open issue since 2020), so custom heuristics are required. Developers test with a small 2-page PDF in development and the function works fine. The function hits the memory wall only when a real Utah standard form (often 20+ pages with embedded images) is processed in production.
**How to avoid:** **How to avoid:**
Treat auto-detection as a "best effort starting point," not a reliable system. The flow should be: (1) attempt AcroForm field parsing first — if the PDF has embedded AcroForm fields, use them; (2) if not, present a manual drag-to-place UI where Teressa visually positions signature and date fields on a PDF.js-rendered preview; (3) never send a document to a client without Teressa having confirmed field placements. This manual confirmation step is also a legal safeguard — it proves the agent reviewed the document before sending. Store field placements in PDF coordinates (bottom-left origin, points) in the database, not screen coordinates. Do not generate the preview inline in a serverless function on every request. Instead: generate the preview once (as a write operation), store the result in the `uploads/` directory or Vercel Blob, and serve it from there. The preview generation can be triggered on-demand (agent clicks "Generate Preview") and is idempotent. Set a timeout guard: if `PDFDocument.load()` takes longer than 8 seconds, return a 504 with "Preview temporarily unavailable." Monitor the Vercel function execution time and memory in the dashboard — alert at 70% of the memory limit.
**Warning signs:** **Warning signs:**
- The heuristic search for signature fields is the only detection method, with no fallback UI. - Preview is regenerated on every page load (no stored preview file).
- Field placements are stored in screen pixels without converting to PDF coordinate space. - The preview route calls `PDFDocument.load()` within a synchronous request handler.
- The app has never been tested with a scanned (non-OCR'd) PDF. - Tests only use PDFs smaller than 2MB.
**Phase to address:** **Phase to address:**
PDF field detection and form preparation UIdesign the manual fallback first, then add auto-detection as an enhancement. Filled document preview phase — establish the "generate once, serve cached" pattern from the start.
--- ---
### Pitfall 9: Font Not Embedded in PDF Before Flattening Causes Text to Disappear ### Pitfall 9: Client Signing Page Confusion — Preview Shows Agent Pre-Fill but Client Signs a Different Document
**What goes wrong:** **What goes wrong:**
Teressa fills in text fields (property address, client name, price) on the PDF. When the form is flattened (fields baked into the page content stream), the text either disappears, renders as boxes, or uses a substitute font that looks wrong. This happens when the font referenced by the AcroForm field is not embedded in the PDF file and is not available in the server-side processing environment. The filled preview shows the document with all text pre-fills applied (client name, property address, price). The client signing page also renders the prepared PDF — which already contains those fills (because `prepare-document.ts` fills AcroForm fields and draws text onto the PDF). But the visual design difference between "this is a preview for review" and "this is the actual document you are signing" is unclear. If the agent generates a stale preview and the client signs a different (more recent) prepared PDF, the client believes they signed what they previewed, but the legal document has different content.
**Why it happens:**
AcroForm fields reference fonts by name (e.g., "Helvetica"). On a desktop PDF viewer, these fonts are available system-wide and render correctly. On a headless server-side Node.js environment (e.g., Vercel serverless function), no system fonts are installed. When the field is flattened, the text cannot be rendered because the font is absent.
**How to avoid:** **How to avoid:**
Before flattening any filled form, ensure all fonts referenced by form fields are embedded in the PDF. With pdf-lib, embed the font explicitly: load a standard font (e.g., StandardFonts.Helvetica from pdf-lib's built-in set) and set it on each form field before calling `form.flatten()`. For forms that use custom fonts, embed the TTF/OTF font file directly. Test flattening in the exact serverless environment where it will run (not locally on a Mac with system fonts installed). The client signing page must always serve the **same** prepared PDF that was cryptographically hashed at prepare time. The preview the agent saw must be generated from that exact file — not a re-generation. Store the SHA-256 hash of the prepared PDF at preparation time (same pattern as the existing `pdfHash` for signed PDFs). When serving the client's signing PDF, recompute and verify the hash matches before streaming. This ties the signed document back to the exact bytes the agent previewed.
**Warning signs:** **Warning signs:**
- PDF flattening is tested only on a local development machine. - The preview is generated by a different code path than `prepare-document.ts` (e.g., a separate PDF rendering library).
- The serverless function environment has no system fonts installed and this has not been verified. - No hash is stored for the prepared PDF, only for the signed PDF.
- Filled form text is visible in the browser PDF preview (which uses browser fonts) but blank in the downloaded flattened PDF. - The agent can re-prepare after preview generation without the signing link being invalidated.
**Phase to address:** **Phase to address:**
PDF form filling and flattening — must be validated in the production environment, not just locally. Filled document preview phase AND agent signing phase — hash the prepared PDF immediately after writing it (extend the existing `pdfHash` pattern from signed to prepared).
--- ---
### Pitfall 10: Mobile Signing UX Makes Canvas Signature Unusable ### Pitfall 10: Agent Signature Field Handled by Client Signing Page
**What goes wrong:** **What goes wrong:**
Clients open the signing link on a mobile phone (the majority of email opens are on mobile). The signature canvas scrolls the page instead of capturing the drawing gesture. The canvas is too small to draw a legible signature. The "Submit" button is off-screen. The client gives up and calls Teressa to ask how to sign, creating friction that defeats the purpose of the email-link flow. A new `"agent-signature"` field type is added to `FieldPlacer`. The agent applies their saved signature to this field before sending. But `SigningPageClient.tsx` iterates all fields in `signatureFields` and shows a signing prompt for each one. If the agent-signature field is included in the array sent to the client, the client sees a field labeled "Signature" (or unlabeled) that is already visually signed with someone else's signature, and the progress bar counts it as an unsigned field the client must complete.
**Why it happens:** **Why it happens:**
The signing page is designed and tested on desktop. The signature canvas uses a standard `<canvas>` element with touch event listeners added, but the browser's default touch behavior (page scroll) intercepts the gesture before the canvas can capture it. `touch-action: none` is not applied. The client signing page receives the full `signatureFields` array from the GET `/api/sign/[token]` response. The route currently returns `doc.signatureFields ?? []` without filtering. When agent-signature fields are added to the same array, they are included in the client's field list.
**Concrete location in codebase:**
```typescript
// /src/app/api/sign/[token]/route.ts, line 88
signatureFields: doc.signatureFields ?? [],
```
This sends ALL fields to the client, including any agent-filled fields.
**How to avoid:** **How to avoid:**
Apply `touch-action: none` on the signature canvas element to prevent the browser from intercepting touch gestures. Set the canvas to fill the full viewport width on mobile (100vw minus padding). Implement pinch-zoom prevention on the signing page only (meta viewport `user-scalable=no`) so accidental zoom does not distort the canvas. Test the complete signing flow on iOS Safari and Android Chrome — these are the two browsers with the most idiosyncratic touch handling. Consider using the `signature_pad` npm library (szimek/signature_pad) which handles touch normalization across devices. Display a "Draw your signature here" placeholder inside the canvas that disappears on first touch. Filter the `signatureFields` array in the signing token GET route: only return fields where `type !== 'agent-signature'` (or more precisely, only return fields the client is expected to sign). Agent-signed fields should be pre-embedded into the `preparedFilePath` PDF during document preparation — by the time the client opens the signing link, the agent's signature is already baked into the prepared PDF as a drawn image. The `signatureFields` array sent to the client should contain only the fields the client needs to provide.
**Warning signs:** **Warning signs:**
- The signing page has only been tested on desktop Chrome. - The full `signatureFields` array is returned from the signing token GET without filtering by `type`.
- The canvas element has no `touch-action` CSS property set. - Agent-signed fields are stored in the same `signatureFields` JSONB column as client signature fields.
- A physical phone test shows the page scrolling instead of capturing strokes. - The client progress bar shows more fields than the client is responsible for signing.
**Phase to address:** **Phase to address:**
Client-facing signing UI — mobile must be a primary test target from the first implementation, not a responsive afterthought. Agent signing phase — filter the signing response by field type before the first agent-signed document is sent to a client.
---
### Pitfall 11: iOS Safari Canvas Self-Clearing and Vertical-Line Bug
**What goes wrong:**
On iOS 15+, the signature canvas occasionally wipes itself mid-draw. On iOS 13, drawing vertical lines registers as a single point rather than a stroke — so signatures with vertical elements look like dots. These bugs are silent: the app appears to work, but the exported signature is illegible or empty, and that illegible image gets embedded in a legal document.
**Why it happens:**
iOS 15 introduced a canvas self-clearing bug triggered by the Safari URL bar resizing the viewport while the user is drawing. The iOS 13 vertical stroke bug is a platform-level touch event regression. These do not reproduce in Chrome DevTools mobile emulation or on the iOS simulator — they require physical devices to discover.
**How to avoid:**
- Use `signature_pad` version >= 1.0.6 (szimek/signature_pad) or `react-signature-canvas` >= 1.0.6 which includes the iOS 15 workaround
- Set `preserveDrawingBuffer: true` on the canvas context when creating it to survive viewport resize events
- Before accepting the submitted signature, validate that the canvas contains a meaningful number of non-background pixels — reject and prompt re-draw if the image is effectively blank
- Export the canvas at 2x resolution (set `canvas.width` and `canvas.height` to 2x the CSS display size, then scale the 2D context by 2) so Retina displays produce a sharp embedded image
- Test signing on physical iOS 13, iOS 15, and latest iOS devices — not in simulator
**Warning signs:**
- Signed PDFs coming back from real clients have blank or near-blank signature images
- The signature canvas library has not been updated in over a year
- `preserveDrawingBuffer` is not explicitly set in the canvas initialization
**Phase to address:**
Client-facing signing UI — physical device testing required before any real client document is sent.
---
### Pitfall 12: Signed PDFs Are Accessible via Guessable URLs (IDOR)
**What goes wrong:**
An authenticated user (any agent or even a client who has signed their own document) modifies the document ID in the download URL — `/api/documents/1234/download` becomes `/api/documents/1233/download` — and successfully downloads another client's signed document. This is an Insecure Direct Object Reference (IDOR), consistently in the OWASP Top 10, and is a serious privacy violation in a real estate context where documents contain personal financial information.
**Why it happens:**
The download route checks that the user is authenticated (has a valid session cookie) but does not verify that the authenticated user owns the requested document. Sequential integer IDs make enumeration trivial. This pattern is easy to miss because it requires testing with two separate accounts — single-account testing never reveals it.
**How to avoid:**
- Use UUID v4 (not sequential integers) for all document IDs in URLs and API routes
- In every API route handler that returns a document or its metadata, query the database to confirm `document.agent_id === session.agent_id` (or the equivalent ownership check) before streaming or returning the file
- Do not rely solely on Next.js middleware for this check — middleware can be bypassed via CVE-2025-29927 (see Pitfall 13); put the ownership check inside the route handler itself
- Store signed PDFs in a private object storage bucket (S3, Supabase Storage with RLS, Cloudflare R2) and generate short-lived pre-signed URLs (15 minutes or less) for downloads — never serve from a static public URL
- Log every document download with the authenticated user's ID and timestamp for audit purposes
**Warning signs:**
- Document download URLs contain sequential integers (e.g., `/documents/47`)
- The download route handler does not perform a database ownership query before responding
- PDFs are served from a publicly accessible storage bucket URL
**Phase to address:**
Document storage and download API — UUID identifiers and ownership checks must be in place before any real client documents are uploaded.
---
### Pitfall 13: Next.js Middleware as the Only Authorization Gate (CVE-2025-29927)
**What goes wrong:**
Authorization for sensitive routes (admin portal, document downloads, signing status) is implemented exclusively in Next.js middleware. An attacker adds the `x-middleware-subrequest` header to their HTTP request and bypasses middleware entirely, gaining unauthenticated access to protected routes. This is CVE-2025-29927, disclosed in March 2025, affecting all Next.js versions before 14.2.25 and 15.2.3.
**Why it happens:**
Next.js middleware feels like the natural place to handle auth redirects. Developers add `middleware.ts` that checks for a session cookie and redirects unauthenticated users. This pattern is documented in the Next.js docs and is widely used — but the vulnerability demonstrated that middleware-only protection is not sufficient.
**How to avoid:**
- Update Next.js to >= 14.2.25 or >= 15.2.3 immediately
- Treat middleware as a first-layer UX redirect, not as a security enforcement layer
- Add an explicit session/authorization check inside every API route handler and every server component that renders sensitive data — do not assume middleware has already validated the request
- If deployed behind a reverse proxy (Nginx, Cloudflare, AWS ALB), configure it to strip the `x-middleware-subrequest` header from incoming requests
- The rule: "Middleware redirects unauthenticated users; route handlers enforce authorization" — both layers must exist
**Warning signs:**
- The `middleware.ts` file is the only place where session validation appears in the codebase
- API routes do not individually check `getSession()` or `getToken()` before returning data
- The Next.js version is older than 14.2.25
**Phase to address:**
API route implementation — every route handler that touches client or document data must include an in-handler auth check regardless of middleware state.
---
### Pitfall 14: Credential-Based utahrealestate.com Scraping Has CFAA Exposure
**What goes wrong:**
Using Teressa's utahrealestate.com credentials to automate form downloads is not just a Terms of Service question — it has Computer Fraud and Abuse Act (CFAA) exposure. A 2024 federal jury verdict (Ryanair vs. Booking.com, Delaware) found a scraping company violated the CFAA and established intent to defraud by accessing data behind a login wall, even using a third-party's credentials. The court further found that using techniques to avoid detection (rotating sessions, changing user-agent strings) separately supported the "intent to defraud" element.
**Why it happens:**
Developers treat "using your own credentials" as inherently authorized. But CFAA analysis asks whether the automated programmatic use exceeds the authorized use defined in the ToS, not just whether credentials are valid. The 2022 hiQ v. LinkedIn ruling protecting public data scraping explicitly does not extend to data behind authentication barriers.
**How to avoid:**
- Do not write automated credential-based scraping code for utahrealestate.com under any circumstances without a written API or data agreement from the platform
- The correct architecture: Teressa manually downloads needed form PDFs from utahrealestate.com and uploads them into the app's admin portal. This is a one-time-per-form-update action, not per-client.
- For state-approved real estate forms, the Utah Division of Real Estate publishes them at commerce.utah.gov/realestate — these are public domain and safe to bundle directly in the app
- If programmatic access is genuinely needed, contact utahrealestate.com to request a formal data agreement or API key before writing any code
**Warning signs:**
- Any code in the repository that performs an automated login to utahrealestate.com
- Teressa's MLS password is stored anywhere in the application's configuration or database
- The app has hardcoded selectors or URLs pointing to utahrealestate.com form library paths
**Phase to address:**
Forms import architecture — this decision must be made and documented before any integration code is written.
--- ---
@@ -324,16 +246,12 @@ Shortcuts that seem reasonable but create long-term problems.
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable | | Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|----------|-------------------|----------------|-----------------| |----------|-------------------|----------------|-----------------|
| Storing Teressa's utahrealestate.com credentials in `.env` for scraping | Avoids building a manual upload flow | ToS violation; breaks when site changes; security liability if env is leaked | Never — use manual PDF upload instead | | Store saved signature as dataURL in users table | No new file storage code needed | Every user query pulls 1560KB of base64; dataURL exposed in client props | Never — use file storage from the start |
| Using client-side timestamp for signing event | Simpler code, no server roundtrip | Timestamps are falsifiable; legally worthless as audit evidence | Never for legal audit trail | | Re-use same `_prepared.pdf` path for preview and final prepared doc | No versioning logic needed | Stale previews; no way to prove which prepared PDF the client signed | Never — versioned paths required for legal integrity |
| Skipping PDF coordinate conversion unit tests | Faster initial development | Silent placement bugs on real-world forms; discovered only when a client signs in the wrong place | Never — write the test before building the UI | | Return all signatureFields to client (no type filtering) | Simpler route code | Client sees agent-signature fields as required fields to complete | Never for agent-signature type; acceptable for debugging only |
| Stateless JWT as signing token with no DB revocation | Simpler auth, no DB lookup on link access | Tokens cannot be invalidated; replay attacks are undetectable | Never for one-time signing links | | Prompt OpenAI with entire PDF as one request | Simpler prompt code | Fails silently on documents > ~8 pages; token limit hit without hard error | Acceptable only for prototyping with < 5 page test PDFs |
| Hardcoding IDX disclaimer text in a component | Quick to ship | Disclaimer text changes with NAR policy; requires code deploy to update | Acceptable only if a content management hook is added in the next phase | | Add `type` to SignatureFieldData but don't add a schema migration | Skip Drizzle migration step | Existing rows have `null` type; `signatureFields` JSONB array has mixed null/typed entries; TypeScript union breaks | Never — migrate immediately |
| Full listing re-fetch instead of delta sync | Simpler initial implementation | Rate limit exhaustion; slow sync; stale off-market listings | Acceptable in Phase 1 dev/test only, must be replaced before public launch | | Generate preview on every page load | No caching logic needed | OOM errors on large PDFs under Vercel memory limit; slow UX | Acceptable only during local development |
| Skipping form flattening (storing unflattenend PDFs with live AcroForm fields) | Faster PDF processing | Signed documents can be edited after the fact in any PDF editor; legally indefensible | Never for final signed documents |
| Using sequential integer IDs for document URLs | Simpler schema | IDOR vulnerability — any user can enumerate other clients' documents by changing the number in the URL | Never — use UUID v4 |
| Putting all auth logic in Next.js middleware only | Less code per route | CVE-2025-29927 bypass allows unauthenticated access by adding a single HTTP header | Never — always add in-handler auth checks |
| Skipping physical iOS device testing for signature canvas | Faster QA iteration | iOS 13 vertical stroke bug and iOS 15 canvas self-clearing are invisible in simulator | Never for any client-facing signing release |
--- ---
@@ -343,14 +261,12 @@ Common mistakes when connecting to external services.
| Integration | Common Mistake | Correct Approach | | Integration | Common Mistake | Correct Approach |
|-------------|----------------|------------------| |-------------|----------------|------------------|
| WFRMLS RESO API | Using `$filter` with wrong field name casing (e.g., `modificationtimestamp` instead of `ModificationTimestamp`) — all field names are case-sensitive | Always copy field names exactly from the RESO metadata endpoint response; do not guess or lowercase | | OpenAI Vision API | Sending raw PDF bytes — PDFs are not natively supported by vision models | Convert each page to PNG via pdfjs-dist on the server; send page images, not PDF bytes |
| WFRMLS RESO API | Expecting sold/closed price data in IDX feed | Utah is a non-disclosure state; closed price data is not available in IDX feeds; design UI without sale price history | | OpenAI structured output | Using `response_format: { type: 'json_object' }` and hoping the schema matches | Use `response_format: { type: 'json_schema', json_schema: { ... } }` with the exact schema, then validate with Zod |
| WFRMLS RESO API | Assuming instant vendor credential provisioning | Vendor approval requires contract signing, background check, and compliance review; allow 24 weeks lead time | | `@cantoo/pdf-lib` (confirmed import in codebase) | Calling `embedPng()` with a base64 dataURL that includes the `data:image/png;base64,` prefix on systems that strip it | The existing `embed-signature.ts` already handles this correctly — preserve the pattern when adding new embed paths |
| utahrealestate.com forms | Automating PDF downloads with Teressa's session cookies | Terms of Service prohibit automated data extraction; use manual upload or state-published public forms | | `@cantoo/pdf-lib` flatten | Flattening before drawing rectangles causes AcroForm overlay to appear on top of drawn content | The existing `prepare-document.ts` already handles order correctly (flatten first, then draw) — preserve this order in any new prepare paths |
| Email delivery (signing links) | Relying on `localhost` or unverified sender domain for transactional email | Signing emails go to spam; SPF/DKIM/DMARC must be configured on teressacopelandhomes.com before first client send | | Vercel Blob (if migrated from local uploads) | Fetching a Blob URL inside a serverless function on the same Vercel deployment causes a request to the CDN with potential cold-start latency | Use the `@vercel/blob` SDK's `get()` method rather than `fetch(blob.url)` from within API routes |
| pdf-lib form flattening | Calling `form.flatten()` without first embedding fonts | Text fields render blank or with substitute fonts on server-side; embed fonts explicitly before flattening | | Agent signature file serving | Serving the agent's saved signature PNG via a public URL | Gate all signature file access behind the authenticated agent API — never expose with a public Blob URL |
| pdf-lib signature image placement | Using screen/canvas Y coordinate directly as the pdf-lib Y value | PDF origin is bottom-left; screen origin is top-left — always apply `pdfY = page.getHeight() - (canvasY / viewportScale)` |
| Document download API | Checking authentication only ("is logged in?") rather than authorization ("does this user own this document?") | Perform an ownership database query inside the route handler; use UUID document IDs; serve from private storage with short-lived signed URLs |
--- ---
@@ -360,10 +276,10 @@ Patterns that work at small scale but fail as usage grows.
| Trap | Symptoms | Prevention | When It Breaks | | Trap | Symptoms | Prevention | When It Breaks |
|------|----------|------------|----------------| |------|----------|------------|----------------|
| Full WFRMLS listing re-fetch on every sync | Slow sync jobs; API rate limit errors; listings temporarily unavailable during fetch | Implement delta sync using `ModificationTimestamp` filter | At ~100+ listings or hourly sync schedule | | OpenAI call inline with agent "AI Place Fields" button click | 1030 second page freeze; API timeout on multi-page PDFs | Trigger AI placement as a background job; poll for completion; show progress bar | Immediately on PDFs > 5 pages |
| Generating signed PDFs synchronously in an API route | Signing page times out; Vercel/Next.js 10s function timeout exceeded | Move PDF generation to a background job or streaming response | On PDFs larger than ~2MB or with complex form filling | | PDF preview generation in a synchronous serverless function | Vercel function timeout (60s max Pro); OOM on 8MB PDFs | Generate once and store; serve from storage | On PDFs > 10MB or under concurrent load |
| Storing signed PDFs in the Next.js app's local filesystem | PDFs lost on serverless deployment; no persistence across function instances | Store all documents in S3-compatible object storage (e.g., AWS S3, Cloudflare R2) from day one | On first serverless deployment | | Storing all signatureFields JSONB on documents table without a size guard | Large JSONB column slows document list queries | Add a field count limit (max 50 fields); if AI places more, require agent review | When AI places fields on 25+ page documents with many fields per page |
| Loading full-page PDF.js bundle on every page of the site | Slow initial page load on the public marketing site | Code-split the PDF viewer — load it only on the agent dashboard and signing page routes | Immediately — affects all visitors | | dataURL signature image in `signaturesRef.current` in SigningPageClient | Each re-render serializes 50KB+ per signature into JSON | Already handled correctly in v1.0 (ref, not state) — do not move signature data to state when adding type-based rendering | Would break at > 5 simultaneous signature fields |
--- ---
@@ -373,15 +289,11 @@ Domain-specific security issues beyond general web security.
| Mistake | Risk | Prevention | | Mistake | Risk | Prevention |
|---------|------|------------| |---------|------|------------|
| Signing link token is not hashed at rest in the database | Database breach exposes all active signing tokens; attacker can sign documents on behalf of clients | Store HMAC-SHA256(token) in DB; compare hash on request, never the raw token | | Agent saved signature served via a predictable or public file path | Any user who can guess the path downloads the agent's legal signature | Store under a UUID path; serve only through `GET /api/agent/signature` which verifies the better-auth session before streaming |
| No rate limit on signing link generation | Attacker floods Teressa's client email inboxes with signing emails from the app | Rate limit signing link creation per document and per agent session (max 3 resends per document) | | AI field placement values (pre-fill text) passed to OpenAI without scrubbing | Client PII (name, email, SSN, property address) sent to OpenAI and stored in their logs | Provide only anonymized document structure to the AI (page images without personally identifiable pre-fill values); apply pre-fill values server-side after AI field detection |
| Signed PDFs accessible via guessable URL (e.g., `/documents/123`) | Sequential ID enumeration lets anyone download any signed document | Use unguessable UUIDs for document storage paths and signed URLs with short TTL for downloads | | Preview PDF served at a guessable URL (e.g. `/api/documents/{id}/preview`) without auth check | Anyone with the document ID can download a prepared document containing client PII | All document file routes must verify the agent session before streaming — apply the same guard as the existing `/api/documents/[id]/download/route.ts` |
| Agent session not invalidated on agent logout | If Teressa leaves a browser tab open on a shared computer, her entire client document library is accessible | Implement server-side session invalidation on logout; do not rely on client-side cookie deletion alone | | Agent signature dataURL transmitted from client to server in an unguarded API route | Any authenticated user (if multi-agent is ever added) can overwrite the saved signature | The save-signature endpoint must verify the session user matches the signature owner — prepare for this even in solo-agent v1 |
| Signature canvas accepts 0-stroke "signatures" | Client submits a blank canvas as their signature; the signed document has no visible signature mark | Validate that the canvas has at least a minimum number of non-white pixels / stroke events before accepting submission | | Signed PDF stale preview served to client after re-preparation | Client signs a document that differs from what agent reviewed and approved | Hash prepared PDF at prepare time; verify hash before serving to client signing page |
| No CSRF protection on the signature submission endpoint | Cross-site request forgery could submit a forged signature | The signing token in the URL/body acts as an implicit CSRF token only if validated server-side on every request; add `SameSite=Strict` to any session cookies |
| Signed PDFs at predictable/guessable URLs (sequential integer IDs) | Any authenticated user can enumerate and download all signed documents; real estate documents contain SSNs, financial info, home addresses | Use UUID v4 for document IDs; store in private bucket; issue short-lived pre-signed download URLs per-request |
| Next.js middleware as sole authorization layer | CVE-2025-29927 allows middleware bypass with a crafted HTTP header; attacker accesses any protected route without authentication | Upgrade Next.js to >= 14.2.25; add ownership check in every API route handler independent of middleware |
| Blank canvas accepted as a valid signature | Empty or near-empty PNG gets embedded in the signed document as the client's legally binding signature | Server-side validation: reject submissions where the canvas image has fewer than a minimum number of non-background pixels or fewer than a minimum number of recorded stroke events |
--- ---
@@ -391,12 +303,12 @@ Common user experience mistakes in this domain.
| Pitfall | User Impact | Better Approach | | Pitfall | User Impact | Better Approach |
|---------|-------------|-----------------| |---------|-------------|-----------------|
| Showing "Invalid token" when a signing link is expired | Client calls Teressa confused; does not know what to do next | Show "This signing link has expired. Contact Teressa Copeland to request a new one." with her phone/email pre-filled | | Preview opens in a new browser tab as a raw PDF | Agent has no context that this is a preview vs. the final document; no field overlays visible | Display preview in-app with a "PREVIEW — Fields Filled" watermark overlay on each page |
| Showing "Invalid token" when a signing link has already been used | Client thinks something is broken; may attempt to sign again | Show "You have already signed this document. Teressa has your signed copy." with a reassuring message | | AI-placed fields shown without a review step | Agent sends a document with misaligned AI fields to a client; client is confused by floating sign boxes | AI placement populates the FieldPlacer UI for agent review — never auto-sends; agent must manually click "Looks good, proceed" |
| Forcing client to scroll through the entire document before the "Sign" button appears | Clients on mobile give up before reaching the signature field | Provide a sticky "Jump to Signature" button; do not hide the sign action behind mandatory scrolling | | "Prepare and Send" button available before the agent has placed any fields | Agent sends a blank document with no signature fields; client has nothing to sign | Disable "Prepare and Send" if `signatureFields` is empty or contains only agent-signature fields (no client fields) |
| No confirmation screen after signing | Client does not know if the submission worked | Show a clear "Document Signed Successfully" confirmation with the document name and a timestamp; optionally send a confirmation email | | Agent saved signature is applied but no visual confirmation is shown | Agent thinks the signature was applied; document arrives unsigned because the apply step silently failed | Show the agent's saved signature PNG in the field placer overlay immediately after apply; require explicit confirmation before the prepare step |
| Agent has no way to see which documents are awaiting signature vs. completed | Teressa loses track of unsigned documents | Dashboard must show document status at a glance: Draft / Sent / Signed / Archived — never mix them in a flat list | | Preview shows pre-filled text but not field type labels | Agent cannot distinguish a "checkbox" pre-fill from a "text" pre-fill in the visual preview | Show field type badges (small colored labels) on the preview overlay, not just the filled content |
| Client-facing signing page has Teressa's agent branding (logo, nav) mixed with client-action UI | Client is confused about whether they are on Teressa's site or a signing service | Keep the signing page minimal: document title, Teressa's name as sender, the document, the signature canvas, and a submit button — no site navigation | | Client signing page shows no progress for non-signature fields (text, checkbox, date) | Client doesn't know they need to fill in text boxes or check checkboxes — sees only signature prompts | The progress bar in `SigningProgressBar.tsx` counts `signatureFields.length` — this must count all client-facing fields, not just signature-type fields |
--- ---
@@ -404,20 +316,14 @@ Common user experience mistakes in this domain.
Things that appear complete but are missing critical pieces. Things that appear complete but are missing critical pieces.
- [ ] **E-signature capture:** Signing canvas works in Chrome desktop — verify it captures correctly on iOS Safari and Android Chrome with touch gestures (not scroll events). - [ ] **AI field placement:** Verify the coordinate conversion unit test asserts specific PDF-space x/y values (not just "fields are returned") — eyeball testing will miss Y-axis inversion errors on Utah standard forms.
- [ ] **PDF signing:** Signature appears in the browser PDF preview — verify the position is correct in the actual downloaded PDF file opened in Adobe Reader or Preview. - [ ] **Expanded field types:** Verify `SigningPageClient.tsx` has a rendering branch for every type in the `SignatureFieldData` type union — not just the new FieldPlacer palette tokens. Check for the default/fallback case.
- [ ] **Audit trail:** Events are logged — verify the log includes: email sent, link opened, document viewed, signature submitted, final PDF hash, all with server-side timestamps. - [ ] **Agent saved signature:** Verify the saved signature is stored as a file path, not a dataURL TEXT column — check the Drizzle schema migration and confirm no `dataUrl` column was added to `users`.
- [ ] **Signing link security:** Link opens the signing page — verify the link cannot be used a second time after signing and expires after 72 hours even if unused. - [ ] **Agent signs first:** Verify that after agent applies their signature, the agent-signature field is embedded into the prepared PDF and removed from the `signatureFields` array that gets sent to the client — not just visually hidden in the FieldPlacer.
- [ ] **Font flattening:** Form fields fill correctly locally — verify the filled and flattened PDF looks correct when generated in the production serverless environment (not just on a Mac with system fonts). - [ ] **Filled preview:** Verify the preview URL changes when fields or text fill values change (cache-busting via timestamp or hash in the path) — open DevTools network tab, modify a field, re-generate preview, confirm a new file is fetched.
- [ ] **IDX compliance:** Listings display on the public site — verify every listing card and detail page has: listing broker attribution, MLS disclaimer text, last updated timestamp, and buyer's agent compensation field per 2024 NAR settlement. - [ ] **Filled preview freshness gate:** Verify the "Send" button is disabled when `lastPreviewGeneratedAt < lastFieldsUpdatedAt` — test by generating a preview, changing a field, and confirming the send button becomes disabled.
- [ ] **Stale listings:** Sync job runs — verify the job uses `ModificationTimestamp` delta filtering and removes off-market listings within 1 hour of status change, not at next full refresh. - [ ] **OpenAI token limit:** Verify the AI placement works on a real 20-page Utah REPC form, not just a 2-page test PDF — check that page 15+ fields are detected with the same accuracy as page 1.
- [ ] **PDF coordinate placement:** Signature field drag-and-drop works on a blank PDF — verify placement coordinates are correct on an actual Utah real estate purchase agreement form, including any rotated pages. - [ ] **Schema migration:** Verify that documents created in v1.0 (where `signatureFields` JSONB has entries without a `type` key) are handled gracefully by all v1.1 code paths — add a null-safe fallback for `field.type ?? 'signature'` throughout.
- [ ] **Document storage:** PDFs save — verify signed PDFs are stored in persistent object storage (S3/R2), not the local filesystem, and that URLs survive a fresh deployment.
- [ ] **Document IDOR:** Document download works for the document owner — verify with a second test account that changing the document ID in the URL returns a 403 or 404, not the other account's document.
- [ ] **iOS canvas bugs:** Signing canvas works on Chrome desktop — verify on a physical iPhone (iOS 13 and iOS 15+) in Safari that vertical strokes register correctly and the canvas does not clear itself during drawing.
- [ ] **Blank signature rejection:** Submitting a signature works — verify that clicking Submit on an untouched blank canvas returns a validation error rather than producing a signed document with an empty signature image.
- [ ] **Next.js auth:** Route protection appears to work via middleware — verify that every sensitive API route handler contains its own session/ownership check independent of middleware by testing with a crafted request that bypasses middleware.
- [ ] **CFAA compliance:** Forms import works — verify there is no code in the repository that performs automated login to utahrealestate.com; all forms must enter the system via manual agent upload.
--- ---
@@ -427,13 +333,12 @@ When pitfalls occur despite prevention, how to recover.
| Pitfall | Recovery Cost | Recovery Steps | | Pitfall | Recovery Cost | Recovery Steps |
|---------|---------------|----------------| |---------|---------------|----------------|
| Signed PDF lacks document hash; legal challenge | HIGH | Reconstruct signing ceremony evidence from audit log events, IP records, email delivery logs; engage legal counsel; this cannot be fully recovered — prevention is the only answer | | Client received signing link but signing page crashes on new field types | HIGH | Emergency hotfix: add `field.type ?? 'signature'` fallback in SigningPageClient; deploy; invalidate old token; send new link |
| utahrealestate.com scraping blocked or ToS violation notice | MEDIUM | Immediately disable scraper; switch to manual PDF upload flow; apologize to MLS; a manually-uploaded forms library was the right design from the start | | AI placed fields are wrong/inverted on first real-form test | LOW | Fix coordinate conversion unit; re-run AI placement for that document; no data migration needed |
| IDX compliance violation discovered (missing attribution) | MEDIUM | Fix attribution immediately across all listing pages; contact WFRMLS compliance to self-report before they audit; document the fix with timestamps | | Agent saved signature stored as dataURL in DB | MEDIUM | Add migration: extract dataURL to file, update path column, nullify dataURL column; existing signed PDFs are unaffected |
| Signing link replay attack (duplicate signature submitted) | MEDIUM | Add `used_at` validation to token check; mark the earlier duplicate submission as invalid in the audit log; notify affected signer via email | | Preview PDF served stale after field changes | LOW | Add cache-busting query param or timestamp to preview URL; no data changes needed |
| Signatures misplaced in PDFs (coordinate bug) | MEDIUM | Identify all affected documents; mark them as invalid in the system; Teressa must re-send for re-signing; fix coordinate conversion and add the regression test | | Agent-signature field appears in client's signing field list | HIGH | Emergency hotfix: filter signatureFields in signing token GET by type; redeploy; affected in-flight signing sessions may need new tokens |
| PDF font flattening failure (blank text in signed PDFs) | LOW-MEDIUM | Embed standard fonts explicitly in the flattening step; regenerate affected documents; re-send for signing if clients have not yet signed | | Large PDF causes Vercel function OOM during preview generation | MEDIUM | Switch preview to background job + polling; no data migration; existing prepared PDFs are valid |
| Stale listing shown to client who made an offer on unavailable property | LOW | Implement hourly delta sync immediately; add a "verify availability" disclaimer to listing pages in the short term |
--- ---
@@ -443,68 +348,34 @@ How roadmap phases should address these pitfalls.
| Pitfall | Prevention Phase | Verification | | Pitfall | Prevention Phase | Verification |
|---------|------------------|--------------| |---------|------------------|--------------|
| No tamper-evident PDF hash | Document signing backend | Compute and store SHA-256 hash on every signed PDF; verify by recomputing and comparing | | Breaking signing page with new field types (Pitfall 1) | Phase 1: Schema + signing page update | Deploy field type union; confirm signing page renders placeholder for unknown types; load an old v1.0 document with no type field and verify graceful fallback |
| Incomplete audit trail | Email-link signing flow | Trace a full test signing ceremony and confirm all 6 event types appear in the audit log | | AI coordinate system mismatch (Pitfall 2) | Phase 2: AI integration — coordinate conversion utility | Unit test with a known Utah REPC: assert specific PDF-space x/y for a known field; Y-axis inversion test |
| Replayable signing link | Email-link signing flow | After signing, reload the signing URL and confirm it returns "already signed" — not the canvas | | OpenAI token limits on large PDFs (Pitfall 3) | Phase 2: AI integration — page-by-page pipeline | Test with the longest form Teressa uses (likely 20+ page REPC); verify all pages processed |
| PDF coordinate mismatch | PDF field placement UI | Place a test signature on a known position in an actual Utah purchase agreement; open in Acrobat and verify visual position | | Prompt hallucination and schema incompatibility (Pitfall 4) | Phase 2: AI integration — Zod validation of AI response | Feed an edge-case page (all text, no form fields) and verify AI returns empty array, not hallucinated fields |
| utahrealestate.com forms scraping | Forms import architecture | No scraping code exists in the codebase; PDF source is manual upload or public state forms only | | Saved signature as dataURL in DB (Pitfall 5) | Phase 3: Agent saved signature | Confirm Drizzle schema has a path column, not a dataURL column; verify file is stored under UUID path |
| IDX display without required attribution | Public listings feature | Audit every listing page template against NAR IDX policy checklist before launch | | Race condition: agent updates signature mid-signing (Pitfall 6) | Phase 3: Agent saved signature + supersede flow | Confirm "Prepare and Send" on a Sent/Viewed document requires confirmation and invalidates old token |
| Stale off-market listings | Listings sync infrastructure | Manually mark a test listing off-market in a dev MLS feed; verify it disappears from the site within 1 hour | | Stale preview after field changes (Pitfall 7) | Phase 4: Filled document preview | Modify a field after preview generation; confirm send button disables or preview refreshes |
| PDF heuristic field detection failure | PDF form preparation UI | Test against a scanned (non-OCR) Utah real estate form; verify the manual placement fallback UI appears | | OOM on large PDF preview (Pitfall 8) | Phase 4: Filled document preview | Test preview generation on a 20-page REPC; monitor Vercel function memory in dashboard |
| Font not embedded before flattening | PDF form filling | Flatten a filled form in the production serverless environment; open the result in Acrobat and confirm all text is visible | | Client signs different doc than agent previewed (Pitfall 9) | Phase 4: Filled document preview | Confirm prepared PDF is hashed at prepare time; verify hash is checked before streaming to client |
| Mobile canvas unusable | Client signing UI | Complete a full signing flow on iOS Safari and Android Chrome on physical devices | | Agent-signature field shown to client (Pitfall 10) | Phase 3: Agent signing flow | Confirm signing token GET filters `type === 'agent-signature'` fields before returning; test with a document that has both agent and client signature fields |
| Missing MLS compliance disclaimers | Public listings feature | Legal/compliance checklist as a PR gate for any listing-display component |
| iOS canvas self-clearing / vertical stroke bug | Client signing UI | Physical device testing on iOS 13 and iOS 15+ before any client-facing release |
| Signed PDF IDOR (sequential IDs, no ownership check) | Document storage architecture | Penetration test with two accounts: confirm ID enumeration returns 403, not another user's document |
| Next.js middleware-only auth (CVE-2025-29927) | API route implementation | Code review gate: every sensitive route handler must contain an explicit ownership/session check |
| CFAA exposure from credential-based scraping | Forms import architecture design | Architecture decision document: no automated login code for utahrealestate.com; manual upload only |
--- ---
## Sources ## Sources
- [ESIGN Act & UETA Compliance — UnicornForms](https://www.unicornforms.com/blog/esign-ueta-compliance) - Reviewed `src/lib/db/schema.ts``SignatureFieldData` has no `type` field; confirmed by inspection 2026-03-21
- [E-Signature Audit Trail Schema — Anvil Engineering](https://www.useanvil.com/blog/engineering/e-signature-audit-trail-schema-events-json-checklist/) - Reviewed `src/app/sign/[token]/_components/SigningPageClient.tsx` — confirmed all fields open signature modal; no type branching
- [Using E-Signatures in Court — Fenwick](https://www.fenwick.com/insights/publications/using-e-signatures-in-court-the-value-of-an-audit-trail) - Reviewed `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` — confirmed single "Signature" token; `screenToPdfCoords` function confirms Y-axis inversion pattern
- [E-Signature Software with Audit Trail — Zignt](https://zignt.com/blog/e-signature-software-with-audit-trail) - Reviewed `src/lib/signing/embed-signature.ts` — confirms `@cantoo/pdf-lib` import; PNG-only embed
- [Utah Electronic Signature Act Explained — Signable](https://www.signable.co.uk/electronic-signatures-legally-binding-utah/) - Reviewed `src/lib/pdf/prepare-document.ts` — confirms AcroForm flatten-first ordering; text stamp fallback
- [UtahRealEstate.com Data Services — Vendor FAQ](https://vendor.utahrealestate.com/faq) - Reviewed `src/app/api/sign/[token]/route.ts` — confirmed `signatureFields: doc.signatureFields ?? []` sends unfiltered fields to client (line 88)
- [UtahRealEstate.com RESO Web API Docs](https://vendor.utahrealestate.com/webapi/docs/tuts/endpoints) - Reviewed `src/app/portal/(protected)/documents/[docId]/_components/PreparePanel.tsx` — no guard against re-preparation of Sent/Viewed documents
- [WFRMLS IDX — Utah MLS at RESO](https://www.reso.org/web-api-examples/mls/utah-mls/) - [OpenAI Vision API Token Counting](https://platform.openai.com/docs/guides/vision#calculating-costs) — image token costs confirmed; LOW tile = 85 tokens, HIGH tile adds detail tokens per 512px tile
- [UtahRealEstate Partners with SkySlope Forms](https://blog.utahrealestate.com/index.php/2022/01/31/utahrealestate-comskyslope/) - [OpenAI Structured Output (JSON Schema mode)](https://platform.openai.com/docs/guides/structured-outputs) — `json_schema` mode confirmed as more reliable than `json_object` for typed responses
- [IDX Integration Best Practices — Real Estate 7](https://contempothemes.com/idx-integration-best-practices-for-mls-rules/) - [Vercel Serverless Function Limits](https://vercel.com/docs/functions/runtimes/node-js#memory-and-compute) — 256MB default, 1024MB on Pro; 60s max execution on Pro
- [NAR IDX Policy Statement 7.58](https://www.nar.realtor/handbook-on-multiple-listing-policy/advertising-print-and-electronic-section-1-internet-data-exchange-idx-policy-policy-statement-7-58) - `@cantoo/pdf-lib` confirmed as the import used (not `@pdfme/pdf-lib` or `pdf-lib`) — v1.0 codebase uses this fork throughout
- [Summary of 2025 MLS Changes — NAR](https://www.nar.realtor/about-nar/policies/summary-of-2025-mls-changes)
- [PDF Page Coordinates — pdfscripting.com](https://www.pdfscripting.com/public/PDF-Page-Coordinates.cfm)
- [PDF Coordinate Systems — Apryse](https://apryse.com/blog/pdf-coordinates-and-pdf-processing)
- [Critical Bug: PDF Annotation Positioning Mismatch — sign-pdf GitHub issue](https://github.com/mattsilv/sign-pdf/issues/1)
- [pdf-lib Field Coordinates Feature Request — GitHub issue #602](https://github.com/Hopding/pdf-lib/issues/602)
- [Magic Link Security Best Practices — Deepak Gupta](https://guptadeepak.com/mastering-magic-link-security-a-deep-dive-for-developers/)
- [Passwordless Magic Links: UX and Security Checklist — AppMaster](https://appmaster.io/blog/passwordless-magic-links-ux-security-checklist)
- [Magic Link Security — BayTech Consulting](https://www.baytechconsulting.com/blog/magic-links-ux-security-and-growth-impacts-for-saas-platforms-2025)
- [Signature Pad Library — szimek/signature_pad](https://github.com/szimek/signature_pad)
- [MLS Listing Data Freshness and Cache — MLSImport](https://mlsimport.com/fix-outdated-listings-on-your-wordpress-real-estate-site/)
- [Utah DRE State-Approved Forms — commerce.utah.gov](https://commerce.utah.gov/realestate/real-estate/forms/state-approved/)
- [Canvas clears itself on iOS 15 (react-signature-canvas Issue #65) — GitHub](https://github.com/agilgur5/react-signature-canvas/issues/65)
- [Drawing broken on iOS 13 (signature_pad Issue #455) — GitHub](https://github.com/szimek/signature_pad/issues/455)
- [Free drawing on iOS disables page scrolling (fabric.js Issue #3756) — GitHub](https://github.com/fabricjs/fabric.js/issues/3756)
- [Insecure Direct Object Reference Prevention — OWASP Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Insecure_Direct_Object_Reference_Prevention_Cheat_Sheet.html)
- [IDOR in Next.js / JavaScript Applications — nodejs-security.com](https://www.nodejs-security.com/blog/insecure-direct-object-reference-idor-javascript-applications)
- [CVE-2025-29927: Next.js Middleware Authorization Bypass — ProjectDiscovery](https://projectdiscovery.io/blog/nextjs-middleware-authorization-bypass)
- [Critical Next.js Vulnerability: Authorization Bypass in Middleware — Jit.io](https://www.jit.io/resources/app-security/critical-nextjs-vulnerability-authorization-bypass-in-middleware)
- [U.S. Court Rules Against Online Travel Booking Company in Web-Scraping Case (2024 CFAA jury verdict) — Alston Bird](https://www.alstonprivacy.com/u-s-court-rules-against-online-travel-booking-company-in-web-scraping-case/)
- [Web Scraping, website terms and the CFAA: hiQ affirmed — White & Case](https://www.whitecase.com/insight-our-thinking/web-scraping-website-terms-and-cfaa-hiqs-preliminary-injunction-affirmed-again)
- [The Computer Fraud and Abuse Act and Third-Party Web Scrapers — Finnegan](https://www.finnegan.com/en/insights/articles/the-computer-fraud-and-abuse-act-and-third-party-web-scrapers.html)
- [JWT Security Best Practices — Curity](https://curity.io/resources/learn/jwt-best-practices/)
- [JSON Web Token Attacks — PortSwigger Web Security Academy](https://portswigger.net/web-security/jwt)
- [PDF Form Flattening — DynamicPDF](https://www.dynamicpdf.com/docs/dotnet/dynamic-pdf-form-flattening)
- [Flatten Your PDFs for Court Filings — eFilingHelp](https://www.efilinghelp.com/electronic-filing/flatten-pdfs/)
- [E-Signatures in Real Estate Transactions: a Deep Dive — Gomez Law, APC](https://gomezlawla.com/blog/e-signatures-in-real-estate-transactions-a-deep-dive/)
- [Enforceability of Electronic Agreements in Real Estate — Arnall Golden Gregory LLP](https://www.agg.com/news-insights/publications/enforceability-of-electronic-agreements-in-real-estate-transactions-06-30-2016/)
- [Utah Code § 46-4-201: Legal recognition of electronic signatures — Justia](https://law.justia.com/codes/utah/title-46/chapter-4/part-2/section-201/)
- [Utah Code § 46-4-203: Attribution and effect of electronic signature — Justia](https://law.justia.com/codes/utah/2023/title-46/chapter-4/part-2/section-203/)
--- ---
*Pitfalls research for: Real estate broker web app — custom e-signature, WFRMLS/IDX integration, PDF document signing* *Pitfalls research for: Teressa Copeland Homes — v1.1 AI field placement, expanded field types, agent signing, filled preview*
*Researched: 2026-03-19* *Researched: 2026-03-21*

View File

@@ -1,41 +1,39 @@
# Stack Research # Stack Research
**Domain:** Real estate agent website + PDF document signing web app **Domain:** Real estate agent website + PDF document signing web app
**Researched:** 2026-03-19 **Researched:** 2026-03-21
**Confidence:** HIGH (versions verified via npm/GitHub; integration strategies based on official docs) **Confidence:** HIGH (versions verified via npm registry; integration issues verified via official GitHub issues)
**Scope:** v1.1 additions only — OpenAI integration, expanded field types, agent signature storage, filled preview
--- ---
## Recommended Stack ## Existing Stack (Do Not Re-research)
### Core Technologies Already validated and in `package.json`. Do not change these.
| Technology | Version in package.json | Role |
|------------|------------------------|------|
| Next.js | 16.2.0 | Full-stack framework |
| React | 19.2.4 | UI |
| `@cantoo/pdf-lib` | ^2.6.3 | PDF modification (server-side) |
| `react-pdf` | ^10.4.1 | In-browser PDF rendering |
| `signature_pad` | ^5.1.3 | Canvas signature drawing |
| `zod` | ^4.3.6 | Schema validation |
| `@vercel/blob` | ^2.3.1 | File storage |
| Drizzle ORM + `postgres` | ^0.45.1 / ^3.4.8 | Database |
| Auth.js (next-auth) | 5.0.0-beta.30 | Authentication |
---
## New Stack Additions for v1.1
### Core New Dependency: OpenAI API
| Technology | Version | Purpose | Why Recommended | | Technology | Version | Purpose | Why Recommended |
|------------|---------|---------|-----------------| |------------|---------|---------|-----------------|
| Next.js | 15.5.x (use 15, not 16 — see note) | Full-stack framework | App Router + Server Components + API routes in one repo; Vercel-native; 15.5 is current LTS-equivalent with stable Node.js middleware | | `openai` | ^6.32.0 | OpenAI API client for GPT calls | Official SDK, current latest, TypeScript-native. Provides `client.chat.completions.create()` for structured JSON output via manual `json_schema` response format. Required for AI field placement and pre-fill. |
| React | 19.x | UI | Ships with Next.js 15; Server Components, `use()`, form actions all stable |
| TypeScript | 5.x | Type safety | Required; Prisma, Auth.js, pdf-lib all ship full types |
| PostgreSQL (Neon) | latest (PG 16) | Data persistence | Serverless-compatible, scales to zero, free tier generous, branches per PR — perfect for solo agent on Vercel |
| Prisma ORM | 6.x (6.19+) | Database access | Best-in-class DX for TypeScript; schema migrations; works with Neon via `@neondatabase/serverless` driver |
**Note on Next.js version:** Next.js 16 is now available but introduced several breaking changes (sync APIs fully removed, `middleware` renamed to `proxy`, edge runtime dropped from proxy). Use **Next.js 15.5.x** for this project to avoid churn — it has stable Node.js middleware, stable Turbopack builds in beta, and typed routes. Upgrade to 16 after it stabilizes (6-12 months). **No other new core dependencies are needed.** The remaining v1.1 features extend capabilities already in `@cantoo/pdf-lib`, `signature_pad`, and `react-pdf`.
---
### PDF Processing Libraries
Three separate libraries serve three distinct roles. You need all three.
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| `pdfjs-dist` | 5.5.x (5.5.207 current) | Render PDFs in browser; detect existing AcroForm fields | Use on the client side to display the imported PDF and to iterate over existing form field annotations so you can find where signature areas already exist |
| `pdf-lib` | 1.17.1 (original) OR `@pdfme/pdf-lib` 5.5.x (actively maintained fork) | Modify PDFs server-side: fill text fields, embed signature images, flatten | Use on the server (API route / Server Action) to fill form fields, embed the canvas signature PNG, and produce the final signed PDF. Use the `@pdfme/pdf-lib` fork — the original has been unmaintained for 4 years |
| `react-pdf` (wojtekmaj) | latest (wraps pdfjs-dist) | React component wrapper for PDF display | Use in the client-side signing UI to render the document page-by-page with minimal setup; falls back to `pdfjs-dist` for custom rendering needs |
**Critical distinction:**
- `pdfjs-dist` = rendering/viewing only — cannot modify PDFs
- `pdf-lib` / `@pdfme/pdf-lib` = modifying/creating PDFs — cannot render to screen
- You must use both together for this workflow
--- ---
@@ -43,384 +41,237 @@ Three separate libraries serve three distinct roles. You need all three.
| Library | Version | Purpose | When to Use | | Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------| |---------|---------|---------|-------------|
| `better-auth` | 1.5.x (1.5.5 current) | Agent authentication | Single-agent portal; credentials auth (email+password) built in; rate limiting and session management out of the box; no per-MAU cost; first-class Next.js App Router support | | `unpdf` | ^1.4.0 | Server-side PDF text extraction | Use in the AI pipeline API route to extract raw text from PDF pages before sending to OpenAI. Serverless-compatible, wraps PDF.js v5, works in Next.js API routes without native bindings. More reliable in serverless than `pdfjs-dist` directly. |
| `resend` + `@react-email/components` | 6.9.x / latest | Email delivery for signing links | Resend is the modern Nodemailer replacement; generous free tier (3k/mo), React-based email templates, dead simple API route integration |
| `signature_pad` | 5.1.3 | Canvas-based signature drawing | Core signature capture library; HTML5 canvas, Bezier curve smoothing, works on touch/mouse; use directly (not the React wrapper) so you control the ref and can export PNG | No other new supporting libraries needed. See "What NOT to Add" below.
| `react-signature-canvas` | 1.1.0-alpha.2 | React wrapper around signature_pad | Optional convenience wrapper if you prefer JSX integration; note: alpha version — prefer using `signature_pad` directly with a `useRef` |
| `@vercel/blob` | latest | PDF file storage | Zero-config for Vercel deploys; S3-backed (99.999999999% durability); PDFs are not large enough to hit egress cost issues for a solo agent; avoid vendor lock-in concern by abstracting behind a `storage.ts` service module |
| `playwright` | 1.58.x | utahrealestate.com forms library scraping; credentials-based login | Only option for sites requiring JS execution + session cookies; multi-browser, auto-wait, built-in proxy support |
| `zod` | 3.x | Request/form validation | Used with Server Actions and API routes; integrates with Prisma and better-auth |
| `nanoid` | 5.x | Generating secure signing link tokens | Cryptographically secure, URL-safe, short IDs for signing request URLs |
| `sharp` | 0.33.x | Image optimization if needed | Only needed if resizing/converting signature images before PDF embedding |
--- ---
### Development Tools ### Development Tools
| Tool | Purpose | Notes | No new dev tooling required for v1.1 features.
|------|---------|-------|
| `tailwindcss` v4 | Styling | v4 released 2025; CSS-native config, no `tailwind.config.js` required; significantly faster |
| `shadcn/ui` | Component library | Copies components into your repo (not a dependency); works with Tailwind v4 and React 19; perfect for the agent portal UI |
| ESLint 9 + `@typescript-eslint` | Linting | Next.js 15 ships ESLint 9 support |
| Prettier | Formatting | Standard |
| `prisma studio` | Database GUI | Built into Prisma; `npx prisma studio` |
| `@next/bundle-analyzer` | Bundle analysis | Experimental in Next.js 16.1 but available as standalone package for 15 |
| Turbopack | Dev server | Enabled by default in Next.js 15 (`next dev --turbo`); production builds still use Webpack in 15.x |
--- ---
## Installation ## Installation
```bash ```bash
# Core framework # New dependencies for v1.1
npm install next@^15.5 react@^19 react-dom@^19 npm install openai unpdf
# Database
npm install prisma @prisma/client @neondatabase/serverless
# Authentication
npm install better-auth
# PDF processing
npm install @pdfme/pdf-lib pdfjs-dist react-pdf
# E-signature (canvas)
npm install signature_pad
# Email
npm install resend @react-email/components react-email
# File storage
npm install @vercel/blob
# Scraping (for utahrealestate.com forms + listings)
npm install playwright
# Utilities
npm install zod nanoid
# Dev dependencies
npm install -D typescript @types/node @types/react @types/react-dom
npm install -D tailwindcss @tailwindcss/postcss postcss
npm install -D eslint eslint-config-next @typescript-eslint/eslint-plugin prettier
npx prisma init
npx playwright install chromium
``` ```
That is the full installation delta for v1.1.
---
## Feature-by-Feature Integration Notes
### Feature 1: OpenAI PDF Analysis + Field Placement
**Flow:**
1. API route receives document ID
2. Fetch PDF bytes from Vercel Blob (`@vercel/blob` — already installed)
3. Extract text per page using `unpdf`: `getDocumentProxy()` + `extractText()`
4. Call OpenAI `gpt-4o-mini` with extracted text + a manually defined JSON schema
5. Parse structured response: array of `{ fieldType, label, pageNumber, x, y, width, height, suggestedValue }`
6. Save placement records to DB via Drizzle ORM
**Why `gpt-4o-mini` (not `gpt-4o`):** Sufficient for structured field extraction on real estate forms. Significantly cheaper. The task is extraction from known document templates — not complex reasoning.
**Why manual JSON schema (not `zodResponseFormat`):** The project uses `zod` v4.3.6. The `zodResponseFormat` helper in `openai/helpers/zod` uses vendored `zod-to-json-schema` that still expects `ZodFirstPartyTypeKind` — removed in Zod v4. This is a confirmed open bug as of late 2025. Using `zodResponseFormat` with Zod v4 throws runtime exceptions. Use `response_format: { type: "json_schema", json_schema: { name: "...", strict: true, schema: { ... } } }` directly with plain TypeScript types instead.
```typescript
// CORRECT for Zod v4 project — use manual JSON schema, not zodResponseFormat
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: prompt }],
response_format: {
type: "json_schema",
json_schema: {
name: "field_placements",
strict: true,
schema: {
type: "object",
properties: {
fields: {
type: "array",
items: {
type: "object",
properties: {
fieldType: { type: "string", enum: ["text", "checkbox", "initials", "date", "signature"] },
label: { type: "string" },
pageNumber: { type: "number" },
x: { type: "number" },
y: { type: "number" },
width: { type: "number" },
height: { type: "number" },
suggestedValue: { type: "string" }
},
required: ["fieldType", "label", "pageNumber", "x", "y", "width", "height", "suggestedValue"],
additionalProperties: false
}
}
},
required: ["fields"],
additionalProperties: false
}
}
}
});
const result = JSON.parse(response.choices[0].message.content!);
```
---
### Feature 2: Expanded Field Types in @cantoo/pdf-lib
**No new library needed.** `@cantoo/pdf-lib` v2.6.3 already supports all required field types natively:
| Field Type | @cantoo/pdf-lib API |
|------------|---------------------|
| Text | `form.createTextField(name)``.addToPage(page, options)``.setText(value)` |
| Checkbox | `form.createCheckBox(name)``.addToPage(page, options)``.check()` / `.uncheck()` |
| Initials | No dedicated type — use `createTextField` with width/height appropriate for initials |
| Date | No dedicated type — use `createTextField`, constrain value format in application logic |
| Agent Signature | Use `page.drawImage(embeddedPng, { x, y, width, height })` — see Feature 3 |
**Key pattern for checkboxes:**
```typescript
const checkBox = form.createCheckBox('fieldName')
checkBox.addToPage(page, { x, y, width: 15, height: 15, borderWidth: 1 })
if (shouldBeChecked) checkBox.check()
```
**Coordinate system note:** `@cantoo/pdf-lib` uses PDF coordinate space where y=0 is the bottom of the page. If field positions come from `unpdf` / PDF.js (which uses y=0 at top), you must transform: `pdfY = pageHeight - sourceY - fieldHeight`.
---
### Feature 3: Agent Signature Storage
**No new library needed.** The project already has `signature_pad` v5.1.3, `@vercel/blob`, and Drizzle ORM.
**Architecture:**
1. Agent draws signature in browser using `signature_pad` (already installed)
2. Call `signaturePad.toDataURL('image/png')` to get base64 PNG
3. POST to API route; server converts base64 → `Uint8Array` → uploads to Vercel Blob at a stable path (e.g., `/agents/{agentId}/signature.png`)
4. Save blob URL to agent record in DB (add `signatureImageUrl` column to `Agent`/`User` table via Drizzle migration)
5. On "apply agent signature": server fetches blob URL, embeds PNG into PDF using `@cantoo/pdf-lib`
**`signature_pad` v5 in React — use `useRef` on a `<canvas>` element directly:**
```typescript
import SignaturePad from 'signature_pad'
import { useRef, useEffect } from 'react'
export function SignatureDrawer() {
const canvasRef = useRef<HTMLCanvasElement>(null)
const padRef = useRef<SignaturePad | null>(null)
useEffect(() => {
if (canvasRef.current) {
padRef.current = new SignaturePad(canvasRef.current)
}
return () => padRef.current?.off()
}, [])
const save = () => {
const dataUrl = padRef.current?.toDataURL('image/png')
// POST dataUrl to /api/agent/signature
}
return <canvas ref={canvasRef} width={400} height={150} />
}
```
**Do NOT add `react-signature-canvas`.** It wraps `signature_pad` at v1.1.0-alpha.2 (alpha status) and the project already has `signature_pad` directly. Use the raw library with a `useRef`.
**Embedding the saved signature into PDF:**
```typescript
const sigBytes = await fetch(agentSignatureBlobUrl).then(r => r.arrayBuffer())
const sigImage = await pdfDoc.embedPng(new Uint8Array(sigBytes))
const dims = sigImage.scaleToFit(fieldWidth, fieldHeight)
page.drawImage(sigImage, { x: fieldX, y: fieldY, width: dims.width, height: dims.height })
```
---
### Feature 4: Filled Document Preview
**No new library needed.** `react-pdf` v10.4.1 is already installed and supports rendering a PDF from an `ArrayBuffer` directly.
**Architecture:**
1. Server Action: load original PDF from Vercel Blob, apply all field values (text, checkboxes, embedded signature image) using `@cantoo/pdf-lib`, return `pdfDoc.save()` bytes
2. API route returns the bytes as `application/pdf`; client receives as `ArrayBuffer`
3. Pass `ArrayBuffer` directly to `react-pdf`'s `<Document file={arrayBuffer}>` — no upload required
**Known issue with react-pdf v7+:** `ArrayBuffer` becomes detached after first use. Always copy:
```typescript
const safeCopy = (buf: ArrayBuffer) => {
const copy = new ArrayBuffer(buf.byteLength)
new Uint8Array(copy).set(new Uint8Array(buf))
return copy
}
<Document file={safeCopy(previewBuffer)}>
```
**react-pdf renders the flattened PDF accurately** — all filled text fields, checked checkboxes, and embedded signature images will appear correctly because they are baked into the PDF bytes by `@cantoo/pdf-lib` before rendering.
--- ---
## Alternatives Considered ## Alternatives Considered
| Recommended | Alternative | Why Not | | Recommended | Alternative | Why Not |
|-------------|-------------|---------| |-------------|-------------|---------|
| `better-auth` | Clerk | Clerk costs per MAU; overkill for one agent; hosted user data is unnecessary complexity for a single known user | | `unpdf` for text extraction | `pdfjs-dist` directly in Node API route | `pdfjs-dist` v5 uses `Promise.withResolvers` requiring Node 22+; the project targets Node 20 LTS. `unpdf` ships a polyfilled serverless build that handles this. |
| `better-auth` | Auth.js v5 (NextAuth) | Auth.js v5 remains in beta; better-auth has the same open-source no-lock-in benefit but with built-in rate limiting and MFA. Notably, Auth.js is merging with better-auth team | | `unpdf` for text extraction | `pdf-parse` | `pdf-parse` is unmaintained (last publish 2019). `unpdf` is the community-recommended successor. |
| `@pdfme/pdf-lib` | original `pdf-lib` | Original pdf-lib (v1.17.1) last published 4 years ago; `@pdfme/pdf-lib` fork is actively maintained with bug fixes | | Manual JSON schema for OpenAI | `zodResponseFormat` helper | Broken with Zod v4 — open bug in `openai-node` as of Nov 2025. Manual schema avoids the dependency entirely. |
| `resend` | Nodemailer + SMTP | Nodemailer requires managing an SMTP server or credentials; Resend has better deliverability, a dashboard, and React template support | | `gpt-4o-mini` | `gpt-4o` | Real estate form field extraction is a structured extraction task on templated documents. `gpt-4o-mini` is sufficient and ~15x cheaper. Upgrade to `gpt-4o` only if accuracy on unusual forms is unacceptable. |
| Neon PostgreSQL | PlanetScale / Supabase | PlanetScale killed its free tier; Supabase is excellent but heavier than needed; Neon's Vercel integration and branching per PR is the best solo developer experience | | `page.drawImage()` for agent signature | `PDFSignature` AcroForm field | `@cantoo/pdf-lib` has no `createSignature()` API — `PDFSignature` only reads existing signature fields and provides no image embedding. The correct approach is `embedPng()` + `drawImage()` at the field coordinates. |
| Neon PostgreSQL | SQLite (local) | SQLite doesn't work on Vercel serverless; fine for local dev but you'd need to swap before deploying — adds friction. Neon's free tier makes this unnecessary |
| `playwright` | Puppeteer | Playwright has better auto-wait, multi-browser, and built-in proxy support; more actively maintained for 2025 use cases |
| `playwright` | Cheerio | utahrealestate.com requires authentication (login session) and renders content with JavaScript; Cheerio only parses static HTML |
| `@vercel/blob` | AWS S3 | S3 requires IAM setup, bucket policies, and CORS config; Vercel Blob is zero-config for Vercel deploys and S3-backed under the hood. Abstract it behind a service module so you can swap later |
| `signature_pad` | react-signature-canvas | The React wrapper is at `1.1.0-alpha.2` — alpha status; better to use the underlying `signature_pad@5.1.3` directly with a `useRef` canvas hook |
--- ---
## What NOT to Use ## What NOT to Add
| Avoid | Why | Use Instead | | Avoid | Why | Use Instead |
|-------|-----|-------------| |-------|-----|-------------|
| DocuSign / HelloSign | Monthly subscription cost for a solo agent; defeats the purpose of a custom tool | Custom `signature_pad` canvas capture embedded via `@pdfme/pdf-lib` | | `zodResponseFormat` from `openai/helpers/zod` | Broken at runtime with Zod v4.x (throws exceptions). Open bug, no fix merged as of 2026-03-21. | Plain `response_format: { type: "json_schema", ... }` with hand-written schema |
| Apryse WebViewer / Nutrient SDK | Enterprise pricing ($$$); overkill; hides the PDF internals you need to control | `pdfjs-dist` (render) + `@pdfme/pdf-lib` (modify) | | `react-signature-canvas` | Alpha version (1.1.0-alpha.2); project already has `signature_pad` v5 directly — the wrapper adds nothing | `signature_pad` + `useRef<HTMLCanvasElement>` directly |
| `jspdf` | Creates PDFs from scratch via HTML canvas; cannot modify existing PDFs | `@pdfme/pdf-lib` for modification | | `@signpdf/placeholder-pdf-lib` | For cryptographic PKCS#7 digital signatures (DocuSign-style). This project needs visual e-signatures (image embedded in PDF), not cryptographic signing. | `@cantoo/pdf-lib` `embedPng()` + `drawImage()` |
| `pdfmake` | Same limitation as jspdf; generates new PDFs, can't edit existing form fields | `@pdfme/pdf-lib` | | `pdf2json` | Extracts spatial text data; useful for arbitrary document analysis. Overkill here — we only need raw text content to feed OpenAI. | `unpdf` |
| Firebase / Supabase Storage | Additional vendor; Vercel Blob is already in the stack | `@vercel/blob` | | `langchain` / Vercel AI SDK | Heavy abstractions for the simple use case of one structured extraction call per document. Adds bundle size and abstraction layers with no benefit here. | `openai` SDK directly |
| Clerk | Per-MAU pricing; vendor-hosted user data; one agent user doesn't need this complexity | `better-auth` | | A separate image processing library (`sharp`, `jimp`) | Not needed — signature PNGs from `signature_pad.toDataURL()` are already correctly sized canvas exports. `@cantoo/pdf-lib` handles embedding without pre-processing. | N/A |
| Next.js 16 (right now) | Too many breaking changes (async APIs fully enforced, middleware renamed); ecosystem compatibility issues | Next.js 15.5.x |
| WebSockets for signing | Overkill; the signing flow is a one-shot action, not a live collaboration session | Server Actions + polling or simple page refresh |
--- ---
## PDF Processing Strategy ## Version Compatibility
### Overview | Package | Compatible With | Notes |
|---------|-----------------|-------|
The workflow has five distinct stages, each using different tools: | `openai@6.32.0` | `zod@4.x` (manual schema only) | Do NOT use `zodResponseFormat` helper — use raw `json_schema` response_format. The helper is broken with Zod v4. |
| `openai@6.32.0` | Node.js 20+ | Requires Node 20 LTS or later. Next.js 16.2 on Vercel uses Node 20 by default. |
``` | `unpdf@1.4.0` | Node.js 18+ | Bundled PDF.js v5.2.133 with polyfills for `Promise.withResolvers`. Works on Node 20. |
[Import PDF] → [Detect fields] → [Add signature areas] → [Fill + sign] → [Store signed PDF] | `@cantoo/pdf-lib@2.6.3` | `react-pdf@10.4.1` | These do not interact at runtime — `@cantoo/pdf-lib` runs server-side, `react-pdf` runs client-side. No conflict. |
Playwright pdfjs-dist @pdfme/pdf-lib @pdfme/pdf-lib @vercel/blob | `signature_pad@5.1.3` | React 19 | Use as a plain class instantiated in `useEffect` with a `useRef<HTMLCanvasElement>`. No React wrapper needed. |
```
### Stage 1: Importing PDFs from utahrealestate.com
The forms library on utahrealestate.com requires agent login. Use **Playwright** on the server (a Next.js API route or background job) to:
1. Authenticate with the agent's saved credentials (stored encrypted in the DB)
2. Navigate to the forms library
3. Download the target PDF as a Buffer
4. Store the original PDF in Vercel Blob under a UUID-based path
5. Record the document in the database with its Blob URL, filename, and source metadata
Playwright should run in a separate process or as a scheduled job (Vercel Cron), not inline with a user request, because browser startup is slow.
### Stage 2: Detecting Existing Form Fields
Once the PDF is stored, use **`pdfjs-dist`** in a server-side script (Node.js API route) to:
1. Load the PDF from Blob storage
2. Iterate over each page's annotations
3. Find `Widget` annotations (AcroForm fields: text inputs, checkboxes, signature fields)
4. Record each field's name, type, and bounding box (x, y, width, height, page number)
5. Store this field map in the database (`DocumentField` table)
Utah real estate forms from WFRMLS are typically standard AcroForm PDFs with pre-defined fields. Most will have existing form fields you can fill directly.
### Stage 3: Adding Signature Areas
For pages that lack signature fields (or when the agent wants to add a new signature area):
- Use **`@pdfme/pdf-lib`** server-side to add a new AcroForm signature annotation at a specified bounding box
- Alternatively, track "signature zones" as metadata in your database (coordinates + page) and overlay them on the rendering side — this avoids PDF modification until signing time
The simpler approach: store signature zones as coordinate records in the DB, render them as highlighted overlay boxes in the browser using `react-pdf`, and embed the actual signature image at those coordinates only at signing time.
### Stage 4: Filling Text Fields + Embedding Signature
When the agent fills out a document form (pre-filling client info, property address, etc.):
1. Send field values to a Server Action
2. Server Action loads the PDF from Blob
3. Use **`@pdfme/pdf-lib`** to:
- Get the AcroForm from the PDF
- Set text field values: `form.getTextField('BuyerName').setText(value)`
- Embed the signature PNG: convert the canvas `toDataURL()` PNG to a `Uint8Array`, embed as `PDFImage`, draw it at the signature zone coordinates
- Optionally flatten the form (make fields non-editable) before final storage
4. Save the modified PDF bytes back to Blob as a new file (preserve the unsigned original)
### Stage 5: Storing Signed PDFs
Store three versions in Vercel Blob:
- `/documents/{id}/original.pdf` — the untouched import
- `/documents/{id}/prepared.pdf` — fields filled, ready to sign
- `/documents/{id}/signed.pdf` — final document with embedded signature
All three paths recorded in the `Document` table. Serve signed PDFs via signed Blob URLs with short expiry (1 hour) to prevent unauthorized access.
---
## E-Signature Legal Requirements
Under the ESIGN Act (federal) and UETA (47 states), an electronic signature is legally valid when it demonstrates: **intent to sign**, **consent to transact electronically**, **association of the signature with the record**, and **reliable record retention**.
### What to Capture at Signing Time
Store all of the following in an `AuditEvent` table linked to the `SigningRequest`:
| Data Point | How to Capture | Legal Purpose |
|------------|---------------|---------------|
| Signer's IP address | `request.headers.get('x-forwarded-for')` in API route | Attribution — links signature to a network location |
| Timestamp (UTC) | `new Date().toISOString()` server-side | Proves the signature occurred at a specific time |
| User-Agent string | `request.headers.get('user-agent')` | Device/browser fingerprint |
| Consent acknowledgment | Require checkbox click: "I agree to sign electronically" | Explicit ESIGN/UETA consent requirement |
| Document hash (pre-sign) | `crypto.subtle.digest('SHA-256', pdfBytes)` | Proves document was not altered before signing |
| Document hash (post-sign) | Same, after embedding signature | Proves final document integrity |
| Signing link token | The `nanoid`-generated token used to access the signing page | Ties the signer to the specific invitation |
| Email used for invitation | From the `SigningRequest` record | Identity association |
### Signing Audit Trail Schema (minimal)
```typescript
// Prisma model
model AuditEvent {
id String @id @default(cuid())
signingRequestId String
signingRequest SigningRequest @relation(fields: [signingRequestId], references: [id])
eventType String // "viewed" | "consent_given" | "signed" | "downloaded"
ipAddress String
userAgent String
timestamp DateTime @default(now())
metadata Json? // document hashes, page count, etc.
}
```
### Signing Page Flow
1. Client opens the email link containing a `nanoid` token
2. Server validates the token (not expired, not already used)
3. Record `"viewed"` audit event (IP, timestamp, UA)
4. Client sees a consent banner: "By signing, you agree to execute this document electronically under the ESIGN Act."
5. Client checks consent checkbox — record `"consent_given"` audit event
6. Client draws signature on canvas
7. On submit: POST to API route with canvas PNG data URL
8. Server records `"signed"` audit event, embeds signature into PDF, stores signed PDF
9. Mark signing request as complete; email confirmation to both agent and client
### What This Does NOT Cover
This custom implementation is sufficient for standard real estate transactions in Utah under UETA. However:
- It does NOT provide notarization (RON — Remote Online Notarization is a separate regulated process)
- It does NOT provide RFC 3161 trusted timestamps (requires a TSA — unnecessary for most residential RE transactions)
- For purchase agreements and disclosures (standard Utah REPC forms), this level of e-signature is accepted by WFRMLS and Utah law
---
## utahrealestate.com Integration Strategy
### Two Separate Use Cases
**Use Case 1: Forms Library (PDF import)**
The forms library requires agent login. There is no documented public API for downloading forms. Strategy:
1. Store the agent's utahrealestate.com credentials encrypted in the database (use `bcrypt` or AES-256-GCM encryption with a server-side secret key — not bcrypt since you need to recover the plaintext)
2. Use Playwright to authenticate: navigate to the login page, fill credentials, submit, wait for session cookies
3. Navigate to the forms section, find the desired form by name/category, download the PDF
4. This is fragile (subject to site redesigns) but unavoidable without official API access
5. Implement with a health check: if Playwright fails to find expected elements, send the agent an alert email via Resend
**Use Case 2: Listings Display (MLS data)**
WFRMLS provides an official RESO OData API at `https://resoapi.utahrealestate.com/reso/odata/`. Access requires:
1. Apply for licensed data access at [vendor.utahrealestate.com](https://vendor.utahrealestate.com) ($50 IDX enrollment fee)
2. Obtain a Bearer Token (OAuth-based)
3. Query the `Property` resource using OData syntax
Example query (fetch recent active listings):
```
GET https://resoapi.utahrealestate.com/reso/odata/Property
?$filter=StandardStatus eq 'Active'
&$orderby=ModificationTimestamp desc
&$top=20
Authorization: Bearer <token>
```
4. Cache results in the database or in-memory (listings don't change by the second) using Next.js `unstable_cache` or a simple `revalidate` tag
5. Display on the public marketing site — no authentication required to view
**Do not scrape the listing pages directly.** WFRMLS terms of service prohibit unauthorized data extraction, and the official API path is straightforward for a licensed agent.
### Playwright Implementation Notes
Run Playwright scraping jobs via:
- **Vercel Cron** for scheduled form sync (daily refresh of available forms list)
- **On-demand API route** when the agent requests a specific form download
- Use `playwright-core` + a managed browser service (Browserless.io free tier, or self-hosted Chromium on a small VPS) for Vercel compatibility — Vercel serverless functions cannot run a full Playwright browser due to size limits
Alternatively, if the Vercel function size is a concern, extract the Playwright logic into a separate lightweight service (a small Express app on a $5/month VPS, or a Railway.app container) and call it from your Next.js API routes.
---
## MLS/WFRMLS Listings Display
Once you have the RESO OData token, the listings page on the public marketing site is straightforward:
1. **Server Component** fetches listings from the RESO API (or from a cached DB table)
2. Display property cards with photo, price, address, beds/baths
3. Photos: WFRMLS media URLs are served directly; use `next/image` with the MLS domain whitelisted in `next.config.ts`
4. Detail page: dynamic route `/listings/[mlsNumber]` with `generateStaticParams` for ISR (revalidate every hour)
5. No client-side JavaScript needed for browsing — pure Server Components + Suspense
---
## Database Schema (Key Tables)
```prisma
model User {
id String @id @default(cuid())
email String @unique
// better-auth manages password hashing
createdAt DateTime @default(now())
clients Client[]
documents Document[]
}
model Client {
id String @id @default(cuid())
agentId String
agent User @relation(fields: [agentId], references: [id])
name String
email String
phone String?
createdAt DateTime @default(now())
signingRequests SigningRequest[]
}
model Document {
id String @id @default(cuid())
agentId String
agent User @relation(fields: [agentId], references: [id])
title String
originalBlobUrl String
preparedBlobUrl String?
signedBlobUrl String?
status String // "draft" | "sent" | "signed"
fields DocumentField[]
signingRequests SigningRequest[]
createdAt DateTime @default(now())
}
model DocumentField {
id String @id @default(cuid())
documentId String
document Document @relation(fields: [documentId], references: [id])
fieldName String
fieldType String // "text" | "checkbox" | "signature"
page Int
x Float
y Float
width Float
height Float
value String?
}
model SigningRequest {
id String @id @default(cuid())
documentId String
document Document @relation(fields: [documentId], references: [id])
clientId String
client Client @relation(fields: [clientId], references: [id])
token String @unique // nanoid for the signing URL
expiresAt DateTime
signedAt DateTime?
status String // "pending" | "signed" | "expired"
auditEvents AuditEvent[]
createdAt DateTime @default(now())
}
model AuditEvent {
id String @id @default(cuid())
signingRequestId String
signingRequest SigningRequest @relation(fields: [signingRequestId], references: [id])
eventType String // "viewed" | "consent_given" | "signed"
ipAddress String
userAgent String
timestamp DateTime @default(now())
metadata Json?
}
```
--- ---
## Sources ## Sources
- [Next.js 15.5 Release Notes](https://nextjs.org/blog/next-15-5) — HIGH confidence - [openai npm page](https://www.npmjs.com/package/openai) — v6.32.0 confirmed, Node 20 requirement — HIGH confidence
- [Next.js 16 Upgrade Guide](https://nextjs.org/docs/app/guides/upgrading/version-16) — HIGH confidence (confirms 16 is available but breaking) - [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual json_schema format confirmed — HIGH confidence
- [pdf-lib npm page](https://www.npmjs.com/package/pdf-lib) — HIGH confidence (v1.17.1, unmaintained 4 years) - [openai-node Issue #1540](https://github.com/openai/openai-node/issues/1540) — zodResponseFormat broken with Zod v4 — HIGH confidence
- [@pdfme/pdf-lib npm page](https://www.npmjs.com/package/@pdfme/pdf-lib) — HIGH confidence (v5.5.8, actively maintained fork) - [openai-node Issue #1602](https://github.com/openai/openai-node/issues/1602) — zodTextFormat broken with Zod 4 — HIGH confidence
- [pdfjs-dist npm / libraries.io](https://libraries.io/npm/pdfjs-dist) — HIGH confidence (v5.5.207 current) - [openai-node Issue #1709](https://github.com/openai/openai-node/issues/1709) — Zod 4.1.13+ discriminated union break — HIGH confidence
- [JavaScript PDF Libraries Comparison 2025 (Nutrient)](https://www.nutrient.io/blog/javascript-pdf-libraries/) — HIGH confidence - [@cantoo/pdf-lib npm page](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3, field types confirmed — HIGH confidence
- [signature_pad npm](https://www.npmjs.com/package/signature_pad) — HIGH confidence (v5.1.3) - [pdf-lib.js.org PDFForm docs](https://pdf-lib.js.org/docs/api/classes/pdfform) — createTextField, createCheckBox, drawImage APIs — HIGH confidence
- [react-signature-canvas npm](https://www.npmjs.com/package/react-signature-canvas) — HIGH confidence (v1.1.0-alpha.2, alpha status noted) - [unpdf npm page](https://www.npmjs.com/package/unpdf) — v1.4.0, serverless PDF.js build, Node 20 compatible — HIGH confidence
- [better-auth npm](https://www.npmjs.com/package/better-auth) — HIGH confidence (v1.5.5) - [unpdf GitHub](https://github.com/unjs/unpdf) — extractText API confirmed — HIGH confidence
- [Auth.js joins better-auth discussion](https://github.com/nextauthjs/next-auth/discussions/13252) — HIGH confidence - [react-pdf npm page](https://www.npmjs.com/package/react-pdf) — v10.4.1, ArrayBuffer file prop confirmed — HIGH confidence
- [better-auth Next.js integration](https://better-auth.com/docs/integrations/next) — HIGH confidence - [react-pdf ArrayBuffer detach issue #1657](https://github.com/wojtekmaj/react-pdf/issues/1657) — copy workaround confirmed — HIGH confidence
- [resend npm](https://www.npmjs.com/package/resend) — HIGH confidence (v6.9.4) - [signature_pad GitHub](https://github.com/szimek/signature_pad) — v5.1.3, toDataURL API — HIGH confidence
- [Prisma 6.19.0 announcement](https://www.prisma.io/blog/announcing-prisma-6-19-0) — HIGH confidence - [pdf-lib image embedding JSFiddle](https://jsfiddle.net/Hopding/bcya43ju/5/) — embedPng/drawImage pattern — HIGH confidence
- [Neon + Vercel integration](https://vercel.com/marketplace/neon) — HIGH confidence
- [playwright npm](https://www.npmjs.com/package/playwright) — HIGH confidence (v1.58.2)
- [UtahRealEstate.com Vendor Data Services](https://vendor.utahrealestate.com/) — HIGH confidence (official RESO OData API confirmed)
- [RESO OData endpoints for WFRMLS](https://vendor.utahrealestate.com/webapi/docs/tuts/endpoints) — HIGH confidence
- [ESIGN/UETA audit trail requirements (Anvil Engineering)](https://www.useanvil.com/blog/engineering/e-signature-audit-trail-schema-events-json-checklist/) — HIGH confidence
- [E-signature legal requirements (BlueNotary)](https://bluenotaryonline.com/electronic-signature-legal-requirements/) — HIGH confidence
- [Vercel Blob documentation](https://vercel.com/docs/vercel-blob) — HIGH confidence
- [Playwright vs Puppeteer 2025 (BrowserStack)](https://www.browserstack.com/guide/playwright-vs-puppeteer) — HIGH confidence
- [NextAuth vs Clerk vs better-auth comparison (supastarter)](https://supastarter.dev/blog/better-auth-vs-nextauth-vs-clerk) — MEDIUM confidence (third-party analysis)
--- ---
*Stack research for: Teressa Copeland Homes — real estate agent website + document signing* *Stack research for: Teressa Copeland Homes — v1.1 Smart Document Preparation additions*
*Researched: 2026-03-19* *Researched: 2026-03-21*

View File

@@ -1,251 +1,185 @@
# Project Research Summary # Project Research Summary
**Project:** Teressa Copeland Homes **Project:** Teressa Copeland Homes — v1.1 Smart Document Preparation
**Domain:** Real estate agent marketing site + custom PDF document signing portal (Utah/WFRMLS) **Domain:** Real estate agent website + PDF document signing portal
**Researched:** 2026-03-19 **Researched:** 2026-03-21
**Confidence:** HIGH **Confidence:** HIGH
## Executive Summary ## Executive Summary
This is a dual-product build: a public real estate marketing site for a solo Utah agent, and a private document-signing portal that replaces per-month third-party tools (DocuSign, HelloSign) with a fully branded, custom implementation. Research across all four domains converges on a single-repo Next.js 15 application deployed to Vercel with a Neon PostgreSQL database and Vercel Blob storage for PDFs. The stack is unambiguous: `pdfjs-dist` for browser PDF rendering, `@pdfme/pdf-lib` (the maintained fork) for server-side PDF modification, `signature_pad` for canvas signature capture, `better-auth` for agent authentication, and `resend` for email delivery of signing links. This combination is well-documented, actively maintained, and sized exactly right for a solo-agent workflow — no enterprise licensing, no per-user cost, and full brand control throughout. This is a v1.1 feature expansion of an existing, working Next.js 15 real estate document signing app. The v1.0 codebase is already validated — it uses Drizzle ORM, local PostgreSQL, `@cantoo/pdf-lib` for PDF writing, `react-pdf` for client-side rendering, Auth.js v5, and `signature_pad` for canvas signatures. The v1.1 additions are: AI-assisted field placement via GPT-4o-mini, five new field types (text, checkbox, initials, date, agent-signature), agent saved signature with a draw-once-reuse workflow, and a filled document preview before sending. The minimal dependency delta is two new packages: `openai@^6.32.0` and optionally `unpdf@^1.4.0` — though `pdfjs-dist` is already installed as a transitive dependency of `react-pdf` and can serve the server-side text extraction role via its legacy build.
The recommended architecture cleanly separates two subsystems that share only the Next.js shell: the public marketing site (listings, bio, contact) and the protected agent portal (clients, documents, signing requests). These can be built in parallel after the foundation is in place and are only loosely coupled — the listings page has no dependency on the signing flow, and vice versa. The five-stage PDF pipeline (import → parse fields → fill text → send for signing → embed signature) maps directly to a sequential build order where each stage unblocks the next. The architecture research prescribes a 7-phase build order that aligns exactly with this dependency chain. The recommended build order is anchored by a schema-first phase. The `SignatureFieldData` type currently has no `type` discriminant — every field is treated identically as a client signature. Adding new field types without simultaneously updating both the schema AND the client signing page would break any in-flight signing session. The architecture research maps out an explicit 8-step dependency chain. For AI field placement, the correct approach uses `pdfjs-dist` for server-side text extraction (not vision), then GPT-4o-mini for semantic label classification — raw vision-based bounding box inference returns accurate coordinates less than 3% of the time. The OpenAI integration must use a manually defined JSON schema for structured output; the `zodResponseFormat` helper is broken with Zod v4 (confirmed open bug).
The most serious risks in this domain are legal, not technical. A custom e-signature implementation that lacks a tamper-evident PDF hash, an incomplete audit trail, or replayable signing links is legally indefensible under ESIGN/UETA regardless of how well the rest of the app works. Pitfalls research is unambiguous: audit logging and one-time token enforcement must be built in from the first signing ceremony — they cannot be retrofitted. A second cluster of risk sits in the WFRMLS integration: scraping the utahrealestate.com forms library violates Terms of Service (the platform partners with SkySlope; no public API exists for forms), and displaying IDX listing data without required broker attribution and NAR-mandated disclaimer text risks fines and loss of MLS access. Both risks are avoidable with correct design decisions made before code is written. The key risk cluster is around the AI coordinate pipeline and signing page integrity. OpenAI returns percentage-based coordinates; `@cantoo/pdf-lib` expects PDF user-space points with a bottom-left origin — a Y-axis inversion that will silently produce wrong field positions without a dedicated conversion utility and unit test. A second risk is that agent-signature fields must be filtered from the `signatureFields` array sent to clients the exact unguarded line (`/src/app/api/sign/[token]/route.ts` line 88) is identified in pitfalls research. Preview PDFs must use versioned paths separate from the final prepared PDF to maintain legal integrity between what the agent reviewed and what the client signs.
## Key Findings ## Key Findings
### Recommended Stack ### Recommended Stack
The stack is cohesive and all components are current-stable. Next.js 15.5.x (not 16 — too many breaking changes just landed) with React 19 provides the full-stack framework, App Router route groups for clean public/agent separation, and Server Components for the listings page without client-side JavaScript. Neon PostgreSQL with Prisma 6.x is the data layer — serverless-compatible, branches per PR, and generous free tier. Authentication uses `better-auth` 1.5.x (Auth.js/NextAuth is merging with better-auth and remains in beta; Clerk has per-MAU pricing that makes no sense for a single agent). PDF processing requires three distinct libraries that serve non-overlapping roles: `pdfjs-dist` for browser rendering and field detection (cannot modify PDFs), `@pdfme/pdf-lib` for server-side modification and signature embedding (cannot render to screen), and `signature_pad` for canvas capture. The original `pdf-lib` is unmaintained (4 years since last publish) — use the `@pdfme/pdf-lib` fork exclusively. The v1.0 stack is unchanged and validated. See `STACK.md` for full version details.
**Core technologies:** **New dependencies for v1.1:**
- **Next.js 15.5.x**: Full-stack framework — App Router, Server Components, API routes in one repo; Vercel-native; skip v16 until ecosystem stabilizes - `openai@^6.32.0`: Official SDK, TypeScript-native structured output for GPT-4o-mini — use manual `json_schema` response_format, NOT `zodResponseFormat` (broken with Zod v4, confirmed open GitHub issues #1540, #1602, #1709)
- **React 19**: Ships with Next.js 15; Server Components and form actions stable - `pdfjs-dist` legacy build (already installed): Server-side PDF text extraction via `pdfjs-dist/legacy/build/pdf.mjs` — no new dependency needed if using this path
- **TypeScript 5.x**: Required; all major libraries ship full types
- **Neon PostgreSQL + Prisma 6.x**: Serverless-compatible DB; best-in-class DX for TypeScript; Vercel Marketplace integration
- **better-auth 1.5.x**: No per-MAU cost; built-in rate limiting; first-class Next.js App Router support
- **pdfjs-dist 5.5.x**: Browser-side PDF rendering and AcroForm field detection only
- **@pdfme/pdf-lib 5.5.x**: Server-side PDF modification — fill fields, embed signature PNG, flatten; actively maintained fork
- **signature_pad 5.1.3**: Canvas signature capture; Bezier smoothing; touch-normalized; use directly (not the alpha React wrapper)
- **resend + @react-email/components**: Transactional email for signing links; 3k/month free tier; React email templates
- **@vercel/blob**: Zero-config PDF storage for Vercel; S3-backed; abstract behind `storage.ts` to allow future swap
- **Playwright**: Required for utahrealestate.com forms (if used); run via separate service or Browserless.io — not inline in Vercel serverless
- **Tailwind CSS v4 + shadcn/ui**: Styling; Tailwind v4 is CSS-native, no config file required
**Critical version note:** Do NOT use `next@16` — breaking changes (async APIs, middleware renamed to proxy, edge runtime dropped) make it incompatible with the ecosystem today. Use `next@15.5.x` and revisit in 6-12 months. **Existing stack components covering all v1.1 needs:**
- `@cantoo/pdf-lib@2.6.3`: All five new field types (text, checkbox, initials, date, agent-signature) supported natively via `createTextField`, `createCheckBox`, `drawImage` APIs
- `signature_pad@5.1.3`: Agent signature canvas — use `useRef<HTMLCanvasElement>` + `useEffect` pattern directly; do NOT add `react-signature-canvas` (alpha wrapper)
- `react-pdf@10.4.1`: Filled preview rendering — pass `ArrayBuffer` directly; copy the buffer before passing to avoid detachment issue (known bug #1657)
- `@vercel/blob@2.3.1` + Drizzle ORM: Agent signature storage — architecture research recommends TEXT column on `users` table for 2-8KB base64 PNG; no new file storage needed
### Expected Features ### Expected Features
Research identifies a clear MVP boundary. The signing portal is the novel differentiator — all the marketing site features are table stakes for any professional real estate presence. The key UX insight from feature research: no client account creation, ever. The signing link token is the client's identity. Every friction point between "email received" and "signature captured" is an abandonment driver. All v1.1 features are P1 (must-have for launch). Research confirms the full feature set is aligned with industry standard behavior across DocuSign, dotloop, and SkySlope DigiSign.
**Must have for v1 launch (P1):** **Must have (table stakes):**
- Marketing site: hero with agent photo, bio, contact form, testimonials - Initials field type — every Utah standard form (REPC, listing agreement, addenda) has per-page initials lines; missing this makes the app unusable for standard Utah workflows
- Active listings display from WFRMLS (with full IDX compliance: broker attribution, disclaimer text, last-updated timestamp) - Date field (auto-stamp, read-only) — "Date Signed" pattern; auto-populated at signing session completion; client never types a date; legally important
- Agent login (better-auth, credentials) - Checkbox field type — Utah REPC uses boolean checkboxes throughout (mediation clauses, contingency elections, disclosure acknowledgments)
- Client management: create/view clients with name and email - Agent saved signature — draw once, reuse across documents; the "Adopted Signature" pattern in every major real estate e-sig tool
- PDF upload and browser rendering (pdfjs-dist + react-pdf) - Agent signs first workflow — industry convention: agent at routing order 1, client at routing order 2; confirmed by DocuSign community docs
- Signature field placement UI: drag-and-drop on PDF canvas with coordinate storage in PDF user space - Filled document preview with Send gating — prevents the most-cited mistake (sending wrong document version); Send button lives in preview
- Email delivery of unique, tokenized signing link (resend)
- Token-based anonymous client signing page: no account, no login
- Canvas signature capture (signature_pad) — mobile-first, touch-action:none, iOS Safari tested
- Audit trail: IP, timestamp, user-agent, consent acknowledgment — all server-side timestamps
- PDF signature embed + form flatten with fonts embedded before flattening (@pdfme/pdf-lib)
- Tamper-evident PDF hash: SHA-256 of final signed PDF bytes stored in DB
- One-time signing token enforcement: used_at column, DB transaction, 72-hour TTL
- Signed document storage (Vercel Blob, private ACLs, presigned URLs only)
- Agent dashboard: document status at a glance (Draft / Sent / Viewed / Signed)
**Should have, add after v1 (P2):** **Should have (differentiators):**
- Forms library import from utahrealestate.com — add only as manual agent upload, not scraping - AI field placement via gpt-4o-mini + text extraction — eliminates manual drag-drop session; accuracy 90%+ on structured Utah forms with predictable label patterns ("Buyer's Signature", "Date", "Initial Here")
- Heuristic AcroForm field detection on Utah standard forms — manual placement fallback always present - AI pre-fill from client profile — maps client name, email, property address to text fields; low hallucination risk (structured profile data, not free-text inference)
- Document view/open tracking (link opened audit event) - Property address field on client profile — enables AI pre-fill to be property-specific; simple schema addition
- Signed document confirmation email to client
- Multiple field types: initials, auto-date, checkbox, text inputs
- Neighborhood guide / SEO content pages
**Defer to v2+ (P3):** **Defer to v1.2+:**
- Automated unsigned-document reminder system (requires scheduling infrastructure) - AI confidence display to agent — adds UI noise; agent can see and correct in preview instead
- Client portal with document history (requires client accounts — explicitly an anti-feature in v1) - Template save from AI placement — high value but requires template management UI; defer until AI accuracy is validated
- Multi-agent / brokerage support (role/permissions model doubles auth complexity) - Multiple agent signature fields per document — needs UX design; defer
- Bulk document send
- Native mobile app (responsive web signing works; 50%+ of e-signatures happen on mobile web)
**Anti-features to avoid entirely:**
- Scraping utahrealestate.com forms library (ToS violation; use manual upload)
- DocuSign/HelloSign integration (monthly cost, third-party branding — defeats the purpose)
- In-app PDF content editing (real estate contracts have legally mandated language; editing creates liability)
- WebSockets for signing (one-shot action, not a live session)
- Client account creation for signing (friction kills completion rates)
### Architecture Approach ### Architecture Approach
The system is a single Next.js monorepo with two distinct route groups: `(public)` for the unauthenticated marketing site and `(agent)` for the protected portal. These share the Prisma/Neon data layer and Vercel Blob storage but have no UI coupling. The PDF pipeline is entirely server-side (API routes / Server Actions) except for browser rendering and signature capture, which require Client Components loaded with `dynamic(() => import(...), { ssr: false })`. Authentication uses three defense-in-depth layers — edge middleware, layout Server Component, and per-route handler — because the CVE-2025-29927 middleware bypass (disclosed March 2025) demonstrated that middleware alone is not sufficient. The signing flow uses JWT tokens (HMAC-SHA256) stored server-side with one-time enforcement via a `used_at` column, providing both stateless verification and replay protection. The v1.1 architecture is an incremental extension of the existing system — not a rewrite. Seven new files are created (two server-only AI lib files, three API routes, two client components). Eight existing files are modified with targeted additions. The critical architectural constraint: the existing client signing flow (`embed-signature.ts`, signing token route, `SignatureModal.tsx`) must not be altered. Agent-sig and text/checkbox/date fields are baked into the prepared PDF before the client opens the signing link. The client signing page handles only `client-signature` and `initials` field types.
**Major components and responsibilities:** See `ARCHITECTURE.md` for complete component boundaries, data flow diagrams, and the full 8-step build order.
1. **`middleware.ts`** — Edge auth gate for `/agent/*`; redirects to login; NOT the only auth layer (CVE-2025-29927 requires defense in depth) **Major components:**
2. **`(agent)/layout.tsx`** — Server Component second auth layer; calls `verifySession()` on every render 1. `lib/ai/extract-text.ts` + `lib/ai/field-placement.ts` (NEW, server-only) — pdfjs-dist legacy build for text extraction; GPT-4o-mini structured output with manual JSON schema; `server-only` import guard prevents accidental client bundle inclusion
3. **`lib/pdf/parse.ts`** — pdfjs-dist server-side: extracts AcroForm field names, types, and bounding boxes from PDF bytes 2. `POST /api/documents/[id]/ai-prepare` (NEW) — orchestrates extract + AI call + coordinate conversion (percentage to PDF points using actual page dimensions)
4. **`lib/pdf/fill.ts`** — @pdfme/pdf-lib server-side: writes agent-supplied text into named form fields; embeds fonts before flattening 3. `GET/PUT /api/agent/signature` (NEW) — stores agent signature as base64 PNG TEXT column on `users` table; always auth-gated
5. **`lib/pdf/sign.ts`** — @pdfme/pdf-lib server-side: embeds signature PNG at stored coordinates; flattens and seals the form; computes SHA-256 hash 4. `POST /api/documents/[id]/preview` (NEW) — reuses existing `preparePdf` in preview mode; writes to versioned `_preview_{timestamp}.pdf`; streams bytes directly; never overwrites final prepared PDF
6. **`components/agent/PDFFieldMapper.tsx`** — Client Component: renders PDF page via canvas; drag-to-define signature zones; converts viewport coordinates to PDF user space (origin bottom-left) before storing 5. Extended `FieldPlacer.tsx` palette — five new draggable tokens; existing drag/move/resize/persist mechanics unchanged
7. **`components/sign/SignatureCanvas.tsx`**Client Component: signature_pad with touch-action:none; exports PNG for submission 6. Extended `prepare-document.ts`type-aware rendering switch for all six field types; existing `client-signature` path unchanged
8. **`app/sign/[token]/page.tsx`** — Public signing page: validates JWT, streams prepared PDF, renders field overlays, submits canvas PNG
9. **`lib/wfrmls/client.ts`** — WFRMLS RESO OData client with ISR revalidate=3600 for listings
10. **`lib/storage/s3.ts`** — Vercel Blob abstraction: upload, download, presigned URL (5-minute TTL); PDFs never served with public ACLs
**Database schema highlights:** `User``Client``SigningRequest``SignatureAuditLog` chain. `Document` stores the raw PDF template (reusable across multiple `SigningRequest`s). `SigningRequest` holds `fieldValues`, `signatureFields` (coordinates in PDF space), `preparedS3Key`, `signedS3Key`, `token`, `tokenJti`, `usedAt`, and status enum (DRAFT/SENT/VIEWED/SIGNED/EXPIRED/CANCELLED). `SignatureAuditLog` is append-only and records every ceremony event with server-side timestamps.
**PDF coordinate conversion is mandatory:** Browser canvas coordinates (origin top-left, Y down) must be converted to PDF user space (origin bottom-left, Y up) using `viewport.convertToPdfPoint(x, y)` from pdfjs-dist before storage. This conversion must be unit-tested against actual Utah real estate forms before the field placement UI ships.
### Critical Pitfalls ### Critical Pitfalls
1. **No tamper-evident PDF hash** — Compute SHA-256 of the complete signed PDF bytes immediately after embedding the signature, before storing. Store the hash in `SigningRequest`. This is the difference between a legally defensible document and "a drawing on a page." 1. **Breaking the signing page with new field types**`SigningPageClient.tsx` opens the signature modal for every field in `signatureFields` with no type branching. Adding new field types without updating the signing page in the same deployment breaks active signing sessions. Ship schema + signing page filter as one atomic deployment, before any other v1.1 work.
2. **Incomplete audit trail** — Log six ceremony events server-side: document prepared, email sent, link opened (with IP/UA/timestamp), document viewed, signature submitted, final PDF hash computed. Timestamps must be server-side — client-reported timestamps are legally worthless. This must be wired in before the first signing ceremony, not added later. 2. **AI coordinate Y-axis inversion** — AI returns percentages from top-left; `@cantoo/pdf-lib` uses PDF user-space with Y=0 at bottom. Storing AI coordinates without conversion inverts every field position. Write a `aiCoordsToPagePdfSpace()` conversion utility with a unit test asserting known PDF-space x/y values against a real Utah REPC before any OpenAI call is made.
3. **Replayable signing token**Signing tokens must have a `used_at` column set atomically on successful submission (DB transaction to prevent race conditions). After use, the link must return "already signed" — never the canvas. 72-hour TTL is appropriate for real estate (clients don't check email instantly; 15-minute magic-link windows are too short). 3. **Agent-signature field sent unfiltered to client**`/src/app/api/sign/[token]/route.ts` line 88 returns `doc.signatureFields ?? []` without type filtering. When `agent-signature` fields are in that array, the client sees them as required unsigned fields. Add type filter before any agent-signed document is sent.
4. **PDF coordinate system mismatch** — PDF user space uses bottom-left origin with Y increasing upward; browser canvas uses top-left with Y downward. Embedding a signature without the conversion inverts the Y position. Write a coordinate conversion unit test against a real Utah purchase agreement form before building the drag-and-drop UI. Also handle page rotation (`Rotate` key in PDF) — Utah forms may be rotated 90 degrees. 4. **Stale preview after field changes** — preview PDF written to a deterministic path gets cached; agent sends a document based on a stale preview. Use versioned preview paths (`{docId}_preview_{timestamp}.pdf`) and disable Send when fields have changed since last preview generation.
5. **utahrealestate.com forms scraping violates ToS** — The platform partners with SkySlope and has no public forms API. Storing Teressa's credentials to automate downloads violates the WFRMLS data licensing agreement. Use manual PDF upload as the document source. State-approved Utah DRE forms at commerce.utah.gov are public domain and can be embedded directly. 5. **OpenAI token limits on multi-page Utah forms** — Utah standard forms are 10-30 pages; full text extraction fits in ~2,000-8,000 tokens (within gpt-4o-mini's 128k context). Risk: testing only with 2-3 page PDFs in development. Prevention: test AI pipeline with the full Utah REPC (20+ pages) before shipping.
6. **IDX compliance violations** — Every listing page (card and detail) must display: listing broker/office name (`ListOfficeName`), WFRMLS disclaimer text verbatim, last-updated timestamp from the feed, and buyer's agent compensation disclosure per 2024 NAR settlement. Missing any of these risks fines up to $15,000 and loss of MLS access. Treat these as acceptance criteria, not post-launch polish.
7. **PDF font flattening failure** — AcroForm fields reference fonts by name (e.g., "Helvetica"). On Vercel serverless, no system fonts are installed. Calling `form.flatten()` without first embedding fonts produces blank text fields in the downloaded PDF. Explicitly embed standard fonts from pdf-lib's built-in set on every form field before flattening. Validate in production environment, not just local Mac with system fonts.
8. **Mobile canvas scrolling instead of signing** — Without `touch-action: none` on the canvas element, iOS Safari and Android Chrome intercept touch gestures as page scroll. The client tries to sign and the page scrolls instead. This must be tested on physical devices before any client is sent a signing link.
## Implications for Roadmap ## Implications for Roadmap
Based on the dependency chain identified in architecture research and the feature priority matrix from feature research, the natural build order is 7 phases: The architecture research provides an explicit 8-step build order based on hard dependencies. This maps directly to 5 phases.
### Phase 1: Foundation ### Phase 1: Schema Foundation + Signing Page Safety
**Rationale:** Everything else depends on this. Database schema, auth system, and storage infrastructure must exist before any feature can be built. Auth has a well-documented security pattern (three-layer defense-in-depth due to CVE-2025-29927) that must be established upfront — retrofitting auth is expensive.
**Delivers:** Working Next.js project, Prisma schema + Neon DB, Vercel Blob bucket, better-auth credentials login, middleware + layout guard, agent can log in and reach a blank dashboard.
**Addresses:** Agent login (P1 feature), tamper-resistant auth architecture.
**Avoids:** Middleware-only auth bypass (CVE-2025-29927 defense-in-depth established from day one).
**Research flag:** None needed — well-documented patterns for all components.
### Phase 2: Public Marketing Site **Rationale:** The single most dangerous change in v1.1 is adding field types to a schema the client signing page does not handle. Any document with mixed field types sent before the signing page is updated is a HIGH-recovery-cost production incident. Must be first, before any other v1.1 work.
**Rationale:** Independent of the document workflow; provides immediate business value; unblocks IDX integration testing early. Can be built by a different person in parallel with Phase 3 if needed. **Delivers:** Extended `DocumentField` discriminated union in `schema.ts` with backward-compatible fallback for v1.0 documents (`type ?? 'client-signature'`); two new nullable DB columns (`agentSignatureData` on users, `propertyAddress` on clients); Drizzle migration; updated `SigningPageClient.tsx` and `POST /api/sign/[token]` with type-based field filtering.
**Delivers:** Public-facing site with hero, bio, contact form, and WFRMLS listings display with full IDX compliance (broker attribution, disclaimer, last-updated, NAR 2024 compensation disclosure). **Addresses:** Foundation for all expanded field types; agent-signature client exposure risk
**Addresses:** Hero/bio/contact (P1), listings display (P1), IDX compliance (legal requirement), delta-sync listings refresh. **Avoids:** Pitfall 1 (signing page crash on new field types), Pitfall 10 (agent-sig field shown to client as required unsigned field)
**Avoids:** Stale off-market listings (hourly delta sync with `ModificationTimestamp` filter); IDX attribution violations (treated as acceptance criteria, not polish). **Research flag:** None needed — Drizzle discriminated union and nullable column additions are well-documented; two-line ALTER TABLE migration.
**Research flag:** WFRMLS RESO API vendor enrollment takes 2-4 weeks — start this process immediately, in parallel with Phase 1. Do not block Phase 2 on this; build with mock data while approval is pending.
### Phase 3: Agent Portal Shell ### Phase 2: Agent Saved Signature + Agent Signing Workflow
**Rationale:** Client management and the agent dashboard are prerequisites for the document workflow. These are straightforward CRUD operations with standard patterns.
**Delivers:** Agent dashboard (skeleton), client list + create/edit, agent login page polish, navigation shell.
**Addresses:** Client management (P1), document status dashboard (P1).
**Avoids:** No pitfalls specific to this phase — standard CRUD patterns.
**Research flag:** None needed — well-documented patterns.
### Phase 4: PDF Ingest and Storage **Rationale:** Agent signature is a prerequisite for the agent-signs-first workflow, which is a prerequisite for the filled preview (preview only makes sense after agent has signed). Agent signature embed also establishes the PNG embed pattern in `prepare-document.ts` that informs how other field types are handled.
**Rationale:** Must exist before field mapping UI can be built. The storage abstraction and PDF parsing pipeline are the lowest layer of the document workflow. **Delivers:** `GET/PUT /api/agent/signature` routes; `AgentSignaturePanel` component (draw + save + thumbnail); extended `prepare-document.ts` to embed agent-sig PNG at field coordinates; `FieldPlacer` palette token for agent-signature type; supersede-and-resend flow guard preventing re-preparation of sent/viewed documents without user confirmation.
**Delivers:** PDF upload (manual agent upload — no scraping), Vercel Blob storage pipeline (original/prepared/signed versions), pdfjs-dist AcroForm field extraction, document create + detail pages. **Uses:** `signature_pad@5.1.3` (existing), `@cantoo/pdf-lib@2.6.3` (existing), `users.agentSignatureData TEXT` column (Phase 1)
**Addresses:** PDF upload + rendering (P1), signed document storage (P1). **Avoids:** Pitfall 5 (signature stored as dataURL in DB is correct — TEXT column is right for 2-8KB), Pitfall 6 (race condition on re-preparation), Pitfall 10 (agent-sig filtered from client fields via Phase 1 foundation)
**Avoids:** Local filesystem storage (serverless ephemeral filesystem); utahrealestate.com scraping (manual upload only, no credentials stored, no headless browser automation in core app); PDF coordinate detection tested with actual Utah forms. **Research flag:** None needed — draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are DB column and API route.
**Research flag:** May need deeper research on pdfjs-dist Node.js legacy build for server-side parsing — uses `pdfjs-dist/legacy/build/pdf.mjs` without a worker, which is distinct from the browser build.
### Phase 5: PDF Fill and Field Mapping ### Phase 3: Expanded Field Types End-to-End
**Rationale:** Depends on Phase 4 (storage + parse infrastructure). The field mapper UI and the fill API are the agent's core document preparation workflow.
**Delivers:** PDFFieldMapper.tsx (drag-to-place signature zones on PDF canvas), coordinate conversion from viewport space to PDF user space with unit tests, @pdfme/pdf-lib fill API (text fields, font embedding before flatten), document editor form.
**Addresses:** Signature field placement UI (P1), agent-fills-then-client-signs workflow (differentiator).
**Avoids:** PDF coordinate mismatch (unit-tested against actual Utah purchase agreement form before UI ships); font flattening failure (fonts embedded explicitly; tested in production serverless environment); heuristic-only detection (manual placement fallback is the primary flow; auto-detect is an enhancement).
**Research flag:** Research the specific pdfjs-dist `viewport.convertToPdfPoint()` API and pdf-lib `StandardFonts` embedding before implementation — these are narrow but critical APIs.
### Phase 6: Signing Flow — End to End **Rationale:** Phase 1 made the schema and signing page safe. Phase 2 established the PNG embed pattern in `prepare-document.ts`. Now extend the field placement UI and prepare pipeline to handle all five new field types. Completing this phase gives the agent a fully functional field system without any AI dependency.
**Rationale:** The highest-complexity phase; depends on all previous phases. Contains all the legally critical components. Should be built as a complete vertical slice in one phase to ensure the audit trail is woven through from start to finish. **Delivers:** Five new draggable palette tokens in `FieldPlacer.tsx` (text, checkbox, initials, date, agent-signature); type-aware rendering in `prepare-document.ts` (text stamp, checkbox embed, date auto-stamp, initials placeholder); `propertyAddress` field in `ClientModal` and clients server action; field type coverage from placement through to embedded PDF.
**Delivers:** JWT token generation (HMAC-SHA256, 72-hour TTL, one-time enforcement with `used_at` DB column), resend email delivery, `/sign/[token]` public signing page, SignatureCanvas.tsx (mobile-first, touch-action:none, iOS Safari + Android Chrome tested on physical devices), @pdfme/pdf-lib signature PNG embed, SHA-256 hash of final signed PDF, complete SignatureAuditLog (6 ceremony events with server-side timestamps), one-time token invalidation with race-condition-safe DB transaction, "already signed" page for expired/used tokens. **Addresses:** All P1 table stakes: initials, date, checkbox, text field types
**Addresses:** Email delivery (P1), client signing page (P1), canvas capture (P1), audit trail (P1), signed document storage (P1), tamper-evident hash (P1). **Avoids:** Pitfall 1 (signing page hardened in Phase 1 before these types can be placed and sent)
**Avoids:** Replayable signing tokens; incomplete audit trail; mobile canvas scroll-instead-of-sign; blank-canvas submission; font flattening blank text; unsigned PDF served publicly. **Research flag:** None needed — all APIs are in existing `@cantoo/pdf-lib@2.6.3`.
**Research flag:** This phase warrants a `/gsd:research-phase` — the intersection of JWT one-time enforcement, PDF hash storage, ESIGN/UETA audit requirements, and mobile touch handling has enough edge cases that a focused spike before implementation reduces rework risk.
### Phase 7: Audit Trail, Status Tracking, and Download ### Phase 4: Filled Document Preview
**Rationale:** Completes the agent-facing visibility layer. The underlying data is already being written in Phase 6; this phase surfaces it in the UI.
**Delivers:** Document status tracking in agent dashboard (Draft/Sent/Viewed/Signed with last-activity timestamp), presigned Vercel Blob URLs for agent PDF download (5-minute TTL, authenticated route only), confirmation screen for client after signing, optional signed document email to client. **Rationale:** Preview depends on the fully extended `preparePdf` from Phase 3 and agent signing from Phase 2. It is a composition of previous phases — build it after those foundations are solid.
**Addresses:** Agent dashboard status (P1 polish), signed document retrieval (P1), document view tracking (P2). **Delivers:** `POST /api/documents/[id]/preview` route; `PreviewModal` component with in-app react-pdf rendering; versioned preview path with staleness detection; Send button disabled when fields changed since last preview; Back-to-edit flow; prepared PDF hashed at prepare time (extend existing `pdfHash` pattern).
**Avoids:** Signed PDFs accessible via guessable URL (all downloads gated behind authenticated presigned URLs). **Uses:** Existing `preparePdf` (reused unchanged), `react-pdf@10.4.1` (existing), ArrayBuffer copy pattern for react-pdf detachment bug
**Research flag:** None needed — well-documented patterns; the audit data infrastructure is already in place from Phase 6. **Avoids:** Pitfall 7 (stale preview), Pitfall 8 (OOM — generate-once, serve-cached pattern), Pitfall 9 (client signs different doc than agent previewed — hash verification)
**Research flag:** Deployment target should be confirmed before implementation — the write-to-local-`uploads/` preview pattern fails on Vercel serverless (ephemeral filesystem). If deployed to Vercel, preview must write to Vercel Blob instead.
### Phase 5: AI Field Placement + Pre-fill
**Rationale:** AI is the highest-complexity feature and depends on field types being fully placeable (Phase 3) and the FieldPlacer accepting `DocumentField[]` from an external source. Building last means the agent can use manual placement throughout earlier phases. AI placement is an enhancement of the field system, not a replacement.
**Delivers:** `lib/ai/extract-text.ts` (pdfjs-dist legacy build, server-only); `lib/ai/field-placement.ts` (GPT-4o-mini structured output, manual JSON schema, `server-only` guard); `POST /api/documents/[id]/ai-prepare` route with coordinate conversion utility + unit test; "AI Auto-place" button in PreparePanel with loading state and agent review step; AI pre-fill of text fields from client profile data.
**Uses:** `openai@^6.32.0` (new install), pdfjs-dist legacy build (existing), gpt-4o-mini (sufficient for structured label extraction; ~15x cheaper than gpt-4o)
**Avoids:** Pitfall 2 (coordinate mismatch — unit-tested conversion utility against known Utah REPC before shipping), Pitfall 3 (token limits — full-form test required), Pitfall 4 (hallucination — Zod validation of AI response before any field is stored; explicit enum for field types in JSON schema)
**Research flag:** Requires integration test with real 20-page Utah REPC before shipping. Also validate that gpt-4o-mini text extraction accuracy on Utah standard forms (which have predictable label patterns) meets the 90%+ threshold claimed in research.
### Phase Ordering Rationale ### Phase Ordering Rationale
- **Foundation first (Phase 1):** Auth, DB, and storage have zero optional dependencies. Every subsequent phase builds on these. - Phase 1 is a safety gate — deploy it before any document with new field types can be created or sent
- **Public site early (Phase 2):** Independent of the signing workflow; provides immediate value and allows WFRMLS API approval process to complete while portal phases are underway. The 2-4 week vendor enrollment timeline makes early start critical. - Phase 2 before Phase 3 because `prepare-document.ts` needs the agent-sig embed pattern established before adding the full type-aware rendering switch
- **Portal CRUD before PDF (Phase 3):** Clients and documents are the entities that the PDF pipeline operates on. Establishing the data model and basic CRUD before building the complex pipeline reduces schema churn. - Phase 3 before Phase 4 because preview calls `preparePdf` — incomplete field type handling in prepare means an incomplete preview
- **Storage before fill before sign (Phases 4-6):** Each phase adds one layer of the PDF pipeline. Building them in dependency order means each phase can ship independently and be tested in isolation. - Phase 5 last because it enhances a complete field system; agents can use manual placement throughout all earlier phases; no blocking dependency
- **Audit/download last (Phase 7):** The underlying data is written by Phase 6. Phase 7 is surfacing and visibility only — no new data model needed. - The agent-signature field filtering (Pitfall 10) is addressed in Phase 1, not Phase 2 — this is deliberate; the signing route must be hardened before the first agent-sig field can be placed and sent
- **Scraping architecture decision made before Phase 4:** The decision to use manual upload (no utahrealestate.com scraping) must be explicit in Phase 4 so no scraping infrastructure is ever built.
### Research Flags ### Research Flags
**Needs `/gsd:research-phase` during planning:** **Needs deeper research during planning:**
- **Phase 6 (Signing Flow):** Legal compliance intersection (ESIGN/UETA audit requirements + JWT one-time enforcement + PDF hash + mobile touch) has enough edge cases and gotchas that a pre-implementation research spike is warranted. - **Phase 5 (AI):** The coordinate conversion from percentage to PDF user-space points needs a concrete unit test against a known Utah REPC before implementation. Validate pdfjs-dist legacy build text extraction works correctly in the project's actual Node 20 / Next.js 16.2 environment.
- **Phase 2 (Listings):** WFRMLS vendor enrollment process and exact IDX compliance requirements (disclaimer text, 2024 NAR settlement fields) should be confirmed with WFRMLS directly before the listings UI is built — the requirements change with NAR policy updates. - **Phase 4 (Preview):** Deployment target (Vercel serverless vs. self-hosted container) determines whether preview files can use the local `uploads/` filesystem or must use Vercel Blob. Confirm before writing the preview route.
**Standard patterns (skip research-phase):** **Standard patterns (skip research-phase):**
- **Phase 1 (Foundation):** Next.js + Prisma + better-auth + Vercel Blob are all well-documented with official guides. - **Phase 1 (Schema):** Drizzle discriminated union extension and nullable column additions are well-documented; two-line ALTER TABLE migration.
- **Phase 3 (Portal Shell):** Standard Next.js CRUD patterns; no novel integration. - **Phase 2 (Agent Signature):** The draw-save-reuse pattern is identical to v1.0 client signature; only new pieces are a DB column and API route.
- **Phase 7 (Audit/Status):** The data model is established; surfacing it is standard UI work. - **Phase 3 (Field Types):** All field type APIs are in existing `@cantoo/pdf-lib@2.6.3`; no new library research needed.
## Confidence Assessment ## Confidence Assessment
| Area | Confidence | Notes | | Area | Confidence | Notes |
|------|------------|-------| |------|------------|-------|
| Stack | HIGH | All library versions verified via npm. next@15.5 vs 16 verified against official upgrade guide. pdf-lib unmaintained status confirmed (npm publish date). @pdfme/pdf-lib fork activity confirmed. better-auth + NextAuth merge confirmed via GitHub discussion. | | Stack | HIGH | All versions verified via npm registry; OpenAI Zod v4 incompatibility confirmed via open GitHub issues #1540, #1602, #1709; pdfjs-dist server-side usage confirmed via actual codebase inspection |
| Features | HIGH | Feature list cross-referenced against multiple industry sources (DocuSign, HelloSign, SkySlope/Authentisign competitor analysis). ESIGN/UETA requirements sourced from legal and engineering references. Utah-specific requirements sourced from Utah DRE and WFRMLS. | | Features | HIGH for field types and signing flows; MEDIUM for AI field detection accuracy | Field behavior confirmed against DocuSign, dotloop, SkySlope docs; AI coordinate accuracy confirmed via Feb 2025 benchmarks (< 3% pixel accuracy from vision); actual accuracy on Utah forms is untested |
| Architecture | HIGH | CVE-2025-29927 middleware bypass is a real, documented vulnerability (Vercel postmortem + ProjectDiscovery analysis). PDF coordinate system documented by multiple sources. Build order derived from dependency analysis, not assumption. | | Architecture | HIGH | Based on actual v1.0 codebase review (not speculative); specific file names, function names, and line numbers cited throughout; build order confirmed by dependency analysis |
| Pitfalls | HIGH | Pitfalls sourced from court cases (e-signature audit trail failures), WFRMLS vendor FAQ (ToS prohibition on scraping), NAR IDX policy statement 7.58 (attribution requirements), and confirmed GitHub issues (pdf-lib coordinate bugs, font flattening). | | Pitfalls | HIGH | All pitfalls grounded in actual codebase inspection; specific file paths and line numbers identified (e.g., sign route line 88); no speculative claims |
**Overall confidence:** HIGH across all research areas. Sources are primary (official docs, official policies, CVE disclosures) with consistent cross-referencing. The main area of uncertainty is not technical but logistical: WFRMLS vendor enrollment timeline (2-4 weeks, outside developer control). **Overall confidence:** HIGH
### Gaps to Address ### Gaps to Address
- **WFRMLS vendor enrollment timeline:** The RESO OData API requires a vendor contract, background check, and compliance review — 2-4 weeks. Start this process on day one. Build the listings page with mock data or a dev-environment token while waiting. Do not block Phase 2 on this. - **AI coordinate accuracy on real Utah forms:** Research confirms the text-extraction + label-matching approach is correct, but accuracy on actual Utah REPC and listing agreement forms is untested. Phase 5 must include an integration test with real forms before the feature ships.
- **Preview file lifecycle in production:** The `_preview_{timestamp}.pdf` pattern creates unbounded file growth in `uploads/`. A cleanup strategy (delete previews older than 24 hours, or delete on document send) needs to be decided before Phase 4 implementation.
- **Exact IDX disclaimer text:** WFRMLS provides required disclaimer text that must appear on every listing page. This text changes with NAR policy updates (2024 settlement changed required fields). Obtain the current required text directly from WFRMLS before the listings feature ships — do not copy from another agent's site. - **Deployment target for preview writes:** The write-to-disk preview pattern silently fails on Vercel serverless (ephemeral filesystem). Confirm whether the app runs on Vercel serverless or a persistent container before implementing Phase 4.
- **Which Utah standard forms Teressa uses most frequently:** The manual upload workflow (replacing planned scraping) needs a curated set of base forms pre-loaded. Teressa should identify the 5-10 forms she uses in 90% of transactions so they can be manually uploaded and stored as reusable document templates in the app. This is a product/content decision, not a technical one.
- **Playwright deployment strategy (if forms import is added in v1.x):** Vercel serverless functions cannot run a full Playwright browser due to size limits. If the forms library import feature is added after v1, the Playwright scraping job must run on a separate service (Railway, Render, or a $5/month VPS with Browserless.io). This architecture decision must be made before any scraping code is written — and the ToS decision must be re-evaluated at that time.
- **SPF/DKIM/DMARC for teressacopelandhomes.com:** Signing link emails sent from an unverified sender domain will go to spam. DNS records must be configured before any signing link is sent to a real client. This is a DNS/infrastructure task, not a code task — it must be in the Phase 6 acceptance criteria.
## Sources ## Sources
### Primary (HIGH confidence) ### Primary (HIGH confidence)
- `src/lib/db/schema.ts` (actual codebase, inspected 2026-03-21) — `SignatureFieldData` has no `type` field confirmed
- [Next.js 15.5 Release Notes](https://nextjs.org/blog/next-15-5) — framework version and stable features - `src/app/api/sign/[token]/route.ts` line 88 (actual codebase) — unfiltered `signatureFields` sent to client confirmed
- [Next.js 16 Upgrade Guide](https://nextjs.org/docs/app/guides/upgrading/version-16) — breaking changes confirming 15 is correct choice - `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` (actual codebase) — single "Signature" token; `screenToPdfCoords` Y-inversion pattern confirmed
- [CVE-2025-29927 — Next.js Middleware Bypass (Vercel)](https://nextjs.org/blog/cve-2025-29927) — middleware auth bypass requiring defense-in-depth - [openai npm](https://www.npmjs.com/package/openai) — v6.32.0 confirmed, Node 20 requirement
- [@pdfme/pdf-lib npm page](https://www.npmjs.com/package/@pdfme/pdf-lib) — v5.5.8, active maintenance confirmed - [OpenAI Structured Outputs docs](https://platform.openai.com/docs/guides/structured-outputs) — manual json_schema format confirmed
- [pdfjs-dist npm](https://www.npmjs.com/package/pdfjs-dist) — v5.5.207 current - [openai-node Issue #1540](https://github.com/openai/openai-node/issues/1540) — zodResponseFormat broken with Zod v4
- [signature_pad npm](https://www.npmjs.com/package/signature_pad) — v5.1.3 - [openai-node Issue #1602](https://github.com/openai/openai-node/issues/1602) — zodTextFormat broken with Zod v4
- [better-auth npm](https://www.npmjs.com/package/better-auth) — v1.5.5; Auth.js merge confirmed - [openai-node Issue #1709](https://github.com/openai/openai-node/issues/1709) — Zod 4.1.13+ discriminated union break
- [UtahRealEstate.com Vendor Data Services](https://vendor.utahrealestate.com/) — RESO OData API and forms ToS confirmed - [@cantoo/pdf-lib npm](https://www.npmjs.com/package/@cantoo/pdf-lib) — v2.6.3; createTextField, createCheckBox, drawImage APIs confirmed
- [WFRMLS RESO OData API Examples](https://www.reso.org/web-api-examples/mls/utah-mls/) — API field names and query syntax - [react-pdf ArrayBuffer detach issue #1657](https://github.com/wojtekmaj/react-pdf/issues/1657) — ArrayBuffer copy workaround confirmed
- [NAR IDX Policy Statement 7.58](https://www.nar.realtor/handbook-on-multiple-listing-policy/advertising-print-and-electronic-section-1-internet-data-exchange-idx-policy-policy-statement-7-58) — listing attribution requirements - [Vercel Serverless Function Limits](https://vercel.com/docs/functions/runtimes/node-js#memory-and-compute) — 256MB default memory, 60s max execution on Pro
- [ESIGN/UETA Audit Trail Schema — Anvil Engineering](https://www.useanvil.com/blog/engineering/e-signature-audit-trail-schema-events-json-checklist/) — legal audit requirements - [Utah Division of Real Estate — State Approved Forms](https://realestate.utah.gov/real-estate/forms/state-approved/) — REPC form structure context
- [pdf-lib Field Coordinates Issue — GitHub #602](https://github.com/Hopding/pdf-lib/issues/602) — coordinate API gap confirmed
- [Utah DRE State-Approved Forms](https://commerce.utah.gov/realestate/real-estate/forms/state-approved/) — public domain forms source
- [Vercel Blob documentation](https://vercel.com/docs/vercel-blob) — storage strategy confirmed
- [Prisma 6.19.0 announcement](https://www.prisma.io/blog/announcing-prisma-6-19-0) — version confirmed
- [Neon + Vercel integration](https://vercel.com/marketplace/neon) — serverless DB strategy
### Secondary (MEDIUM confidence) ### Secondary (MEDIUM confidence)
- [Edge AI and Vision Alliance — SAM 2 + GPT-4o (Feb 2025)](https://www.edge-ai-vision.com/2025/02/sam-2-gpt-4o-cascading-foundation-models-via-visual-prompting-part-2/) — GPT-4o returns accurate bounding box coordinates in < 3% of attempts
- [NextAuth vs Clerk vs better-auth comparison (supastarter)](https://supastarter.dev/blog/better-auth-vs-nextauth-vs-clerk) — auth selection rationale (third-party analysis, but findings align with primary sources) - [Instafill.ai — Real estate law flat PDF form automation (Feb 2026)](https://blog.instafill.ai/2026/02/18/case-study-real-estate-law-flat-pdf-form-automation/) — hybrid text-extraction + LLM approach confirmed as production pattern
- [JavaScript PDF Libraries Comparison 2025 (Nutrient)](https://www.nutrient.io/blog/javascript-pdf-libraries/) — PDF library selection (vendor-written but technically accurate) - [DocuSign community — routing order for real estate](https://community.docusign.com/esignature-111/prefill-fields-before-sending-envelope-for-signature-180) — agent order 1, client order 2 confirmed
- [MLS Listing Data Freshness — MLSImport](https://mlsimport.com/fix-outdated-listings-on-your-wordpress-real-estate-site/) — delta sync approach - [Dotloop support — date auto-stamp behavior](https://support.dotloop.com/hc/en-us/articles/217936457-Adding-Signatures-or-Initials-to-Locked-Templates) — date field auto-stamp pattern confirmed
- [Playwright vs Puppeteer 2025 (BrowserStack)](https://www.browserstack.com/guide/playwright-vs-puppeteer) — scraping tool selection - [DocuSign community — Date Signed field](https://community.docusign.com/esignature-111/am-i-able-to-auto-populate-the-date-field-2271) — read-only auto-populated date confirmed
### Tertiary (LOW confidence — validate during implementation)
- [E-Signature UX Best Practices (various vendors)](https://www.esignglobal.com/blog/best-practices-embedded-signing-user-experience-ux) — UX recommendations sourced from vendor blogs; general patterns are sound but specific metrics need validation against real user behavior
- WFRMLS exact required disclaimer text — must be obtained directly from WFRMLS; text changes with NAR policy and cannot be reliably sourced from third parties
--- ---
*Research completed: 2026-03-19* *Research completed: 2026-03-21*
*Ready for roadmap: yes* *Ready for roadmap: yes*