docs: complete project research
This commit is contained in:
@@ -1,381 +1,641 @@
|
||||
# Pitfalls Research
|
||||
|
||||
**Domain:** Real estate broker web app — v1.1 additions: AI field placement, expanded field types, agent saved signature, filled document preview
|
||||
**Researched:** 2026-03-21
|
||||
**Confidence:** HIGH (all pitfalls grounded in the actual v1.0 codebase reviewed; no speculative claims)
|
||||
**Domain:** Real estate broker web app — v1.2 additions: multi-signer support and Docker production deployment
|
||||
**Researched:** 2026-04-03
|
||||
**Confidence:** HIGH — all pitfalls grounded in the v1.1 codebase reviewed directly; no speculative claims. Source code line references included throughout.
|
||||
|
||||
---
|
||||
|
||||
## Context: What v1.1 Is Adding to the Existing System
|
||||
## Context: What v1.2 Is Adding to the Existing System
|
||||
|
||||
The v1.0 codebase has been reviewed. Key facts that shape every pitfall below:
|
||||
The v1.1 codebase has been reviewed in full. Key facts that make every pitfall below concrete:
|
||||
|
||||
- `SignatureFieldData` (schema.ts) has **no `type` field** — it stores only `{ id, page, x, y, width, height }`. Every field is treated as a signature.
|
||||
- `FieldPlacer.tsx` has **one draggable token** labeled "Signature" — no other field types exist in the palette.
|
||||
- `SigningPageClient.tsx` **iterates `signatureFields`** and opens the signature modal for every field. It has no concept of field type.
|
||||
- `embed-signature.ts` **only draws PNG images** — no logic for text, checkboxes, or dates.
|
||||
- `prepare-document.ts` uses `@cantoo/pdf-lib` (confirmed import), fills AcroForm text fields and draws blue rectangles for signature placeholders. It does not handle the new field types.
|
||||
- Prepared PDF paths are stored as relative local filesystem paths (not Vercel Blob URLs). The signing route builds absolute paths from these.
|
||||
- Agent saved signature: no infrastructure exists yet. The v1.0 `SignatureModal` checks `localStorage` for a saved signature — that is the only "save" mechanism today, and it is per-browser only.
|
||||
- `signingTokens` table has one row per document, no `signerEmail` column. One token = one signer = current architecture.
|
||||
- `SignatureFieldData` (schema.ts) stores `{ id, page, x, y, width, height, type? }` — no `signerEmail` field. All fields belong to the single signer.
|
||||
- `send/route.ts` calls `createSigningToken(doc.id)` once and emails `client.email`. Multi-signer needs iteration.
|
||||
- `documents.status` enum is `Draft | Sent | Viewed | Signed`. No per-signer completion state exists.
|
||||
- `POST /api/sign/[token]` marks `documents.status = 'Signed'` when its one token is claimed. With multiple signers, the first signer to complete will trigger this transition prematurely.
|
||||
- PDF files live at `process.cwd() + '/uploads'` — a local filesystem path. Docker containers have ephemeral filesystems by default.
|
||||
- `NEXT_PUBLIC_BASE_URL` is used to construct signing URLs. Variables prefixed `NEXT_PUBLIC_` are inlined at build time in Next.js, not resolved at container startup.
|
||||
- Nodemailer transporter in `signing-mailer.tsx` calls `createTransporter()` per send — healthy pattern, but reads `CONTACT_SMTP_HOST` at call time, which only works if the env var is present in the container.
|
||||
- `src/lib/db/index.ts` uses `postgres(url)` with no explicit `max` connection limit. In Docker, the `postgres` npm package defaults to `10` connections per instance. Against Neon, the free tier allows 10 concurrent connections total — one container saturates this budget entirely.
|
||||
- `next.config.ts` declares `serverExternalPackages: ['@napi-rs/canvas']`. This native binary must be present in the Docker image. The package ships platform-specific `.node` files selected by npm at install time. If the Docker image is built on ARM (Apple Silicon) and run on x86_64 Linux, the wrong binary is included.
|
||||
- `package.json` lists `@vercel/blob` as a production dependency. It is not used anywhere in the codebase. Its presence creates a risk of accidental use in future code that would break in a non-Vercel Docker deployment.
|
||||
|
||||
---
|
||||
|
||||
## Critical Pitfalls
|
||||
## Summary
|
||||
|
||||
### Pitfall 1: Breaking the Signing Page by Adding Field Types Without Type Discrimination
|
||||
Eight risk areas for v1.2:
|
||||
|
||||
1. **Multi-signer completion detection** — the current "first signer marks Signed" pattern will falsely complete documents.
|
||||
2. **Docker filesystem and env var** — Next.js bakes `NEXT_PUBLIC_*` at build time; container loses uploads unless a volume is mounted; `DATABASE_URL` and SMTP secrets silently absent in container.
|
||||
3. **SMTP in Docker** — not a DNS problem for external SMTP services, but env var injection failure is the confirmed root cause of the reported email breakage.
|
||||
4. **PDF assembly on partial completion** — the final merged PDF must only be produced once, after all signers complete, without race conditions.
|
||||
5. **Token security** — multiple tokens per document opens surfaces that a single-token system didn't have.
|
||||
6. **Neon connection pool exhaustion** — `postgres` npm client's default 10 connections saturates Neon's free tier connection limit in a single container.
|
||||
7. **`@napi-rs/canvas` native binary** — cross-platform Docker builds break this native module without explicit platform targeting.
|
||||
8. **`@vercel/blob` dead dependency** — installed but unused; its presence risks accidental use in code that would silently fail outside Vercel.
|
||||
|
||||
---
|
||||
|
||||
## Multi-Signer Pitfalls
|
||||
|
||||
### Pitfall 1: First Signer Marks Document "Signed" — Completion Fires Prematurely
|
||||
|
||||
**What goes wrong:**
|
||||
`SignatureFieldData` has no `type` field. `SigningPageClient.tsx` opens the signature-draw modal for every field in `signatureFields`. When new field types (text, checkbox, initials, date, agent-signature) are stored in that same array with only coordinates, the client signing page either (a) shows a signature canvas for a checkbox field, or (b) crashes with a runtime error when it encounters a field type it doesn't handle, blocking the entire signing page.
|
||||
|
||||
**Why it happens:**
|
||||
The schema change is made on the agent side first (adding a `type` discriminant to `SignatureFieldData` and new field types to `FieldPlacer`), but the signing page is not updated in the same commit. Even one deployed document with mixed field types — sent before the signing page update — will be broken for that client.
|
||||
|
||||
**How to avoid:**
|
||||
Add `type` to `SignatureFieldData` as a string literal union **before** any field placement UI changes ship. Make the signing page's field renderer branch on `type` defensively: unknown types default to a placeholder ("not required") rather than throwing. Ship both changes atomically — schema migration, `FieldPlacer` update, and `SigningPageClient` update must be deployed together. Never have a deployed state where the schema supports types the signing page doesn't handle.
|
||||
|
||||
**Warning signs:**
|
||||
- `SignatureFieldData` in `schema.ts` gains a `type` property but `SigningPageClient.tsx` still iterates fields without branching on it.
|
||||
- The FieldPlacer palette has more tokens than the signing page has rendering branches.
|
||||
- A document is sent before the signing page is updated to handle the new types.
|
||||
|
||||
**Phase to address:**
|
||||
Phase 1 of v1.1 (schema and signing page update) — must be the first change, before any AI or UI work touches field types.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 2: AI Coordinate System Mismatch — OpenAI Returns CSS-Space Percentages, pdf-lib Expects PDF Points
|
||||
|
||||
**What goes wrong:**
|
||||
The OpenAI response for field placement will return bounding boxes in one of several formats: percentage of page (0–1 or 0–100), pixel coordinates at an assumed render resolution, or CSS-style top-left origin. The existing `SignatureFieldData` schema stores **PDF user space coordinates** (bottom-left origin, points). When the AI output is stored without conversion, every AI-placed field appears at the wrong position — often inverted on the Y axis. The mismatch is not obvious during development if you test with PDFs where fields land approximately near the correct area.
|
||||
|
||||
**Why it happens:**
|
||||
The current `FieldPlacer.tsx` already has a correct `screenToPdfCoords` function for converting drag events. But that function takes rendered pixel dimensions as input. When AI output arrives as a JSON payload, developers mistakenly store the raw AI coordinates directly into the database without passing them through the same conversion. The sign-on-screen overlay in `SigningPageClient.tsx` then applies `getFieldOverlayStyle()` which expects PDF-space coords, producing the wrong position.
|
||||
|
||||
**Concrete example from the codebase:**
|
||||
`screenToPdfCoords` in `FieldPlacer.tsx` computes:
|
||||
```
|
||||
pdfY = ((renderedH - screenY) / renderedH) * pageInfo.originalHeight
|
||||
```
|
||||
If the AI returns a y_min as fraction of page height from the top (0 = top), storing it directly as `field.y` means the field appears at the bottom of the page instead of the top, because PDF Y=0 is the bottom.
|
||||
|
||||
**How to avoid:**
|
||||
Define a canonical AI output format contract before building the prompt. Use normalized coordinates (0–1 fractions from top-left) in the AI JSON response, then convert server-side using a single `aiCoordsToPagePdfSpace(norm_x, norm_y, norm_w, norm_h, pageWidthPts, pageHeightPts)` utility. This utility mirrors the existing `screenToPdfCoords` logic. Unit-test it against a known Utah purchase agreement with known field positions before shipping.
|
||||
|
||||
**Warning signs:**
|
||||
- AI-placed fields appear clustered at the bottom or top of the page regardless of document content.
|
||||
- The AI integration test uses visual eyeballing rather than coordinate assertions.
|
||||
- The conversion function is not covered by the existing test suite (`prepare-document.test.ts`).
|
||||
|
||||
**Phase to address:**
|
||||
AI field placement phase — write the coordinate conversion utility and its test before the OpenAI API call is made.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 3: OpenAI Token Limits on Large Utah Real Estate PDFs
|
||||
|
||||
**What goes wrong:**
|
||||
Utah standard real estate forms (REPC, listing agreements, buyer representation agreements) are 10–30 pages. Sending the raw PDF bytes or a base64-encoded PDF to GPT-4o-mini will immediately hit the 128k context window limit for multi-page forms, or produce truncated/hallucinated field detection when the document is silently cut off mid-content. GPT-4o-mini's vision context limit is further constrained by image tokens — a single PDF page rendered at 72 DPI costs roughly 1,700 tokens; a 20-page document at standard resolution consumes ~34,000 tokens before any prompt text.
|
||||
|
||||
**Why it happens:**
|
||||
Developers prototype with short test PDFs (2–3 pages) where the approach works, then discover it fails on production forms. The failure mode is not a hard error — the API returns a response, but field positions are wrong or missing because the model never saw the later pages.
|
||||
|
||||
**How to avoid:**
|
||||
Page-by-page processing: render each PDF page to a base64 PNG (using `pdfjs-dist` or `sharp` on the server), send each page image in a separate API call, then merge the field results. Cap input image resolution to 1024px wide (sufficient for field detection). Set a token budget guard before each API call and log when pages approach the limit. Use structured output (JSON mode) so partial responses fail loudly rather than silently returning incomplete data.
|
||||
|
||||
**Warning signs:**
|
||||
- AI analysis is tested with only a 2-page or 3-page sample PDF.
|
||||
- The implementation sends the entire PDF to OpenAI in a single request.
|
||||
- Field detection success rate degrades noticeably on page 8+.
|
||||
|
||||
**Phase to address:**
|
||||
AI integration phase — establish the page-by-page pipeline pattern before testing with real Utah forms.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 4: Prompt Design — AI Hallucinates Fields That Don't Exist or Misses Required Fields
|
||||
|
||||
**What goes wrong:**
|
||||
Without a carefully constrained prompt, GPT-4o-mini will "helpfully" infer field locations that don't exist in the PDF (e.g., detecting a printed date as a fillable date field) or will use inconsistent field type names that don't match the application's `type` enum (`"text_input"` instead of `"text"`, `"check_box"` instead of `"checkbox"`). This produces spurious fields in the agent's document and breaks the downstream field type renderer.
|
||||
|
||||
**Why it happens:**
|
||||
The default behavior of vision models is to be helpful and infer structure. Without explicit constraints (exact allowed types, instructions to return empty array when no fields exist, max field count), the output is non-deterministic and schema-incompatible.
|
||||
|
||||
**How to avoid:**
|
||||
Use OpenAI's structured output (JSON schema mode) with an explicit enum for field types matching the application's type discriminant exactly. Include a negative instruction: "Only detect fields that have an explicit visual placeholder (blank line, box, checkbox square) — do not infer fields from printed text labels." Include a `confidence` score per field so the agent UI can filter low-confidence placements. Validate the response JSON against a Zod schema server-side before storing — reject the entire AI response if any field has an invalid type.
|
||||
|
||||
**Warning signs:**
|
||||
- The prompt asks the model to "detect all form fields" without specifying what counts as a field.
|
||||
- The response is stored directly in the database without Zod validation.
|
||||
- The agent sees unexpected fields on pages with no visual placeholders.
|
||||
|
||||
**Phase to address:**
|
||||
AI integration phase — validate prompt output against Zod before the first real Utah form is tested.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 5: Agent Saved Signature Stored as Raw DataURL — Database Bloat and Serving Risk
|
||||
|
||||
**What goes wrong:**
|
||||
A canvas signature exported as `toDataURL('image/png')` produces a base64-encoded PNG string. A typical signature on a 400x150 canvas is 15–60KB as base64. If this is stored directly in the database (e.g., a `TEXT` column in the `users` table), every query that fetches the user row will carry 15–60KB of base64 data it may not need. More critically, if the dataURL is ever sent to the client to pre-populate a form field, it exposes the full signature as a downloadable string in page source.
|
||||
|
||||
**How to avoid:**
|
||||
Store the signature as a file (Vercel Blob or the existing `uploads/` directory), and store only the file path/URL in the database. On the signing page and preview, serve the signature through an authenticated API route that streams the file bytes — never expose the raw dataURL to the client page. Alternatively, convert the dataURL to a `Uint8Array` immediately on the server (for PDF embedding only) and discard the string — only the file path goes to the DB.
|
||||
|
||||
**Warning signs:**
|
||||
- A `savedSignatureDataUrl TEXT` column is added to the `users` table.
|
||||
- The agent dashboard page fetches the user row and passes `savedSignatureDataUrl` to a React component prop.
|
||||
- The signature appears in the React devtools component tree as a base64 string.
|
||||
|
||||
**Phase to address:**
|
||||
Agent saved signature phase — establish the storage pattern (file + path, not dataURL + column) before any signature saving UI is built.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 6: Race Condition — Agent Updates Saved Signature While Client Is Mid-Signing
|
||||
|
||||
**What goes wrong:**
|
||||
The agent draws a new saved signature and saves it while a client has the signing page open. The signing page has already loaded the signing request data (including `signatureFields`). When the agent applies their new saved signature to an agent-signature field and re-prepares the document, there are now two versions of the prepared PDF on disk: the one the client is looking at and the newly generated one. If the client submits their signature concurrently with the agent's re-preparation, `embedSignatureInPdf()` may read a partially-written prepared PDF (before the atomic rename completes) or the document may be marked "Sent" again after already being in "Viewed" state, breaking the audit trail.
|
||||
|
||||
**Why it happens:**
|
||||
The existing prepare flow in `PreparePanel.tsx` allows re-preparation of Draft documents. Once agent signing is added, the agent can re-run preparation on a "Sent" or "Viewed" document to swap their signature, creating a mutable prepared PDF while a client session is active.
|
||||
|
||||
**How to avoid:**
|
||||
Lock prepared documents once the first signing link is sent. Gate the agent re-prepare action behind a confirmation: "Resending will invalidate the existing signing link — the client will receive a new email." On confirmation, atomically: (1) mark the old signing token as `usedAt = now()` with reason "superseded", (2) delete the old prepared PDF (or rename to `_prepared_v1.pdf`), (3) generate a new prepared PDF, (4) issue a new signing token, (5) send a new email. This prevents mid-session clobber. The existing `embedSignatureInPdf` already uses atomic rename (`tmp → final`) which prevents partial-read corruption — preserve this.
|
||||
|
||||
**Warning signs:**
|
||||
- Agent can click "Prepare and Send" on a document with status "Sent" without any confirmation dialog.
|
||||
- The prepared PDF path is deterministic and overwritten in place (e.g. always `{docId}_prepared.pdf`).
|
||||
- No "superseded" state exists in the `signingTokens` table.
|
||||
|
||||
**Phase to address:**
|
||||
Agent signing phase — implement the supersede-and-resend flow before any agent signature is applied to a sent document.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 7: Filled Preview Is Served From the Same Path as the Prepared PDF — Stale Preview After Field Changes
|
||||
|
||||
**What goes wrong:**
|
||||
The agent makes changes to field placement or pre-fill values after generating a preview. The preview file on disk is now stale. The preview URL is cached by the browser (or a CDN). The agent sees the old preview and believes the document is correct, then sends it to the client. The client receives a document with the old pre-fill values, not the updated ones.
|
||||
|
||||
**Why it happens:**
|
||||
The existing `prepare-document.ts` writes to a deterministic path: `{docId}_prepared.pdf`. If the preview is served from the same path, any browser cache of that URL shows the old version. The agent has no visual indication that the preview is stale.
|
||||
|
||||
**How to avoid:**
|
||||
Generate preview PDFs to a separate path with a timestamp or version suffix: `{docId}_preview_{timestamp}.pdf`. Never serve the preview from the same path as the final prepared PDF. Add a "Preview is stale — regenerate before sending" banner that appears when `signatureFields` or `textFillData` are changed after the last preview was generated. Store `lastPreviewGeneratedAt` in the document record and compare to `updatedAt`. The "Send" button should be disabled until a fresh preview has been generated (or explicitly skipped by the agent).
|
||||
|
||||
**Warning signs:**
|
||||
- The preview endpoint serves `/api/documents/{id}/prepared` without a cache-busting mechanism.
|
||||
- The agent can modify fields after generating a preview and the preview URL does not change.
|
||||
- No "stale preview" indicator exists in the UI.
|
||||
|
||||
**Phase to address:**
|
||||
Filled document preview phase — establish the versioned preview path and staleness indicator before the first preview is rendered.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 8: Memory Issues Rendering Large PDFs for Preview on the Server
|
||||
|
||||
**What goes wrong:**
|
||||
Generating a filled preview requires loading the PDF into memory (via `@cantoo/pdf-lib`), modifying it, and either returning the bytes for streaming or writing to disk. Utah real estate forms (REPC, addendums) can be 15–30 pages and 2–8MB as raw PDFs. Running `PDFDocument.load()` on an 8MB PDF in a Vercel serverless function that has a 256MB memory limit can cause OOM errors under concurrent load. The Vercel function timeout (10s default, 60s max on Pro) can also be exceeded for large PDFs with many embedded fonts.
|
||||
|
||||
**Why it happens:**
|
||||
Developers test with a small 2-page PDF in development and the function works fine. The function hits the memory wall only when a real Utah standard form (often 20+ pages with embedded images) is processed in production.
|
||||
|
||||
**How to avoid:**
|
||||
Do not generate the preview inline in a serverless function on every request. Instead: generate the preview once (as a write operation), store the result in the `uploads/` directory or Vercel Blob, and serve it from there. The preview generation can be triggered on-demand (agent clicks "Generate Preview") and is idempotent. Set a timeout guard: if `PDFDocument.load()` takes longer than 8 seconds, return a 504 with "Preview temporarily unavailable." Monitor the Vercel function execution time and memory in the dashboard — alert at 70% of the memory limit.
|
||||
|
||||
**Warning signs:**
|
||||
- Preview is regenerated on every page load (no stored preview file).
|
||||
- The preview route calls `PDFDocument.load()` within a synchronous request handler.
|
||||
- Tests only use PDFs smaller than 2MB.
|
||||
|
||||
**Phase to address:**
|
||||
Filled document preview phase — establish the "generate once, serve cached" pattern from the start.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 9: Client Signing Page Confusion — Preview Shows Agent Pre-Fill but Client Signs a Different Document
|
||||
|
||||
**What goes wrong:**
|
||||
The filled preview shows the document with all text pre-fills applied (client name, property address, price). The client signing page also renders the prepared PDF — which already contains those fills (because `prepare-document.ts` fills AcroForm fields and draws text onto the PDF). But the visual design difference between "this is a preview for review" and "this is the actual document you are signing" is unclear. If the agent generates a stale preview and the client signs a different (more recent) prepared PDF, the client believes they signed what they previewed, but the legal document has different content.
|
||||
|
||||
**How to avoid:**
|
||||
The client signing page must always serve the **same** prepared PDF that was cryptographically hashed at prepare time. The preview the agent saw must be generated from that exact file — not a re-generation. Store the SHA-256 hash of the prepared PDF at preparation time (same pattern as the existing `pdfHash` for signed PDFs). When serving the client's signing PDF, recompute and verify the hash matches before streaming. This ties the signed document back to the exact bytes the agent previewed.
|
||||
|
||||
**Warning signs:**
|
||||
- The preview is generated by a different code path than `prepare-document.ts` (e.g., a separate PDF rendering library).
|
||||
- No hash is stored for the prepared PDF, only for the signed PDF.
|
||||
- The agent can re-prepare after preview generation without the signing link being invalidated.
|
||||
|
||||
**Phase to address:**
|
||||
Filled document preview phase AND agent signing phase — hash the prepared PDF immediately after writing it (extend the existing `pdfHash` pattern from signed to prepared).
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 10: Agent Signature Field Handled by Client Signing Page
|
||||
|
||||
**What goes wrong:**
|
||||
A new `"agent-signature"` field type is added to `FieldPlacer`. The agent applies their saved signature to this field before sending. But `SigningPageClient.tsx` iterates all fields in `signatureFields` and shows a signing prompt for each one. If the agent-signature field is included in the array sent to the client, the client sees a field labeled "Signature" (or unlabeled) that is already visually signed with someone else's signature, and the progress bar counts it as an unsigned field the client must complete.
|
||||
|
||||
**Why it happens:**
|
||||
The client signing page receives the full `signatureFields` array from the GET `/api/sign/[token]` response. The route currently returns `doc.signatureFields ?? []` without filtering. When agent-signature fields are added to the same array, they are included in the client's field list.
|
||||
|
||||
**Concrete location in codebase:**
|
||||
`POST /api/sign/[token]` at line 254–263 of the current route unconditionally executes:
|
||||
```typescript
|
||||
// /src/app/api/sign/[token]/route.ts, line 88
|
||||
signatureFields: doc.signatureFields ?? [],
|
||||
await db.update(documents).set({ status: 'Signed', signedAt: now, ... })
|
||||
.where(eq(documents.id, payload.documentId));
|
||||
```
|
||||
This sends ALL fields to the client, including any agent-filled fields.
|
||||
With two signers, Signer A completes and triggers this. The document is now `Signed`. Signer B's token is still valid, but when Signer B opens their signing page GET request, it checks `doc.signatureFields` filtered by `isClientVisibleField`. The document's fields are all there — nothing prevents Signer B from completing. Two `signature_submitted` audit events are logged for the same document, two conflicting `_signed.pdf` files may be written, and the agent receives two "document signed" emails. The final PDF hash stored in `documents.pdfHash` is from whichever signer completed last and overwrote the row.
|
||||
|
||||
**Why it happens:**
|
||||
The single-signer assumption is load-bearing in the POST handler. Completion detection is a single UPDATE, not a query across all tokens for the document.
|
||||
|
||||
**How to avoid:**
|
||||
Filter the `signatureFields` array in the signing token GET route: only return fields where `type !== 'agent-signature'` (or more precisely, only return fields the client is expected to sign). Agent-signed fields should be pre-embedded into the `preparedFilePath` PDF during document preparation — by the time the client opens the signing link, the agent's signature is already baked into the prepared PDF as a drawn image. The `signatureFields` array sent to the client should contain only the fields the client needs to provide.
|
||||
Add a `signerEmail TEXT NOT NULL` column to `signingTokens`. Completion detection becomes: after claiming a token (the atomic UPDATE that prevents double-submission), query `SELECT COUNT(*) FROM signing_tokens WHERE document_id = ? AND used_at IS NULL`. If count reaches zero, all signers have completed — only then trigger final PDF assembly and agent notification. Protect this with a database transaction so the count query and the "mark Signed" update are atomic. Never set `documents.status = 'Signed'` until the zero-remaining-tokens check passes.
|
||||
|
||||
**Warning signs:**
|
||||
- The full `signatureFields` array is returned from the signing token GET without filtering by `type`.
|
||||
- Agent-signed fields are stored in the same `signatureFields` JSONB column as client signature fields.
|
||||
- The client progress bar shows more fields than the client is responsible for signing.
|
||||
- `POST /api/sign/[token]` sets `status = 'Signed'` without first counting remaining unclaimed tokens.
|
||||
- Agent receives two notification emails after a two-signer document is tested.
|
||||
- `documents.signedAt` is overwritten by both signers (last-write-wins).
|
||||
|
||||
**Phase to address:**
|
||||
Agent signing phase — filter the signing response by field type before the first agent-signed document is sent to a client.
|
||||
**Phase to address:** Multi-signer schema phase — before any send or signing UI is changed, establish the completion detection query.
|
||||
|
||||
---
|
||||
|
||||
## Technical Debt Patterns
|
||||
### Pitfall 2: Race Condition — Two Signers Complete Simultaneously, Both Trigger Final PDF Assembly
|
||||
|
||||
Shortcuts that seem reasonable but create long-term problems.
|
||||
**What goes wrong:**
|
||||
Signer A and Signer B submit within milliseconds of each other (common if they are in the same room). Both claim their respective tokens atomically — that part works. Both then execute the "count remaining unclaimed tokens" check. If that check is not inside the same database transaction as the token claim, both reads may return 0 remaining (after the other's claim propagated), and both handlers proceed to assemble the final merged PDF simultaneously. Two concurrent writes to `{docId}_signed.pdf` corrupt the file (partial PDF bytes interleaved), or the second write silently overwrites the first.
|
||||
|
||||
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|
||||
|----------|-------------------|----------------|-----------------|
|
||||
| Store saved signature as dataURL in users table | No new file storage code needed | Every user query pulls 15–60KB of base64; dataURL exposed in client props | Never — use file storage from the start |
|
||||
| Re-use same `_prepared.pdf` path for preview and final prepared doc | No versioning logic needed | Stale previews; no way to prove which prepared PDF the client signed | Never — versioned paths required for legal integrity |
|
||||
| Return all signatureFields to client (no type filtering) | Simpler route code | Client sees agent-signature fields as required fields to complete | Never for agent-signature type; acceptable for debugging only |
|
||||
| Prompt OpenAI with entire PDF as one request | Simpler prompt code | Fails silently on documents > ~8 pages; token limit hit without hard error | Acceptable only for prototyping with < 5 page test PDFs |
|
||||
| Add `type` to SignatureFieldData but don't add a schema migration | Skip Drizzle migration step | Existing rows have `null` type; `signatureFields` JSONB array has mixed null/typed entries; TypeScript union breaks | Never — migrate immediately |
|
||||
| Generate preview on every page load | No caching logic needed | OOM errors on large PDFs under Vercel memory limit; slow UX | Acceptable only during local development |
|
||||
**Why it happens:**
|
||||
The atomic token claim (`UPDATE ... WHERE used_at IS NULL RETURNING`) is a single row update. The subsequent completion check is a separate query. Two handlers can interleave between those two operations.
|
||||
|
||||
**How to avoid:**
|
||||
Use a `completionTriggeredAt TIMESTAMP` column on `documents` with a one-time-set guard:
|
||||
```typescript
|
||||
const won = await db.update(documents)
|
||||
.set({ completionTriggeredAt: new Date() })
|
||||
.where(and(eq(documents.id, docId), isNull(documents.completionTriggeredAt)))
|
||||
.returning({ id: documents.id });
|
||||
if (won.length === 0) return; // another handler already triggered completion
|
||||
// proceed to final PDF assembly
|
||||
```
|
||||
This is the same pattern the existing token claim uses (`UPDATE ... WHERE used_at IS NULL RETURNING`). If 0 rows returned, another handler already won the race; skip assembly silently.
|
||||
|
||||
**Warning signs:**
|
||||
- Two concurrent POST requests for the same document produce two `_signed.pdf` files.
|
||||
- The `documents` table has no `completionTriggeredAt` column.
|
||||
|
||||
**Phase to address:** Multi-signer schema phase — establish this pattern alongside the completion detection fix.
|
||||
|
||||
---
|
||||
|
||||
## Integration Gotchas
|
||||
### Pitfall 3: Legacy Single-Signer Documents Break When signingTokens Gains signerEmail
|
||||
|
||||
Common mistakes when connecting to external services.
|
||||
**What goes wrong:**
|
||||
v1.0 and v1.1 documents have one row in `signingTokens` with no `signerEmail`. When the multi-signer schema adds `signerEmail NOT NULL` to `signingTokens`, all existing token rows become invalid (null violates NOT NULL). If the column is added without a migration that backfills existing rows, all existing signing links stop working: the token lookup succeeds but any code reading `token.signerEmail` throws a null dereference.
|
||||
|
||||
| Integration | Common Mistake | Correct Approach |
|
||||
|-------------|----------------|------------------|
|
||||
| OpenAI Vision API | Sending raw PDF bytes — PDFs are not natively supported by vision models | Convert each page to PNG via pdfjs-dist on the server; send page images, not PDF bytes |
|
||||
| OpenAI structured output | Using `response_format: { type: 'json_object' }` and hoping the schema matches | Use `response_format: { type: 'json_schema', json_schema: { ... } }` with the exact schema, then validate with Zod |
|
||||
| `@cantoo/pdf-lib` (confirmed import in codebase) | Calling `embedPng()` with a base64 dataURL that includes the `data:image/png;base64,` prefix on systems that strip it | The existing `embed-signature.ts` already handles this correctly — preserve the pattern when adding new embed paths |
|
||||
| `@cantoo/pdf-lib` flatten | Flattening before drawing rectangles causes AcroForm overlay to appear on top of drawn content | The existing `prepare-document.ts` already handles order correctly (flatten first, then draw) — preserve this order in any new prepare paths |
|
||||
| Vercel Blob (if migrated from local uploads) | Fetching a Blob URL inside a serverless function on the same Vercel deployment causes a request to the CDN with potential cold-start latency | Use the `@vercel/blob` SDK's `get()` method rather than `fetch(blob.url)` from within API routes |
|
||||
| Agent signature file serving | Serving the agent's saved signature PNG via a public URL | Gate all signature file access behind the authenticated agent API — never expose with a public Blob URL |
|
||||
**Why it happens:**
|
||||
Drizzle migrations add the column in a single ALTER TABLE. There is no Drizzle migration command that backfills legacy data — that requires a separate SQL step in the migration file.
|
||||
|
||||
**How to avoid:**
|
||||
Add `signerEmail` as `TEXT` (nullable) initially. Backfill existing rows with the client's email via a JOIN at migration time. Then add the NOT NULL constraint in a second migration once backfill is confirmed. Alternatively, add `signerEmail TEXT DEFAULT ''` and document that empty string means "legacy single-signer." All code reading `signerEmail` must handle the legacy empty/null case.
|
||||
|
||||
**Warning signs:**
|
||||
- Drizzle migration adds `signer_email TEXT NOT NULL` in one step with no `DEFAULT` and no backfill SQL.
|
||||
- A v1.0 document's signing link is not tested after migration.
|
||||
|
||||
**Phase to address:** Multi-signer schema phase — include legacy backfill SQL in the migration script.
|
||||
|
||||
---
|
||||
|
||||
## Performance Traps
|
||||
### Pitfall 4: Field-to-Signer Tag Stored in JSONB — Queries Cannot Filter by Signer Efficiently
|
||||
|
||||
Patterns that work at small scale but fail as usage grows.
|
||||
**What goes wrong:**
|
||||
`signatureFields JSONB` is an array of field objects. Adding `signerEmail` to each field object is the right call for field filtering in the signing page (already done via `isClientVisibleField`). But if the completion detection, status dashboard, or "who has signed" query tries to derive signer list from the JSONB array, it requires a Postgres JSONB containment query (`@>` or `jsonb_array_elements`). These are unindexed by default and slow on large arrays. More critically, if the agent changes a field's `signerEmail` tag after the document has been sent, the JSONB update does not cascade to any `signingTokens` rows — the token was issued for the old email.
|
||||
|
||||
| Trap | Symptoms | Prevention | When It Breaks |
|
||||
|------|----------|------------|----------------|
|
||||
| OpenAI call inline with agent "AI Place Fields" button click | 10–30 second page freeze; API timeout on multi-page PDFs | Trigger AI placement as a background job; poll for completion; show progress bar | Immediately on PDFs > 5 pages |
|
||||
| PDF preview generation in a synchronous serverless function | Vercel function timeout (60s max Pro); OOM on 8MB PDFs | Generate once and store; serve from storage | On PDFs > 10MB or under concurrent load |
|
||||
| Storing all signatureFields JSONB on documents table without a size guard | Large JSONB column slows document list queries | Add a field count limit (max 50 fields); if AI places more, require agent review | When AI places fields on 25+ page documents with many fields per page |
|
||||
| dataURL signature image in `signaturesRef.current` in SigningPageClient | Each re-render serializes 50KB+ per signature into JSON | Already handled correctly in v1.0 (ref, not state) — do not move signature data to state when adding type-based rendering | Would break at > 5 simultaneous signature fields |
|
||||
**How to avoid:**
|
||||
The authoritative list of signers and their completion state lives in `signingTokens`, not in the JSONB. `signingTokens.signerEmail` is the source of truth for "who needs to sign." The JSONB field's `signerEmail` is used only at signing-page render time to filter which fields a given signer sees. Once a document is Sent (tokens issued), the JSONB field tags are considered frozen — re-tagging fields on a Sent document is not permitted without voiding the existing tokens.
|
||||
|
||||
**Warning signs:**
|
||||
- A query tries to derive the recipient list from `signatureFields JSONB` rather than from `signingTokens`.
|
||||
|
||||
**Phase to address:** Multi-signer schema phase — document this invariant in a code comment on `signingTokens`.
|
||||
|
||||
---
|
||||
|
||||
## Security Mistakes
|
||||
### Pitfall 5: Audit Trail Gap — No Record of Which Signer Completed Which Field
|
||||
|
||||
Domain-specific security issues beyond general web security.
|
||||
**What goes wrong:**
|
||||
The current `audit_events` table has `eventType: 'signature_submitted'` at the document level. With one signer this is unambiguous. With two signers, two `signature_submitted` events are logged for the same `documentId` with no `signerEmail` on the event. The legal audit trail cannot distinguish "Seller A signed at 14:00" from "Seller B signed at 14:05" — both appear as anonymous "signature submitted" events on the same document.
|
||||
|
||||
| Mistake | Risk | Prevention |
|
||||
|---------|------|------------|
|
||||
| Agent saved signature served via a predictable or public file path | Any user who can guess the path downloads the agent's legal signature | Store under a UUID path; serve only through `GET /api/agent/signature` which verifies the better-auth session before streaming |
|
||||
| AI field placement values (pre-fill text) passed to OpenAI without scrubbing | Client PII (name, email, SSN, property address) sent to OpenAI and stored in their logs | Provide only anonymized document structure to the AI (page images without personally identifiable pre-fill values); apply pre-fill values server-side after AI field detection |
|
||||
| Preview PDF served at a guessable URL (e.g. `/api/documents/{id}/preview`) without auth check | Anyone with the document ID can download a prepared document containing client PII | All document file routes must verify the agent session before streaming — apply the same guard as the existing `/api/documents/[id]/download/route.ts` |
|
||||
| Agent signature dataURL transmitted from client to server in an unguarded API route | Any authenticated user (if multi-agent is ever added) can overwrite the saved signature | The save-signature endpoint must verify the session user matches the signature owner — prepare for this even in solo-agent v1 |
|
||||
| Signed PDF stale preview served to client after re-preparation | Client signs a document that differs from what agent reviewed and approved | Hash prepared PDF at prepare time; verify hash before serving to client signing page |
|
||||
**Why this matters:**
|
||||
Utah e-signature law requires proof of who signed what and when. An undifferentiated audit log is a legal compliance gap (see existing LEGAL-03 compliance requirement in v1.0).
|
||||
|
||||
**How to avoid:**
|
||||
Add `signerEmail TEXT` to `auditEvents` (nullable, to preserve backward compatibility with v1.0 events). When logging `signature_submitted` in multi-signer mode, include the `signerEmail` from the claimed token row in the event metadata. The `metadata JSONB` column already exists and can carry this without a schema change — use `metadata: { signerEmail: tokenRow.signerEmail }` as a minimum before a proper column is added.
|
||||
|
||||
**Warning signs:**
|
||||
- Two `signature_submitted` events logged for the same `documentId` with no distinguishing field.
|
||||
|
||||
**Phase to address:** Multi-signer signing flow phase — include signer identity in audit events before the first multi-signer document is tested.
|
||||
|
||||
---
|
||||
|
||||
## UX Pitfalls
|
||||
### Pitfall 6: Document Status "Viewed" Conflicts Across Signers
|
||||
|
||||
Common user experience mistakes in this domain.
|
||||
**What goes wrong:**
|
||||
The current GET `/api/sign/[token]` sets `documents.status = 'Viewed'` when any signer opens their link (line 81 of the current route). With two signers, Signer A opens the link → document becomes Viewed. Signer A backs out without signing. Signer B hasn't even opened their link yet. Agent sees "Viewed" status and assumes both signers have engaged. If Signer A then signs, status jumps from Viewed → Signed (via the POST handler), bypassing any intermediate state. The agent has no way to know that Signer B never opened their link.
|
||||
|
||||
| Pitfall | User Impact | Better Approach |
|
||||
|---------|-------------|-----------------|
|
||||
| Preview opens in a new browser tab as a raw PDF | Agent has no context that this is a preview vs. the final document; no field overlays visible | Display preview in-app with a "PREVIEW — Fields Filled" watermark overlay on each page |
|
||||
| AI-placed fields shown without a review step | Agent sends a document with misaligned AI fields to a client; client is confused by floating sign boxes | AI placement populates the FieldPlacer UI for agent review — never auto-sends; agent must manually click "Looks good, proceed" |
|
||||
| "Prepare and Send" button available before the agent has placed any fields | Agent sends a blank document with no signature fields; client has nothing to sign | Disable "Prepare and Send" if `signatureFields` is empty or contains only agent-signature fields (no client fields) |
|
||||
| Agent saved signature is applied but no visual confirmation is shown | Agent thinks the signature was applied; document arrives unsigned because the apply step silently failed | Show the agent's saved signature PNG in the field placer overlay immediately after apply; require explicit confirmation before the prepare step |
|
||||
| Preview shows pre-filled text but not field type labels | Agent cannot distinguish a "checkbox" pre-fill from a "text" pre-fill in the visual preview | Show field type badges (small colored labels) on the preview overlay, not just the filled content |
|
||||
| Client signing page shows no progress for non-signature fields (text, checkbox, date) | Client doesn't know they need to fill in text boxes or check checkboxes — sees only signature prompts | The progress bar in `SigningProgressBar.tsx` counts `signatureFields.length` — this must count all client-facing fields, not just signature-type fields |
|
||||
**How to avoid:**
|
||||
Per-signer status belongs in `signingTokens`, not in `documents`. Add a `viewedAt TIMESTAMP` column to `signingTokens`. The GET handler sets `signingTokens.viewedAt = NOW()` for the specific token, not `documents.status`. The documents-level status becomes a computed aggregate: `Draft` → `Sent` (any token issued) → `Partially Signed` (some tokens usedAt set) → `Signed` (all tokens usedAt set). Consider adding `Partially Signed` to the `documentStatusEnum`, or compute it in the agent dashboard query.
|
||||
|
||||
**Warning signs:**
|
||||
- The signing GET handler writes `documents.status = 'Viewed'` instead of `signingTokens.viewedAt = NOW()`.
|
||||
|
||||
**Phase to address:** Multi-signer schema phase — add `viewedAt` to `signingTokens` and derive document status from token states.
|
||||
|
||||
---
|
||||
|
||||
## "Looks Done But Isn't" Checklist
|
||||
## Docker/Deployment Pitfalls
|
||||
|
||||
Things that appear complete but are missing critical pieces.
|
||||
### Pitfall 7: NEXT_PUBLIC_BASE_URL Is Baked at Build Time — Wrong URL in Production Container
|
||||
|
||||
- [ ] **AI field placement:** Verify the coordinate conversion unit test asserts specific PDF-space x/y values (not just "fields are returned") — eyeball testing will miss Y-axis inversion errors on Utah standard forms.
|
||||
- [ ] **Expanded field types:** Verify `SigningPageClient.tsx` has a rendering branch for every type in the `SignatureFieldData` type union — not just the new FieldPlacer palette tokens. Check for the default/fallback case.
|
||||
- [ ] **Agent saved signature:** Verify the saved signature is stored as a file path, not a dataURL TEXT column — check the Drizzle schema migration and confirm no `dataUrl` column was added to `users`.
|
||||
- [ ] **Agent signs first:** Verify that after agent applies their signature, the agent-signature field is embedded into the prepared PDF and removed from the `signatureFields` array that gets sent to the client — not just visually hidden in the FieldPlacer.
|
||||
- [ ] **Filled preview:** Verify the preview URL changes when fields or text fill values change (cache-busting via timestamp or hash in the path) — open DevTools network tab, modify a field, re-generate preview, confirm a new file is fetched.
|
||||
- [ ] **Filled preview freshness gate:** Verify the "Send" button is disabled when `lastPreviewGeneratedAt < lastFieldsUpdatedAt` — test by generating a preview, changing a field, and confirming the send button becomes disabled.
|
||||
- [ ] **OpenAI token limit:** Verify the AI placement works on a real 20-page Utah REPC form, not just a 2-page test PDF — check that page 15+ fields are detected with the same accuracy as page 1.
|
||||
- [ ] **Schema migration:** Verify that documents created in v1.0 (where `signatureFields` JSONB has entries without a `type` key) are handled gracefully by all v1.1 code paths — add a null-safe fallback for `field.type ?? 'signature'` throughout.
|
||||
**What goes wrong:**
|
||||
`send/route.ts` line 35 reads:
|
||||
```typescript
|
||||
const baseUrl = process.env.NEXT_PUBLIC_BASE_URL ?? 'http://localhost:3000';
|
||||
```
|
||||
In Next.js, any variable prefixed `NEXT_PUBLIC_` is substituted at `next build` time — it becomes a string literal in the compiled JavaScript bundle. If the Docker image is built with `NEXT_PUBLIC_BASE_URL=http://localhost:3000` (or not set at all), every signing URL emailed to clients will point to `localhost:3000` regardless of what is set in the container's runtime environment. The client clicks the link and gets "connection refused."
|
||||
|
||||
**This is specific to `NEXT_PUBLIC_*` variables.** Server-only variables (no `NEXT_PUBLIC_` prefix) ARE read at runtime from the container environment. Mixing the two causes precisely the confusion reported in this project.
|
||||
|
||||
**How to avoid:**
|
||||
For variables that need to be available on the server only (like `BASE_URL` for constructing server-side URLs), remove the `NEXT_PUBLIC_` prefix. `NEXT_PUBLIC_` should only be used for variables that need to reach the browser bundle. The signing URL is constructed in a server-side API route — it does not need `NEXT_PUBLIC_`. Rename to `SIGNING_BASE_URL` (no prefix), read it only in API routes, and inject it into the container environment at runtime via Docker Compose `environment:` block.
|
||||
|
||||
**Warning signs:**
|
||||
- Signing emails send but clicking the link shows a browser connection error or goes to localhost.
|
||||
- `NEXT_PUBLIC_BASE_URL` is set in `docker-compose.yml` under `environment:` and the developer assumes this is sufficient — it is not, because the value was already baked in during `docker build`.
|
||||
|
||||
**Phase to address:** Docker deployment phase — rename the variable and audit all `NEXT_PUBLIC_` usages before building the production image.
|
||||
|
||||
---
|
||||
|
||||
## Recovery Strategies
|
||||
### Pitfall 8: Uploads Directory Is Lost on Container Restart
|
||||
|
||||
When pitfalls occur despite prevention, how to recover.
|
||||
**What goes wrong:**
|
||||
All uploaded PDFs, prepared PDFs, and signed PDFs are written to `process.cwd() + '/uploads'`. In the Docker container, `process.cwd()` is the directory where Next.js starts — typically `/app`. The path `/app/uploads` is inside the container's writable layer, which is ephemeral. When the container is stopped and recreated (deployment, crash, `docker compose up --force-recreate`), all PDFs are gone. Signed documents that were legally executed are permanently lost. Clients cannot download their signed copies. The agent loses the audit record.
|
||||
|
||||
| Pitfall | Recovery Cost | Recovery Steps |
|
||||
|---------|---------------|----------------|
|
||||
| Client received signing link but signing page crashes on new field types | HIGH | Emergency hotfix: add `field.type ?? 'signature'` fallback in SigningPageClient; deploy; invalidate old token; send new link |
|
||||
| AI placed fields are wrong/inverted on first real-form test | LOW | Fix coordinate conversion unit; re-run AI placement for that document; no data migration needed |
|
||||
| Agent saved signature stored as dataURL in DB | MEDIUM | Add migration: extract dataURL to file, update path column, nullify dataURL column; existing signed PDFs are unaffected |
|
||||
| Preview PDF served stale after field changes | LOW | Add cache-busting query param or timestamp to preview URL; no data changes needed |
|
||||
| Agent-signature field appears in client's signing field list | HIGH | Emergency hotfix: filter signatureFields in signing token GET by type; redeploy; affected in-flight signing sessions may need new tokens |
|
||||
| Large PDF causes Vercel function OOM during preview generation | MEDIUM | Switch preview to background job + polling; no data migration; existing prepared PDFs are valid |
|
||||
**How to avoid:**
|
||||
Mount a named Docker volume at `/app/uploads` (or whatever `process.cwd()` resolves to in the container) in `docker-compose.yml`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
app:
|
||||
volumes:
|
||||
- uploads_data:/app/uploads
|
||||
volumes:
|
||||
uploads_data:
|
||||
```
|
||||
|
||||
Verify the mount path matches `process.cwd()` inside the container — do not assume it is `/app`. Run `docker exec <container> node -e "console.log(process.cwd())"` to confirm. The volume must also be backed up separately; Docker named volumes are not automatically backed up.
|
||||
|
||||
**Warning signs:**
|
||||
- No `volumes:` key appears in `docker-compose.yml` for the app service.
|
||||
- After a container restart, the agent portal shows documents with no downloadable PDF (the file path in the DB is valid but the file does not exist on disk).
|
||||
|
||||
**Phase to address:** Docker deployment phase — establish the volume before any production upload occurs.
|
||||
|
||||
---
|
||||
|
||||
## Pitfall-to-Phase Mapping
|
||||
### Pitfall 9: Database Connection String Absent in Container — App Boots but All Queries Fail
|
||||
|
||||
How roadmap phases should address these pitfalls.
|
||||
**What goes wrong:**
|
||||
`DATABASE_URL` and other secrets (`SIGNING_JWT_SECRET`, `CONTACT_SMTP_HOST`, etc.) are not committed to the repository. In development they are in `.env.local`. In a Docker container, `.env.local` is not automatically copied (`.gitignore` typically excludes it, and `COPY . .` in a Dockerfile may or may not include it depending on `.dockerignore`). If the Docker image is built without the secret baked in (correct practice) but the `docker-compose.yml` does not inject it via `environment:` or `env_file:`, the container starts successfully — `next start` does not validate env vars at startup — but every database query throws "missing connection string" at request time. The agent portal loads its login page (server components that don't query the DB) but crashes on any data operation.
|
||||
|
||||
| Pitfall | Prevention Phase | Verification |
|
||||
|---------|------------------|--------------|
|
||||
| Breaking signing page with new field types (Pitfall 1) | Phase 1: Schema + signing page update | Deploy field type union; confirm signing page renders placeholder for unknown types; load an old v1.0 document with no type field and verify graceful fallback |
|
||||
| AI coordinate system mismatch (Pitfall 2) | Phase 2: AI integration — coordinate conversion utility | Unit test with a known Utah REPC: assert specific PDF-space x/y for a known field; Y-axis inversion test |
|
||||
| OpenAI token limits on large PDFs (Pitfall 3) | Phase 2: AI integration — page-by-page pipeline | Test with the longest form Teressa uses (likely 20+ page REPC); verify all pages processed |
|
||||
| Prompt hallucination and schema incompatibility (Pitfall 4) | Phase 2: AI integration — Zod validation of AI response | Feed an edge-case page (all text, no form fields) and verify AI returns empty array, not hallucinated fields |
|
||||
| Saved signature as dataURL in DB (Pitfall 5) | Phase 3: Agent saved signature | Confirm Drizzle schema has a path column, not a dataURL column; verify file is stored under UUID path |
|
||||
| Race condition: agent updates signature mid-signing (Pitfall 6) | Phase 3: Agent saved signature + supersede flow | Confirm "Prepare and Send" on a Sent/Viewed document requires confirmation and invalidates old token |
|
||||
| Stale preview after field changes (Pitfall 7) | Phase 4: Filled document preview | Modify a field after preview generation; confirm send button disables or preview refreshes |
|
||||
| OOM on large PDF preview (Pitfall 8) | Phase 4: Filled document preview | Test preview generation on a 20-page REPC; monitor Vercel function memory in dashboard |
|
||||
| Client signs different doc than agent previewed (Pitfall 9) | Phase 4: Filled document preview | Confirm prepared PDF is hashed at prepare time; verify hash is checked before streaming to client |
|
||||
| Agent-signature field shown to client (Pitfall 10) | Phase 3: Agent signing flow | Confirm signing token GET filters `type === 'agent-signature'` fields before returning; test with a document that has both agent and client signature fields |
|
||||
The `src/lib/db/index.ts` lazy singleton does throw `"DATABASE_URL environment variable is not set"` when first accessed — but this error is silent at startup and only surfaces at first request.
|
||||
|
||||
**How to avoid:**
|
||||
Create a `.env.production` file (not committed) that is referenced in `docker-compose.yml` via `env_file: .env.production`. Alternatively, use Docker Compose `environment:` blocks with explicit variable names. Validate at container startup by adding a health check endpoint (`/api/health`) that runs `SELECT 1` against the database and returns 200 only when the connection is live. Gate the container's `healthcheck:` on this endpoint so Docker Compose's `depends_on: condition: service_healthy` prevents the app from accepting traffic before the DB is reachable.
|
||||
|
||||
**Warning signs:**
|
||||
- The login page loads in Docker but the agent portal shows 500 errors on every page.
|
||||
- `docker logs <container>` shows "Environment variable DATABASE_URL is not set" at the first request, not at startup.
|
||||
- The `.env.production` or secrets file is not referenced anywhere in `docker-compose.yml`.
|
||||
|
||||
**Phase to address:** Docker deployment phase — validate all required env vars against a checklist before the first production deploy.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 10: PostgreSQL Container and App Container Start in Wrong Order — DB Not Ready
|
||||
|
||||
**What goes wrong:**
|
||||
`docker compose up` starts all services in parallel by default. The Next.js app container may attempt its first database query before PostgreSQL has accepted connections. Drizzle's `postgres` client (using the `postgres` npm package) throws `ECONNREFUSED` or `ENOTFOUND` on the first query. The app container may crash-loop if the error is unhandled at startup, or silently return 500s until the DB is ready if queries are only made at request time.
|
||||
|
||||
**How to avoid:**
|
||||
Add `depends_on` with `condition: service_healthy` in `docker-compose.yml`. The PostgreSQL service needs a `healthcheck:` using `pg_isready`:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
db:
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U postgres"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
app:
|
||||
depends_on:
|
||||
db:
|
||||
condition: service_healthy
|
||||
```
|
||||
|
||||
Also run Drizzle migrations as part of app startup (add `drizzle-kit migrate` to the container's `command:` or an entrypoint script) so the schema is applied before the first request. Without this, a fresh deployment against an empty database will fail on every query.
|
||||
|
||||
**Warning signs:**
|
||||
- `docker-compose.yml` has no `healthcheck:` on the database service.
|
||||
- `docker-compose.yml` has no `depends_on` on the app service.
|
||||
|
||||
**Phase to address:** Docker deployment phase — write the complete `docker-compose.yml` with health checks before the first production deploy.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 11: Neon Connection Pool Exhaustion in Docker
|
||||
|
||||
**What goes wrong:**
|
||||
`src/lib/db/index.ts` creates a `postgres(url)` client with no explicit `max` parameter. The `postgres` npm package defaults to `max: 10` connections per process. Neon's free tier allows 10 concurrent connections total. One Next.js container with default settings exhausts the entire connection budget. A second container (staging + production running simultaneously, or a restart overlap) causes all new queries to queue indefinitely until connections are freed, manifesting as timeouts on every request.
|
||||
|
||||
Additionally, the current proxy-singleton pattern in `db/index.ts` creates one pool per Node.js process. Next.js in development mode can hot-reload modules, creating multiple pool instances per dev session. In production this is not a problem, but it can silently leak connections during CI test runs or development stress tests.
|
||||
|
||||
**Why it happens:**
|
||||
The `postgres` npm package does not warn when connection limits are exceeded — it silently queues queries. The Neon dashboard shows connection count; the app shows only request timeouts with no clear error.
|
||||
|
||||
**How to avoid:**
|
||||
Set an explicit `max` connection limit appropriate for the deployment. For a single-container deployment against Neon free tier (10 connection limit), use `postgres(url, { max: 5 })` to leave headroom for migrations, admin queries, and overlap during deployments. For paid Neon tiers, scale accordingly. Add `idle_timeout: 20` (seconds) to release idle connections promptly. Add `connect_timeout: 10` to surface connection failures quickly rather than queuing indefinitely.
|
||||
|
||||
Recommended `db/index.ts` configuration:
|
||||
```typescript
|
||||
const client = postgres(url, {
|
||||
max: 5, // conservative for Neon free tier; increase with paid plan
|
||||
idle_timeout: 20, // release idle connections within 20s
|
||||
connect_timeout: 10, // fail fast if Neon is unreachable
|
||||
});
|
||||
```
|
||||
|
||||
**Warning signs:**
|
||||
- `postgres(url)` called with no second argument in `db/index.ts`.
|
||||
- Neon dashboard shows connection count at ceiling during normal single-user usage.
|
||||
- Requests time out with no database error in logs — only generic "fetch failed" errors.
|
||||
|
||||
**Phase to address:** Docker deployment phase — configure connection pool limits before the first production deploy.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 12: @napi-rs/canvas Native Binary — Wrong Platform in Docker Image
|
||||
|
||||
**What goes wrong:**
|
||||
`@napi-rs/canvas` is declared in `serverExternalPackages` in `next.config.ts`, which tells Next.js to load it as a native Node.js module rather than bundling it. The package ships pre-compiled `.node` binary files for specific platforms (darwin-arm64, linux-x64-gnu, linux-arm64-gnu, etc.). When `npm install` runs on an Apple Silicon Mac during development, npm downloads the `darwin-arm64` binary. If the Docker image is built by running `npm install` inside a `node:alpine` container (which is `linux-musl`, not `linux-gnu`), the `linux-x64-musl` binary is selected — but `@napi-rs/canvas` does not publish musl builds. The canvas module fails to load at runtime with `Error: /app/node_modules/@napi-rs/canvas/...node: invalid ELF header`.
|
||||
|
||||
Even if the Docker base image is `node:20-slim` (Debian, linux-gnu), building on an ARM host and deploying to an x86 server results in the wrong binary unless the `--platform` flag is used during `docker build`.
|
||||
|
||||
**How to avoid:**
|
||||
Always build the Docker image with an explicit platform target matching the production host:
|
||||
```bash
|
||||
docker build --platform linux/amd64 -t app .
|
||||
```
|
||||
|
||||
Use `node:20-slim` (Debian-based, glibc) as the Docker base image — not `node:20-alpine` (musl). Verify the canvas module loads in the container before deploying:
|
||||
```bash
|
||||
docker exec <container> node -e "require('@napi-rs/canvas'); console.log('canvas OK')"
|
||||
```
|
||||
|
||||
If developing on ARM and deploying to x86, add `--platform linux/amd64` to the `docker build` command in the deployment runbook and CI pipeline.
|
||||
|
||||
**Warning signs:**
|
||||
- `next.config.ts` lists `@napi-rs/canvas` in `serverExternalPackages`.
|
||||
- Docker base image is `node:alpine`.
|
||||
- The build machine architecture differs from the deployment target.
|
||||
- Runtime error: `invalid ELF header` or `Cannot find module '@napi-rs/canvas'` after a clean image build.
|
||||
|
||||
**Phase to address:** Docker deployment phase — verify canvas module compatibility before the first production build.
|
||||
|
||||
---
|
||||
|
||||
## Email/SMTP Pitfalls
|
||||
|
||||
### Pitfall 13: SMTP Env Vars Absent in Container — Root Cause of Reported Email Breakage
|
||||
|
||||
**What goes wrong:**
|
||||
This is the reported issue: email worked in development but broke when deployed to Docker. The most likely root cause is that `CONTACT_SMTP_HOST`, `CONTACT_SMTP_PORT`, `CONTACT_EMAIL_USER`, `CONTACT_EMAIL_PASS`, and `AGENT_EMAIL` are not present in the container environment. `signing-mailer.tsx` reads these in `createTransporter()` which is called at send time (not at module load) — so the missing env vars do not cause a startup error. The first signing email attempt fails with Nodemailer throwing `connect ECONNREFUSED` (if host resolves to nothing) or `Invalid login` (if credentials are absent).
|
||||
|
||||
**Why it looks like a DNS problem but isn't:**
|
||||
Docker containers on a bridge network use the host's DNS resolver (or Docker's embedded resolver) and can reach external SMTP servers by hostname without any special configuration. The SMTP server (`CONTACT_SMTP_HOST`) is an external service (e.g., Mailgun, SendGrid, or a personal SMTP relay) — Docker does not change its reachability. The error is env var injection failure, not DNS.
|
||||
|
||||
**Verification steps before attempting the Docker fix:**
|
||||
1. `docker exec <container> printenv CONTACT_SMTP_HOST` — if empty, the env var is missing.
|
||||
2. `docker exec <container> node -e "const n = require('nodemailer'); n.createTransport({host: process.env.CONTACT_SMTP_HOST, port: 465, secure: true, auth: {user: process.env.CONTACT_EMAIL_USER, pass: process.env.CONTACT_EMAIL_PASS}}).verify(console.log)"` — tests SMTP connectivity from inside the container.
|
||||
|
||||
**How to avoid:**
|
||||
Include all SMTP variables in the `env_file:` or `environment:` block of the app service in `docker-compose.yml`. Use an `.env.production` file that is manually provisioned on the Docker host (not committed). Consider using Docker secrets (mounted files) for the SMTP password rather than environment variables if the host is shared.
|
||||
|
||||
**Warning signs:**
|
||||
- `docker exec <container> printenv CONTACT_SMTP_HOST` returns empty.
|
||||
- Signing emails silently fail with no error until first send attempt.
|
||||
|
||||
**Phase to address:** Docker deployment phase — SMTP env var verification is the first check in the deployment runbook.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 14: Nodemailer Transporter Created With Mismatched Port and TLS Settings
|
||||
|
||||
**What goes wrong:**
|
||||
`signing-mailer.tsx` contains:
|
||||
```typescript
|
||||
port: Number(process.env.CONTACT_SMTP_PORT ?? 465),
|
||||
secure: Number(process.env.CONTACT_SMTP_PORT ?? 465) === 465,
|
||||
```
|
||||
`contact-mailer.ts` contains:
|
||||
```typescript
|
||||
port: Number(process.env.CONTACT_SMTP_PORT ?? 587),
|
||||
secure: false, // STARTTLS on port 587
|
||||
```
|
||||
The two mailers use different defaults for the same env var. If `CONTACT_SMTP_PORT` is not set in the container, the signing mailer assumes port 465 (TLS), but the contact form mailer assumes port 587 (STARTTLS). If the SMTP provider only supports one of these, one mailer will connect and the other will time out. The mismatch is invisible until both code paths are exercised in production.
|
||||
|
||||
**How to avoid:**
|
||||
Require `CONTACT_SMTP_PORT` explicitly — remove the fallback defaults and add a startup validation check that throws if this variable is missing. Use a single `createSmtpTransporter()` utility function shared by both mailers, not two separate inline `createTransport()` calls with different defaults. Document the required env var values in a `DEPLOYMENT.md` or the `docker-compose.yml` comments.
|
||||
|
||||
**Warning signs:**
|
||||
- Two separate inline `createTransport()` calls with different `port` defaults for the same env var.
|
||||
- Only one of the two email paths (signing email vs. contact form) is tested in Docker.
|
||||
|
||||
**Phase to address:** Docker deployment phase — consolidate SMTP transporter creation before the first production email test.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 15: Multi-Signer Email Loop Fails Halfway — No Partial-Send Recovery
|
||||
|
||||
**What goes wrong:**
|
||||
When sending to three signers, the send route will loop: create token 1, email Signer 1, create token 2, email Signer 2, create token 3, email Signer 3. If email to Signer 2 fails (SMTP timeout, invalid address), tokens 1 and 3 may still be created in the database but Signer 3 never receives their email. The document is now in an inconsistent state: tokens exist for recipients who were never emailed. Signer 1 signs, completion detection counts 2 remaining unclaimed tokens (Signers 2 and 3 never signed), document never reaches "Signed."
|
||||
|
||||
**How to avoid:**
|
||||
Create all tokens before sending any emails. Wrap token creation in a transaction — if any token INSERT fails, roll back all tokens and return an error before any emails are sent. Send emails outside the transaction (SMTP is not transactional). If an email send fails, mark that token as `superseded` (add a `supersededAt` column to `signingTokens`) rather than deleting it, and surface the partial-send failure to the agent with a "resend to failed recipients" option. Never leave unclaimed tokens orphaned by partial email failure.
|
||||
|
||||
**Warning signs:**
|
||||
- The send loop interleaves token creation and email sending (create token 1, send email 1, create token 2, send email 2...) rather than creating all tokens atomically first.
|
||||
|
||||
**Phase to address:** Multi-signer send phase — design the send loop with transactional token creation from the start.
|
||||
|
||||
---
|
||||
|
||||
## PDF Assembly Pitfalls
|
||||
|
||||
### Pitfall 16: Final PDF Assembly Runs Multiple Times — Duplicate Signed PDFs
|
||||
|
||||
**What goes wrong:**
|
||||
Completion detection triggers PDF assembly (merging all signer contributions into one final PDF). If the race condition guard (Pitfall 2) is not in place, assembly runs twice. Even with the guard, if the assembly function crashes partway through and the `completionTriggeredAt` was already set, there is no way to retry assembly — the guard prevents re-entry and the document is stuck with no signed PDF.
|
||||
|
||||
**How to avoid:**
|
||||
Separate the "completion triggered" flag from the "signed PDF ready" flag. Add both `completionTriggeredAt TIMESTAMP` (prevents double-triggering) and `signedFilePath TEXT` (set only when PDF is successfully written). If `completionTriggeredAt` is set but `signedFilePath` is null after 60 seconds, an admin retry endpoint can reset `completionTriggeredAt` to null to allow re-triggering. The existing atomic rename pattern (`tmp → final`) in `embed-signature.ts` already prevents partial PDF corruption — preserve this in the multi-signer assembly code.
|
||||
|
||||
**Warning signs:**
|
||||
- Only a single flag (`completionTriggeredAt`) is used to track both triggering and completion.
|
||||
- No retry mechanism exists for a stuck assembly.
|
||||
|
||||
**Phase to address:** Multi-signer completion phase — implement idempotent assembly with separate trigger and completion flags.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 17: Multi-Signer Final PDF — Which Prepared PDF Is the Base?
|
||||
|
||||
**What goes wrong:**
|
||||
In the current single-signer flow, `embedSignatureInPdf` reads from `doc.preparedFilePath` (the agent-prepared PDF with text fills and agent signatures already embedded) and writes to `_signed.pdf`. With multiple signers, each signer's signature needs to be embedded sequentially onto the same prepared PDF base. If two handlers run concurrently and both read from `preparedFilePath`, modify it in memory, and write independent output PDFs, the final "merge" step needs a different strategy — you cannot simply append two separately-signed PDFs into one document without losing the shared base.
|
||||
|
||||
**How to avoid:**
|
||||
The correct architecture for multi-signer PDF assembly:
|
||||
|
||||
1. Each signer's POST handler embeds only that signer's signatures into an intermediate file: `{docId}_partial_{signerEmail_hash}.pdf`. This intermediate file is written atomically (tmp → rename). It is NOT the final document.
|
||||
2. When completion is triggered (all tokens claimed), a single assembly function reads the prepared PDF once, iterates all signers' signature data (from DB or intermediate files), embeds all signatures in one pass, and writes `{docId}_signed.pdf`.
|
||||
3. The `pdfHash` is computed only from the final assembled PDF, not from any intermediate.
|
||||
|
||||
This avoids the read-modify-write race entirely. Intermediate files are cleaned up after successful final assembly.
|
||||
|
||||
**Warning signs:**
|
||||
- Each signer's POST handler directly writes to `_signed.pdf` rather than an intermediate file.
|
||||
- The final assembly step reads from two separately-signed PDF files and tries to merge them.
|
||||
|
||||
**Phase to address:** Multi-signer completion phase — establish the intermediate file pattern before any signing submission code is written.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 18: Temp File Accumulation on Failed Assemblies
|
||||
|
||||
**What goes wrong:**
|
||||
The current code already creates a temp file during date stamping (`preparedAbsPath.datestamped.tmp`) and cleans it up with `unlink().catch(() => {})`. Multi-signer assembly will create intermediate partial files. If the assembly handler crashes between writing intermediates and producing the final PDF, those temp files are never cleaned up. Over time, the `uploads/` directory fills with orphaned intermediate files. On the home Docker server with limited disk, this causes write failures on new documents.
|
||||
|
||||
**How to avoid:**
|
||||
Name all intermediate and temp files with a recognizable pattern (`*.tmp`, `*_partial_*.pdf`). Add a periodic cleanup job (a Next.js route called by a cron or a simple setInterval in a route handler) that deletes `*.tmp` and `*_partial_*.pdf` files older than 24 hours. Log a warning when cleanup finds orphaned files — this surfaces incomplete assemblies that need investigation.
|
||||
|
||||
**Warning signs:**
|
||||
- The `uploads/` directory grows unbounded over time.
|
||||
- Partial files from failed assemblies remain after a document is marked Signed.
|
||||
|
||||
**Phase to address:** Multi-signer completion phase — add cleanup alongside the assembly logic.
|
||||
|
||||
---
|
||||
|
||||
## Security Pitfalls
|
||||
|
||||
### Pitfall 19: Multiple Tokens Per Document — Token Enumeration Attack
|
||||
|
||||
**What goes wrong:**
|
||||
In the single-signer system, one token is issued per document. An attacker who intercepts or guesses a token can sign one document. With multi-signer, multiple tokens are issued for the same document. If token generation uses a predictable pattern (e.g., sequential IDs, short UUIDs, or low-entropy random values), an attacker who holds one valid token for a document can enumerate sibling tokens for the same document by brute-forcing nearby values.
|
||||
|
||||
**Current state:** `createSigningToken` uses `crypto.randomUUID()` for the JTI and `SignJWT` with HS256. UUID v4 provides 122 bits of randomness — sufficient. The risk is theoretical given current implementation but becomes concrete if the JTI generation is ever changed.
|
||||
|
||||
**How to avoid:**
|
||||
Keep using `crypto.randomUUID()` for JTI. Do not add any sequential or human-readable component to the JTI. Ensure the JWT is verified before the JTI is looked up in the database — `verifySigningToken()` already does this (JWT signature check first, then DB lookup). Add rate limiting on the signing GET and POST endpoints: `MAX 10 requests per IP per minute` prevents brute force. Log and alert on `status: 'invalid'` responses that repeat from the same IP.
|
||||
|
||||
**Warning signs:**
|
||||
- JTI generation switches from `crypto.randomUUID()` to a sequential or short-UUID pattern.
|
||||
- No rate limiting exists on `/api/sign/[token]` GET or POST.
|
||||
|
||||
**Phase to address:** Multi-signer send phase — add rate limiting before issuing multiple tokens per document.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 20: Token Shared Between Signers — Signer A Uses Signer B's Token
|
||||
|
||||
**What goes wrong:**
|
||||
With multi-signer, the system issues separate tokens per signer email. But the signing GET handler at line 90 currently returns ALL client-visible fields (filtered by `isClientVisibleField`), not fields tagged to the specific signer. If Signer A somehow obtains Signer B's token (e.g., email forward, shared email account, phishing), Signer A sees Signer B's fields and can sign them. In real estate, this is equivalent to signing another party's name on a contract — a serious legal issue.
|
||||
|
||||
The signing POST handler (lines 210-213) filters `signableFields` to all `client-signature` and `initials` fields for the entire document — it does not restrict by signer. A cross-token submission would succeed server-side.
|
||||
|
||||
**How to avoid:**
|
||||
After multi-signer is implemented, the signing GET handler must filter `signatureFields` by `field.signerEmail === tokenRow.signerEmail`. The signing POST handler must verify that the field IDs in the `signatures` request body correspond only to fields tagged to `tokenRow.signerEmail` — reject any submission that includes field IDs not assigned to that signer. This is a server-side enforcement, not a UI concern.
|
||||
|
||||
**Warning signs:**
|
||||
- The signing GET handler's `signatureFields` filter does not include a `signerEmail` check.
|
||||
- The signing POST handler's `signableFields` filter does not restrict by `signerEmail`.
|
||||
|
||||
**Phase to address:** Multi-signer signing flow phase — add signer-field binding validation to both GET and POST handlers.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 21: Completion Notification Email Sent to Wrong Recipients
|
||||
|
||||
**What goes wrong:**
|
||||
The current `sendAgentNotificationEmail` sends to `process.env.AGENT_EMAIL`. In multi-signer, the requirement is to send the final merged PDF to all signers AND the agent when completion occurs. If the recipient list is derived from `documents.emailAddresses` (the JSONB array collected at prepare time), and that array is stale (e.g., the agent changed a signer's email between prepare and send), the final PDF goes to the old address.
|
||||
|
||||
A worse variant: if `emailAddresses` contains CC addresses that are NOT signers (e.g., a title company contact), those recipients receive the completed PDF immediately — before the agent has reviewed it. For a solo agent workflow, this is likely acceptable, but it should be explicit.
|
||||
|
||||
**How to avoid:**
|
||||
Derive the final recipient list from `signingTokens.signerEmail` (the authoritative record of who was actually sent a token), not from `documents.emailAddresses`. Separate "recipients who receive the signing link" from "recipients who receive the completed PDF" explicitly in the data model. The agent should review the final recipient list at send time.
|
||||
|
||||
**Warning signs:**
|
||||
- The completion handler derives email recipients from `documents.emailAddresses` rather than `signingTokens.signerEmail`.
|
||||
|
||||
**Phase to address:** Multi-signer send phase — establish the recipient derivation rule before tokens are issued.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 22: Signing Token Issued But Document Re-Prepared — Token Points to Stale PDF
|
||||
|
||||
**What goes wrong:**
|
||||
v1.1 introduced a guard: Draft-only documents can be AI-prepared (`ai-prepare/route.ts` line 37: `if (doc.status !== 'Draft') return 403`). But `prepare/route.ts` (which calls `preparePdf` and writes `_prepared.pdf`) does not have an equivalent guard — a Sent document can be re-prepared if the agent POST to `/api/documents/{id}/prepare` directly. With multi-signer, if any token has been issued (even if no signer has used it), re-preparing the document overwrites `_prepared.pdf` and changes `preparedFilePath`. Signers who have already received their token will open the signing page and load the new prepared PDF — which may have different text fills, field positions, or the agent's new signature — not what was legally sent to them.
|
||||
|
||||
**How to avoid:**
|
||||
Add a guard to `prepare/route.ts`: if `signingTokens` has any row for this document with `usedAt IS NULL` (any token still outstanding), reject the prepare request with `409 Conflict: "Cannot re-prepare a document with outstanding signing tokens."` If the agent genuinely needs to change the document, they must first void all outstanding tokens (supersede them) and issue new ones.
|
||||
|
||||
**Warning signs:**
|
||||
- `prepare/route.ts` has no check against the `signingTokens` table before writing `_prepared.pdf`.
|
||||
|
||||
**Phase to address:** Multi-signer send phase — add the outstanding-token guard to the prepare route before multi-signer send is implemented.
|
||||
|
||||
---
|
||||
|
||||
### Pitfall 23: @vercel/blob Is Installed But Not Used — Risk of Accidental Use
|
||||
|
||||
**What goes wrong:**
|
||||
`package.json` lists `@vercel/blob` as a production dependency. No file in the codebase imports or uses it. The package provides a Vercel-hosted blob storage client that requires `BLOB_READ_WRITE_TOKEN` to be set in the environment. If any future code accidentally imports from `@vercel/blob` instead of using the local filesystem path utilities, it will silently fail in Docker (no `BLOB_READ_WRITE_TOKEN` in a non-Vercel environment) and would route file storage through Vercel's infrastructure rather than the local volume, breaking the signed PDF storage entirely.
|
||||
|
||||
**Why it happens:**
|
||||
`@vercel/blob` may have been installed during initial scaffolding when Vercel deployment was considered. It was never wired up. Its presence in `package.json` is a footgun.
|
||||
|
||||
**How to avoid:**
|
||||
Remove `@vercel/blob` from `package.json` and run `npm install` before building the Docker image. If Vercel deployment is ever considered in the future, re-add it intentionally with a clear decision to migrate storage. Until then, its presence is a liability.
|
||||
|
||||
**Warning signs:**
|
||||
- `@vercel/blob` appears in `package.json` dependencies but `grep -r "@vercel/blob"` finds no usage in `src/`.
|
||||
- Any new code imports from `@vercel/blob` without an explicit architectural decision to use it.
|
||||
|
||||
**Phase to address:** Docker deployment phase — remove the unused dependency before building the production image.
|
||||
|
||||
---
|
||||
|
||||
## Prevention Checklist
|
||||
|
||||
Group by phase for the roadmap planner.
|
||||
|
||||
### Multi-Signer Schema Phase
|
||||
- [ ] Add `signerEmail TEXT NOT NULL` to `signingTokens` (with backfill migration for v1.1 rows)
|
||||
- [ ] Add `viewedAt TIMESTAMP` to `signingTokens`
|
||||
- [ ] Add `completionTriggeredAt TIMESTAMP` to `documents`
|
||||
- [ ] Add `Partially Signed` to `documentStatusEnum` or compute from token states
|
||||
- [ ] Freeze `signatureFields` JSONB after tokens are issued (document invariant, enforced in prepare route)
|
||||
- [ ] Document the invariant: `signingTokens.signerEmail` is the source of truth for recipient list
|
||||
|
||||
### Multi-Signer Send Phase
|
||||
- [ ] Wrap all token creation in a single DB transaction; send emails after commit
|
||||
- [ ] Add outstanding-token guard to `prepare/route.ts` (409 if any unclaimed token exists)
|
||||
- [ ] Derive final PDF recipient list from `signingTokens.signerEmail`, not `emailAddresses`
|
||||
- [ ] Add rate limiting to signing GET and POST endpoints
|
||||
|
||||
### Multi-Signer Signing Flow Phase
|
||||
- [ ] Filter `signatureFields` by `field.signerEmail === tokenRow.signerEmail` in signing GET
|
||||
- [ ] Validate submitted field IDs against signer's assigned fields in signing POST
|
||||
- [ ] Include `signerEmail` in `signature_submitted` audit event metadata
|
||||
- [ ] Completion detection: count unclaimed tokens in same transaction as token claim
|
||||
|
||||
### Multi-Signer Completion Phase
|
||||
- [ ] Race condition guard: `UPDATE documents SET completion_triggered_at = NOW() WHERE completion_triggered_at IS NULL`
|
||||
- [ ] Assemble final PDF in one pass from prepared PDF base (not by merging two separately-signed files)
|
||||
- [ ] Set `signedFilePath` only after successful atomic rename of final assembled PDF
|
||||
- [ ] Compute `pdfHash` only from final assembled PDF
|
||||
- [ ] Clean up intermediate `_partial_*.pdf` files after successful assembly
|
||||
- [ ] Add periodic orphaned-temp-file cleanup
|
||||
|
||||
### Docker Deployment Phase
|
||||
- [ ] Rename `NEXT_PUBLIC_BASE_URL` → `SIGNING_BASE_URL` (server-only var, no NEXT_PUBLIC_ prefix)
|
||||
- [ ] Audit all remaining `NEXT_PUBLIC_*` usages — confirm each one genuinely needs browser access
|
||||
- [ ] Mount named Docker volume at `process.cwd() + '/uploads'` (verify path inside container first)
|
||||
- [ ] Create `.env.production` on Docker host with all required secrets; reference in `docker-compose.yml`
|
||||
- [ ] Add `CONTACT_SMTP_PORT` as required env var; remove fallback defaults from both mailers
|
||||
- [ ] Consolidate SMTP transporter into a shared `createSmtpTransporter()` utility
|
||||
- [ ] Add PostgreSQL `healthcheck` + app `depends_on: condition: service_healthy`
|
||||
- [ ] Add Drizzle migration to container startup (before `next start`)
|
||||
- [ ] Add `/api/health` endpoint that runs `SELECT 1` + checks `DATABASE_URL` + checks `CONTACT_SMTP_HOST`
|
||||
- [ ] Verify SMTP connectivity from inside container before first production deploy
|
||||
- [ ] Configure `postgres(url, { max: 5, idle_timeout: 20, connect_timeout: 10 })` for Neon free tier
|
||||
- [ ] Build Docker image with `--platform linux/amd64` when deploying to x86_64 Linux
|
||||
- [ ] Use `node:20-slim` (Debian glibc) as base image — not `node:alpine` (musl)
|
||||
- [ ] Verify `@napi-rs/canvas` loads in container: `node -e "require('@napi-rs/canvas')"`
|
||||
- [ ] Remove `@vercel/blob` from `package.json` dependencies
|
||||
|
||||
### Verification (Do Not Skip)
|
||||
- [ ] Test a two-signer document where both signers submit within 1 second of each other — confirm one PDF, one notification, one `signedAt`
|
||||
- [ ] Restart the Docker container and confirm all previously-uploaded PDFs are still accessible
|
||||
- [ ] Confirm clicking a signing link emailed from Docker opens the correct production URL (not localhost)
|
||||
- [ ] Confirm `docker exec <container> printenv CONTACT_SMTP_HOST` returns the expected value
|
||||
- [ ] Test a v1.1 (single-signer) document after migration — confirm existing tokens still work
|
||||
- [ ] Confirm Neon connection count stays below 7 during normal usage (check Neon dashboard)
|
||||
- [ ] Confirm canvas module loads: `docker exec <container> node -e "require('@napi-rs/canvas'); console.log('OK')"`
|
||||
|
||||
---
|
||||
|
||||
## Phase-Specific Warning Summary
|
||||
|
||||
| Phase Topic | Likely Pitfall | Mitigation |
|
||||
|-------------|---------------|------------|
|
||||
| signingTokens schema change | NOT NULL constraint breaks existing token rows | Backfill migration with client email JOIN |
|
||||
| Multi-signer send loop | Partial email failure orphans tokens | Transactional token creation, separate from email sends |
|
||||
| Completion detection | First signer marks document Signed | Count unclaimed tokens inside transaction before marking |
|
||||
| Concurrent completion | Two handlers both run final assembly | `completionTriggeredAt` one-time-set guard |
|
||||
| Docker build | NEXT_PUBLIC_BASE_URL baked into bundle | Remove NEXT_PUBLIC_ prefix for server-only URL |
|
||||
| Docker volumes | Uploads lost on container recreate | Named volume mounted at uploads path |
|
||||
| Docker secrets | SMTP env vars absent in container | env_file in compose, verify with printenv |
|
||||
| PostgreSQL startup | App queries before DB is ready | service_healthy depends_on + pg_isready healthcheck |
|
||||
| Neon connection pool | Default 10 connections saturates free tier | Set max: 5 with idle_timeout and connect_timeout |
|
||||
| Native module in Docker | @napi-rs/canvas wrong platform binary | --platform linux/amd64 + node:20-slim base image |
|
||||
| Unused dependency | @vercel/blob accidentally used in new code | Remove from package.json before Docker build |
|
||||
| Final PDF assembly | Signer PDFs assembled by merging two separate files | Single-pass assembly from prepared PDF base |
|
||||
| Signer identity in audit | Two signature_submitted events indistinguishable | signerEmail in audit event metadata |
|
||||
|
||||
---
|
||||
|
||||
## Sources
|
||||
|
||||
- Reviewed `src/lib/db/schema.ts` — `SignatureFieldData` has no `type` field; confirmed by inspection 2026-03-21
|
||||
- Reviewed `src/app/sign/[token]/_components/SigningPageClient.tsx` — confirmed all fields open signature modal; no type branching
|
||||
- Reviewed `src/app/portal/(protected)/documents/[docId]/_components/FieldPlacer.tsx` — confirmed single "Signature" token; `screenToPdfCoords` function confirms Y-axis inversion pattern
|
||||
- Reviewed `src/lib/signing/embed-signature.ts` — confirms `@cantoo/pdf-lib` import; PNG-only embed
|
||||
- Reviewed `src/lib/pdf/prepare-document.ts` — confirms AcroForm flatten-first ordering; text stamp fallback
|
||||
- Reviewed `src/app/api/sign/[token]/route.ts` — confirmed `signatureFields: doc.signatureFields ?? []` sends unfiltered fields to client (line 88)
|
||||
- Reviewed `src/app/portal/(protected)/documents/[docId]/_components/PreparePanel.tsx` — no guard against re-preparation of Sent/Viewed documents
|
||||
- [OpenAI Vision API Token Counting](https://platform.openai.com/docs/guides/vision#calculating-costs) — image token costs confirmed; LOW tile = 85 tokens, HIGH tile adds detail tokens per 512px tile
|
||||
- [OpenAI Structured Output (JSON Schema mode)](https://platform.openai.com/docs/guides/structured-outputs) — `json_schema` mode confirmed as more reliable than `json_object` for typed responses
|
||||
- [Vercel Serverless Function Limits](https://vercel.com/docs/functions/runtimes/node-js#memory-and-compute) — 256MB default, 1024MB on Pro; 60s max execution on Pro
|
||||
- `@cantoo/pdf-lib` confirmed as the import used (not `@pdfme/pdf-lib` or `pdf-lib`) — v1.0 codebase uses this fork throughout
|
||||
- Reviewed `src/lib/db/schema.ts` — confirmed `signingTokens` has no `signerEmail`; `documentStatusEnum` has no partial state; `SignatureFieldData` has no signer tag; 2026-04-03
|
||||
- Reviewed `src/app/api/sign/[token]/route.ts` — confirmed completion marks document Signed unconditionally at line 254; confirmed `isClientVisibleField` filter at line 90; confirmed `signableFields` filter does not restrict by signer at lines 210-213
|
||||
- Reviewed `src/app/api/documents/[id]/send/route.ts` — confirmed single token creation, single recipient
|
||||
- Reviewed `src/app/api/documents/[id]/prepare/route.ts` — confirmed no guard against re-preparation of Sent documents
|
||||
- Reviewed `src/lib/signing/signing-mailer.tsx` — confirmed `createTransporter()` per send (healthy), confirmed `CONTACT_SMTP_PORT` defaults differ from `contact-mailer.ts`
|
||||
- Reviewed `src/lib/signing/token.ts` — confirmed `crypto.randomUUID()` JTI generation (sufficient entropy)
|
||||
- Reviewed `src/lib/signing/embed-signature.ts` — confirmed atomic rename pattern (`tmp → final`)
|
||||
- Reviewed `src/lib/db/index.ts` — confirmed `postgres(url)` with no `max` parameter; Proxy singleton pattern; lazy initialization
|
||||
- Reviewed `next.config.ts` — confirmed `serverExternalPackages: ['@napi-rs/canvas']`
|
||||
- Reviewed `package.json` — confirmed `@vercel/blob` present in dependencies; confirmed `postgres` npm package in use; confirmed `node:` not specified in package engines
|
||||
- [Next.js Environment Variables — Build-time vs Runtime](https://nextjs.org/docs/app/building-your-application/configuring/environment-variables) — NEXT_PUBLIC_ vars inlined at build time; confirmed in Next.js 15 docs
|
||||
- [Docker Compose healthcheck + depends_on](https://docs.docker.com/compose/how-tos/startup-order/) — `service_healthy` condition requires explicit healthcheck definition
|
||||
- [Nodemailer: SMTP port and TLS](https://nodemailer.com/smtp/) — port 465 = implicit TLS (`secure: true`), port 587 = STARTTLS (`secure: false`); mismatch causes connection timeout
|
||||
- [postgres npm package documentation](https://github.com/porsager/postgres) — default `max: 10` connections per client instance; `idle_timeout` and `connect_timeout` options
|
||||
- [Neon connection limits](https://neon.tech/docs/introduction/plans) — free tier: 10 concurrent connections; paid tiers increase this
|
||||
- [@napi-rs/canvas supported platforms](https://github.com/Brooooooklyn/canvas#support-matrix) — no musl (Alpine) builds published; requires glibc (Debian/Ubuntu) base image
|
||||
|
||||
---
|
||||
|
||||
*Pitfalls research for: Teressa Copeland Homes — v1.1 AI field placement, expanded field types, agent signing, filled preview*
|
||||
*Researched: 2026-03-21*
|
||||
*Pitfalls research for: Teressa Copeland Homes — v1.2 multi-signer and Docker deployment*
|
||||
*Researched: 2026-04-03*
|
||||
*Previous v1.1 pitfalls (AI field placement, expanded field types, agent signing, filled preview) documented in git history — superseded by this file for v1.2 planning. The v1.1 pitfalls are assumed addressed; recovery strategies from that document remain valid if regressions occur.*
|
||||
|
||||
Reference in New Issue
Block a user