# Phase 4: PDF Ingest - Context **Gathered:** 2026-03-19 **Status:** Ready for planning ## Phase Boundary Agent can pull PDF forms from a SkySlope/URE forms library (scraped or API), assign them to clients, view them rendered in the browser, and store them safely on the local filesystem/Docker volume. Filling fields, signatures, and sending are separate phases. **Note:** The original roadmap said "manual upload — no utahrealestate.com scraping" but the user wants SkySlope/URE integration as the primary template source. The researcher should investigate whether SkySlope has a public API or if scraping is required. ## Implementation Decisions ### Forms library source - Primary source: SkySlope / URE Legacy Forms Library from utahrealestate.com (MLS Forms + URE Legacy Forms Library) - Research needed: Does SkySlope have a public API? If not, scraping is the route - Forms library should sync at least monthly (automated or manual re-sync path) - File picker upload is a backup for custom/non-standard forms not in the library ### Upload entry point - Agent uploads from the client's profile page via an "Add Document" button - Clicking "Add Document" opens a modal showing the forms library list with search - Search filters the list by name as the agent types - File picker option is also in the modal for custom PDFs (rare case) ### Document naming and metadata - Modal pre-fills document name from the template name (e.g., "Purchase Agreement") - Agent can edit the name before submitting (e.g., "123 Main St Purchase Agreement") - For custom file picker uploads: pre-fill name from filename, agent edits - No extra metadata fields — name only. Status auto-sets to Draft on creation. - Multiple instances of the same template allowed per client (agent renames to distinguish) ### File storage - When agent adds a template to a client, a copy is saved to a client-specific folder: `uploads/clients/{clientId}/` - This ensures Phase 5 field mapping works per-document without affecting the template - Soft delete: document record hidden from UI, file kept on disk (not deleted) ### PDF viewer (document detail page) - Minimal chrome: PDF fills the page, document name + client name shown above - Controls: page navigation (prev/next), zoom in/out, download button - Back link to the client's profile page (matches Phase 3 pattern) - Render method: Claude's discretion (researcher should evaluate PDF.js vs browser embed — PDF.js preferred if it sets up Phase 5 field overlay cleanly) ### Post-upload flow - After successful upload: stay on client profile, new document appears in the documents list - Progress indicator shown inside the modal while file is being saved ### Claude's Discretion - PDF rendering library choice (PDF.js vs iframe/embed) - Exact storage path conventions within uploads/clients/{id}/ - Error handling for failed uploads or missing templates - Forms library sync mechanism implementation details ## Specific Ideas - "I get all my documents from utahrealestate.com under the Forms section" — specifically the SkySlope Forms dropdown which shows MLS Forms and URE Legacy Forms Library - The forms list in the modal should be searchable — SkySlope likely has many forms - The file picker is a fallback for rare edge cases, not the primary workflow ## Deferred Ideas - None — discussion stayed within phase scope --- *Phase: 04-pdf-ingest* *Context gathered: 2026-03-19*