3.4 KiB
3.4 KiB
Phase 4: PDF Ingest - Context
Gathered: 2026-03-19 Status: Ready for planning
## Phase BoundaryAgent can pull PDF forms from a SkySlope/URE forms library (scraped or API), assign them to clients, view them rendered in the browser, and store them safely on the local filesystem/Docker volume. Filling fields, signatures, and sending are separate phases.
Note: The original roadmap said "manual upload — no utahrealestate.com scraping" but the user wants SkySlope/URE integration as the primary template source. The researcher should investigate whether SkySlope has a public API or if scraping is required.
## Implementation DecisionsForms library source
- Primary source: SkySlope / URE Legacy Forms Library from utahrealestate.com (MLS Forms + URE Legacy Forms Library)
- Research needed: Does SkySlope have a public API? If not, scraping is the route
- Forms library should sync at least monthly (automated or manual re-sync path)
- File picker upload is a backup for custom/non-standard forms not in the library
Upload entry point
- Agent uploads from the client's profile page via an "Add Document" button
- Clicking "Add Document" opens a modal showing the forms library list with search
- Search filters the list by name as the agent types
- File picker option is also in the modal for custom PDFs (rare case)
Document naming and metadata
- Modal pre-fills document name from the template name (e.g., "Purchase Agreement")
- Agent can edit the name before submitting (e.g., "123 Main St Purchase Agreement")
- For custom file picker uploads: pre-fill name from filename, agent edits
- No extra metadata fields — name only. Status auto-sets to Draft on creation.
- Multiple instances of the same template allowed per client (agent renames to distinguish)
File storage
- When agent adds a template to a client, a copy is saved to a client-specific folder:
uploads/clients/{clientId}/ - This ensures Phase 5 field mapping works per-document without affecting the template
- Soft delete: document record hidden from UI, file kept on disk (not deleted)
PDF viewer (document detail page)
- Minimal chrome: PDF fills the page, document name + client name shown above
- Controls: page navigation (prev/next), zoom in/out, download button
- Back link to the client's profile page (matches Phase 3 pattern)
- Render method: Claude's discretion (researcher should evaluate PDF.js vs browser embed — PDF.js preferred if it sets up Phase 5 field overlay cleanly)
Post-upload flow
- After successful upload: stay on client profile, new document appears in the documents list
- Progress indicator shown inside the modal while file is being saved
Claude's Discretion
- PDF rendering library choice (PDF.js vs iframe/embed)
- Exact storage path conventions within uploads/clients/{id}/
- Error handling for failed uploads or missing templates
- Forms library sync mechanism implementation details
- "I get all my documents from utahrealestate.com under the Forms section" — specifically the SkySlope Forms dropdown which shows MLS Forms and URE Legacy Forms Library
- The forms list in the modal should be searchable — SkySlope likely has many forms
- The file picker is a fallback for rare edge cases, not the primary workflow
- None — discussion stayed within phase scope
Phase: 04-pdf-ingest Context gathered: 2026-03-19