- Encode filePath with encodeURIComponent to handle spaces/special chars
- Add page-level error handling in PdfViewerPage for better diagnostics
- Show PDF URL in error message to aid debugging
pdfjs-dist 5.6.205 was installed separately from react-pdf's bundled 5.4.296.
Change workerSrc from local pdfjs-dist import to unpkg CDN with dynamic version.
Uses pdfjs.version from react-pdf to ensure API/worker version match.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
react-resizable-panels v4 interprets numeric minSize/maxSize as PIXELS, not percentages.
Change minSize={15} → minSize='15%' and maxSize={60} → maxSize='60%'.
Add defaultLayout on Group for explicit 30/70 initial split.
Add min-h-0 to grid container and shrink-0 to separator.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Required for react-pdf v10 in test environment.
Refactor polyfills to use named variables to avoid TS class name conflicts.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
ResponsePanel now calls processCitations() on answer text before rendering.
Custom ReactMarkdown 'a' component adds target="_blank" to all citation links.
Adds tests for citation link rendering and unmatched citation fallback.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace numeric [1] labels with [filename, page N] format in context chunks.
Update LLM prompt to instruct inline citation using bracket labels.
Enables traceable source references in generated answers.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
ResponsePanel and ChunkList View PDF links now open /pdf-viewer page instead of raw PDF download.
Update ChunkList test mock from getChunkPdfUrl to getPdfViewerUrl.
Add DOMMatrix polyfill to test setup for react-pdf/jsdom compatibility.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace fixed CSS Grid with react-resizable-panels v4 (Group/Panel/Separator).
Upper panel (video + query) defaults to 30%, lower panel (response) to 70%.
Draggable divider with hover/active state via data-separator attributes.
Add ResizeObserver and DOMRect polyfills to test setup for jsdom compatibility.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace KeywordsDisplay test with ExtractedQuestionsDisplay test. Update e2e mock data for extracted_questions. Fix QueryInput to show submitted question inline with submit button.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Delete KeywordsDisplay (blue pills) and create ExtractedQuestionsDisplay (numbered list). Rename keywords to extracted_questions in types and LTTPage.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Update prompt assertion in decomposer test and field assertions in query endpoint tests to match extracted_questions rename.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Change QueryDecomposer prompt to generate 2-5 sub-questions instead of keywords. Rename API field from keywords to extracted_questions across models, service, and router.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Mark sub-phases 2.1 (Remove Upload) and 2.2 (Question Display) as done.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Show submitted question as italic text below the input area after clicking submit. Clears when user starts typing a new question.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add page_number and chunk_file_path to SourceMetadata model and query router
- Add GET /chunks/{file_path}/pdf endpoint with path traversal protection
- Add View PDF links in ResponsePanel source cards and ChunkList component
- Update TypeScript types and API helper for chunk PDF URLs
- Add backend tests (5) and frontend ChunkList tests (7)
- Update enhancement plan: all 3 features complete
Delete document endpoint now removes associated chunk PDF files from document_chunk/ before ChromaDB deletion. Delete chunk endpoint removes individual chunk PDF. Missing files logged as warnings, not errors.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
PDF uploads now use parse_pdf_by_page() -> chunk_pages() -> extract page PDFs -> enhanced metadata with page_number, chunk_file_path, and document_id. Same-filename replacement deletes old chunks and PDFs before re-ingest. DOCX/TXT keep original flat flow with document_id added. RAGService.ingest_document() accepts optional document_id parameter.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
New pdf_extractor.py with extract_page_as_pdf() and extract_pages_as_pdf() for extracting individual PDF pages as separate files. Adds document_chunk_path setting to config and document_chunk/ to .gitignore.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Enhance extract_metadata() with three new optional fields for page-aware chunking support. Validates list length mismatches. Fully backward compatible — existing callers unaffected.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add chunk_pages() to TokenChunkingStrategy: one chunk per page with 200-token overlap from adjacent pages. Uses original page text for main content, decoded tokens for overlap. Never splits a page regardless of size.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add parse_pdf_by_page() that returns List[Tuple[int, str]] with 1-indexed page numbers. Pages with no extractable text are skipped. Follows same error handling as existing parse_pdf().
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
ChromaDB 1.5.8 calls embed_query() during collection.query(), but the wrapper only implemented __call__ (used by collection.add()). Added embed_query() as alias and refactored to shared _embed() method.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
When uploading files, the backend passes them through NamedTemporaryFile, causing os.path.basename to return temp names like 'tmp90i7xqa8.pdf'. Added original_filename parameter to extract_metadata() so the actual upload filename is stored.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Sub-phase 1.5.3: Full RAG Database page with document listing, expandable chunk viewer, delete with confirmation, and document upload. Adds TypeScript types, API functions, TanStack Query hooks (useQuery + useMutation with cache invalidation), and three new components (DocumentList, ChunkList, DocumentUpload).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Mark sub-phase 1.5.2 (backend CRUD) as complete. Update acceptance criteria, risk mitigations, and test plan.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
ChromaDB 1.5.8 requires embedding functions to implement the name() method from the EmbeddingFunction protocol. Without this, collection.get() fails with AttributeError.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add 4 REST endpoints for RAG database management: GET /documents, GET /documents/{id}/chunks, DELETE /documents/{id}, DELETE /chunks/{id}. Register documents router in main.py. 8 unit tests covering all CRUD operations.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add react-router-dom with NavBar component (LTT + RAG Database tabs)
- Extract AppContent into LTTPage, add RAGDatabasePage placeholder
- Refactor App.tsx to BrowserRouter + Routes layout
- Switch ResponsePanel to react-markdown for rich formatting
- Fix ResponsePanel test for markdown rendering
- Update RAG prompt to cite source name instead of number
- Save Phase 1 enhancement plan (.plans/phase1_enhancement_plan.md)
- Log extra_body contents before sending to LLM
- Log full LLM response object for debugging
- Changed extra_body format to OpenRouter reasoning format
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus \u003cclio-agent@sisyphuslabs.ai\u003e
The LLM (Qwen3.5 via OpenRouter) returns JSON wrapped in markdown code blocks:
```json
["project manager", "limits", ...]
```
But the code was trying to parse this directly with json.loads(), causing:
- QueryDecomposer to return empty keywords
- RelevanceFilter to fail with "Expecting value: line 1 column 1"
Changes:
- Added _extract_json_from_markdown() helper function to both modules
- Strips markdown code block markers (```json and ```) before JSON parsing
- Added unit tests for markdown code block handling
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus \u003cclio-agent@sisyphuslabs.ai\u003e
- Changed grid layout from grid-rows-[1fr_auto] to grid-rows-[30%_1fr]
- Top section (video placeholder + query input) now fixed at 30% viewport height
- Bottom response panel scrolls independently when content is long
- Added overflow-hidden to video container to prevent overflow issues
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- LLMClient.complete() now accepts step_name parameter to identify processing step
- Logs prompt preview (first 100 + last 100 chars) at INFO level
- Logs processing time in milliseconds with token usage stats
- Updated QueryDecomposer, RelevanceFilter, and RAGService to pass step names
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Log to console only (must use backend/app/log/)
- Commit log files to git (must be .gitignored)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>