Three bugs caused 'Chunk by Question' to silently produce token chunks:
1. QuestionChunkingStrategy.chunk_pages() had a broken event-loop check
that always skipped LLM structure detection in FastAPI's async context.
Fixed by making chunk_pages() async and removing the is_running() guard.
2. get_chunking_strategy() factory never passed an LLMClient to
QuestionChunkingStrategy. Fixed by creating LLMClient in the factory
with graceful fallback to regex-only when config is incomplete.
3. rag.list_documents() and list_chunks() didn't extract strategy_type
or Q&A fields from ChromaDB metadata, so the frontend always showed
chunking_strategy='token' and null Q&A fields. Fixed by reading
these fields from ChromaDB and routing them through the API.
Also: TokenChunkingStrategy.chunk_pages() made async for consistency
with the question strategy; ingest router updated to await it.
Tests updated (asyncio.run() for sync tests, async mock chunk_pages).
- Add page_number and chunk_file_path to SourceMetadata model and query router
- Add GET /chunks/{file_path}/pdf endpoint with path traversal protection
- Add View PDF links in ResponsePanel source cards and ChunkList component
- Update TypeScript types and API helper for chunk PDF URLs
- Add backend tests (5) and frontend ChunkList tests (7)
- Update enhancement plan: all 3 features complete
Delete document endpoint now removes associated chunk PDF files from document_chunk/ before ChromaDB deletion. Delete chunk endpoint removes individual chunk PDF. Missing files logged as warnings, not errors.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add 4 REST endpoints for RAG database management: GET /documents, GET /documents/{id}/chunks, DELETE /documents/{id}, DELETE /chunks/{id}. Register documents router in main.py. 8 unit tests covering all CRUD operations.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>