Three bugs caused 'Chunk by Question' to silently produce token chunks:
1. QuestionChunkingStrategy.chunk_pages() had a broken event-loop check
that always skipped LLM structure detection in FastAPI's async context.
Fixed by making chunk_pages() async and removing the is_running() guard.
2. get_chunking_strategy() factory never passed an LLMClient to
QuestionChunkingStrategy. Fixed by creating LLMClient in the factory
with graceful fallback to regex-only when config is incomplete.
3. rag.list_documents() and list_chunks() didn't extract strategy_type
or Q&A fields from ChromaDB metadata, so the frontend always showed
chunking_strategy='token' and null Q&A fields. Fixed by reading
these fields from ChromaDB and routing them through the API.
Also: TokenChunkingStrategy.chunk_pages() made async for consistency
with the question strategy; ingest router updated to await it.
Tests updated (asyncio.run() for sync tests, async mock chunk_pages).
Add retrieve_per_subquestion() that queries ChromaDB independently per sub-question instead of joining all sub-qs into one query string. Add filter_per_subquestion() that evaluates each chunk against its own originating sub-question in a single LLM call with a redesigned grouped prompt. Add generate_response_per_subquestion() that produces markdown sections per sub-question with grouped sources and {context_sections} template support. All existing methods preserved for backward compatibility.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Replace numeric [1] labels with [filename, page N] format in context chunks.
Update LLM prompt to instruct inline citation using bracket labels.
Enables traceable source references in generated answers.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
PDF uploads now use parse_pdf_by_page() -> chunk_pages() -> extract page PDFs -> enhanced metadata with page_number, chunk_file_path, and document_id. Same-filename replacement deletes old chunks and PDFs before re-ingest. DOCX/TXT keep original flat flow with document_id added. RAGService.ingest_document() accepts optional document_id parameter.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add react-router-dom with NavBar component (LTT + RAG Database tabs)
- Extract AppContent into LTTPage, add RAGDatabasePage placeholder
- Refactor App.tsx to BrowserRouter + Routes layout
- Switch ResponsePanel to react-markdown for rich formatting
- Fix ResponsePanel test for markdown rendering
- Update RAG prompt to cite source name instead of number
- Save Phase 1 enhancement plan (.plans/phase1_enhancement_plan.md)
- LLMClient.complete() now accepts step_name parameter to identify processing step
- Logs prompt preview (first 100 + last 100 chars) at INFO level
- Logs processing time in milliseconds with token usage stats
- Updated QueryDecomposer, RelevanceFilter, and RAGService to pass step names
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>