diff --git a/.plans/phase1_backend_plan.md b/.plans/phase1_backend_plan.md index 4bef6ad..0f6b206 100644 --- a/.plans/phase1_backend_plan.md +++ b/.plans/phase1_backend_plan.md @@ -3,7 +3,7 @@ **Source**: `development_plan.md` **Scope**: FastAPI backend for text-based RAG Q&A **Estimated Duration**: 3-4 days -**Status**: Draft +**Status**: In Progress (Phase 1.1 ✅, Phase 1.2 ✅, Phase 1.3 pending) --- @@ -18,13 +18,13 @@ Build a complete FastAPI backend that: ## Acceptance Criteria -- [ ] `POST /api/v1/ingest` accepts DOCX and PDF, parses content, chunks at 1000/200, embeds, stores in ChromaDB with filename/upload_date/content_summary +- [x] `POST /api/v1/ingest` accepts DOCX and PDF, parses content, chunks at 1000/200, embeds, stores in ChromaDB with filename/upload_date/content_summary - [ ] `POST /api/v1/query` accepts natural language question, returns JSON with: `keywords`, `answer` (bullet points), `sources` (array of metadata objects) - [ ] Query pipeline executes 3 LLM calls: decomposition → relevance filter → response generation -- [ ] All LLM/ASR configuration reads from `.env` (OpenRouter for dev) -- [ ] ChromaDB persists to `chroma_db/` directory -- [ ] Chunking strategy is abstracted (interface/class) for future replacement -- [ ] All unit tests pass (`pytest app/test/test_phase1_*.py -v`) +- [x] All LLM/ASR configuration reads from `.env` (OpenRouter for dev) +- [x] ChromaDB persists to `chroma_db/` directory +- [x] Chunking strategy is abstracted (interface/class) for future replacement +- [x] All unit tests pass (`pytest app/test/test_phase1_*.py -v`) - [ ] All acceptance tests pass (`pytest app/test/acceptance/ -v -m acceptance`) --- @@ -71,6 +71,11 @@ Build a complete FastAPI backend that: **Commit**: "feat: Phase 1.1 project setup with config, database, and models" +**Status**: ✅ Complete +**Tests**: 5 passed (2 config, 3 database) + +--- + ### Phase 1.2: Ingestion Pipeline **Test files to write first**: @@ -110,6 +115,12 @@ Build a complete FastAPI backend that: **Commit**: "feat: Phase 1.2 ingestion pipeline with chunking and metadata" +**Status**: ✅ Complete +**Tests**: 20 passed, 2 skipped (python-docx not installed in test env) +**Coverage**: chunking (4), metadata (3), parsers (5), RAGService (6), ingest endpoint (4) + +--- + ### Phase 1.3: Query Pipeline (3-Step) **Test files to write first**: @@ -117,39 +128,39 @@ Build a complete FastAPI backend that: - `test_phase1_rag_service.py` — Test retrieval and response generation - `test_phase1_query.py` — Test full pipeline with mocked LLM calls -**Task 1.3.1**: LLM client -- `services/llm_client.py`: `LLMClient` class +**Task 1.3.1**: LLM client — ✅ Done in Phase 1.1 +- `services/llm_client.py`: `LLMClient` class — Implemented - Constructor takes config from `Settings` - Method: `complete(prompt: str, temperature: float = 0.7) -> str` - Use httpx with OpenAI-compatible API format - Handle errors gracefully **Task 1.3.2**: Query decomposition -- `services/query_decomposer.py`: `QueryDecomposer` class +- `services/query_decomposer.py`: `QueryDecomposer` class — 🔄 Pending - Prompt template: "Given question: '{question}', extract key search keywords as JSON array" - Method: `decompose(question: str) -> list[str]` - Parse LLM JSON response into list of keywords -**Task 1.3.3**: Retrieval from ChromaDB -- `services/rag.py`: Add `retrieve(query_keywords: list[str], n_results: int = 10)` +**Task 1.3.3**: Retrieval from ChromaDB — ✅ Done in Phase 1.2 +- `services/rag.py`: `retrieve(query_keywords: list[str], n_results: int = 10)` — Implemented - Join keywords with space for query text - Return list of `(chunk_text, metadata, distance)` tuples **Task 1.3.4**: Relevance filtering -- `services/relevance_filter.py`: `RelevanceFilter` class +- `services/relevance_filter.py`: `RelevanceFilter` class — 🔄 Pending - Prompt: "Given question '{question}' and these document chunks, rate each 0-10 for relevance. Return JSON array of scores." - Input: list of chunks - Output: filtered list of (chunk, metadata) with score > threshold (e.g., 7) - Batch all chunks in single LLM call -**Task 1.3.5**: Response generation -- `services/rag.py`: Add `generate_response(question: str, chunks: list, metadata: list) -> str` +**Task 1.3.5**: Response generation — ✅ Done in Phase 1.2 +- `services/rag.py`: `generate_response(question: str, chunks: list, metadata: list) -> str` — Implemented - Prompt: "Answer question using ONLY these document chunks. Format as bullet points. Cite sources." - Include chunk content and metadata in context - Enforce bullet-point format via prompt **Task 1.3.6**: Query endpoint -- `routers/query.py`: `POST /api/v1/query` +- `routers/query.py`: `POST /api/v1/query` — 🔄 Pending - Full pipeline orchestration: 1. Call `query_decomposer.decompose()` → get keywords 2. Call `rag.retrieve()` → get chunks @@ -189,21 +200,22 @@ Build a complete FastAPI backend that: --- -## New Services Required +## Services Status -| Service | File | Responsibility | -|---------|------|----------------| -| Config | `core/config.py` | `.env` loading, Settings class | -| Database | `core/database.py` | ChromaDB persistent client | -| LLM Client | `services/llm_client.py` | OpenAI-compatible API wrapper | -| Query Decomposer | `services/query_decomposer.py` | Extract keywords from question | -| Relevance Filter | `services/relevance_filter.py` | Batch score chunk relevance | -| RAG Service | `services/rag.py` | Embedding, retrieval, response generation | -| Document Parser | `utils/document_parser.py` | Router to DOCX/PDF parsers | -| DOCX Parser | `utils/parsers/docx_parser.py` | Extract text from DOCX | -| PDF Parser | `utils/parsers/pdf_parser.py` | Extract text from PDF | -| Chunking | `utils/chunking.py` | Token-based chunking with overlap | -| Metadata | `utils/metadata.py` | Extract file metadata | +| Service | File | Status | Responsibility | +|---------|------|--------|----------------| +| Config | `core/config.py` | ✅ Complete | `.env` loading, Settings class | +| Database | `core/database.py` | ✅ Complete | ChromaDB persistent client | +| LLM Client | `services/llm_client.py` | ✅ Complete | OpenAI-compatible API wrapper | +| Query Decomposer | `services/query_decomposer.py` | 🔄 Pending | Extract keywords from question | +| Relevance Filter | `services/relevance_filter.py` | 🔄 Pending | Batch score chunk relevance | +| RAG Service | `services/rag.py` | ✅ Complete | Embedding, retrieval, response generation | +| Ingest Router | `routers/ingest.py` | ✅ Complete | POST /api/v1/ingest endpoint | +| Query Router | `routers/query.py` | 🔄 Pending | POST /api/v1/query endpoint | +| DOCX Parser | `utils/docx_parser.py` | ✅ Complete | Extract text from DOCX | +| PDF Parser | `utils/pdf_parser.py` | ✅ Complete | Extract text from PDF | +| Chunking | `utils/chunking.py` | ✅ Complete | Token-based chunking with overlap | +| Metadata | `utils/metadata.py` | ✅ Complete | Extract file metadata | ---