legco_ai_assistant/backend/app/utils
Woody 2656f9ca08 refactor(test): rewrite tests to comply with integration-first rules
Replace mocked DB/internal-services with real ChromaDB/SQLite via tmp_path.
Only mock truly external APIs (LLM, embedding for deterministic vectors).

13 test files rewritten (314 pass, 0 fail):
- Route tests: use TestClient + real ChromaDB, seed test data
- Service tests: use real PersistentClient/SQLite instances
- Pipeline tests: TestClient hits SSE /query endpoint, verify history
- Converted unittest.TestCase to pytest where applicable

Plus: fix metadata.py to filter None values from ChromaDB metadata
(pre-existing bug caught by real-DB ingestion tests)
2026-04-27 11:46:58 +08:00
..
__init__.py feat: Phase 1.1 project setup with config, database, and models 2026-04-22 16:13:52 +08:00
chunking.py feat(backend): add page-aware chunking with adjacent-page overlap 2026-04-24 10:30:18 +08:00
docx_parser.py refactor(backend): update document parsers for DOCX and PDF 2026-04-23 13:27:08 +08:00
metadata.py refactor(test): rewrite tests to comply with integration-first rules 2026-04-27 11:46:58 +08:00
pdf_extractor.py feat(backend): add PDF page extractor and chunk PDF storage config 2026-04-24 10:52:57 +08:00
pdf_parser.py feat(backend): add page-aware PDF parsing with per-page text extraction 2026-04-24 10:30:04 +08:00