Go to file
Woody d94abaac77 feat: Phase 1.2 ingestion pipeline with chunking and metadata
- Add document parsers (DOCX, PDF) with lazy imports
- Add TokenChunkingStrategy with ABC for future replacement
- Add metadata extraction (filename, upload_date, content_summary)
- Add RAGService for ChromaDB ingestion/retrieval/response generation
- Add POST /api/v1/ingest endpoint with file validation
- Test-first: 20 passed, 2 skipped (python-docx not installed)
2026-04-22 16:49:52 +08:00
.plans docs: add PDF support alongside DOCX in all plans 2026-04-22 15:59:55 +08:00
backend feat: Phase 1.2 ingestion pipeline with chunking and metadata 2026-04-22 16:49:52 +08:00
.env.txt init: project setup with AGENTS.md, test structure, and plan directory 2026-04-22 15:22:29 +08:00
.gitignore chore: add .gitignore with Python, Node, env, and ChromaDB exclusions 2026-04-22 15:57:04 +08:00
AGENTS.md docs: add test-first and Phase X.Y sub-phase naming to AGENTS.md and plans 2026-04-22 15:54:34 +08:00
development_plan.md docs: add PDF support alongside DOCX in all plans 2026-04-22 15:59:55 +08:00