legco_ai_assistant

Go to file

Woody b2dd385443 feat(backend): refactor ingest pipeline for page-aware chunking with PDF generation PDF uploads now use parse_pdf_by_page() -> chunk_pages() -> extract page PDFs -> enhanced metadata with page_number, chunk_file_path, and document_id. Same-filename replacement deletes old chunks and PDFs before re-ingest. DOCX/TXT keep original flat flow with document_id added. RAGService.ingest_document() accepts optional document_id parameter. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>		2026-04-24 10:53:17 +08:00
.plans	docs: update enhancement plan with sub-phase 1.5.4 completion status	2026-04-24 10:30:55 +08:00
backend	feat(backend): refactor ingest pipeline for page-aware chunking with PDF generation	2026-04-24 10:53:17 +08:00
frontend	feat(frontend): add RAG Database management page with document CRUD UI	2026-04-24 09:41:56 +08:00
test materials	test: add sample documents for manual testing	2026-04-23 13:28:13 +08:00
.env.txt	init: project setup with AGENTS.md, test structure, and plan directory	2026-04-22 15:22:29 +08:00
.gitignore	feat(backend): add PDF page extractor and chunk PDF storage config	2026-04-24 10:52:57 +08:00
AGENTS.md	docs: add logging anti-patterns to AGENTS.md	2026-04-23 14:10:09 +08:00
development_plan.md	docs: update development plans with Phase 1 completion status	2026-04-23 13:27:52 +08:00