legco_ai_assistant/backend/app/test
Woody 0995c685fa feat(backend): add page-aware chunking with adjacent-page overlap
Add chunk_pages() to TokenChunkingStrategy: one chunk per page with 200-token overlap from adjacent pages. Uses original page text for main content, decoded tokens for overlap. Never splits a page regardless of size.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-24 10:30:18 +08:00
..
acceptance test(backend): update Phase 1 test suite 2026-04-23 13:27:40 +08:00
conftest.py test(backend): update Phase 1 test suite 2026-04-23 13:27:40 +08:00
test_phase1_chunking.py feat: Phase 1.2 ingestion pipeline with chunking and metadata 2026-04-22 16:49:52 +08:00
test_phase1_config.py feat: Phase 1.1 project setup with config, database, and models 2026-04-22 16:13:52 +08:00
test_phase1_database.py feat: Phase 1.1 project setup with config, database, and models 2026-04-22 16:13:52 +08:00
test_phase1_documents_router.py feat(backend): add documents CRUD endpoints and tests 2026-04-23 19:02:28 +08:00
test_phase1_ingest.py feat: Phase 1.2 ingestion pipeline with chunking and metadata 2026-04-22 16:49:52 +08:00
test_phase1_llm_client.py test(backend): update unit tests for LLM monitoring changes 2026-04-23 14:52:41 +08:00
test_phase1_metadata.py feat: Phase 1.2 ingestion pipeline with chunking and metadata 2026-04-22 16:49:52 +08:00
test_phase1_page_aware_chunking.py feat(backend): add page-aware chunking with adjacent-page overlap 2026-04-24 10:30:18 +08:00
test_phase1_parsers.py feat: Phase 1.2 ingestion pipeline with chunking and metadata 2026-04-22 16:49:52 +08:00
test_phase1_pdf_parser_pages.py feat(backend): add page-aware PDF parsing with per-page text extraction 2026-04-24 10:30:04 +08:00
test_phase1_query.py test(backend): update Phase 1 test suite 2026-04-23 13:27:40 +08:00
test_phase1_query_decomposer.py fix(backend): extract JSON from markdown code blocks in LLM responses 2026-04-23 16:28:07 +08:00
test_phase1_rag_service.py test(backend): update Phase 1 test suite 2026-04-23 13:27:40 +08:00
test_phase1_relevance_filter.py fix(backend): extract JSON from markdown code blocks in LLM responses 2026-04-23 16:28:07 +08:00
test_phase2_asr_client.py init: project setup with AGENTS.md, test structure, and plan directory 2026-04-22 15:22:29 +08:00
test_phase2_video_upload.py init: project setup with AGENTS.md, test structure, and plan directory 2026-04-22 15:22:29 +08:00
test_phase2_ws_asr.py init: project setup with AGENTS.md, test structure, and plan directory 2026-04-22 15:22:29 +08:00