legco_ai_assistant/backend/app
Woody 0995c685fa feat(backend): add page-aware chunking with adjacent-page overlap
Add chunk_pages() to TokenChunkingStrategy: one chunk per page with 200-token overlap from adjacent pages. Uses original page text for main content, decoded tokens for overlap. Never splits a page regardless of size.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-24 10:30:18 +08:00
..
core fix(backend): add embed_query method to EmbeddingFunctionWrapper for ChromaDB query 2026-04-24 10:15:08 +08:00
models feat(backend): add documents CRUD service methods and Pydantic schemas 2026-04-23 19:02:07 +08:00
routers fix(backend): preserve original filename in chunk metadata instead of temp file name 2026-04-24 10:14:58 +08:00
services feat(backend): add documents CRUD service methods and Pydantic schemas 2026-04-23 19:02:07 +08:00
test feat(backend): add page-aware chunking with adjacent-page overlap 2026-04-24 10:30:18 +08:00
utils feat(backend): add page-aware chunking with adjacent-page overlap 2026-04-24 10:30:18 +08:00
__init__.py feat: Phase 1.1 project setup with config, database, and models 2026-04-22 16:13:52 +08:00
main.py feat(backend): add documents CRUD endpoints and tests 2026-04-23 19:02:28 +08:00