legco_ai_assistant/backend
Woody 73c1789698 fix: Q\&A chunking always fell back to token — LLM never called, missing API fields
Three bugs caused 'Chunk by Question' to silently produce token chunks:

1. QuestionChunkingStrategy.chunk_pages() had a broken event-loop check
   that always skipped LLM structure detection in FastAPI's async context.
   Fixed by making chunk_pages() async and removing the is_running() guard.

2. get_chunking_strategy() factory never passed an LLMClient to
   QuestionChunkingStrategy. Fixed by creating LLMClient in the factory
   with graceful fallback to regex-only when config is incomplete.

3. rag.list_documents() and list_chunks() didn't extract strategy_type
   or Q&A fields from ChromaDB metadata, so the frontend always showed
   chunking_strategy='token' and null Q&A fields. Fixed by reading
   these fields from ChromaDB and routing them through the API.

Also: TokenChunkingStrategy.chunk_pages() made async for consistency
with the question strategy; ingest router updated to await it.
Tests updated (asyncio.run() for sync tests, async mock chunk_pages).
2026-05-15 14:46:45 +08:00
..
app fix: Q\&A chunking always fell back to token — LLM never called, missing API fields 2026-05-15 14:46:45 +08:00
uploads chore: add .gitignore with Python, Node, env, and ChromaDB exclusions 2026-04-22 15:57:04 +08:00
.env.example feat: Sub-Phase 8.0 — config & enums for Q&A-pair chunking strategy 2026-05-15 12:01:28 +08:00
pytest.ini chore(backend): update config, env template, and pytest settings 2026-04-23 13:26:08 +08:00
requirements.txt revert: remove Phase 3 YouTube proxy — all 7 sub-phases 2026-05-09 21:07:21 +08:00