legco_ai_assistant/frontend
Woody 14423c773a feat: Sub-Phases 8.1-8.4 — Q&A-pair chunking strategy
8.1 — Core algorithm (test-first):
- qa_chunking.py: preprocess_text, build_structure_detection_prompt,
  parse_llm_structure_response, Section dataclass, split_chinese_qa,
  split_english_qa, build_chunks_from_sections with recursive size split
- QuestionChunkingStrategy in chunking.py with _chunk_metadata tracking
- get_chunking_strategy() factory function
- table_extraction.py: vision LLM extraction, heuristic text fallback,
  disk cache, inject_tables_into_answer
- 18/18 tests pass (LLM parse, regex fast-pass, multi-page, ABC contract,
  size limit, chunk building, preprocess)

8.2 — Metadata enrichment:
- extract_metadata() accepts strategy_type + chunk_metadata params
- Q&A fields (question_id, question_index, section_heading, etc.)
  merged into ChromaDB metadata entries
- DocumentInfo.chunking_strategy + ChunkInfo Q&A fields in models
- 6/6 metadata tests pass

8.3 — Ingest API integration:
- POST /api/v1/ingest accepts ?strategy=token|question
- validate strategy against VALID_CHUNKING_STRATEGIES
- factory creates correct chunker; _chunk_metadata passed to extract_metadata
- 6/6 ingest integration tests pass, zero regressions on existing tests

8.4 — Frontend strategy selector:
- Radio button selector (Token / Question) on RAG Database page
- Strategy passed to ingest mutation via api.ts
- DocumentList: strategy badge (gray/blue)
- ChunkList: Q&A display with question_id, question_text, page range, table badge
- tsc --noEmit clean, vite build successful
2026-05-15 12:44:04 +08:00
..
src feat: Sub-Phases 8.1-8.4 — Q&A-pair chunking strategy 2026-05-15 12:44:04 +08:00
.pnpmrc feat: Phase 4 — System Audio & Listen Mic capture into ASR → RAG 2026-05-14 22:55:06 +08:00
index.html feat(frontend): Phase 1.1 project scaffold with Vite, Tailwind, and API client 2026-04-23 10:57:20 +08:00
package-lock.json feat: Phase 4 — System Audio & Listen Mic capture into ASR → RAG 2026-05-14 22:55:06 +08:00
package.json feat: Phase 7.1 — highlight prompt template + sequential citation [N] + highlightTerms parser 2026-05-15 10:46:55 +08:00
pnpm-lock.yaml feat: Phase 7.1 — highlight prompt template + sequential citation [N] + highlightTerms parser 2026-05-15 10:46:55 +08:00
pnpm-workspace.yaml feat: Phase 4 — System Audio & Listen Mic capture into ASR → RAG 2026-05-14 22:55:06 +08:00
postcss.config.cjs feat(frontend): Phase 1.1 project scaffold with Vite, Tailwind, and API client 2026-04-23 10:57:20 +08:00
tailwind.config.cjs feat(frontend): Phase 1.1 project scaffold with Vite, Tailwind, and API client 2026-04-23 10:57:20 +08:00
tsconfig.json feat(frontend): Phase 1.1 project scaffold with Vite, Tailwind, and API client 2026-04-23 10:57:20 +08:00
vite.config.ts fix: Phase 2 ASR pipeline — 9 bugs resolved, Full Transcript works end-to-end 2026-05-06 18:26:17 +08:00