LegCo documents use multiple formats (問/答 markers, Q1/Q2 numbering,
section headings like '(1) 住戶的安置補償', 發言要點 bullet points,
and pure table pages). Regex alone cannot reliably classify all these.
Changes:
- Primary detection: LLM call identifies ALL section types in one pass
(qa, narrative, speaking_notes, table, toc, heading_only)
- Regex: downgraded to optional fast-pass optimization for known patterns
- Architecture diagram, algorithm detail, risks, and test plan all updated
- Single model handles structure detection + table extraction + verification
- New risk: vLLM may not support Qwen3.5-35B-A3B vision API depending on version
- Dependencies: added vLLM compatibility note with smoke test snippet
- Heuristic fallback (Option B) works regardless of OpenRouter or vLLM
- qa_vision_enabled toggle provides escape hatch