legco_ai_assistant/backend/app/services
Woody 787c6b1692 fix: vLLM highlight batch failure — replace guided_json with response_format + add debug logging
Root cause: guided_json removed in vLLM v0.12.0, and the two-attempt
loop (structured_outputs → guided_json) merged chat_template_kwargs
into the extra_body, potentially causing param conflicts.

Changes:
- llm_client.py: Replace _complete_structured_vllm() with two-tier
  approach — response_format (Tier 1, v0.6.4+) then structured_outputs
  (Tier 2, v0.8+). Remove dead guided_json path. Add _strip_markdown_fence().

- chunk_highlight_service.py: Add complete() fallback as defense-in-depth
  when structured output fails. Strip markdown fences before parsing.

- chunks.py: Add request/response logging at router level.

- chunk_highlight_service.py: Add full logging chain — entry, ChromaDB
  fetch, LLM call, fallback, cache results, exit.

- ResponsePanel.tsx: Add console logging for request payload, response
  status/errors/timing. Handle status=failed explicitly (was silently
  ignored). Track round-trip timing via performance.now().
2026-05-15 11:08:36 +08:00
..
__init__.py feat: Phase 1.1 project setup with config, database, and models 2026-04-22 16:13:52 +08:00
asr_client.py fix: Phase 2 ASR pipeline — 9 bugs resolved, Full Transcript works end-to-end 2026-05-06 18:26:17 +08:00
chunk_highlight_service.py fix: vLLM highlight batch failure — replace guided_json with response_format + add debug logging 2026-05-15 11:08:36 +08:00
embedding_client.py feat(backend): add embedding client and update LLM client 2026-04-23 13:26:43 +08:00
highlight_cache.py feat: add SQLite highlight cache service (Phase 5.4.3) 2026-04-29 09:26:20 +08:00
history_service.py feat: track highlight generation prompt, response, and timing in history (Phase 5.5) 2026-04-29 11:18:21 +08:00
llm_client.py fix: vLLM highlight batch failure — replace guided_json with response_format + add debug logging 2026-05-15 11:08:36 +08:00
llm_client_dp.py feat: inject Pydantic JSON schema into Deepseek prompt (Phase 6) 2026-05-04 15:17:24 +08:00
prompt_service.py feat: configurable SubQuestions via Step 1.2 system prompt page 2026-05-04 17:22:14 +08:00
query_decomposer.py feat: configurable SubQuestions via Step 1.2 system prompt page 2026-05-04 17:22:14 +08:00
rag.py feat: structured LLM output for decompose + citation fuzzy matching (Phase 5) 2026-04-28 15:39:17 +08:00
relevance_filter.py fix: wrap filter chunks in XML tags for clearer LLM input 2026-04-30 13:59:03 +08:00
video_service.py feat: Phase 2.3 ASR proxy + full transcript and 2.4 frontend hooks 2026-05-06 13:41:24 +08:00