legco_ai_assistant

Commit Graph

Author	SHA1	Message	Date
Woody	3ab6fd102a	fix: use vLLM-native guided_json for structured output vLLM servers support JSON schema enforcement via extra_body (guided_json or structured_outputs), not OpenAI's response_format protocol. LangChain's with_structured_output(method='json_schema') sends response_format which vLLM ignores, causing NoneType not iterable parsing errors. - vLLM path: direct OpenAI SDK call with extra_body={guided_json\|structured_outputs} - OpenRouter path: unchanged with_structured_output(method='json_schema') - Try new 'structured_outputs' format first, fall back to legacy 'guided_json' - Update _SEED_DECOMPOSE with explicit JSON array instruction - Add diagnostic logging: exc_info=True, schema preview, prompt template preview - Add logging in _parse_legacy_json for fallback failure debugging	2026-04-29 16:49:14 +08:00
Woody	2aca18d30e	docs: add vLLM structured output fix plan - Diagnose: vLLM ignores OpenAI-native response_format, causing NoneType error - Diagnose: legacy fallback prompt lacks JSON instruction → empty questions - Plan: use vLLM-native guided_json via extra_body instead of with_structured_output - Plan: update _SEED_DECOMPOSE with JSON format instruction - Plan: add diagnostic logging (exc_info, method, schema preview) wip: temporary function_calling switch for vLLM (to be replaced by guided_json)	2026-04-29 16:42:23 +08:00
Woody	cbb958d75d	fix: vLLM chat_template_kwargs breaks LangChain structured output vLLM's chat_template_kwargs leaked into LangChain's AsyncCompletions.parse() via _get_langchain_model's model_kwargs, causing structured decomposition to fail on vLLM backends. Skip vLLM-specific params when building the LangChain model — only provider-agnostic params (OpenAI reasoning) pass through.	2026-04-29 16:07:44 +08:00
Woody	48e15f8232	feat(llm): log structured LLM response and extra_body Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 16:50:26 +08:00
Woody	095f013739	feat(llm): pass extra_body via model_kwargs in LangChain Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 16:42:49 +08:00
Woody	f2115ae563	feat: structured LLM output for decompose + citation fuzzy matching (Phase 5) Phase 5.1 — Structured LLM output for query decomposition: - Add SubQuestions Pydantic model with sub_question, keywords, rationale - Add LLMClient.complete_structured() using langchain with_structured_output - Update QueryDecomposer with structured output path + legacy json.loads fallback - Update SQLite seed templates: add subq+citation labeling requirement - Add tests: structured output, subquestions model validation, logging Phase 5.2 — Citation format alignment and fallback links: - Add document_id to SourceMetadata (backend + frontend types) - Rewrite citationParser.ts with fuzzy matching and fallback document links - Add RAGDatabasePage auto-expand from ?document= URL param - Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown' - Add citation parser tests for fuzzy match and fallback link scenarios - Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)	2026-04-28 15:39:17 +08:00
Woody	711be3dfde	feat(llm): add VLLM_ENGINE env flag for provider-specific extra_body format	2026-04-28 13:30:27 +08:00
Woody	029a0e490f	debug(backend): add LLM request/response logging for OpenRouter debugging - Log extra_body contents before sending to LLM - Log full LLM response object for debugging - Changed extra_body format to OpenRouter reasoning format Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus \u003cclio-agent@sisyphuslabs.ai\u003e	2026-04-23 16:28:43 +08:00
Woody	f5cfe44183	feat(backend): add LLM monitoring with step names, timing, and prompt logging - LLMClient.complete() now accepts step_name parameter to identify processing step - Logs prompt preview (first 100 + last 100 chars) at INFO level - Logs processing time in milliseconds with token usage stats - Updated QueryDecomposer, RelevanceFilter, and RAGService to pass step names Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 14:51:57 +08:00
Woody	74cb8b83d5	feat(backend): migrate LLM client to OpenAI SDK with thinking control - Replace httpx with openai.AsyncOpenAI - Add llm_enable_thinking config (default False) - Add _build_extra_body() for Qwen3.5 thinking mode control - Use chat_template_kwargs for vLLM/SGLang compatibility Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 14:10:26 +08:00
Woody	38f4c70762	feat(backend): add embedding client and update LLM client Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 13:26:43 +08:00
Woody	d94abaac77	feat: Phase 1.2 ingestion pipeline with chunking and metadata - Add document parsers (DOCX, PDF) with lazy imports - Add TokenChunkingStrategy with ABC for future replacement - Add metadata extraction (filename, upload_date, content_summary) - Add RAGService for ChromaDB ingestion/retrieval/response generation - Add POST /api/v1/ingest endpoint with file validation - Test-first: 20 passed, 2 skipped (python-docx not installed)	2026-04-22 16:49:52 +08:00

12 Commits