vLLM's chat_template_kwargs leaked into LangChain's AsyncCompletions.parse() via _get_langchain_model's model_kwargs, causing structured decomposition to fail on vLLM backends. Skip vLLM-specific params when building the LangChain model — only provider-agnostic params (OpenAI reasoning) pass through. |
||
|---|---|---|
| .. | ||
| core | ||
| models | ||
| routers | ||
| services | ||
| test | ||
| utils | ||
| __init__.py | ||
| main.py | ||