diff --git a/README.md b/README.md index d1cb121..3bd91d9 100644 --- a/README.md +++ b/README.md @@ -59,11 +59,20 @@ All configurable via `backend/.env`: | `LLM_BASE_URL` | `https://openrouter.ai/api/v1` | LLM API endpoint | | `LLM_API_KEY` | — | API key for LLM provider | | `LLM_MODEL_NAME` | `qwen/qwen3.5-35b-a3b` | Chat model | +| `LLM_TIMEOUT` | `60.0` | LLM request timeout in seconds | +| `LLM_ENABLE_THINKING` | `false` | Enable LLM thinking/reasoning tokens | +| `VLLM_ENGINE` | `false` | Use vLLM-format `extra_body` instead of OpenRouter | | `EMBEDDING_MODEL` | `qwen/qwen3-embedding-4b` | Embedding model | +| `EMBEDDING_BASE_URL` | `https://openrouter.ai/api/v1` | Embedding API endpoint | | `EMBEDDING_API_KEY` | — | API key for embeddings (falls back to `LLM_API_KEY`) | +| `CHROMA_DB_PATH` | `./chroma_db` | ChromaDB persistent storage | +| `CHUNK_SIZE` | `1000` | Token chunk size | +| `CHUNK_OVERLAP` | `200` | Token chunk overlap | | `RETRIEVAL_N_RESULTS` | `10` | Chunks per sub-question | | `RELEVANCE_THRESHOLD` | `7.0` | Min relevance score (0-10) | -| `LLM_TIMEOUT` | `60.0` | LLM request timeout in seconds | +| `PROMPTS_DB_PATH` | `./data/prompts.db` | Prompt templates SQLite | +| `HISTORY_DB_PATH` | `./data/history.db` | Query history SQLite | +| `CORS_ORIGINS` | `["http://localhost:5173","http://localhost:3000"]` | Allowed CORS origins | ### Production: Nginx Reverse Proxy @@ -156,6 +165,8 @@ docker run -d --name legco_test -p 8888:8000 \ -e LLM_API_KEY=your_key_here \ -e LLM_MODEL_NAME=qwen/qwen3.6-35b-a3b \ -e LLM_TIMEOUT=60.0 \ + -e LLM_ENABLE_THINKING=false \ + -e VLLM_ENGINE=false \ -e EMBEDDING_MODEL=qwen/qwen3-embedding-4b \ -e EMBEDDING_BASE_URL=https://openrouter.ai/api/v1 \ -e EMBEDDING_API_KEY=your_key_here \ diff --git a/backend/.env.example b/backend/.env.example index 72c025b..4725b33 100644 --- a/backend/.env.example +++ b/backend/.env.example @@ -2,6 +2,8 @@ LLM_BASE_URL=https://openrouter.ai/api/v1 LLM_API_KEY=your_openrouter_key_here LLM_MODEL_NAME=qwen/qwen3.5-35b-a3b LLM_TIMEOUT=60.0 +LLM_ENABLE_THINKING=false +VLLM_ENGINE=false EMBEDDING_MODEL=qwen/qwen3-embedding-4b EMBEDDING_BASE_URL=https://openrouter.ai/api/v1