3.0 KiB
3.0 KiB
LegCo Reranker
RAG-powered document Q&A app. Upload PDFs, ask questions in Cantonese, get bullet-point answers with citations.
Quick Start (Dev)
# Backend
cd backend
cp .env.example .env # edit .env with your LLM API key
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
# Frontend
cd frontend
npm install
npm run dev
Backend → http://localhost:8000 | Frontend → http://localhost:5173
Deploy with Docker
Prerequisites
- Docker 24+ and Docker Compose v2
- OpenRouter API key (or compatible LLM provider)
Setup
# 1. Configure environment
cp backend/.env.example backend/.env
# Edit backend/.env with your API keys and model names
# 2. Build and start
docker compose up -d --build
# 3. Check health
curl http://localhost:8000/health
The app is served at http://localhost:8000 — both the API and the frontend UI.
Volumes
| Volume | Purpose |
|---|---|
chroma_data |
ChromaDB vector store (persistent) |
chunk_data |
Extracted PDF page files |
sqlite_data |
Prompt templates and query history |
Environment Variables
All configurable via backend/.env:
| Variable | Default | Description |
|---|---|---|
LLM_BASE_URL |
https://openrouter.ai/api/v1 |
LLM API endpoint |
LLM_API_KEY |
— | API key for LLM provider |
LLM_MODEL_NAME |
qwen/qwen3.5-35b-a3b |
Chat model |
EMBEDDING_MODEL |
qwen/qwen3-embedding-4b |
Embedding model |
EMBEDDING_API_KEY |
— | API key for embeddings (falls back to LLM_API_KEY) |
RETRIEVAL_N_RESULTS |
10 |
Chunks per sub-question |
RELEVANCE_THRESHOLD |
7.0 |
Min relevance score (0-10) |
LLM_TIMEOUT |
60.0 |
LLM request timeout in seconds |
Production: Nginx Reverse Proxy
# Include nginx.conf in your site config
# Key settings:
# - client_max_body_size 350M (allow large PDF uploads)
# - proxy_read_timeout 300s (LLM calls can take minutes)
# Install nginx
sudo apt install nginx
# Copy config
sudo cp nginx.conf /etc/nginx/sites-available/legco
sudo ln -s /etc/nginx/sites-available/legco /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
Stopping
docker compose down
Updating
git pull
docker compose up -d --build
Architecture
User → Nginx (80) → Uvicorn (8000)
├── FastAPI API (/api/v1/*)
└── Static Frontend (/*)
└── React 18 + Vite + Tailwind
RAG Pipeline (Per-Sub-Question)
User Question
→ [LLM] Decompose into 2-5 sub-questions
→ [ChromaDB] Retrieve 10 chunks per sub-question
→ [LLM] Score all chunks against their own sub-question (single call)
→ [LLM] Generate markdown response per sub-question
→ SSE stream with per-sub-question sources
Notes
- PDF upload limit: 300MB
- Desktop only (not mobile-optimized)
- No authentication (public demo)
- All LLM calls routed through configurable base URL