126 lines
3.0 KiB
Markdown
126 lines
3.0 KiB
Markdown
# LegCo Reranker
|
|
|
|
RAG-powered document Q&A app. Upload PDFs, ask questions in Cantonese, get bullet-point answers with citations.
|
|
|
|
## Quick Start (Dev)
|
|
|
|
```bash
|
|
# Backend
|
|
cd backend
|
|
cp .env.example .env # edit .env with your LLM API key
|
|
pip install -r requirements.txt
|
|
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
|
|
|
|
# Frontend
|
|
cd frontend
|
|
npm install
|
|
npm run dev
|
|
```
|
|
|
|
Backend → `http://localhost:8000` | Frontend → `http://localhost:5173`
|
|
|
|
## Deploy with Docker
|
|
|
|
### Prerequisites
|
|
|
|
- Docker 24+ and Docker Compose v2
|
|
- OpenRouter API key (or compatible LLM provider)
|
|
|
|
### Setup
|
|
|
|
```bash
|
|
# 1. Configure environment
|
|
cp backend/.env.example backend/.env
|
|
# Edit backend/.env with your API keys and model names
|
|
|
|
# 2. Build and start
|
|
docker compose up -d --build
|
|
|
|
# 3. Check health
|
|
curl http://localhost:8000/health
|
|
```
|
|
|
|
The app is served at `http://localhost:8000` — both the API and the frontend UI.
|
|
|
|
### Volumes
|
|
|
|
| Volume | Purpose |
|
|
|--------|---------|
|
|
| `chroma_data` | ChromaDB vector store (persistent) |
|
|
| `chunk_data` | Extracted PDF page files |
|
|
| `sqlite_data` | Prompt templates and query history |
|
|
|
|
### Environment Variables
|
|
|
|
All configurable via `backend/.env`:
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `LLM_BASE_URL` | `https://openrouter.ai/api/v1` | LLM API endpoint |
|
|
| `LLM_API_KEY` | — | API key for LLM provider |
|
|
| `LLM_MODEL_NAME` | `qwen/qwen3.5-35b-a3b` | Chat model |
|
|
| `EMBEDDING_MODEL` | `qwen/qwen3-embedding-4b` | Embedding model |
|
|
| `EMBEDDING_API_KEY` | — | API key for embeddings (falls back to `LLM_API_KEY`) |
|
|
| `RETRIEVAL_N_RESULTS` | `10` | Chunks per sub-question |
|
|
| `RELEVANCE_THRESHOLD` | `7.0` | Min relevance score (0-10) |
|
|
| `LLM_TIMEOUT` | `60.0` | LLM request timeout in seconds |
|
|
|
|
### Production: Nginx Reverse Proxy
|
|
|
|
```nginx
|
|
# Include nginx.conf in your site config
|
|
# Key settings:
|
|
# - client_max_body_size 350M (allow large PDF uploads)
|
|
# - proxy_read_timeout 300s (LLM calls can take minutes)
|
|
```
|
|
|
|
```bash
|
|
# Install nginx
|
|
sudo apt install nginx
|
|
|
|
# Copy config
|
|
sudo cp nginx.conf /etc/nginx/sites-available/legco
|
|
sudo ln -s /etc/nginx/sites-available/legco /etc/nginx/sites-enabled/
|
|
sudo nginx -t && sudo systemctl reload nginx
|
|
```
|
|
|
|
### Stopping
|
|
|
|
```bash
|
|
docker compose down
|
|
```
|
|
|
|
### Updating
|
|
|
|
```bash
|
|
git pull
|
|
docker compose up -d --build
|
|
```
|
|
|
|
## Architecture
|
|
|
|
```
|
|
User → Nginx (80) → Uvicorn (8000)
|
|
├── FastAPI API (/api/v1/*)
|
|
└── Static Frontend (/*)
|
|
└── React 18 + Vite + Tailwind
|
|
```
|
|
|
|
### RAG Pipeline (Per-Sub-Question)
|
|
|
|
```
|
|
User Question
|
|
→ [LLM] Decompose into 2-5 sub-questions
|
|
→ [ChromaDB] Retrieve 10 chunks per sub-question
|
|
→ [LLM] Score all chunks against their own sub-question (single call)
|
|
→ [LLM] Generate markdown response per sub-question
|
|
→ SSE stream with per-sub-question sources
|
|
```
|
|
|
|
## Notes
|
|
|
|
- PDF upload limit: 300MB
|
|
- Desktop only (not mobile-optimized)
|
|
- No authentication (public demo)
|
|
- All LLM calls routed through configurable base URL
|