feat(deploy): add Dockerfile, compose, nginx config, and README

Multi-stage Dockerfile: Node builds frontend, Python serves both API
and static files. docker-compose.yml with named volumes for ChromaDB,
chunks, and SQLite data. nginx.conf as reverse proxy with 350M upload
limit and 300s LLM proxy timeout. README with dev setup, deploy steps,
env vars table, and architecture diagram.

Backend main.py: add catch-all route to serve frontend/dist/static
files in production. Only activates when dist/ exists.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
Woody 2026-04-27 17:17:53 +08:00
parent d444c99c23
commit 4ad9deeccb
5 changed files with 221 additions and 0 deletions

42
Dockerfile Normal file
View File

@ -0,0 +1,42 @@
# Stage 1: Build frontend
FROM node:20-alpine AS frontend-build
WORKDIR /app/frontend
COPY frontend/package.json frontend/package-lock.json ./
RUN npm ci
COPY frontend/ ./
RUN npm run build
# Stage 2: Production runtime
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
tini \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY backend/requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
# Copy backend source
COPY backend/ ./
# Copy built frontend from stage 1
COPY --from=frontend-build /app/frontend/dist ./frontend/dist
# Create data directories
RUN mkdir -p /app/chroma_db /app/document_chunk /app/data /app/app/log
# Expose port
EXPOSE 8000
# Use tini as init to handle signals properly
ENTRYPOINT ["tini", "--"]
# Start uvicorn
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

125
README.md Normal file
View File

@ -0,0 +1,125 @@
# LegCo Reranker
RAG-powered document Q&A app. Upload PDFs, ask questions in Cantonese, get bullet-point answers with citations.
## Quick Start (Dev)
```bash
# Backend
cd backend
cp .env.example .env # edit .env with your LLM API key
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
# Frontend
cd frontend
npm install
npm run dev
```
Backend → `http://localhost:8000` | Frontend → `http://localhost:5173`
## Deploy with Docker
### Prerequisites
- Docker 24+ and Docker Compose v2
- OpenRouter API key (or compatible LLM provider)
### Setup
```bash
# 1. Configure environment
cp backend/.env.example backend/.env
# Edit backend/.env with your API keys and model names
# 2. Build and start
docker compose up -d --build
# 3. Check health
curl http://localhost:8000/health
```
The app is served at `http://localhost:8000` — both the API and the frontend UI.
### Volumes
| Volume | Purpose |
|--------|---------|
| `chroma_data` | ChromaDB vector store (persistent) |
| `chunk_data` | Extracted PDF page files |
| `sqlite_data` | Prompt templates and query history |
### Environment Variables
All configurable via `backend/.env`:
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_BASE_URL` | `https://openrouter.ai/api/v1` | LLM API endpoint |
| `LLM_API_KEY` | — | API key for LLM provider |
| `LLM_MODEL_NAME` | `qwen/qwen3.5-35b-a3b` | Chat model |
| `EMBEDDING_MODEL` | `qwen/qwen3-embedding-4b` | Embedding model |
| `EMBEDDING_API_KEY` | — | API key for embeddings (falls back to `LLM_API_KEY`) |
| `RETRIEVAL_N_RESULTS` | `10` | Chunks per sub-question |
| `RELEVANCE_THRESHOLD` | `7.0` | Min relevance score (0-10) |
| `LLM_TIMEOUT` | `60.0` | LLM request timeout in seconds |
### Production: Nginx Reverse Proxy
```nginx
# Include nginx.conf in your site config
# Key settings:
# - client_max_body_size 350M (allow large PDF uploads)
# - proxy_read_timeout 300s (LLM calls can take minutes)
```
```bash
# Install nginx
sudo apt install nginx
# Copy config
sudo cp nginx.conf /etc/nginx/sites-available/legco
sudo ln -s /etc/nginx/sites-available/legco /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
```
### Stopping
```bash
docker compose down
```
### Updating
```bash
git pull
docker compose up -d --build
```
## Architecture
```
User → Nginx (80) → Uvicorn (8000)
├── FastAPI API (/api/v1/*)
└── Static Frontend (/*)
└── React 18 + Vite + Tailwind
```
### RAG Pipeline (Per-Sub-Question)
```
User Question
→ [LLM] Decompose into 2-5 sub-questions
→ [ChromaDB] Retrieve 10 chunks per sub-question
→ [LLM] Score all chunks against their own sub-question (single call)
→ [LLM] Generate markdown response per sub-question
→ SSE stream with per-sub-question sources
```
## Notes
- PDF upload limit: 300MB
- Desktop only (not mobile-optimized)
- No authentication (public demo)
- All LLM calls routed through configurable base URL

View File

@ -5,6 +5,7 @@ from pathlib import Path
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import FileResponse
from app.routers import ingest, query, documents, prompts, history
from app.core.config import get_settings
@ -68,3 +69,12 @@ _history_conn.close()
@app.get("/health")
def health_check():
return {"status": "ok"}
_frontend_dist = Path(__file__).resolve().parent.parent / "frontend" / "dist"
if _frontend_dist.is_dir():
@app.get("/{full_path:path}")
async def serve_frontend(full_path: str):
path = _frontend_dist / (full_path or "index.html")
if path.is_file():
return FileResponse(str(path))
return FileResponse(str(_frontend_dist / "index.html"))

26
docker-compose.yml Normal file
View File

@ -0,0 +1,26 @@
version: "3.8"
services:
app:
build: .
container_name: legco_reranker
ports:
- "8000:8000"
env_file:
- ./backend/.env
volumes:
- chroma_data:/app/chroma_db
- chunk_data:/app/document_chunk
- sqlite_data:/app/data
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
volumes:
chroma_data:
chunk_data:
sqlite_data:

18
nginx.conf Normal file
View File

@ -0,0 +1,18 @@
server {
listen 80;
server_name _;
client_max_body_size 350M;
location / {
proxy_pass http://127.0.0.1:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300s;
}
}