feat(deploy): add Dockerfile, compose, nginx config, and README

Multi-stage Dockerfile: Node builds frontend, Python serves both API and static files. docker-compose.yml with named volumes for ChromaDB, chunks, and SQLite data. nginx.conf as reverse proxy with 350M upload limit and 300s LLM proxy timeout. README with dev setup, deploy steps, env vars table, and architecture diagram. Backend main.py: add catch-all route to serve frontend/dist/static files in production. Only activates when dist/ exists. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-27 17:17:53 +08:00 · 2026-04-27 17:17:53 +08:00 · 4ad9deeccb
parent d444c99c23
commit 4ad9deeccb
5 changed files with 221 additions and 0 deletions
--- a/42
+++ b/42
@ -0,0 +1,42 @@
+# Stage 1: Build frontend
+FROM node:20-alpine AS frontend-build
+
+WORKDIR /app/frontend
+
+COPY frontend/package.json frontend/package-lock.json ./
+RUN npm ci
+
+COPY frontend/ ./
+RUN npm run build
+
+# Stage 2: Production runtime
+FROM python:3.11-slim
+
+WORKDIR /app
+
+# Install system dependencies
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    tini \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install Python dependencies
+COPY backend/requirements.txt ./
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy backend source
+COPY backend/ ./
+
+# Copy built frontend from stage 1
+COPY --from=frontend-build /app/frontend/dist ./frontend/dist
+
+# Create data directories
+RUN mkdir -p /app/chroma_db /app/document_chunk /app/data /app/app/log
+
+# Expose port
+EXPOSE 8000
+
+# Use tini as init to handle signals properly
+ENTRYPOINT ["tini", "--"]
+
+# Start uvicorn
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
--- a/README.md
+++ b/README.md
@ -0,0 +1,125 @@
+# LegCo Reranker
+
+RAG-powered document Q&A app. Upload PDFs, ask questions in Cantonese, get bullet-point answers with citations.
+
+## Quick Start (Dev)
+
+```bash
+# Backend
+cd backend
+cp .env.example .env    # edit .env with your LLM API key
+pip install -r requirements.txt
+uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
+
+# Frontend
+cd frontend
+npm install
+npm run dev
+```
+
+Backend → `http://localhost:8000` | Frontend → `http://localhost:5173`
+
+## Deploy with Docker
+
+### Prerequisites
+
+- Docker 24+ and Docker Compose v2
+- OpenRouter API key (or compatible LLM provider)
+
+### Setup
+
+```bash
+# 1. Configure environment
+cp backend/.env.example backend/.env
+# Edit backend/.env with your API keys and model names
+
+# 2. Build and start
+docker compose up -d --build
+
+# 3. Check health
+curl http://localhost:8000/health
+```
+
+The app is served at `http://localhost:8000` — both the API and the frontend UI.
+
+### Volumes
+
+| Volume | Purpose |
+|--------|---------|
+| `chroma_data` | ChromaDB vector store (persistent) |
+| `chunk_data` | Extracted PDF page files |
+| `sqlite_data` | Prompt templates and query history |
+
+### Environment Variables
+
+All configurable via `backend/.env`:
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `LLM_BASE_URL` | `https://openrouter.ai/api/v1` | LLM API endpoint |
+| `LLM_API_KEY` | — | API key for LLM provider |
+| `LLM_MODEL_NAME` | `qwen/qwen3.5-35b-a3b` | Chat model |
+| `EMBEDDING_MODEL` | `qwen/qwen3-embedding-4b` | Embedding model |
+| `EMBEDDING_API_KEY` | — | API key for embeddings (falls back to `LLM_API_KEY`) |
+| `RETRIEVAL_N_RESULTS` | `10` | Chunks per sub-question |
+| `RELEVANCE_THRESHOLD` | `7.0` | Min relevance score (0-10) |
+| `LLM_TIMEOUT` | `60.0` | LLM request timeout in seconds |
+
+### Production: Nginx Reverse Proxy
+
+```nginx
+# Include nginx.conf in your site config
+# Key settings:
+# - client_max_body_size 350M   (allow large PDF uploads)
+# - proxy_read_timeout 300s     (LLM calls can take minutes)
+```
+
+```bash
+# Install nginx
+sudo apt install nginx
+
+# Copy config
+sudo cp nginx.conf /etc/nginx/sites-available/legco
+sudo ln -s /etc/nginx/sites-available/legco /etc/nginx/sites-enabled/
+sudo nginx -t && sudo systemctl reload nginx
+```
+
+### Stopping
+
+```bash
+docker compose down
+```
+
+### Updating
+
+```bash
+git pull
+docker compose up -d --build
+```
+
+## Architecture
+
+```
+User → Nginx (80) → Uvicorn (8000)
+                         ├── FastAPI API (/api/v1/*)
+                         └── Static Frontend (/*)
+                              └── React 18 + Vite + Tailwind
+```
+
+### RAG Pipeline (Per-Sub-Question)
+
+```
+User Question
+  → [LLM] Decompose into 2-5 sub-questions
+  → [ChromaDB] Retrieve 10 chunks per sub-question
+  → [LLM] Score all chunks against their own sub-question (single call)
+  → [LLM] Generate markdown response per sub-question
+  → SSE stream with per-sub-question sources
+```
+
+## Notes
+
+- PDF upload limit: 300MB
+- Desktop only (not mobile-optimized)
+- No authentication (public demo)
+- All LLM calls routed through configurable base URL
--- a/backend/app/main.py
+++ b/backend/app/main.py
@ -5,6 +5,7 @@ from pathlib import Path

 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import FileResponse

 from app.routers import ingest, query, documents, prompts, history
 from app.core.config import get_settings
@ -68,3 +69,12 @@ _history_conn.close()
@app.get("/health")
 def health_check():
    return {"status": "ok"}
+
+_frontend_dist = Path(__file__).resolve().parent.parent / "frontend" / "dist"
+if _frontend_dist.is_dir():
+    @app.get("/{full_path:path}")
+    async def serve_frontend(full_path: str):
+        path = _frontend_dist / (full_path or "index.html")
+        if path.is_file():
+            return FileResponse(str(path))
+        return FileResponse(str(_frontend_dist / "index.html"))
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -0,0 +1,26 @@
+version: "3.8"
+
+services:
+  app:
+    build: .
+    container_name: legco_reranker
+    ports:
+      - "8000:8000"
+    env_file:
+      - ./backend/.env
+    volumes:
+      - chroma_data:/app/chroma_db
+      - chunk_data:/app/document_chunk
+      - sqlite_data:/app/data
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
+      interval: 30s
+      timeout: 5s
+      retries: 3
+      start_period: 10s
+
+volumes:
+  chroma_data:
+  chunk_data:
+  sqlite_data:
--- a/nginx.conf
+++ b/nginx.conf
@ -0,0 +1,18 @@
+server {
+    listen 80;
+    server_name _;
+
+    client_max_body_size 350M;
+
+    location / {
+        proxy_pass http://127.0.0.1:8000;
+        proxy_http_version 1.1;
+        proxy_set_header Upgrade $http_upgrade;
+        proxy_set_header Connection "upgrade";
+        proxy_set_header Host $host;
+        proxy_set_header X-Real-IP $remote_addr;
+        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+        proxy_set_header X-Forwarded-Proto $scheme;
+        proxy_read_timeout 300s;
+    }
+}