Go to file

Woody 4ad9deeccb feat(deploy): add Dockerfile, compose, nginx config, and README Multi-stage Dockerfile: Node builds frontend, Python serves both API and static files. docker-compose.yml with named volumes for ChromaDB, chunks, and SQLite data. nginx.conf as reverse proxy with 350M upload limit and 300s LLM proxy timeout. README with dev setup, deploy steps, env vars table, and architecture diagram. Backend main.py: add catch-all route to serve frontend/dist/static files in production. Only activates when dist/ exists. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>		2026-04-27 17:17:53 +08:00
.plans	feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI	2026-04-27 11:14:27 +08:00
backend	feat(deploy): add Dockerfile, compose, nginx config, and README	2026-04-27 17:17:53 +08:00
frontend	fix(relevance): tolerate LLM score count mismatches via padding instead of discarding	2026-04-27 14:31:18 +08:00
.env.txt	init: project setup with AGENTS.md, test structure, and plan directory	2026-04-22 15:22:29 +08:00
.gitignore	feat(db): Phase 3.1 — SQLite infrastructure (prompts.db + history.db)	2026-04-25 20:29:29 +08:00
AGENTS.md	docs: update testing rules to integration-first approach	2026-04-27 10:55:43 +08:00
Dockerfile	feat(deploy): add Dockerfile, compose, nginx config, and README	2026-04-27 17:17:53 +08:00
README.md	feat(deploy): add Dockerfile, compose, nginx config, and README	2026-04-27 17:17:53 +08:00
development_plan.md	docs: update development plans with Phase 1 completion status	2026-04-23 13:27:52 +08:00
docker-compose.yml	feat(deploy): add Dockerfile, compose, nginx config, and README	2026-04-27 17:17:53 +08:00
nginx.conf	feat(deploy): add Dockerfile, compose, nginx config, and README	2026-04-27 17:17:53 +08:00

README.md

LegCo Reranker

RAG-powered document Q&A app. Upload PDFs, ask questions in Cantonese, get bullet-point answers with citations.

Quick Start (Dev)

# Backend
cd backend
cp .env.example .env    # edit .env with your LLM API key
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# Frontend
cd frontend
npm install
npm run dev

Backend → http://localhost:8000 | Frontend → http://localhost:5173

Deploy with Docker

Prerequisites

Docker 24+ and Docker Compose v2
OpenRouter API key (or compatible LLM provider)

Setup

# 1. Configure environment
cp backend/.env.example backend/.env
# Edit backend/.env with your API keys and model names

# 2. Build and start
docker compose up -d --build

# 3. Check health
curl http://localhost:8000/health

The app is served at http://localhost:8000 — both the API and the frontend UI.

Volumes

Volume	Purpose
`chroma_data`	ChromaDB vector store (persistent)
`chunk_data`	Extracted PDF page files
`sqlite_data`	Prompt templates and query history

Environment Variables

All configurable via backend/.env:

Variable	Default	Description
`LLM_BASE_URL`	`https://openrouter.ai/api/v1`	LLM API endpoint
`LLM_API_KEY`	—	API key for LLM provider
`LLM_MODEL_NAME`	`qwen/qwen3.5-35b-a3b`	Chat model
`EMBEDDING_MODEL`	`qwen/qwen3-embedding-4b`	Embedding model
`EMBEDDING_API_KEY`	—	API key for embeddings (falls back to `LLM_API_KEY`)
`RETRIEVAL_N_RESULTS`	`10`	Chunks per sub-question
`RELEVANCE_THRESHOLD`	`7.0`	Min relevance score (0-10)
`LLM_TIMEOUT`	`60.0`	LLM request timeout in seconds

Production: Nginx Reverse Proxy

# Include nginx.conf in your site config
# Key settings:
# - client_max_body_size 350M   (allow large PDF uploads)
# - proxy_read_timeout 300s     (LLM calls can take minutes)

# Install nginx
sudo apt install nginx

# Copy config
sudo cp nginx.conf /etc/nginx/sites-available/legco
sudo ln -s /etc/nginx/sites-available/legco /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Stopping

docker compose down

Updating

git pull
docker compose up -d --build

Architecture

User → Nginx (80) → Uvicorn (8000)
                         ├── FastAPI API (/api/v1/*)
                         └── Static Frontend (/*)
                              └── React 18 + Vite + Tailwind

RAG Pipeline (Per-Sub-Question)

User Question
  → [LLM] Decompose into 2-5 sub-questions
  → [ChromaDB] Retrieve 10 chunks per sub-question
  → [LLM] Score all chunks against their own sub-question (single call)
  → [LLM] Generate markdown response per sub-question
  → SSE stream with per-sub-question sources

Notes

PDF upload limit: 300MB
Desktop only (not mobile-optimized)
No authentication (public demo)
All LLM calls routed through configurable base URL