Go to file
Woody 05af86f5d2 fix(docker): set relative API base URL and pin numpy for ChromaDB compat 2026-04-27 19:15:16 +08:00
.plans feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI 2026-04-27 11:14:27 +08:00
backend fix(docker): set relative API base URL and pin numpy for ChromaDB compat 2026-04-27 19:15:16 +08:00
frontend fix(relevance): tolerate LLM score count mismatches via padding instead of discarding 2026-04-27 14:31:18 +08:00
.env.txt init: project setup with AGENTS.md, test structure, and plan directory 2026-04-22 15:22:29 +08:00
.gitignore feat(db): Phase 3.1 — SQLite infrastructure (prompts.db + history.db) 2026-04-25 20:29:29 +08:00
AGENTS.md docs: update testing rules to integration-first approach 2026-04-27 10:55:43 +08:00
Dockerfile fix(docker): set relative API base URL and pin numpy for ChromaDB compat 2026-04-27 19:15:16 +08:00
README.md feat(deploy): add Dockerfile, compose, nginx config, and README 2026-04-27 17:17:53 +08:00
development_plan.md docs: update development plans with Phase 1 completion status 2026-04-23 13:27:52 +08:00
docker-compose.yml feat(deploy): add Dockerfile, compose, nginx config, and README 2026-04-27 17:17:53 +08:00
nginx.conf feat(deploy): add Dockerfile, compose, nginx config, and README 2026-04-27 17:17:53 +08:00

README.md

LegCo Reranker

RAG-powered document Q&A app. Upload PDFs, ask questions in Cantonese, get bullet-point answers with citations.

Quick Start (Dev)

# Backend
cd backend
cp .env.example .env    # edit .env with your LLM API key
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# Frontend
cd frontend
npm install
npm run dev

Backend → http://localhost:8000 | Frontend → http://localhost:5173

Deploy with Docker

Prerequisites

  • Docker 24+ and Docker Compose v2
  • OpenRouter API key (or compatible LLM provider)

Setup

# 1. Configure environment
cp backend/.env.example backend/.env
# Edit backend/.env with your API keys and model names

# 2. Build and start
docker compose up -d --build

# 3. Check health
curl http://localhost:8000/health

The app is served at http://localhost:8000 — both the API and the frontend UI.

Volumes

Volume Purpose
chroma_data ChromaDB vector store (persistent)
chunk_data Extracted PDF page files
sqlite_data Prompt templates and query history

Environment Variables

All configurable via backend/.env:

Variable Default Description
LLM_BASE_URL https://openrouter.ai/api/v1 LLM API endpoint
LLM_API_KEY API key for LLM provider
LLM_MODEL_NAME qwen/qwen3.5-35b-a3b Chat model
EMBEDDING_MODEL qwen/qwen3-embedding-4b Embedding model
EMBEDDING_API_KEY API key for embeddings (falls back to LLM_API_KEY)
RETRIEVAL_N_RESULTS 10 Chunks per sub-question
RELEVANCE_THRESHOLD 7.0 Min relevance score (0-10)
LLM_TIMEOUT 60.0 LLM request timeout in seconds

Production: Nginx Reverse Proxy

# Include nginx.conf in your site config
# Key settings:
# - client_max_body_size 350M   (allow large PDF uploads)
# - proxy_read_timeout 300s     (LLM calls can take minutes)
# Install nginx
sudo apt install nginx

# Copy config
sudo cp nginx.conf /etc/nginx/sites-available/legco
sudo ln -s /etc/nginx/sites-available/legco /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Stopping

docker compose down

Updating

git pull
docker compose up -d --build

Architecture

User → Nginx (80) → Uvicorn (8000)
                         ├── FastAPI API (/api/v1/*)
                         └── Static Frontend (/*)
                              └── React 18 + Vite + Tailwind

RAG Pipeline (Per-Sub-Question)

User Question
  → [LLM] Decompose into 2-5 sub-questions
  → [ChromaDB] Retrieve 10 chunks per sub-question
  → [LLM] Score all chunks against their own sub-question (single call)
  → [LLM] Generate markdown response per sub-question
  → SSE stream with per-sub-question sources

Notes

  • PDF upload limit: 300MB
  • Desktop only (not mobile-optimized)
  • No authentication (public demo)
  • All LLM calls routed through configurable base URL