149 lines
6.0 KiB
Markdown
149 lines
6.0 KiB
Markdown
# RAG Video Q&A — Project Knowledge Base
|
|
|
|
**Generated:** 2026-04-22
|
|
**Source:** development_plan.md
|
|
**Status:** Greenfield (no code yet)
|
|
|
|
---
|
|
|
|
## OVERVIEW
|
|
RAG-powered Video Q&A web app. Phase 1: text → ChromaDB retrieval → bullet-point answer. Phase 2: video upload → real-time ASR → auto/manual RAG query. FastAPI backend + React 18 (Vite) frontend.
|
|
|
|
## STRUCTURE
|
|
```
|
|
app/
|
|
├── backend/ # FastAPI (Python)
|
|
│ ├── app/
|
|
│ │ ├── main.py
|
|
│ │ ├── routers/ # query.py, ingest.py, video.py, ws_asr.py
|
|
│ │ ├── services/ # rag.py, llm_client.py, asr_client.py, video_service.py
|
|
│ │ ├── models/ # Pydantic schemas
|
|
│ │ ├── core/ # config.py, database.py
|
|
│ │ └── utils/ # chunking.py, metadata_extraction.py
|
|
│ ├── uploads/ # video storage (max 300MB)
|
|
│ ├── requirements.txt
|
|
│ └── .env.example
|
|
├── frontend/ # React 18 + TS + Vite
|
|
│ ├── src/
|
|
│ │ ├── components/ # shadcn/ui + custom
|
|
│ │ ├── pages/
|
|
│ │ ├── lib/
|
|
│ │ │ └── api.ts # API client (TanStack Query)
|
|
│ │ └── App.tsx
|
|
│ ├── package.json
|
|
│ └── vite.config.ts
|
|
├── chroma_db/ # Persistent vector store
|
|
├── Dockerfile
|
|
├── docker-compose.yml
|
|
├── nginx.conf
|
|
└── deploy.sh
|
|
```
|
|
|
|
## WHERE TO LOOK
|
|
| Task | Location | Notes |
|
|
|------|----------|-------|
|
|
| API routes | `backend/app/routers/` | Versioned `/api/v1/...` |
|
|
| Business logic | `backend/app/services/` | RAG, LLM, ASR, video |
|
|
| Schemas | `backend/app/models/` | Pydantic request/response |
|
|
| Config | `backend/app/core/config.py` | `.env` driven |
|
|
| DB init | `backend/app/core/database.py` | ChromaDB persistent |
|
|
| Frontend API | `frontend/src/lib/api.ts` | TanStack Query |
|
|
| UI components | `frontend/src/components/` | shadcn/ui + Tailwind |
|
|
|
|
## CODE MAP
|
|
*Greenfield — no code yet. See development_plan.md for full specification.*
|
|
|
|
## CONVENTIONS
|
|
- **Backend**: `snake_case` files; routers thin, services thick; `.env` for all LLM/ASR config
|
|
- **Frontend**: PascalCase components; `lib/api.ts` single API client; TanStack Query for server state
|
|
- **API**: Path versioning `/api/v1/`; WebSocket at `/ws/asr/{video_id}`
|
|
- **RAG**: Strict prompt — answer ONLY from retrieved context; bullet-point format
|
|
- **Metadata**: Every doc chunk must have `filename`, `upload_date`, `content_summary`
|
|
|
|
## ANTI-PATTERNS (THIS PROJECT)
|
|
- Hardcode LLM URLs/keys — always `.env`
|
|
- Business logic in routers — belongs in `services/`
|
|
- Non-persistent ChromaDB — must use `chroma_db/` directory
|
|
- LLM hallucination outside retrieved context — strict RAG prompt enforced
|
|
- Plain text responses — always bullet points with source metadata
|
|
- Missing document metadata — breaks source attribution
|
|
- Add authentication — public demo only
|
|
- Mobile-first design — desktop only at this stage
|
|
|
|
## UNIQUE STYLES
|
|
- **Dual ASR trigger**: automatic (on transcript update) + manual "Ask from Video" button
|
|
- **Layout**: Top-Left video player | Top-Right transcript + input | Bottom RAG response
|
|
- **Provider switching**: same codebase runs dev (OpenRouter/Alibaba Cloud) and prod (local vLLM)
|
|
- **Video limit**: 300MB max, MP4 + common formats
|
|
|
|
## TESTING
|
|
|
|
**Backend test directory**: `backend/app/test/`
|
|
|
|
**Naming convention** (pytest, flat structure, phase-prefixed):
|
|
```
|
|
test_phase<N>_<module_or_feature>.py
|
|
```
|
|
|
|
**Examples**:
|
|
- `test_phase1_ingest.py` — Document upload & ChromaDB ingestion
|
|
- `test_phase1_query.py` — RAG query endpoint
|
|
- `test_phase1_rag_service.py` — RAG retrieval + strict prompt logic
|
|
- `test_phase1_llm_client.py` — LLM client (mocked provider)
|
|
- `test_phase1_chunking.py` — Document chunking utils
|
|
- `test_phase1_metadata.py` — Metadata extraction
|
|
- `test_phase2_video_upload.py` — Video upload (<300MB, format validation)
|
|
- `test_phase2_asr_client.py` — ASR transcription client
|
|
- `test_phase2_ws_asr.py` — WebSocket audio streaming
|
|
- `test_phase2_query_from_video.py` — Auto/manual trigger from transcript
|
|
- `test_integration_phase1.py` — End-to-end text → RAG → answer
|
|
- `test_integration_phase2.py` — End-to-end video → ASR → RAG → answer
|
|
|
|
**Rules**:
|
|
- Use `pytest` + `pytest-asyncio` for async tests
|
|
- Mock all external LLM/ASR calls (do not hit live APIs in tests)
|
|
- Use `tmp_path` fixture for ChromaDB test instances
|
|
- Each test file must have a module-level docstring describing coverage
|
|
|
|
## COMMANDS
|
|
```bash
|
|
# Dev
|
|
backend: uvicorn app.main:app --reload --port 8000
|
|
frontend: npm run dev
|
|
|
|
# Test
|
|
backend: cd backend && pytest app/test/ -v
|
|
|
|
# Prod
|
|
docker-compose up -d
|
|
./deploy.sh
|
|
```
|
|
|
|
## PLAN STORAGE
|
|
|
|
**All development plans** (including sub-plans, debug plans, and task breakdowns) **must be stored in `.plans/`**.
|
|
|
|
```
|
|
.plans/
|
|
├── development_plan.md # Main development plan (root-level)
|
|
├── phase1_backend_plan.md # Phase 1 backend tasks
|
|
├── phase1_frontend_plan.md # Phase 1 frontend tasks
|
|
├── phase2_backend_plan.md # Phase 2 backend tasks
|
|
├── phase2_frontend_plan.md # Phase 2 frontend tasks
|
|
├── debug_<date>_<issue>.md # Debug/diagnosis logs
|
|
└── _template.md # Plan template (optional)
|
|
```
|
|
|
|
**Rules**:
|
|
- Name format: `<purpose>_<optional_date>.md` (snake_case)
|
|
- Use `debug_` prefix for troubleshooting logs
|
|
- Root `development_plan.md` stays at root as canonical source
|
|
- Sub-plans reference root plan, never duplicate it
|
|
|
|
## NOTES
|
|
- No routing library specified — single-page app likely sufficient
|
|
- No client state library specified — `useState`/`useReducer` + TanStack Query
|
|
- WebSocket client not specified — may need to expand `lib/api.ts`
|
|
- shadcn/ui components are copied, not imported as npm package
|
|
- Alibaba Cloud reference: https://modelstudio.console.alibabacloud.com/ap-southeast-1?switchAgent=101503&tab=doc&productCode=p_efm&switchUserType=3#/doc/?type=model&url=2989727
|