legco_ai_assistant

Commit Graph

Author	SHA1	Message	Date
Woody	09b5ea7d64	refactor: remove dead _merge_stash, add Phase 3 YouTube proxy plan - Remove _merge_stash (dead code since delta-based ASR refactor) - Replace TestMergeStash with TestTextFieldFormatting (53/53 Phase 2 tests pass) - Mark phase2_enhancement_use_text_field as Complete - Add Phase 3 YouTube live stream proxy implementation plan - README updates	2026-05-09 15:14:01 +08:00
Woody	78d1f8cc91	feat: delta-based ASR transcript — use text field, utterance boundaries, stash on pause Replace full_text responses with character-level deltas computed from DashScope's monotonically-growing 'text' field. Stash-only events (empty text) are skipped; trailing stash chars sent alongside deltas and appended on pause to complete final sentences. Backend: - Delta = text[len(prev_text):] — simple suffix diff, no merge logic - Track item_id for utterance boundaries, prepend space separator - Send stash alongside delta for frontend pause handler Frontend: - Accumulate deltas locally (transcriptRef += msg.delta) - Store lastStashRef from each message - On pause: append stash to text, fire onFinalTranscript Plan: .plans/phase2_enhancement_delta_sse.md updated to Complete	2026-05-07 11:26:19 +08:00
Woody	cb0ac07786	fix: text accumulation — stashes are sliding windows, merge via overlap detection DashScope stashes are ~7-char rolling windows, not cumulative. Each partial event replaces the previous. Completed events rarely sent. This caused text to jump/replace during streaming and disappear on pause. Backend: - Add _merge_stash() — finds overlapping suffix between successive stashes and appends only new characters, reconstructing full utterance from partials - format_transcription_event returns raw stash for read_events to merge - read_events maintains partial_buffer via _merge_stash, clears on completed - Guard against empty/whitespace-only stashes Frontend: - transcriptRef + onFinalTranscriptRef avoid stale closures in pause handler - stopStreaming fires onFinalTranscript(currentText) before clearing partial - Removed blind setPartialTranscript('') that erased text on pause Tests: 16/16 ws_protocol tests pass, frontend tests unchanged Plan: Updated phase2_implementation_plan.md to Complete with 11-bug log	2026-05-06 20:06:39 +08:00
Woody	fcb9ec1f6c	fix: Phase 2 ASR pipeline — 9 bugs resolved, Full Transcript works end-to-end - Vite proxy: forward /api and /ws to backend port 8000 - WebSocket URL: use backend host, not Vite HMR port - LTTPage: callback ref replaces useRef (video element always null before) - ws_asr: pass DashScope API key to OmniRealtimeConversation - asr_client: fix data_url MIME type (audio/wav), omit extra_body when auto - useFullTranscript: use absolute URL prefix for fetch - QueryInput: add value prop for external Full Transcript injection - QueryInput: fix displayValue \|\| logic (partialText '' overrode question) - ffmpeg: install static binary for audio extraction - Integration tests: 7 tests (upload→transcribe flow) - Acceptance tests: real DashScope tests (skippable) - Structured logging: ws_asr.py + video.py	2026-05-06 18:26:17 +08:00
Woody	a4e067822b	feat: Phase 2.3 ASR proxy + full transcript and 2.4 frontend hooks - Backend: DashScope WebSocket proxy (/ws/asr/{video_id}), DashScopeCallback sync-to-async bridge, ffmpeg audio extraction, POST /video/{id}/transcribe - Frontend: useVideoASR hook (auto on play), useFullTranscript hook, QueryInput partialText prop, VideoUploadResponse types, uploadVideo API - Tests: 41 backend + 26 frontend = 67 new tests, all passing	2026-05-06 13:41:24 +08:00
Woody	9934749d2b	feat: Phase 2.1 config + infrastructure and 2.2 video upload backend - Add DashScope ASR and video upload config fields to Settings - Create Pydantic models (video.py, asr.py) - Create VideoService with validation, save, serve, delete - Create ASR client stub with float32_to_s16le utility - Implement POST /api/v1/video/upload with streaming validation - Implement GET /api/v1/video/{video_id} with FileResponse - Create WebSocket ASR endpoint stub - Register new routers in main.py - Update .env.example and requirements.txt - Add reference examples for DashScope integration - 8 tests passing (3 config + 5 video upload)	2026-05-06 13:08:19 +08:00
Woody	df62283f58	feat: inject Pydantic JSON schema into Deepseek prompt (Phase 6) Follows Deepseek JSON Output guide: the prompt now includes the word 'json' and a format example derived from the Pydantic model schema. Added _pydantic_to_json_instruction() helper that builds a human-readable schema description with EXAMPLE JSON OUTPUT. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 15:17:24 +08:00
Woody	226f4ed700	test: update integration mocks for dual-client architecture (Phase 6) Added complete_structured() to mock classes, split response lists between LLMClientDP (decompose) and LLMClient (filter+generate), and patched both clients in all integration tests. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:59:23 +08:00
Woody	849beb4d4e	feat: add LLMClientDP for Deepseek decompose (Phase 6) Uses Deepseek's json_object response_format (not json_schema, which Deepseek does not support). Always disables thinking mode. Includes unit tests (12) and acceptance tests (5). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:58:53 +08:00
Woody	41f59b396f	feat: track highlight generation prompt, response, and timing in history (Phase 5.5) - Add 3 columns to query_history: highlight_prompt, highlight_response, highlight_time_ms - HistoryService.update_highlights() updates existing row after batch LLM call - ChunkHighlightService measures timing, captures prompt and structured JSON response - SSE completed event includes history_id for frontend to pass back - Frontend captures historyId, passes as ?history_id= query param in batch POST - Highlight time tracked separately (excluded from total_time_ms) - All 153 tests pass (108 backend + 45 frontend)	2026-04-29 11:18:21 +08:00
Woody	a56f8f69e2	feat: add highlight batch and GET endpoints (Phase 5.4.5) - POST /api/v1/v2/highlights/batch: compute and cache highlights for cited chunks - GET /api/v1/v2/highlights: serve cached highlighted HTML pages - chunks.py router registered in main.py - Dynamic DB path computation (prompts.db -> highlights.db), no Settings changes - 7 endpoint tests: POST 200/422, GET 200/404, mock service verification	2026-04-29 09:26:50 +08:00
Woody	c6d4a38013	feat: add LLM-based batch highlight service and HTML rendering (Phase 5.4.4) - ChunkHighlightService.compute_highlights_batch(): single LLM call across all cited chunks, grouped by sub-question, with structured output - render_highlight_html(): self-contained HTML page with yellow-highlighted relevant sentences, LLM reason annotations, and View Original PDF footer - Per-target error isolation, ChromaDB miss handling, graceful degradation - 14 tests: 7 batch service + 7 HTML rendering	2026-04-29 09:26:33 +08:00
Woody	bdbc8ea1a0	feat: add SQLite highlight cache service (Phase 5.4.3) - highlight_cache.py: HighlightCache class with get/set_highlight and compute_cache_key (sha256 hash of document_id\|chunk_index\|sub_question) - INSERT OR REPLACE semantics, idempotent table creation - 13 tests covering round-trip, overwrite, missing keys, determinism	2026-04-29 09:26:20 +08:00
Woody	b11d31e2d1	feat: add sentence splitter and highlight data models (Phase 5.4.1-5.4.2) - sentence_splitter.py: regex-based sentence splitting for English + Chinese punctuation - highlight.py: 6 Pydantic models (ChunkHighlightTarget, HighlightBatchRequest, RelevantSentence, ChunkHighlights, HighlightBatchResult, HighlightBatchResponse) - 43 tests: 13 sentence splitter + 30 model validation	2026-04-29 09:26:06 +08:00
Woody	25b26c9b48	feat(ingest): generate per-chunk PDFs for DOCX/TXT documents (Phase 5.3) DOCX and TXT ingestion now produces chunk_file_path + per-chunk PDF files matching the PDF ingestion flow. Uses reportlab to render chunk text as simple PDFs with automatic text wrapping. - Add reportlab==4.2.5 to requirements.txt - New utils/text_to_pdf.py: generate_text_pdf() renders chunk text as PDF - Ingest router DOCX/TXT branches: generate chunk_N.pdf per chunk, store in chunk_file_paths - Graceful degradation: chunk_file_path stays None if PDF generation fails - Update test_phase1_ingest_page_aware.py assertions: DOCX chunks now HAVE chunk_file_path - New test_phase5_docx_pdf_generation.py: 5 tests (DOCX PDF gen, TXT PDF gen, PDF regression, file count, graceful degradation) - 361 backend tests pass (4 pre-existing embedding failures unrelated) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 17:32:22 +08:00
Woody	f2115ae563	feat: structured LLM output for decompose + citation fuzzy matching (Phase 5) Phase 5.1 — Structured LLM output for query decomposition: - Add SubQuestions Pydantic model with sub_question, keywords, rationale - Add LLMClient.complete_structured() using langchain with_structured_output - Update QueryDecomposer with structured output path + legacy json.loads fallback - Update SQLite seed templates: add subq+citation labeling requirement - Add tests: structured output, subquestions model validation, logging Phase 5.2 — Citation format alignment and fallback links: - Add document_id to SourceMetadata (backend + frontend types) - Rewrite citationParser.ts with fuzzy matching and fallback document links - Add RAGDatabasePage auto-expand from ?document= URL param - Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown' - Add citation parser tests for fuzzy match and fallback link scenarios - Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)	2026-04-28 15:39:17 +08:00
Woody	23796d6a0c	feat(prompts): add JSON export/import for profile prompt configurations	2026-04-27 19:44:35 +08:00
Woody	a7a22f1494	fix(relevance): tolerate LLM score count mismatches via padding instead of discarding The per-sub-question filter was all-or-nothing: if the LLM returned 9 scores for 10 chunks (common with qwen3.5-35b), every chunk was discarded and the user got 'no relevant information found'. Now: fewer scores → pad with 0.0; more scores → truncate. Changed from error→warning since this is recoverable. Also improve LTT page UI: sources collapsed by default in per-sub-q sections, and the 'Your question' text now shows the full question instead of being truncated. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-27 14:31:18 +08:00
Woody	2656f9ca08	refactor(test): rewrite tests to comply with integration-first rules Replace mocked DB/internal-services with real ChromaDB/SQLite via tmp_path. Only mock truly external APIs (LLM, embedding for deterministic vectors). 13 test files rewritten (314 pass, 0 fail): - Route tests: use TestClient + real ChromaDB, seed test data - Service tests: use real PersistentClient/SQLite instances - Pipeline tests: TestClient hits SSE /query endpoint, verify history - Converted unittest.TestCase to pytest where applicable Plus: fix metadata.py to filter None values from ChromaDB metadata (pre-existing bug caught by real-DB ingestion tests)	2026-04-27 11:46:58 +08:00
Woody	3b868a0133	feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable. Fix two latent bugs from Package 4: - generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime) - _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}. Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart. Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-27 11:14:27 +08:00
Woody	3f50f81bfe	test(backend): extend existing tests for per-sub-q methods and templates Add 6 tests for retrieve_per_subquestion and generate_response_per_subquestion to Phase 1 rag service tests. Add 4 tests for filter_per_subquestion to Phase 1 relevance filter tests. Add 2 tests for new {context_sections} generate template to Phase 3 prompt injection tests. Add TestPerSubQPipelineHistory class with 3 per-sub-q pipeline simulation tests to Phase 3 integration tests. Add generate_per_subq template seed to conftest mock_prompt_service fixture. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:29:27 +08:00
Woody	201bddecf0	test(backend): add Phase 4 integration and acceptance tests 5 integration tests simulating full per-sub-question pipeline with mocked services covering 2-sub-q, empty decomposition fallback, single sub-q, all-filtered, and partial retrieval. 2 acceptance tests (manual run) for real LLM verification of per-sub-question organized answers with grouped sources and ## Sub-question headers. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:29:09 +08:00
Woody	dd98fa0b65	test(backend): add Phase 4 unit tests for generate, format, history, prompts 9 tests for generate_response_per_subquestion() and answer format validation covering multi-sub-q, empty, prompt construction, and markdown format. 8 tests for new history XML/JSON formats (sources as list-of-lists, <sub_q> wrappers in XML) and new {context_sections} prompt template. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:58 +08:00
Woody	ab6ec28de6	test(backend): add Phase 4 unit tests for retrieval and filtering 10 tests for retrieve_per_subquestion() covering multi-sub-q, empty, single, call counting, n_results passthrough, and empty results. 14 tests for filter_per_subquestion() covering basic filtering, threshold behavior, JSON parsing edge cases, markdown extraction, LLM exceptions, and format helpers. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:45 +08:00
Woody	0ecae11bf8	feat(db): update history schema and generate prompt template for Package 4 Add chunks_retrieved_per_subq_count and chunks_filtered_per_subq_count columns to query_history table with safe ALTER TABLE migration. Replace generate template {question}/{context} placeholders with {context_sections} for per-sub-question organized context sections. Update Phase 3 test assertions to match new template and schema shapes. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:28 +08:00
Woody	475306f2b1	feat(history): Phase 3.5 — Query History backend (service, API, timing, XML capture)	2026-04-25 22:59:53 +08:00
Woody	e49a68b0bd	feat(prompts): Phase 3.2 — Prompt Backend (CRUD service, REST API, 33 tests) - PromptService (services/prompt_service.py): full CRUD for 3 profiles A/B/C with seed template reset, validation, and sqlite3.Row access - REST API (routers/prompts.py): 6 endpoints on /api/v1/prompts - Pydantic models (models/prompts.py): 6 schemas - DI wiring (dependencies.py): get_prompt_service() - App registration (main.py): prompts router - Mock fixture (conftest.py): mock_prompt_service - Tests: test_phase3_prompt_service.py (22) + test_phase3_prompts_router.py (11) - 162/166 total pass, 4 skipped, 0 fail	2026-04-25 21:11:17 +08:00
Woody	f4b404f27d	feat(db): Phase 3.1 — SQLite infrastructure (prompts.db + history.db) - Add sqlite_db.py with dual-DB connection factories (WAL mode, foreign keys) - init_prompts_db() creates system_prompt_profiles + system_prompts tables - init_history_db() creates query_history table + created_at index - seed_default_profiles() inserts 3 profiles (A/B/C) x 3 steps each - All 3 profiles start with identical seed templates; Profile A active - Add prompts_db_path + history_db_path to config (./data/ default) - Startup init in main.py creates data/ dir, inits both DBs, seeds profiles - Add PROMPTS_DB_PATH + HISTORY_DB_PATH to .env.example - Add data/ to .gitignore - 17 new tests in test_phase3_sqlite_db.py (all passing)	2026-04-25 20:29:29 +08:00
Woody	51640201f3	test(backend): update query tests for sub-question generation (sub-phase 2.3) Update prompt assertion in decomposer test and field assertions in query endpoint tests to match extracted_questions rename. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 16:24:10 +08:00
Woody	d49756f374	feat: add chunk PDF serving endpoint and frontend clickable source links (1.5.6) - Add page_number and chunk_file_path to SourceMetadata model and query router - Add GET /chunks/{file_path}/pdf endpoint with path traversal protection - Add View PDF links in ResponsePanel source cards and ChunkList component - Update TypeScript types and API helper for chunk PDF URLs - Add backend tests (5) and frontend ChunkList tests (7) - Update enhancement plan: all 3 features complete	2026-04-24 11:49:39 +08:00
Woody	b2dd385443	feat(backend): refactor ingest pipeline for page-aware chunking with PDF generation PDF uploads now use parse_pdf_by_page() -> chunk_pages() -> extract page PDFs -> enhanced metadata with page_number, chunk_file_path, and document_id. Same-filename replacement deletes old chunks and PDFs before re-ingest. DOCX/TXT keep original flat flow with document_id added. RAGService.ingest_document() accepts optional document_id parameter. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:53:17 +08:00
Woody	8c84062996	feat(backend): add PDF page extractor and chunk PDF storage config New pdf_extractor.py with extract_page_as_pdf() and extract_pages_as_pdf() for extracting individual PDF pages as separate files. Adds document_chunk_path setting to config and document_chunk/ to .gitignore. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:52:57 +08:00
Woody	b97264c66a	feat(backend): add page_number, chunk_file_path, document_id to chunk metadata Enhance extract_metadata() with three new optional fields for page-aware chunking support. Validates list length mismatches. Fully backward compatible — existing callers unaffected. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:30:40 +08:00
Woody	0995c685fa	feat(backend): add page-aware chunking with adjacent-page overlap Add chunk_pages() to TokenChunkingStrategy: one chunk per page with 200-token overlap from adjacent pages. Uses original page text for main content, decoded tokens for overlap. Never splits a page regardless of size. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:30:18 +08:00
Woody	f4fa577fb0	feat(backend): add page-aware PDF parsing with per-page text extraction Add parse_pdf_by_page() that returns List[Tuple[int, str]] with 1-indexed page numbers. Pages with no extractable text are skipped. Follows same error handling as existing parse_pdf(). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:30:04 +08:00
Woody	f21085b3df	feat(backend): add documents CRUD endpoints and tests Add 4 REST endpoints for RAG database management: GET /documents, GET /documents/{id}/chunks, DELETE /documents/{id}, DELETE /chunks/{id}. Register documents router in main.py. 8 unit tests covering all CRUD operations. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 19:02:28 +08:00
Woody	33b960f786	fix(backend): extract JSON from markdown code blocks in LLM responses The LLM (Qwen3.5 via OpenRouter) returns JSON wrapped in markdown code blocks: ```json ["project manager", "limits", ...] ``` But the code was trying to parse this directly with json.loads(), causing: - QueryDecomposer to return empty keywords - RelevanceFilter to fail with "Expecting value: line 1 column 1" Changes: - Added _extract_json_from_markdown() helper function to both modules - Strips markdown code block markers (```json and ```) before JSON parsing - Added unit tests for markdown code block handling Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus \u003cclio-agent@sisyphuslabs.ai\u003e	2026-04-23 16:28:07 +08:00
Woody	be5e75e67c	test(backend): update unit tests for LLM monitoring changes - Fixed MockLLMClient to accept step_name parameter - Updated test mocks for OpenAI SDK structure Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 14:52:41 +08:00
Woody	351950f512	test(backend): update Phase 1 test suite Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 13:27:40 +08:00
Woody	7493b3aaf6	feat: Phase 1.4 acceptance tests, error handling, and polish - Implement acceptance tests for ingest (real ChromaDB) and query (real LLM) - Full 3-step RAG pipeline verified: decompose → retrieve → filter → generate - Add logging to ingest and query routers - Improve error handling: empty doc detection, proper HTTPException re-raising - Add .txt file support to ingest endpoint - Fix query router: strip distance from retrieve tuples before relevance filter - Update plan: Phase 1 backend complete (all acceptance criteria met) - Tests: 41 unit passed, 5 acceptance passed (real OpenRouter calls)	2026-04-22 17:45:50 +08:00
Woody	181f4eca5b	feat: Phase 1.3 query pipeline with decomposition, relevance filter, and response - Add QueryDecomposer: extracts keywords from question via LLM JSON response - Add RelevanceFilter: batch scores chunks 0-10, filters by threshold - Add POST /api/v1/query endpoint with full 3-step pipeline: 1. QueryDecomposer.decompose() → keywords 2. RAGService.retrieve() → chunks from ChromaDB 3. RelevanceFilter.filter() → score and filter chunks 4. RAGService.generate_response() → bullet-point answer - Fix SourceMetadata.upload_date type from datetime to str for flexibility - Test-first: 13 new tests pass (5 decomposer, 5 relevance filter, 3 query endpoint) - All Phase 1 tests: 41 passed, 2 skipped	2026-04-22 17:19:21 +08:00
Woody	d94abaac77	feat: Phase 1.2 ingestion pipeline with chunking and metadata - Add document parsers (DOCX, PDF) with lazy imports - Add TokenChunkingStrategy with ABC for future replacement - Add metadata extraction (filename, upload_date, content_summary) - Add RAGService for ChromaDB ingestion/retrieval/response generation - Add POST /api/v1/ingest endpoint with file validation - Test-first: 20 passed, 2 skipped (python-docx not installed)	2026-04-22 16:49:52 +08:00
Woody	3712397d64	feat: Phase 1.1 project setup with config, database, and models - Add requirements.txt with all dependencies - Add .env.example with required environment variables - Add Pydantic Settings (config.py) with .env loading - Add ChromaDB persistent client (database.py) - Add Pydantic schemas (ingest.py) for request/response - Add FastAPI main.py with CORS middleware - Add package __init__.py files - Add tests: test_phase1_config.py, test_phase1_database.py - All 5 tests pass	2026-04-22 16:13:52 +08:00
Woody	be48b1d8c7	docs: add sub-phase development rules and acceptance test structure	2026-04-22 15:27:31 +08:00
Woody	3c2d647943	init: project setup with AGENTS.md, test structure, and plan directory	2026-04-22 15:22:29 +08:00

45 Commits