legco_ai_assistant

Commit Graph

Author	SHA1	Message	Date
Woody	76c3bec2ab	feat: configurable SubQuestions via Step 1.2 system prompt page - Split 'Step 1: Query Decomposition' into Step 1.1 (prompt template) and Step 1.2 (format config with description + max_length) - Add create_subquestions_model() and parse_decompose_format() to decompose.py - QueryDecomposer reads decompose_format from DB, creates dynamic Pydantic model at runtime - PromptEditor renders Step 1.2 as textarea (description) + number input (max_length 1-5) - Graceful fallback to static SubQuestions when decompose_format unavailable	2026-05-04 17:22:14 +08:00
Woody	40b338d3ca	chore: gitignore .research, switch to flash, tighten sub-questions Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 16:38:58 +08:00
Woody	5535b42ae2	refactor: tighten SubQuestions to 1-3 with Cantonese format hint Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 15:18:14 +08:00
Woody	41f59b396f	feat: track highlight generation prompt, response, and timing in history (Phase 5.5) - Add 3 columns to query_history: highlight_prompt, highlight_response, highlight_time_ms - HistoryService.update_highlights() updates existing row after batch LLM call - ChunkHighlightService measures timing, captures prompt and structured JSON response - SSE completed event includes history_id for frontend to pass back - Frontend captures historyId, passes as ?history_id= query param in batch POST - Highlight time tracked separately (excluded from total_time_ms) - All 153 tests pass (108 backend + 45 frontend)	2026-04-29 11:18:21 +08:00
Woody	b11d31e2d1	feat: add sentence splitter and highlight data models (Phase 5.4.1-5.4.2) - sentence_splitter.py: regex-based sentence splitting for English + Chinese punctuation - highlight.py: 6 Pydantic models (ChunkHighlightTarget, HighlightBatchRequest, RelevantSentence, ChunkHighlights, HighlightBatchResult, HighlightBatchResponse) - 43 tests: 13 sentence splitter + 30 model validation	2026-04-29 09:26:06 +08:00
Woody	f2115ae563	feat: structured LLM output for decompose + citation fuzzy matching (Phase 5) Phase 5.1 — Structured LLM output for query decomposition: - Add SubQuestions Pydantic model with sub_question, keywords, rationale - Add LLMClient.complete_structured() using langchain with_structured_output - Update QueryDecomposer with structured output path + legacy json.loads fallback - Update SQLite seed templates: add subq+citation labeling requirement - Add tests: structured output, subquestions model validation, logging Phase 5.2 — Citation format alignment and fallback links: - Add document_id to SourceMetadata (backend + frontend types) - Rewrite citationParser.ts with fuzzy matching and fallback document links - Add RAGDatabasePage auto-expand from ?document= URL param - Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown' - Add citation parser tests for fuzzy match and fallback link scenarios - Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)	2026-04-28 15:39:17 +08:00
Woody	23796d6a0c	feat(prompts): add JSON export/import for profile prompt configurations	2026-04-27 19:44:35 +08:00
Woody	40393d81f8	feat(models): add SubQuestionSources model and per-sub-q history fields Add SubQuestionSources, SubQuestionResult, GeneratingSubquestionEvent Pydantic models for the new per-sub-question response format. Add chunks_retrieved_per_subq_count and chunks_filtered_per_subq_count optional fields to QueryHistoryRecord and QueryHistoryDetail for per-sub-question chunk count tracking. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:19 +08:00
Woody	475306f2b1	feat(history): Phase 3.5 — Query History backend (service, API, timing, XML capture)	2026-04-25 22:59:53 +08:00
Woody	e49a68b0bd	feat(prompts): Phase 3.2 — Prompt Backend (CRUD service, REST API, 33 tests) - PromptService (services/prompt_service.py): full CRUD for 3 profiles A/B/C with seed template reset, validation, and sqlite3.Row access - REST API (routers/prompts.py): 6 endpoints on /api/v1/prompts - Pydantic models (models/prompts.py): 6 schemas - DI wiring (dependencies.py): get_prompt_service() - App registration (main.py): prompts router - Mock fixture (conftest.py): mock_prompt_service - Tests: test_phase3_prompt_service.py (22) + test_phase3_prompts_router.py (11) - 162/166 total pass, 4 skipped, 0 fail	2026-04-25 21:11:17 +08:00
Woody	3b741c1844	feat(query): stream extracted questions immediately via SSE Convert /query endpoint from synchronous JSON to Server-Sent Events (SSE) streaming. The frontend now receives extracted_questions as soon as the first LLM call completes, without waiting for retrieval, filtering, and answer generation. Backend: - Add StreamingQueryEvent union type (Decomposed, Retrieving, Filtering, Generating, Completed, Error) - Convert /query to return StreamingResponse with SSE format - Yield events after each pipeline phase Frontend: - Add queryDocumentStream() using fetch + ReadableStream - Add useQueryDocumentStream() hook with phase-aware state - Update LTTPage to use streaming instead of mutation - Update ResponsePanel to show phase messages (Searching documents..., Filtering passages..., Generating answer...) - Update ExtractedQuestionsDisplay to accept null Tests: - Update query_flow e2e test to mock queryDocumentStream - 84/85 tests pass (1 pre-existing failure from removed file-input)	2026-04-25 18:29:22 +08:00
Woody	f9dda7bd18	feat(backend): rename keywords to extracted_questions in query pipeline (sub-phase 2.3) Change QueryDecomposer prompt to generate 2-5 sub-questions instead of keywords. Rename API field from keywords to extracted_questions across models, service, and router. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 16:23:53 +08:00
Woody	d49756f374	feat: add chunk PDF serving endpoint and frontend clickable source links (1.5.6) - Add page_number and chunk_file_path to SourceMetadata model and query router - Add GET /chunks/{file_path}/pdf endpoint with path traversal protection - Add View PDF links in ResponsePanel source cards and ChunkList component - Update TypeScript types and API helper for chunk PDF URLs - Add backend tests (5) and frontend ChunkList tests (7) - Update enhancement plan: all 3 features complete	2026-04-24 11:49:39 +08:00
Woody	178461915a	feat(backend): add documents CRUD service methods and Pydantic schemas Add list_documents(), list_chunks(), delete_document(), delete_chunk() to RAGService for ChromaDB document management. New schemas: DocumentInfo, ChunkInfo, DocumentListResponse, DeleteResponse. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 19:02:07 +08:00
Woody	09f8cb7e6d	refactor(backend): update Pydantic models for ingestion and query Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-23 13:26:20 +08:00
Woody	181f4eca5b	feat: Phase 1.3 query pipeline with decomposition, relevance filter, and response - Add QueryDecomposer: extracts keywords from question via LLM JSON response - Add RelevanceFilter: batch scores chunks 0-10, filters by threshold - Add POST /api/v1/query endpoint with full 3-step pipeline: 1. QueryDecomposer.decompose() → keywords 2. RAGService.retrieve() → chunks from ChromaDB 3. RelevanceFilter.filter() → score and filter chunks 4. RAGService.generate_response() → bullet-point answer - Fix SourceMetadata.upload_date type from datetime to str for flexibility - Test-first: 13 new tests pass (5 decomposer, 5 relevance filter, 3 query endpoint) - All Phase 1 tests: 41 passed, 2 skipped	2026-04-22 17:19:21 +08:00
Woody	3712397d64	feat: Phase 1.1 project setup with config, database, and models - Add requirements.txt with all dependencies - Add .env.example with required environment variables - Add Pydantic Settings (config.py) with .env loading - Add ChromaDB persistent client (database.py) - Add Pydantic schemas (ingest.py) for request/response - Add FastAPI main.py with CORS middleware - Add package __init__.py files - Add tests: test_phase1_config.py, test_phase1_database.py - All 5 tests pass	2026-04-22 16:13:52 +08:00

17 Commits