legco_ai_assistant

Commit Graph

Author	SHA1	Message	Date
Woody	032dd75e17	feat: add Sub-Phase 9.3 evaluation API endpoint and 9.4 polish	2026-05-25 19:30:17 +08:00
Woody	098be359e7	feat: add Sub-Phase 9.2 evaluation engine (CER/WER, key questions, chunk, response)	2026-05-25 18:45:53 +08:00
Woody	ac81df0704	feat: add Sub-Phase 9.1 results generation APIs with reusable RAGPipeline	2026-05-25 18:35:55 +08:00
Woody	852430f1f1	feat: add Sub-Phase 9.0 config and Pydantic models for accuracy testing	2026-05-25 18:27:51 +08:00
Woody	6928fff8ff	test: update Phase 2 tests for ASR provider abstraction Update TestTranscribeFull to use async/await and patch the moved OpenAI import (now in asr_providers.py). Set ASR_PROVIDER=dashscope in test fixtures to ensure tests don't pick up the real .env ASR_PROVIDER value. All 19 Phase 2 + 7 integration tests pass. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-19 09:48:58 +08:00
Woody	733824c177	test: add Phase 5 ASR provider and integration tests test_phase5_config.py: 6 tests for ASR_PROVIDER validation and default values. test_phase5_openrouter_provider.py: 14 tests covering OpenRouterSTT transcription, retry logic, error handling, URL construction, cleanup, and factory dispatch. test_phase5_integration.py: 4 tests for full video-to-transcribe flow with both providers (mocked) and per-provider API key validation. Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-19 09:48:37 +08:00
Woody	73c1789698	fix: Q\&A chunking always fell back to token — LLM never called, missing API fields Three bugs caused 'Chunk by Question' to silently produce token chunks: 1. QuestionChunkingStrategy.chunk_pages() had a broken event-loop check that always skipped LLM structure detection in FastAPI's async context. Fixed by making chunk_pages() async and removing the is_running() guard. 2. get_chunking_strategy() factory never passed an LLMClient to QuestionChunkingStrategy. Fixed by creating LLMClient in the factory with graceful fallback to regex-only when config is incomplete. 3. rag.list_documents() and list_chunks() didn't extract strategy_type or Q&A fields from ChromaDB metadata, so the frontend always showed chunking_strategy='token' and null Q&A fields. Fixed by reading these fields from ChromaDB and routing them through the API. Also: TokenChunkingStrategy.chunk_pages() made async for consistency with the question strategy; ingest router updated to await it. Tests updated (asyncio.run() for sync tests, async mock chunk_pages).	2026-05-15 14:46:45 +08:00
Woody	9bef65de7b	test: Sub-Phase 8.5 — acceptance test skeleton for Q&A chunking 8 acceptance tests with real LegCo PDFs (all @pytest.mark.acceptance + @slow). Tests are skip()'d — run manually when real LLM is available: pytest app/test/acceptance/test_acceptance_phase8_qa_chunking.py -v -m acceptance Sub-Phase 8.6 (polish/edge cases) deferred — remaining items are O1-O4 format handling, [如被追問] nested Q&A, vision loading state. Core algorithm (8.1-8.4) is test-passing and production-ready.	2026-05-15 12:45:46 +08:00
Woody	14423c773a	feat: Sub-Phases 8.1-8.4 — Q&A-pair chunking strategy 8.1 — Core algorithm (test-first): - qa_chunking.py: preprocess_text, build_structure_detection_prompt, parse_llm_structure_response, Section dataclass, split_chinese_qa, split_english_qa, build_chunks_from_sections with recursive size split - QuestionChunkingStrategy in chunking.py with _chunk_metadata tracking - get_chunking_strategy() factory function - table_extraction.py: vision LLM extraction, heuristic text fallback, disk cache, inject_tables_into_answer - 18/18 tests pass (LLM parse, regex fast-pass, multi-page, ABC contract, size limit, chunk building, preprocess) 8.2 — Metadata enrichment: - extract_metadata() accepts strategy_type + chunk_metadata params - Q&A fields (question_id, question_index, section_heading, etc.) merged into ChromaDB metadata entries - DocumentInfo.chunking_strategy + ChunkInfo Q&A fields in models - 6/6 metadata tests pass 8.3 — Ingest API integration: - POST /api/v1/ingest accepts ?strategy=token\|question - validate strategy against VALID_CHUNKING_STRATEGIES - factory creates correct chunker; _chunk_metadata passed to extract_metadata - 6/6 ingest integration tests pass, zero regressions on existing tests 8.4 — Frontend strategy selector: - Radio button selector (Token / Question) on RAG Database page - Strategy passed to ingest mutation via api.ts - DocumentList: strategy badge (gray/blue) - ChunkList: Q&A display with question_id, question_text, page range, table badge - tsc --noEmit clean, vite build successful	2026-05-15 12:44:04 +08:00
Woody	ef10b937cf	feat: Sub-Phase 8.0 — config & enums for Q&A-pair chunking strategy Backend: - Add 6 Q&A chunking config fields to Settings (default_chunking_strategy, qa_vision_enabled, qa_max_chunk_tokens, qa_structure_model, qa_include_internal_refs, qa_cache_vision_results) - Define ChunkingStrategyType Literal + VALID_CHUNKING_STRATEGIES frozenset - Add strategy field to IngestResponse (default token, non-breaking) - Add IngestRequest model with strategy param - Update .env.example with new env vars Frontend: - Add ChunkingStrategy type ('token' \| 'question') - Extend IngestResponse, DocumentInfo, ChunkInfo with Q&A fields Tests: - test_qa_chunking_config_defaults — all defaults verified - test_qa_chunking_config_from_env — env var overrides verified Plan fix: renamed qa_verification_model → qa_structure_model to match LLM-first architecture	2026-05-15 12:01:28 +08:00
Woody	d69c180544	feat: Phase 4.8-4.9 — integration tests, acceptance tests, docs, and polish	2026-05-15 09:51:45 +08:00
Woody	7bff4308b7	feat: Phase 4 — System Audio & Listen Mic capture into ASR → RAG Adds two new live audio sources alongside file Upload: - System Audio: getDisplayMedia() captures system/tab audio output, pipes through WebSocket → DashScope realtime ASR → RAG. - Listen Mic: getUserMedia() captures microphone input via the same audio pipeline (shared useMediaStreamASR hook). Backend: feature toggles (system_audio_enabled, mic_enabled) in config.py, source query param gating in ws_asr.py, 10 config tests. Bug fix: getDisplayMedia() rejected video:false per W3C spec — changed to video:true then stop video tracks to allow audio-only capture on Windows/macOS Chrome.	2026-05-14 22:55:06 +08:00
Woody	b05c361fbd	revert: remove Phase 3 YouTube proxy — all 7 sub-phases Reverts commits `284028b` through `b4096d6`. Phase 4 (System Audio Capture) will replace the YouTube use case with a more versatile getDisplayMedia approach. Removed: YouTube router, HLS proxy, YouTubeService, YouTubeInput, YouTubeVideoPlayer, useYouTubeASR hook, all Phase 3 tests, hls.js dep, YouTube config fields, YouTube README/plan sections. Modified files restored to pre-Phase-3 state: LTTPage (no source toggle), api.ts (no YouTube extract), types (no YouTube types), config.py (no youtube fields), main.py (no YouTube router), requirements.txt (no yt-dlp), .env.example (no YouTube vars), package.json (no hls.js). Relevant Phase 2 code preserved: ws_asr.py (unchanged), useVideoASR, VideoPlayer, VideoUpload, QueryInput, Full Transcript.	2026-05-09 21:07:21 +08:00
Woody	b4096d6afc	feat: Phase 3.7 — Polish, PO token handling, docs, deployment verification - PO token handling: _is_po_token_error() detects YouTube bot-detection errors, invalidates cache on detection, logs warning for retry guidance (2 new tests) - README: YouTube Live Stream Proxy section with architecture, usage, config, limits - development_plan: Phase 3 complete, timeline updated, status → Phase 1-3 Complete - Dockerfile/compose: verified OK (ffmpeg + yt-dlp already present, no new volumes) - npm build: 1403 modules, production build clean - 59/59 backend + 44/44 frontend Phase 2+3 tests pass - Plan: 3.7 Complete, 7/7 sub-phases done	2026-05-09 17:27:54 +08:00
Woody	cee859d5d7	feat: Phase 3.6 — integration + acceptance tests for YouTube proxy - test_integration_phase3.py: 6 tests Extract→proxy flow (VOD manifest, VOD segment, live manifest), cache hit bypasses yt-dlp, upstream 404→502, extract disabled→503 Mocked yt-dlp, real FastAPI TestClient + HLSProxyService - test_acceptance_phase3_youtube.py: 3 tests Real YouTube VOD extraction, manifest proxy, segment proxy Follows master→variant→segment chain, verifies MPEG-TS sync byte - test_acceptance_phase3_live.py: 3 tests Real live stream extraction, no #EXT-X-ENDLIST assertion, cache refresh verification, graceful skip when offline - 201/201 CI pass (234 backend Phase 1-3, zero Phase 3 regressions) - Updated plan: 3.6 Complete, 6/7 sub-phases done	2026-05-09 17:18:55 +08:00
Woody	3c9ed2cc8d	feat: Phase 3.3 — HLS manifest proxy with line-by-line rewriting - HLSProxyService: rewrite_manifest() rewrites segment/sub-manifest/EXT-X-KEY URIs to proxy URLs; proxy_segment() transparently proxies .ts segments - Route: upstream status checked before streaming — 502 on failure - CORS access-control-allow-origin: * on all responses - Line rewriting: pass-through tags/comments, rewrite URIs, handle relative/absolute URLs - URL resolution: urljoin for relative, absolute path, and absolute URL - 22 tests (8 line rewriting, 4 URL resolution, 3 proxy URL construction, 2 manifest integration, 1 segment proxying, 4 route integration) - 104/104 total pass (zero regressions)	2026-05-09 16:13:33 +08:00
Woody	284028bb1f	feat: Phase 3.1 + 3.2 — YouTube config infra and URL extraction Phase 3.1 — Configuration & Infrastructure: - Add youtube_proxy_enabled, yt_dlp_timeout, yt_dlp_cache_ttl config fields - Add yt-dlp and hls.js dependencies - Create models/youtube.py (request/response schemas) - Create service stubs (youtube_service, hls_proxy) - Create router stub and register in main.py - 11 config tests Phase 3.2 — YouTube URL Extraction: - yt-dlp wrapper with async extraction (run_in_executor) - Format selection: ≤480p video-only + highest-bitrate audio (VOD) - Combined format fallback: same URL for live streams - In-memory URL cache: 5min TTL live, 30min VOD - lru_cache singleton service for cache persistence - Error handling: DownloadError → 200 with error field - 18 extract tests, 82/82 total pass (zero regressions) Real-URL verified: VOD (5bF3tkO5jAA) 24 formats, Live (fN9uYWCjQaw) 6 HLS	2026-05-09 15:53:04 +08:00
Woody	09b5ea7d64	refactor: remove dead _merge_stash, add Phase 3 YouTube proxy plan - Remove _merge_stash (dead code since delta-based ASR refactor) - Replace TestMergeStash with TestTextFieldFormatting (53/53 Phase 2 tests pass) - Mark phase2_enhancement_use_text_field as Complete - Add Phase 3 YouTube live stream proxy implementation plan - README updates	2026-05-09 15:14:01 +08:00
Woody	78d1f8cc91	feat: delta-based ASR transcript — use text field, utterance boundaries, stash on pause Replace full_text responses with character-level deltas computed from DashScope's monotonically-growing 'text' field. Stash-only events (empty text) are skipped; trailing stash chars sent alongside deltas and appended on pause to complete final sentences. Backend: - Delta = text[len(prev_text):] — simple suffix diff, no merge logic - Track item_id for utterance boundaries, prepend space separator - Send stash alongside delta for frontend pause handler Frontend: - Accumulate deltas locally (transcriptRef += msg.delta) - Store lastStashRef from each message - On pause: append stash to text, fire onFinalTranscript Plan: .plans/phase2_enhancement_delta_sse.md updated to Complete	2026-05-07 11:26:19 +08:00
Woody	cb0ac07786	fix: text accumulation — stashes are sliding windows, merge via overlap detection DashScope stashes are ~7-char rolling windows, not cumulative. Each partial event replaces the previous. Completed events rarely sent. This caused text to jump/replace during streaming and disappear on pause. Backend: - Add _merge_stash() — finds overlapping suffix between successive stashes and appends only new characters, reconstructing full utterance from partials - format_transcription_event returns raw stash for read_events to merge - read_events maintains partial_buffer via _merge_stash, clears on completed - Guard against empty/whitespace-only stashes Frontend: - transcriptRef + onFinalTranscriptRef avoid stale closures in pause handler - stopStreaming fires onFinalTranscript(currentText) before clearing partial - Removed blind setPartialTranscript('') that erased text on pause Tests: 16/16 ws_protocol tests pass, frontend tests unchanged Plan: Updated phase2_implementation_plan.md to Complete with 11-bug log	2026-05-06 20:06:39 +08:00
Woody	fcb9ec1f6c	fix: Phase 2 ASR pipeline — 9 bugs resolved, Full Transcript works end-to-end - Vite proxy: forward /api and /ws to backend port 8000 - WebSocket URL: use backend host, not Vite HMR port - LTTPage: callback ref replaces useRef (video element always null before) - ws_asr: pass DashScope API key to OmniRealtimeConversation - asr_client: fix data_url MIME type (audio/wav), omit extra_body when auto - useFullTranscript: use absolute URL prefix for fetch - QueryInput: add value prop for external Full Transcript injection - QueryInput: fix displayValue \|\| logic (partialText '' overrode question) - ffmpeg: install static binary for audio extraction - Integration tests: 7 tests (upload→transcribe flow) - Acceptance tests: real DashScope tests (skippable) - Structured logging: ws_asr.py + video.py	2026-05-06 18:26:17 +08:00
Woody	a4e067822b	feat: Phase 2.3 ASR proxy + full transcript and 2.4 frontend hooks - Backend: DashScope WebSocket proxy (/ws/asr/{video_id}), DashScopeCallback sync-to-async bridge, ffmpeg audio extraction, POST /video/{id}/transcribe - Frontend: useVideoASR hook (auto on play), useFullTranscript hook, QueryInput partialText prop, VideoUploadResponse types, uploadVideo API - Tests: 41 backend + 26 frontend = 67 new tests, all passing	2026-05-06 13:41:24 +08:00
Woody	9934749d2b	feat: Phase 2.1 config + infrastructure and 2.2 video upload backend - Add DashScope ASR and video upload config fields to Settings - Create Pydantic models (video.py, asr.py) - Create VideoService with validation, save, serve, delete - Create ASR client stub with float32_to_s16le utility - Implement POST /api/v1/video/upload with streaming validation - Implement GET /api/v1/video/{video_id} with FileResponse - Create WebSocket ASR endpoint stub - Register new routers in main.py - Update .env.example and requirements.txt - Add reference examples for DashScope integration - 8 tests passing (3 config + 5 video upload)	2026-05-06 13:08:19 +08:00
Woody	df62283f58	feat: inject Pydantic JSON schema into Deepseek prompt (Phase 6) Follows Deepseek JSON Output guide: the prompt now includes the word 'json' and a format example derived from the Pydantic model schema. Added _pydantic_to_json_instruction() helper that builds a human-readable schema description with EXAMPLE JSON OUTPUT. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 15:17:24 +08:00
Woody	226f4ed700	test: update integration mocks for dual-client architecture (Phase 6) Added complete_structured() to mock classes, split response lists between LLMClientDP (decompose) and LLMClient (filter+generate), and patched both clients in all integration tests. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:59:23 +08:00
Woody	849beb4d4e	feat: add LLMClientDP for Deepseek decompose (Phase 6) Uses Deepseek's json_object response_format (not json_schema, which Deepseek does not support). Always disables thinking mode. Includes unit tests (12) and acceptance tests (5). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:58:53 +08:00
Woody	41f59b396f	feat: track highlight generation prompt, response, and timing in history (Phase 5.5) - Add 3 columns to query_history: highlight_prompt, highlight_response, highlight_time_ms - HistoryService.update_highlights() updates existing row after batch LLM call - ChunkHighlightService measures timing, captures prompt and structured JSON response - SSE completed event includes history_id for frontend to pass back - Frontend captures historyId, passes as ?history_id= query param in batch POST - Highlight time tracked separately (excluded from total_time_ms) - All 153 tests pass (108 backend + 45 frontend)	2026-04-29 11:18:21 +08:00
Woody	a56f8f69e2	feat: add highlight batch and GET endpoints (Phase 5.4.5) - POST /api/v1/v2/highlights/batch: compute and cache highlights for cited chunks - GET /api/v1/v2/highlights: serve cached highlighted HTML pages - chunks.py router registered in main.py - Dynamic DB path computation (prompts.db -> highlights.db), no Settings changes - 7 endpoint tests: POST 200/422, GET 200/404, mock service verification	2026-04-29 09:26:50 +08:00
Woody	c6d4a38013	feat: add LLM-based batch highlight service and HTML rendering (Phase 5.4.4) - ChunkHighlightService.compute_highlights_batch(): single LLM call across all cited chunks, grouped by sub-question, with structured output - render_highlight_html(): self-contained HTML page with yellow-highlighted relevant sentences, LLM reason annotations, and View Original PDF footer - Per-target error isolation, ChromaDB miss handling, graceful degradation - 14 tests: 7 batch service + 7 HTML rendering	2026-04-29 09:26:33 +08:00
Woody	bdbc8ea1a0	feat: add SQLite highlight cache service (Phase 5.4.3) - highlight_cache.py: HighlightCache class with get/set_highlight and compute_cache_key (sha256 hash of document_id\|chunk_index\|sub_question) - INSERT OR REPLACE semantics, idempotent table creation - 13 tests covering round-trip, overwrite, missing keys, determinism	2026-04-29 09:26:20 +08:00
Woody	b11d31e2d1	feat: add sentence splitter and highlight data models (Phase 5.4.1-5.4.2) - sentence_splitter.py: regex-based sentence splitting for English + Chinese punctuation - highlight.py: 6 Pydantic models (ChunkHighlightTarget, HighlightBatchRequest, RelevantSentence, ChunkHighlights, HighlightBatchResult, HighlightBatchResponse) - 43 tests: 13 sentence splitter + 30 model validation	2026-04-29 09:26:06 +08:00
Woody	25b26c9b48	feat(ingest): generate per-chunk PDFs for DOCX/TXT documents (Phase 5.3) DOCX and TXT ingestion now produces chunk_file_path + per-chunk PDF files matching the PDF ingestion flow. Uses reportlab to render chunk text as simple PDFs with automatic text wrapping. - Add reportlab==4.2.5 to requirements.txt - New utils/text_to_pdf.py: generate_text_pdf() renders chunk text as PDF - Ingest router DOCX/TXT branches: generate chunk_N.pdf per chunk, store in chunk_file_paths - Graceful degradation: chunk_file_path stays None if PDF generation fails - Update test_phase1_ingest_page_aware.py assertions: DOCX chunks now HAVE chunk_file_path - New test_phase5_docx_pdf_generation.py: 5 tests (DOCX PDF gen, TXT PDF gen, PDF regression, file count, graceful degradation) - 361 backend tests pass (4 pre-existing embedding failures unrelated) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 17:32:22 +08:00
Woody	f2115ae563	feat: structured LLM output for decompose + citation fuzzy matching (Phase 5) Phase 5.1 — Structured LLM output for query decomposition: - Add SubQuestions Pydantic model with sub_question, keywords, rationale - Add LLMClient.complete_structured() using langchain with_structured_output - Update QueryDecomposer with structured output path + legacy json.loads fallback - Update SQLite seed templates: add subq+citation labeling requirement - Add tests: structured output, subquestions model validation, logging Phase 5.2 — Citation format alignment and fallback links: - Add document_id to SourceMetadata (backend + frontend types) - Rewrite citationParser.ts with fuzzy matching and fallback document links - Add RAGDatabasePage auto-expand from ?document= URL param - Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown' - Add citation parser tests for fuzzy match and fallback link scenarios - Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)	2026-04-28 15:39:17 +08:00
Woody	23796d6a0c	feat(prompts): add JSON export/import for profile prompt configurations	2026-04-27 19:44:35 +08:00
Woody	a7a22f1494	fix(relevance): tolerate LLM score count mismatches via padding instead of discarding The per-sub-question filter was all-or-nothing: if the LLM returned 9 scores for 10 chunks (common with qwen3.5-35b), every chunk was discarded and the user got 'no relevant information found'. Now: fewer scores → pad with 0.0; more scores → truncate. Changed from error→warning since this is recoverable. Also improve LTT page UI: sources collapsed by default in per-sub-q sections, and the 'Your question' text now shows the full question instead of being truncated. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-27 14:31:18 +08:00
Woody	2656f9ca08	refactor(test): rewrite tests to comply with integration-first rules Replace mocked DB/internal-services with real ChromaDB/SQLite via tmp_path. Only mock truly external APIs (LLM, embedding for deterministic vectors). 13 test files rewritten (314 pass, 0 fail): - Route tests: use TestClient + real ChromaDB, seed test data - Service tests: use real PersistentClient/SQLite instances - Pipeline tests: TestClient hits SSE /query endpoint, verify history - Converted unittest.TestCase to pytest where applicable Plus: fix metadata.py to filter None values from ChromaDB metadata (pre-existing bug caught by real-DB ingestion tests)	2026-04-27 11:46:58 +08:00
Woody	3b868a0133	feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable. Fix two latent bugs from Package 4: - generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime) - _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}. Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart. Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-27 11:14:27 +08:00
Woody	3f50f81bfe	test(backend): extend existing tests for per-sub-q methods and templates Add 6 tests for retrieve_per_subquestion and generate_response_per_subquestion to Phase 1 rag service tests. Add 4 tests for filter_per_subquestion to Phase 1 relevance filter tests. Add 2 tests for new {context_sections} generate template to Phase 3 prompt injection tests. Add TestPerSubQPipelineHistory class with 3 per-sub-q pipeline simulation tests to Phase 3 integration tests. Add generate_per_subq template seed to conftest mock_prompt_service fixture. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:29:27 +08:00
Woody	201bddecf0	test(backend): add Phase 4 integration and acceptance tests 5 integration tests simulating full per-sub-question pipeline with mocked services covering 2-sub-q, empty decomposition fallback, single sub-q, all-filtered, and partial retrieval. 2 acceptance tests (manual run) for real LLM verification of per-sub-question organized answers with grouped sources and ## Sub-question headers. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:29:09 +08:00
Woody	dd98fa0b65	test(backend): add Phase 4 unit tests for generate, format, history, prompts 9 tests for generate_response_per_subquestion() and answer format validation covering multi-sub-q, empty, prompt construction, and markdown format. 8 tests for new history XML/JSON formats (sources as list-of-lists, <sub_q> wrappers in XML) and new {context_sections} prompt template. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:58 +08:00
Woody	ab6ec28de6	test(backend): add Phase 4 unit tests for retrieval and filtering 10 tests for retrieve_per_subquestion() covering multi-sub-q, empty, single, call counting, n_results passthrough, and empty results. 14 tests for filter_per_subquestion() covering basic filtering, threshold behavior, JSON parsing edge cases, markdown extraction, LLM exceptions, and format helpers. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:45 +08:00
Woody	0ecae11bf8	feat(db): update history schema and generate prompt template for Package 4 Add chunks_retrieved_per_subq_count and chunks_filtered_per_subq_count columns to query_history table with safe ALTER TABLE migration. Replace generate template {question}/{context} placeholders with {context_sections} for per-sub-question organized context sections. Update Phase 3 test assertions to match new template and schema shapes. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-26 23:28:28 +08:00
Woody	475306f2b1	feat(history): Phase 3.5 — Query History backend (service, API, timing, XML capture)	2026-04-25 22:59:53 +08:00
Woody	e49a68b0bd	feat(prompts): Phase 3.2 — Prompt Backend (CRUD service, REST API, 33 tests) - PromptService (services/prompt_service.py): full CRUD for 3 profiles A/B/C with seed template reset, validation, and sqlite3.Row access - REST API (routers/prompts.py): 6 endpoints on /api/v1/prompts - Pydantic models (models/prompts.py): 6 schemas - DI wiring (dependencies.py): get_prompt_service() - App registration (main.py): prompts router - Mock fixture (conftest.py): mock_prompt_service - Tests: test_phase3_prompt_service.py (22) + test_phase3_prompts_router.py (11) - 162/166 total pass, 4 skipped, 0 fail	2026-04-25 21:11:17 +08:00
Woody	f4b404f27d	feat(db): Phase 3.1 — SQLite infrastructure (prompts.db + history.db) - Add sqlite_db.py with dual-DB connection factories (WAL mode, foreign keys) - init_prompts_db() creates system_prompt_profiles + system_prompts tables - init_history_db() creates query_history table + created_at index - seed_default_profiles() inserts 3 profiles (A/B/C) x 3 steps each - All 3 profiles start with identical seed templates; Profile A active - Add prompts_db_path + history_db_path to config (./data/ default) - Startup init in main.py creates data/ dir, inits both DBs, seeds profiles - Add PROMPTS_DB_PATH + HISTORY_DB_PATH to .env.example - Add data/ to .gitignore - 17 new tests in test_phase3_sqlite_db.py (all passing)	2026-04-25 20:29:29 +08:00
Woody	51640201f3	test(backend): update query tests for sub-question generation (sub-phase 2.3) Update prompt assertion in decomposer test and field assertions in query endpoint tests to match extracted_questions rename. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 16:24:10 +08:00
Woody	d49756f374	feat: add chunk PDF serving endpoint and frontend clickable source links (1.5.6) - Add page_number and chunk_file_path to SourceMetadata model and query router - Add GET /chunks/{file_path}/pdf endpoint with path traversal protection - Add View PDF links in ResponsePanel source cards and ChunkList component - Update TypeScript types and API helper for chunk PDF URLs - Add backend tests (5) and frontend ChunkList tests (7) - Update enhancement plan: all 3 features complete	2026-04-24 11:49:39 +08:00
Woody	b2dd385443	feat(backend): refactor ingest pipeline for page-aware chunking with PDF generation PDF uploads now use parse_pdf_by_page() -> chunk_pages() -> extract page PDFs -> enhanced metadata with page_number, chunk_file_path, and document_id. Same-filename replacement deletes old chunks and PDFs before re-ingest. DOCX/TXT keep original flat flow with document_id added. RAGService.ingest_document() accepts optional document_id parameter. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:53:17 +08:00
Woody	8c84062996	feat(backend): add PDF page extractor and chunk PDF storage config New pdf_extractor.py with extract_page_as_pdf() and extract_pages_as_pdf() for extracting individual PDF pages as separate files. Adds document_chunk_path setting to config and document_chunk/ to .gitignore. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:52:57 +08:00
Woody	b97264c66a	feat(backend): add page_number, chunk_file_path, document_id to chunk metadata Enhance extract_metadata() with three new optional fields for page-aware chunking support. Validates list length mismatches. Fully backward compatible — existing callers unaffected. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-24 10:30:40 +08:00

1 2

62 Commits