legco_ai_assistant

Commit Graph

Author	SHA1	Message	Date
Woody	17db487dbb	feat: Phase 3 — Half Question button, Final Submit rename, ASR text always black - Backend: add stop_after_decompose flag to QueryRequest, early-return after decomposition in SSE stream with half_question:true event - Frontend: add decomposeOnly method to useQueryDocumentStream hook - QueryInput: remove grey italic from ASR partial text, rename Submit to Final Submit, add gray Half Question button that decomposes without clearing querybox text - LTTPage: wire handleHalfQuestion to decomposeOnly	2026-05-14 21:27:21 +08:00
Woody	64a7a8a46b	chore: add pnpm lockfiles, Phase 4 plan, and dev plan status update	2026-05-14 20:26:17 +08:00
Woody	2501a2c3c0	docs: use pnpm instead of npm in dev commands	2026-05-14 20:22:33 +08:00
Woody	5832a854c5	chore: remove Phase 3 plan file after revert	2026-05-09 21:14:20 +08:00
Woody	b05c361fbd	revert: remove Phase 3 YouTube proxy — all 7 sub-phases Reverts commits `284028b` through `b4096d6`. Phase 4 (System Audio Capture) will replace the YouTube use case with a more versatile getDisplayMedia approach. Removed: YouTube router, HLS proxy, YouTubeService, YouTubeInput, YouTubeVideoPlayer, useYouTubeASR hook, all Phase 3 tests, hls.js dep, YouTube config fields, YouTube README/plan sections. Modified files restored to pre-Phase-3 state: LTTPage (no source toggle), api.ts (no YouTube extract), types (no YouTube types), config.py (no youtube fields), main.py (no YouTube router), requirements.txt (no yt-dlp), .env.example (no YouTube vars), package.json (no hls.js). Relevant Phase 2 code preserved: ws_asr.py (unchanged), useVideoASR, VideoPlayer, VideoUpload, QueryInput, Full Transcript.	2026-05-09 21:07:21 +08:00
Woody	b4096d6afc	feat: Phase 3.7 — Polish, PO token handling, docs, deployment verification - PO token handling: _is_po_token_error() detects YouTube bot-detection errors, invalidates cache on detection, logs warning for retry guidance (2 new tests) - README: YouTube Live Stream Proxy section with architecture, usage, config, limits - development_plan: Phase 3 complete, timeline updated, status → Phase 1-3 Complete - Dockerfile/compose: verified OK (ffmpeg + yt-dlp already present, no new volumes) - npm build: 1403 modules, production build clean - 59/59 backend + 44/44 frontend Phase 2+3 tests pass - Plan: 3.7 Complete, 7/7 sub-phases done	2026-05-09 17:27:54 +08:00
Woody	cee859d5d7	feat: Phase 3.6 — integration + acceptance tests for YouTube proxy - test_integration_phase3.py: 6 tests Extract→proxy flow (VOD manifest, VOD segment, live manifest), cache hit bypasses yt-dlp, upstream 404→502, extract disabled→503 Mocked yt-dlp, real FastAPI TestClient + HLSProxyService - test_acceptance_phase3_youtube.py: 3 tests Real YouTube VOD extraction, manifest proxy, segment proxy Follows master→variant→segment chain, verifies MPEG-TS sync byte - test_acceptance_phase3_live.py: 3 tests Real live stream extraction, no #EXT-X-ENDLIST assertion, cache refresh verification, graceful skip when offline - 201/201 CI pass (234 backend Phase 1-3, zero Phase 3 regressions) - Updated plan: 3.6 Complete, 6/7 sub-phases done	2026-05-09 17:18:55 +08:00
Woody	1699a249b0	feat: Phase 3.5 — YouTube → ASR integration with source toggle - useYouTubeASR.ts: adapted from useVideoASR, captures audio from HTMLAudioElement (hls.js → <audio> → AudioContext.createMediaElementSource → ScriptProcessorNode → WebSocket) Play/pause events on videoElement; same return shape as useVideoASR - LTTPage.tsx: Source toggle (Upload/YouTube tabs), YouTubeInput + YouTubeVideoPlayer wired with handleExtractSuccess → handleAudioReady → useYouTubeASR Full Transcript button hidden for YouTube source; unified asr variable - QueryInput.tsx: no changes needed (already supports partialText/value from any source) - Tests: 18 new (11 useYouTubeASR, 7 LTTPage integration) - 189/189 total pass (zero regressions) - Updated plan: 3.5 marked Complete, 5/7 sub-phases done	2026-05-09 17:00:32 +08:00
Woody	a8eea54c0f	feat: Phase 3.4 — YouTube Input + Video Player frontend components - YouTubeInput.tsx: URL input with validation (youtube.com/watch, youtu.be, /live/, /shorts/), loading/error states, Load Stream button, uses useYouTubeExtract mutation - YouTubeVideoPlayer.tsx: dual hls.js (video + hidden audio), forwardRef, thumbnail placeholder until play, LIVE badge, quality capped ≤480p, onAudioReady callback for ASR hook exposure, dynamic import('hls.js') - Types: YouTubeFormat, YouTubeStreamResponse interfaces - API: extractYouTubeStream() — POST /youtube/extract - Query: useYouTubeExtract() TanStack Query mutation hook - Tests: 16 new (7 YouTubeInput, 9 YouTubeVideoPlayer) - 171/171 total pass (zero regressions) - Updated plan: 3.4 marked Complete, 4/7 sub-phases done	2026-05-09 16:43:42 +08:00
Woody	3c9ed2cc8d	feat: Phase 3.3 — HLS manifest proxy with line-by-line rewriting - HLSProxyService: rewrite_manifest() rewrites segment/sub-manifest/EXT-X-KEY URIs to proxy URLs; proxy_segment() transparently proxies .ts segments - Route: upstream status checked before streaming — 502 on failure - CORS access-control-allow-origin: * on all responses - Line rewriting: pass-through tags/comments, rewrite URIs, handle relative/absolute URLs - URL resolution: urljoin for relative, absolute path, and absolute URL - 22 tests (8 line rewriting, 4 URL resolution, 3 proxy URL construction, 2 manifest integration, 1 segment proxying, 4 route integration) - 104/104 total pass (zero regressions)	2026-05-09 16:13:33 +08:00
Woody	284028bb1f	feat: Phase 3.1 + 3.2 — YouTube config infra and URL extraction Phase 3.1 — Configuration & Infrastructure: - Add youtube_proxy_enabled, yt_dlp_timeout, yt_dlp_cache_ttl config fields - Add yt-dlp and hls.js dependencies - Create models/youtube.py (request/response schemas) - Create service stubs (youtube_service, hls_proxy) - Create router stub and register in main.py - 11 config tests Phase 3.2 — YouTube URL Extraction: - yt-dlp wrapper with async extraction (run_in_executor) - Format selection: ≤480p video-only + highest-bitrate audio (VOD) - Combined format fallback: same URL for live streams - In-memory URL cache: 5min TTL live, 30min VOD - lru_cache singleton service for cache persistence - Error handling: DownloadError → 200 with error field - 18 extract tests, 82/82 total pass (zero regressions) Real-URL verified: VOD (5bF3tkO5jAA) 24 formats, Live (fN9uYWCjQaw) 6 HLS	2026-05-09 15:53:04 +08:00
Woody	09b5ea7d64	refactor: remove dead _merge_stash, add Phase 3 YouTube proxy plan - Remove _merge_stash (dead code since delta-based ASR refactor) - Replace TestMergeStash with TestTextFieldFormatting (53/53 Phase 2 tests pass) - Mark phase2_enhancement_use_text_field as Complete - Add Phase 3 YouTube live stream proxy implementation plan - README updates	2026-05-09 15:14:01 +08:00
Woody	c8d955c45c	fix: add ffmpeg, uploads volume to Docker deployment for Phase 2 - Dockerfile: install ffmpeg for video audio extraction, create /app/uploads - docker-compose.yml: add uploads_data volume mount - README: add uploads_data to volumes table	2026-05-07 11:32:09 +08:00
Woody	563ef263ed	docs: add DashScope API key to Docker prereqs, ffmpeg install guide, Phase 2 env vars	2026-05-07 11:30:30 +08:00
Woody	78d1f8cc91	feat: delta-based ASR transcript — use text field, utterance boundaries, stash on pause Replace full_text responses with character-level deltas computed from DashScope's monotonically-growing 'text' field. Stash-only events (empty text) are skipped; trailing stash chars sent alongside deltas and appended on pause to complete final sentences. Backend: - Delta = text[len(prev_text):] — simple suffix diff, no merge logic - Track item_id for utterance boundaries, prepend space separator - Send stash alongside delta for frontend pause handler Frontend: - Accumulate deltas locally (transcriptRef += msg.delta) - Store lastStashRef from each message - On pause: append stash to text, fire onFinalTranscript Plan: .plans/phase2_enhancement_delta_sse.md updated to Complete	2026-05-07 11:26:19 +08:00
Woody	cb0ac07786	fix: text accumulation — stashes are sliding windows, merge via overlap detection DashScope stashes are ~7-char rolling windows, not cumulative. Each partial event replaces the previous. Completed events rarely sent. This caused text to jump/replace during streaming and disappear on pause. Backend: - Add _merge_stash() — finds overlapping suffix between successive stashes and appends only new characters, reconstructing full utterance from partials - format_transcription_event returns raw stash for read_events to merge - read_events maintains partial_buffer via _merge_stash, clears on completed - Guard against empty/whitespace-only stashes Frontend: - transcriptRef + onFinalTranscriptRef avoid stale closures in pause handler - stopStreaming fires onFinalTranscript(currentText) before clearing partial - Removed blind setPartialTranscript('') that erased text on pause Tests: 16/16 ws_protocol tests pass, frontend tests unchanged Plan: Updated phase2_implementation_plan.md to Complete with 11-bug log	2026-05-06 20:06:39 +08:00
Woody	fcb9ec1f6c	fix: Phase 2 ASR pipeline — 9 bugs resolved, Full Transcript works end-to-end - Vite proxy: forward /api and /ws to backend port 8000 - WebSocket URL: use backend host, not Vite HMR port - LTTPage: callback ref replaces useRef (video element always null before) - ws_asr: pass DashScope API key to OmniRealtimeConversation - asr_client: fix data_url MIME type (audio/wav), omit extra_body when auto - useFullTranscript: use absolute URL prefix for fetch - QueryInput: add value prop for external Full Transcript injection - QueryInput: fix displayValue \|\| logic (partialText '' overrode question) - ffmpeg: install static binary for audio extraction - Integration tests: 7 tests (upload→transcribe flow) - Acceptance tests: real DashScope tests (skippable) - Structured logging: ws_asr.py + video.py	2026-05-06 18:26:17 +08:00
Woody	f3b94381ae	feat: Phase 2.5 video player, upload UI, and LTTPage layout refactor - VideoUpload: native drag-and-drop with axios progress bar, file validation - VideoPlayer: forwardRef wrapper for <video> element (used by useVideoASR) - LTTPage: replaced VideoPlaceholder, wired useVideoASR/useFullTranscript, Full Transcript button, resizable left/right panels (min 30%) - Tests: 25 new (VideoUpload 8, VideoPlayer 7, LTTPage integration 10)	2026-05-06 14:31:27 +08:00
Woody	a4e067822b	feat: Phase 2.3 ASR proxy + full transcript and 2.4 frontend hooks - Backend: DashScope WebSocket proxy (/ws/asr/{video_id}), DashScopeCallback sync-to-async bridge, ffmpeg audio extraction, POST /video/{id}/transcribe - Frontend: useVideoASR hook (auto on play), useFullTranscript hook, QueryInput partialText prop, VideoUploadResponse types, uploadVideo API - Tests: 41 backend + 26 frontend = 67 new tests, all passing	2026-05-06 13:41:24 +08:00
Woody	9934749d2b	feat: Phase 2.1 config + infrastructure and 2.2 video upload backend - Add DashScope ASR and video upload config fields to Settings - Create Pydantic models (video.py, asr.py) - Create VideoService with validation, save, serve, delete - Create ASR client stub with float32_to_s16le utility - Implement POST /api/v1/video/upload with streaming validation - Implement GET /api/v1/video/{video_id} with FileResponse - Create WebSocket ASR endpoint stub - Register new routers in main.py - Update .env.example and requirements.txt - Add reference examples for DashScope integration - 8 tests passing (3 config + 5 video upload)	2026-05-06 13:08:19 +08:00
Woody	63e4c1a385	docs: add plan for configurable SubQuestions format	2026-05-04 17:22:38 +08:00
Woody	76c3bec2ab	feat: configurable SubQuestions via Step 1.2 system prompt page - Split 'Step 1: Query Decomposition' into Step 1.1 (prompt template) and Step 1.2 (format config with description + max_length) - Add create_subquestions_model() and parse_decompose_format() to decompose.py - QueryDecomposer reads decompose_format from DB, creates dynamic Pydantic model at runtime - PromptEditor renders Step 1.2 as textarea (description) + number input (max_length 1-5) - Graceful fallback to static SubQuestions when decompose_format unavailable	2026-05-04 17:22:14 +08:00
Woody	40b338d3ca	chore: gitignore .research, switch to flash, tighten sub-questions Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 16:38:58 +08:00
Woody	5535b42ae2	refactor: tighten SubQuestions to 1-3 with Cantonese format hint Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 15:18:14 +08:00
Woody	df62283f58	feat: inject Pydantic JSON schema into Deepseek prompt (Phase 6) Follows Deepseek JSON Output guide: the prompt now includes the word 'json' and a format example derived from the Pydantic model schema. Added _pydantic_to_json_instruction() helper that builds a human-readable schema description with EXAMPLE JSON OUTPUT. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 15:17:24 +08:00
Woody	226f4ed700	test: update integration mocks for dual-client architecture (Phase 6) Added complete_structured() to mock classes, split response lists between LLMClientDP (decompose) and LLMClient (filter+generate), and patched both clients in all integration tests. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:59:23 +08:00
Woody	3b5bd79839	feat: wire LLMClientDP into query decompose pipeline (Phase 6) QueryDecomposer now uses LLMClientDP (Deepseek) while RelevanceFilter and RAGService continue using LLMClient (OpenRouter/vLLM). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:59:08 +08:00
Woody	849beb4d4e	feat: add LLMClientDP for Deepseek decompose (Phase 6) Uses Deepseek's json_object response_format (not json_schema, which Deepseek does not support). Always disables thinking mode. Includes unit tests (12) and acceptance tests (5). Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:58:53 +08:00
Woody	73ae621f3b	feat: add Deepseek config fields and DI wiring (Phase 6) Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:58:39 +08:00
Woody	b6562f3d76	docs: add Package 6 enhancement plan Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-05-04 14:58:24 +08:00
Woody	23c665515d	fix: wrap filter chunks in XML tags for clearer LLM input	2026-04-30 13:59:03 +08:00
Woody	fc6b5463b5	fix: vLLM structured output missing thinking-control extra_body	2026-04-29 21:01:10 +08:00
Woody	16de8394aa	fix: add full input/output logging to vLLM structured output path Log the complete prompt, schema, extra_body content, full API response, token counts, and full parsed JSON output. Add exc_info=True tracebacks on all failure paths.	2026-04-29 16:52:26 +08:00
Woody	3ab6fd102a	fix: use vLLM-native guided_json for structured output vLLM servers support JSON schema enforcement via extra_body (guided_json or structured_outputs), not OpenAI's response_format protocol. LangChain's with_structured_output(method='json_schema') sends response_format which vLLM ignores, causing NoneType not iterable parsing errors. - vLLM path: direct OpenAI SDK call with extra_body={guided_json\|structured_outputs} - OpenRouter path: unchanged with_structured_output(method='json_schema') - Try new 'structured_outputs' format first, fall back to legacy 'guided_json' - Update _SEED_DECOMPOSE with explicit JSON array instruction - Add diagnostic logging: exc_info=True, schema preview, prompt template preview - Add logging in _parse_legacy_json for fallback failure debugging	2026-04-29 16:49:14 +08:00
Woody	2aca18d30e	docs: add vLLM structured output fix plan - Diagnose: vLLM ignores OpenAI-native response_format, causing NoneType error - Diagnose: legacy fallback prompt lacks JSON instruction → empty questions - Plan: use vLLM-native guided_json via extra_body instead of with_structured_output - Plan: update _SEED_DECOMPOSE with JSON format instruction - Plan: add diagnostic logging (exc_info, method, schema preview) wip: temporary function_calling switch for vLLM (to be replaced by guided_json)	2026-04-29 16:42:23 +08:00
Woody	cbb958d75d	fix: vLLM chat_template_kwargs breaks LangChain structured output vLLM's chat_template_kwargs leaked into LangChain's AsyncCompletions.parse() via _get_langchain_model's model_kwargs, causing structured decomposition to fail on vLLM backends. Skip vLLM-specific params when building the LangChain model — only provider-agnostic params (OpenAI reasoning) pass through.	2026-04-29 16:07:44 +08:00
Woody	90269608bc	fix: display highlight tracking data in history page UI - Add highlight_prompt, highlight_response, highlight_time_ms to QueryHistoryDetail type - Add 'Highlights' bar segment with pink color in TimingBar component - Pass highlightTimeMs to TimingBar in HistoryCard expanded view - Add collapsible sections for highlight prompt and response in HistoryCard detail	2026-04-29 13:42:08 +08:00
Woody	41f59b396f	feat: track highlight generation prompt, response, and timing in history (Phase 5.5) - Add 3 columns to query_history: highlight_prompt, highlight_response, highlight_time_ms - HistoryService.update_highlights() updates existing row after batch LLM call - ChunkHighlightService measures timing, captures prompt and structured JSON response - SSE completed event includes history_id for frontend to pass back - Frontend captures historyId, passes as ?history_id= query param in batch POST - Highlight time tracked separately (excluded from total_time_ms) - All 153 tests pass (108 backend + 45 frontend)	2026-04-29 11:18:21 +08:00
Woody	36dedab485	docs: finalize Phase 5 enhancement plan with completion status - Mark Phase 5.4 complete with actual commit log - Add Phase 5.4 completion checklist (15 items all checked) - Add production notes (Vite proxy, port conflicts, cache location) - Update test counts to current (108 backend, 45 frontend, 153 total) - Update Decision #12 to reflect inline citation link upgrade	2026-04-29 10:54:18 +08:00
Woody	523b27bb58	test: update batch URL assertion to match absolute backend URL	2026-04-29 10:42:18 +08:00
Woody	b47e37f39b	fix: use absolute backend URL for highlight API calls - Vite dev server doesn't proxy /api/v1/v2/ paths to backend - Changed fetch URL and getHighlightUrl to use http://localhost:8000 - Fixed inline citation highlight URLs in buildCitationUrl - Cleaned up debug code	2026-04-29 10:39:01 +08:00
Woody	bcf4a853bf	feat: add highlight status toast notification (Phase 5.4) - Shows 'Preparing highlights...' (amber spinner) while LLM batch runs - Shows 'Highlights ready' (green) for 4 seconds when batch completes - Fixed position top-left corner, auto-dismisses	2026-04-29 10:00:54 +08:00
Woody	1c490ce2fa	fix: inline citations now upgrade to highlighted view (Phase 5.4) - Added sub_question_text to frontend SourceMetadata type - SubQuestionSection enriches sources with parent sub-question text - buildCitationUrl routes to highlight page when sub_question_text present - processCitations threads highlightReadyKeys through inline citations	2026-04-29 09:54:40 +08:00
Woody	c632b9ea3b	feat: cited source extraction, background batch trigger, and View PDF link upgrade (Phase 5.4.6-5.4.8) - citationParser.ts: extractCitedSources() parses answer text for [citations], resolves against SourceMetadata, returns deduplicated cited sources - ResponsePanel.tsx: useEffect fires POST /api/v1/v2/highlights/batch after answer renders; View PDF link upgrades in-place to highlighted HTML when batch completes; stays as raw PDF on failure - Updated plan: LLM-based relevance detection, eager background computation, single batched LLM call, sqlite cache, regex sentence splitter - 45 frontend tests: 28 citationParser + 17 ResponsePanel (including 4 new sub-question highlight tests)	2026-04-29 09:27:04 +08:00
Woody	a56f8f69e2	feat: add highlight batch and GET endpoints (Phase 5.4.5) - POST /api/v1/v2/highlights/batch: compute and cache highlights for cited chunks - GET /api/v1/v2/highlights: serve cached highlighted HTML pages - chunks.py router registered in main.py - Dynamic DB path computation (prompts.db -> highlights.db), no Settings changes - 7 endpoint tests: POST 200/422, GET 200/404, mock service verification	2026-04-29 09:26:50 +08:00
Woody	c6d4a38013	feat: add LLM-based batch highlight service and HTML rendering (Phase 5.4.4) - ChunkHighlightService.compute_highlights_batch(): single LLM call across all cited chunks, grouped by sub-question, with structured output - render_highlight_html(): self-contained HTML page with yellow-highlighted relevant sentences, LLM reason annotations, and View Original PDF footer - Per-target error isolation, ChromaDB miss handling, graceful degradation - 14 tests: 7 batch service + 7 HTML rendering	2026-04-29 09:26:33 +08:00
Woody	bdbc8ea1a0	feat: add SQLite highlight cache service (Phase 5.4.3) - highlight_cache.py: HighlightCache class with get/set_highlight and compute_cache_key (sha256 hash of document_id\|chunk_index\|sub_question) - INSERT OR REPLACE semantics, idempotent table creation - 13 tests covering round-trip, overwrite, missing keys, determinism	2026-04-29 09:26:20 +08:00
Woody	b11d31e2d1	feat: add sentence splitter and highlight data models (Phase 5.4.1-5.4.2) - sentence_splitter.py: regex-based sentence splitting for English + Chinese punctuation - highlight.py: 6 Pydantic models (ChunkHighlightTarget, HighlightBatchRequest, RelevantSentence, ChunkHighlights, HighlightBatchResult, HighlightBatchResponse) - 43 tests: 13 sentence splitter + 30 model validation	2026-04-29 09:26:06 +08:00
Woody	ec3b5a4ae1	docs: mark Phase 5.3 complete in enhancement plan Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 17:33:00 +08:00
Woody	25b26c9b48	feat(ingest): generate per-chunk PDFs for DOCX/TXT documents (Phase 5.3) DOCX and TXT ingestion now produces chunk_file_path + per-chunk PDF files matching the PDF ingestion flow. Uses reportlab to render chunk text as simple PDFs with automatic text wrapping. - Add reportlab==4.2.5 to requirements.txt - New utils/text_to_pdf.py: generate_text_pdf() renders chunk text as PDF - Ingest router DOCX/TXT branches: generate chunk_N.pdf per chunk, store in chunk_file_paths - Graceful degradation: chunk_file_path stays None if PDF generation fails - Update test_phase1_ingest_page_aware.py assertions: DOCX chunks now HAVE chunk_file_path - New test_phase5_docx_pdf_generation.py: 5 tests (DOCX PDF gen, TXT PDF gen, PDF regression, file count, graceful degradation) - 361 backend tests pass (4 pre-existing embedding failures unrelated) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>	2026-04-28 17:32:22 +08:00

1 2 3 4

178 Commits All Branches Search

178 Commits

All Branches