- Backend: add stop_after_decompose flag to QueryRequest, early-return
after decomposition in SSE stream with half_question:true event
- Frontend: add decomposeOnly method to useQueryDocumentStream hook
- QueryInput: remove grey italic from ASR partial text, rename Submit
to Final Submit, add gray Half Question button that decomposes
without clearing querybox text
- LTTPage: wire handleHalfQuestion to decomposeOnly
Reverts commits 284028b through b4096d6. Phase 4 (System Audio Capture)
will replace the YouTube use case with a more versatile getDisplayMedia approach.
Removed: YouTube router, HLS proxy, YouTubeService, YouTubeInput,
YouTubeVideoPlayer, useYouTubeASR hook, all Phase 3 tests, hls.js dep,
YouTube config fields, YouTube README/plan sections.
Modified files restored to pre-Phase-3 state: LTTPage (no source toggle),
api.ts (no YouTube extract), types (no YouTube types), config.py (no
youtube fields), main.py (no YouTube router), requirements.txt (no yt-dlp),
.env.example (no YouTube vars), package.json (no hls.js).
Relevant Phase 2 code preserved: ws_asr.py (unchanged), useVideoASR,
VideoPlayer, VideoUpload, QueryInput, Full Transcript.
Replace full_text responses with character-level deltas computed from
DashScope's monotonically-growing 'text' field. Stash-only events (empty
text) are skipped; trailing stash chars sent alongside deltas and
appended on pause to complete final sentences.
Backend:
- Delta = text[len(prev_text):] — simple suffix diff, no merge logic
- Track item_id for utterance boundaries, prepend space separator
- Send stash alongside delta for frontend pause handler
Frontend:
- Accumulate deltas locally (transcriptRef += msg.delta)
- Store lastStashRef from each message
- On pause: append stash to text, fire onFinalTranscript
Plan: .plans/phase2_enhancement_delta_sse.md updated to Complete
DashScope stashes are ~7-char rolling windows, not cumulative. Each partial
event replaces the previous. Completed events rarely sent. This caused text to
jump/replace during streaming and disappear on pause.
Backend:
- Add _merge_stash() — finds overlapping suffix between successive stashes
and appends only new characters, reconstructing full utterance from partials
- format_transcription_event returns raw stash for read_events to merge
- read_events maintains partial_buffer via _merge_stash, clears on completed
- Guard against empty/whitespace-only stashes
Frontend:
- transcriptRef + onFinalTranscriptRef avoid stale closures in pause handler
- stopStreaming fires onFinalTranscript(currentText) before clearing partial
- Removed blind setPartialTranscript('') that erased text on pause
Tests: 16/16 ws_protocol tests pass, frontend tests unchanged
Plan: Updated phase2_implementation_plan.md to Complete with 11-bug log
- Add DashScope ASR and video upload config fields to Settings
- Create Pydantic models (video.py, asr.py)
- Create VideoService with validation, save, serve, delete
- Create ASR client stub with float32_to_s16le utility
- Implement POST /api/v1/video/upload with streaming validation
- Implement GET /api/v1/video/{video_id} with FileResponse
- Create WebSocket ASR endpoint stub
- Register new routers in main.py
- Update .env.example and requirements.txt
- Add reference examples for DashScope integration
- 8 tests passing (3 config + 5 video upload)
- Mark Phase 5.4 complete with actual commit log
- Add Phase 5.4 completion checklist (15 items all checked)
- Add production notes (Vite proxy, port conflicts, cache location)
- Update test counts to current (108 backend, 45 frontend, 153 total)
- Update Decision #12 to reflect inline citation link upgrade
- citationParser.ts: extractCitedSources() parses answer text for [citations],
resolves against SourceMetadata, returns deduplicated cited sources
- ResponsePanel.tsx: useEffect fires POST /api/v1/v2/highlights/batch after
answer renders; View PDF link upgrades in-place to highlighted HTML when
batch completes; stays as raw PDF on failure
- Updated plan: LLM-based relevance detection, eager background computation,
single batched LLM call, sqlite cache, regex sentence splitter
- 45 frontend tests: 28 citationParser + 17 ResponsePanel (including 4 new
sub-question highlight tests)
Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable.
Fix two latent bugs from Package 4:
- generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime)
- _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}.
Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart.
Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Convert /query endpoint from synchronous JSON to Server-Sent Events (SSE)
streaming. The frontend now receives extracted_questions as soon as the
first LLM call completes, without waiting for retrieval, filtering, and
answer generation.
Backend:
- Add StreamingQueryEvent union type (Decomposed, Retrieving, Filtering,
Generating, Completed, Error)
- Convert /query to return StreamingResponse with SSE format
- Yield events after each pipeline phase
Frontend:
- Add queryDocumentStream() using fetch + ReadableStream
- Add useQueryDocumentStream() hook with phase-aware state
- Update LTTPage to use streaming instead of mutation
- Update ResponsePanel to show phase messages (Searching documents...,
Filtering passages..., Generating answer...)
- Update ExtractedQuestionsDisplay to accept null
Tests:
- Update query_flow e2e test to mock queryDocumentStream
- 84/85 tests pass (1 pre-existing failure from removed file-input)
Mark sub-phases 2.1 (Remove Upload) and 2.2 (Question Display) as done.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add page_number and chunk_file_path to SourceMetadata model and query router
- Add GET /chunks/{file_path}/pdf endpoint with path traversal protection
- Add View PDF links in ResponsePanel source cards and ChunkList component
- Update TypeScript types and API helper for chunk PDF URLs
- Add backend tests (5) and frontend ChunkList tests (7)
- Update enhancement plan: all 3 features complete
Sub-phase 1.5.3: Full RAG Database page with document listing, expandable chunk viewer, delete with confirmation, and document upload. Adds TypeScript types, API functions, TanStack Query hooks (useQuery + useMutation with cache invalidation), and three new components (DocumentList, ChunkList, DocumentUpload).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Mark sub-phase 1.5.2 (backend CRUD) as complete. Update acceptance criteria, risk mitigations, and test plan.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add react-router-dom with NavBar component (LTT + RAG Database tabs)
- Extract AppContent into LTTPage, add RAGDatabasePage placeholder
- Refactor App.tsx to BrowserRouter + Routes layout
- Switch ResponsePanel to react-markdown for rich formatting
- Fix ResponsePanel test for markdown rendering
- Update RAG prompt to cite source name instead of number
- Save Phase 1 enhancement plan (.plans/phase1_enhancement_plan.md)
- Mark Phase 1.1 and 1.2 as complete with test results
- Update acceptance criteria checklist
- Add Services Status table showing implemented/pending components
- Mark Phase 1.3 tasks that are already done (LLM client, retrieval, response gen)