Without rehype-raw, ReactMarkdown escaped the raw <mark> HTML injected
by highlightTerms(), showing literal tags instead of yellow highlights.
Now 30 marks render with correct bg-yellow-200 (#FEF08A) background.
DashScope realtime ASR sends utterance-completed (final) events
without incremental deltas. The onmessage handler cleared
partialTranscript on every final, so text never appeared.
Set partialTranscript to full_text on final messages instead
of clearing it, keeping the transcript visible in QueryInput.
useMediaStreamASR cleanup() cleared partialTranscript on stop,
causing live ASR text to vanish from QueryInput. Unlike video
ASR (which has onFinalTranscript to persist via queryText),
mic and system-audio hooks rely on partialTranscript for
display. Keep partialTranscript populated with the final
transcript instead of clearing it.
Adds two new live audio sources alongside file Upload:
- System Audio: getDisplayMedia() captures system/tab audio output,
pipes through WebSocket → DashScope realtime ASR → RAG.
- Listen Mic: getUserMedia() captures microphone input via the same
audio pipeline (shared useMediaStreamASR hook).
Backend: feature toggles (system_audio_enabled, mic_enabled) in
config.py, source query param gating in ws_asr.py, 10 config tests.
Bug fix: getDisplayMedia() rejected video:false per W3C spec —
changed to video:true then stop video tracks to allow audio-only
capture on Windows/macOS Chrome.
isDisabled, handleSubmit, and Half Question onClick all checked
question.trim() instead of displayValue.trim(). Since question state
is only updated on onFinalTranscript (complete sentences), interim
ASR delta text shown in the textarea via partialText was invisible
to the disabled check — buttons stayed disabled until sentence end.
Fix: use displayValue which includes partialText when user hasn't typed.
- Backend: add stop_after_decompose flag to QueryRequest, early-return
after decomposition in SSE stream with half_question:true event
- Frontend: add decomposeOnly method to useQueryDocumentStream hook
- QueryInput: remove grey italic from ASR partial text, rename Submit
to Final Submit, add gray Half Question button that decomposes
without clearing querybox text
- LTTPage: wire handleHalfQuestion to decomposeOnly
Reverts commits 284028b through b4096d6. Phase 4 (System Audio Capture)
will replace the YouTube use case with a more versatile getDisplayMedia approach.
Removed: YouTube router, HLS proxy, YouTubeService, YouTubeInput,
YouTubeVideoPlayer, useYouTubeASR hook, all Phase 3 tests, hls.js dep,
YouTube config fields, YouTube README/plan sections.
Modified files restored to pre-Phase-3 state: LTTPage (no source toggle),
api.ts (no YouTube extract), types (no YouTube types), config.py (no
youtube fields), main.py (no YouTube router), requirements.txt (no yt-dlp),
.env.example (no YouTube vars), package.json (no hls.js).
Relevant Phase 2 code preserved: ws_asr.py (unchanged), useVideoASR,
VideoPlayer, VideoUpload, QueryInput, Full Transcript.
Replace full_text responses with character-level deltas computed from
DashScope's monotonically-growing 'text' field. Stash-only events (empty
text) are skipped; trailing stash chars sent alongside deltas and
appended on pause to complete final sentences.
Backend:
- Delta = text[len(prev_text):] — simple suffix diff, no merge logic
- Track item_id for utterance boundaries, prepend space separator
- Send stash alongside delta for frontend pause handler
Frontend:
- Accumulate deltas locally (transcriptRef += msg.delta)
- Store lastStashRef from each message
- On pause: append stash to text, fire onFinalTranscript
Plan: .plans/phase2_enhancement_delta_sse.md updated to Complete
DashScope stashes are ~7-char rolling windows, not cumulative. Each partial
event replaces the previous. Completed events rarely sent. This caused text to
jump/replace during streaming and disappear on pause.
Backend:
- Add _merge_stash() — finds overlapping suffix between successive stashes
and appends only new characters, reconstructing full utterance from partials
- format_transcription_event returns raw stash for read_events to merge
- read_events maintains partial_buffer via _merge_stash, clears on completed
- Guard against empty/whitespace-only stashes
Frontend:
- transcriptRef + onFinalTranscriptRef avoid stale closures in pause handler
- stopStreaming fires onFinalTranscript(currentText) before clearing partial
- Removed blind setPartialTranscript('') that erased text on pause
Tests: 16/16 ws_protocol tests pass, frontend tests unchanged
Plan: Updated phase2_implementation_plan.md to Complete with 11-bug log
- Add highlight_prompt, highlight_response, highlight_time_ms to QueryHistoryDetail type
- Add 'Highlights' bar segment with pink color in TimingBar component
- Pass highlightTimeMs to TimingBar in HistoryCard expanded view
- Add collapsible sections for highlight prompt and response in HistoryCard detail
- Vite dev server doesn't proxy /api/v1/v2/ paths to backend
- Changed fetch URL and getHighlightUrl to use http://localhost:8000
- Fixed inline citation highlight URLs in buildCitationUrl
- Cleaned up debug code
- Added sub_question_text to frontend SourceMetadata type
- SubQuestionSection enriches sources with parent sub-question text
- buildCitationUrl routes to highlight page when sub_question_text present
- processCitations threads highlightReadyKeys through inline citations
- citationParser.ts: extractCitedSources() parses answer text for [citations],
resolves against SourceMetadata, returns deduplicated cited sources
- ResponsePanel.tsx: useEffect fires POST /api/v1/v2/highlights/batch after
answer renders; View PDF link upgrades in-place to highlighted HTML when
batch completes; stays as raw PDF on failure
- Updated plan: LLM-based relevance detection, eager background computation,
single batched LLM call, sqlite cache, regex sentence splitter
- 45 frontend tests: 28 citationParser + 17 ResponsePanel (including 4 new
sub-question highlight tests)
LLMs may cite chunks from one sub-question's context inside another sub-question's answer section. Previously, processCitationsForSubq only looked up the current sub-question's sources, leaving cross-referenced citations unlinked. Now SubQuestionSection passes all sub-question sources and uses processCitations with a combined flat lookup.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The per-sub-question filter was all-or-nothing: if the LLM returned
9 scores for 10 chunks (common with qwen3.5-35b), every chunk was
discarded and the user got 'no relevant information found'.
Now: fewer scores → pad with 0.0; more scores → truncate. Changed
from error→warning since this is recoverable.
Also improve LTT page UI: sources collapsed by default in per-sub-q
sections, and the 'Your question' text now shows the full question
instead of being truncated.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable.
Fix two latent bugs from Package 4:
- generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime)
- _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}.
Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart.
Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
6 stream state tests for completed, backward compat, generating_subquestion, null sources, lifecycle, and reset. 7 type tests for SubQuestionSources shape, discriminated union narrowing, and QueryResponse. 8 ResponsePanel tests for per-subq sections, source scoping, backward compat flat rendering, copy button, skeletons, and empty states. 7 citationParser tests for per-subq lookup, cross-section isolation, and processCitationsForSubq. 2 e2e tests for per-subq SSE display and progressive generation.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Redesign ResponsePanel with SubQuestionSections component that parses answer markdown on ## Sub-question N: boundaries and renders per-sub-question cards with headers, ReactMarkdown body, and collapsible sources scoped to each section. Extract FlatResponse for backward compatibility when subQuestionSources is null. Add buildCitationLookupForSubq and processCitationsForSubq for per-sub-question citation lookup scope isolation. Add anchor links in ExtractedQuestionsDisplay that smooth-scroll to matching ResponsePanel sections. Pass subQuestionSources through LTTPage.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Add SubQuestionSources interface for grouped per-sub-question sources. Convert QueryStreamEvent from flat interface to discriminated union with 7 variants including generating_subquestion and completed with sub_question_sources. Add subQuestionSources to QueryStreamState. Update completed event handler to populate subQuestionSources. Make sub_question_sources optional for backward compatibility with old SSE format.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Root cause: PromptEditor useEffect synced localPrompts back to match prompts after every keystroke, making isDirty() always false.
- Delegate disabled control to parent via hasChanges prop (no local sync)
- Derive currentPrompts synchronously to avoid empty-textarea flash
- Add key={selectedProfile} for clean remount on profile switch
- Update PromptEditor tests for new hasChanges prop
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Convert /query endpoint from synchronous JSON to Server-Sent Events (SSE)
streaming. The frontend now receives extracted_questions as soon as the
first LLM call completes, without waiting for retrieval, filtering, and
answer generation.
Backend:
- Add StreamingQueryEvent union type (Decomposed, Retrieving, Filtering,
Generating, Completed, Error)
- Convert /query to return StreamingResponse with SSE format
- Yield events after each pipeline phase
Frontend:
- Add queryDocumentStream() using fetch + ReadableStream
- Add useQueryDocumentStream() hook with phase-aware state
- Update LTTPage to use streaming instead of mutation
- Update ResponsePanel to show phase messages (Searching documents...,
Filtering passages..., Generating answer...)
- Update ExtractedQuestionsDisplay to accept null
Tests:
- Update query_flow e2e test to mock queryDocumentStream
- 84/85 tests pass (1 pre-existing failure from removed file-input)
Chunk PDFs like 'NEC4 ACC_page_5.pdf' contain only 1 page (the extracted
page). Passing the original page number (e.g. 5) to <Page> caused
'Invalid page request' because the file only has 1 page.
- Always render pageNumber={1} for <Page> component
- Display originalPage in UI title to show source page number
- Disable prev/next navigation since chunk PDFs are single-page
- Encode filePath with encodeURIComponent to handle spaces/special chars
- Add page-level error handling in PdfViewerPage for better diagnostics
- Show PDF URL in error message to aid debugging