504 lines
27 KiB
Markdown
504 lines
27 KiB
Markdown
# Phase 3: YouTube Live Stream Proxy → ASR → RAG — Implementation Plan
|
||
|
||
**Created:** 2026-05-09
|
||
**Updated:** 2026-05-09 (Phase 3 complete — all 7 sub-phases done)
|
||
**Status:** ✅ Complete
|
||
**Depends on:** Phase 1 (Complete), Phase 2 (Complete)
|
||
|
||
---
|
||
|
||
## 1. Overview
|
||
|
||
Phase 3 adds YouTube live stream (and VOD) playback as an alternative to file upload. User pastes a YouTube URL → backend extracts stream URLs via yt-dlp (separate video-only + audio-only for VODs; combined HLS for live) → backend proxies HLS manifests and .ts segments (zero re-encoding) → frontend plays video in `<video>` via hls.js, routes audio through hidden `<audio>` element → AudioContext.createMediaElementSource(audioElement) → existing ASR pipeline (WebSocket → DashScope) → transcript flows into QueryInput → Phase 1 RAG pipeline.
|
||
|
||
**Same code works identically for live streams and VODs.**
|
||
|
||
### Why Full Proxy (Not iframe)
|
||
|
||
YouTube's official iframe player does not expose the audio track to Web Audio API due to cross-origin restrictions. We proxy HLS segments through our backend so the browser treats them as same-origin.
|
||
|
||
### Audio Routing
|
||
|
||
```
|
||
YouTube HLS stream (combined video+audio for live; separate tracks for VOD)
|
||
→ hls.js loads into <video> (muted) and hidden <audio> element
|
||
→ AudioContext.createMediaElementSource(audioElement)
|
||
→ ScriptProcessorNode (Float32 PCM)
|
||
→ WebSocket → FastAPI → DashScope realtime ASR
|
||
→ transcript → QueryInput
|
||
```
|
||
|
||
Note: For VODs, separate video-only and audio-only tracks are used. For live streams, YouTube provides combined formats only — the same HLS manifest URL is used for both elements; hls.js demuxes them independently.
|
||
|
||
### Integration With Existing Pipeline
|
||
|
||
This phase reuses the existing ASR infrastructure entirely:
|
||
- `useVideoASR.ts` AudioContext graph pattern → adapted for YouTube audio element
|
||
- `ws_asr.py` WebSocket → DashScope proxy → unchanged
|
||
- `QueryInput.tsx` transcript display → unchanged
|
||
- `LTTPage.tsx` layout → minor addition (source toggle)
|
||
- RAG pipeline → unchanged
|
||
|
||
---
|
||
|
||
## 2. User Flow
|
||
|
||
1. User selects "YouTube" source (instead of "Upload")
|
||
2. User pastes YouTube URL → clicks "Load Stream"
|
||
3. Backend extracts stream URLs → thumbnail shown as placeholder; video loads behind the scenes
|
||
4. User presses play → video appears, audio routes to ASR pipeline (no auto-play)
|
||
5. Real-time ASR transcription begins automatically on play
|
||
6. Transcript flows into QueryInput → user can edit while streaming continues
|
||
7. User pauses/stops → transcript stays, user edits and submits → RAG answer
|
||
8. **"Full Transcript" button hidden for YouTube source** — real-time streaming ASR only
|
||
9. **If HLS stream fails**: auto-retry up to 3 times with re-extracted URL → after 3 failures, show "Live stream unavailable" error
|
||
|
||
---
|
||
|
||
## 3. Sub-Phases
|
||
|
||
### Phase 3.1 — Configuration & Infrastructure Setup ✅ Complete
|
||
|
||
Add config fields, install dependencies, create skeletons, register router.
|
||
|
||
**Test:** `test_phase3_config.py` (11 tests)
|
||
|
||
**Tasks:**
|
||
| # | Task | File | Status |
|
||
|---|------|------|--------|
|
||
| 3.1.1 | Add config fields: `youtube_proxy_enabled`, `yt_dlp_timeout`, `yt_dlp_cache_ttl` | `core/config.py` | Done |
|
||
| 3.1.2 | Update `.env.example` | `.env.example` | Done |
|
||
| 3.1.3 | Add deps: `yt-dlp>=2024.0.0` to `requirements.txt`, `hls.js@^1.5.0` to `package.json` | `requirements.txt`, `package.json` | Done |
|
||
| 3.1.4 | Create `models/youtube.py` — `YouTubeExtractRequest`, `YouTubeStreamResponse`, `StreamFormat` | `models/youtube.py` | Done |
|
||
| 3.1.5 | Create `services/youtube_service.py` stub | `services/youtube_service.py` | Done |
|
||
| 3.1.6 | Create `services/hls_proxy.py` stub | `services/hls_proxy.py` | Done |
|
||
| 3.1.7 | Create `routers/youtube.py` stub: `POST /youtube/extract`, `GET /youtube/proxy/{stream_type}/{path}` | `routers/youtube.py` | Done |
|
||
| 3.1.8 | Register router in `main.py` | `main.py` | Done |
|
||
| 3.1.9 | Write and pass `test_phase3_config.py` | `app/test/` | Done (11/11 pass) |
|
||
|
||
---
|
||
|
||
### Phase 3.2 — YouTube URL Extraction Backend ✅ Complete
|
||
|
||
yt-dlp wrapper service that extracts stream URLs and formats. Returns proxy-wrapped URLs pointing back to our HLS proxy.
|
||
|
||
**Test:** `test_phase3_youtube_extract.py` (18 tests)
|
||
|
||
**Acceptance Criteria:**
|
||
- `POST /api/v1/youtube/extract` accepts `{"url": "https://www.youtube.com/watch?v=..."}`
|
||
- Returns `{ video_id, title, is_live, video_proxy_url, audio_proxy_url, thumbnail_url, formats, error }`
|
||
- VODs: extracts separate video-only + audio-only tracks, selects best ≤480p + highest-bitrate audio
|
||
- Live streams: extracts combined HLS formats, uses same URL for video and audio (hls.js demuxes)
|
||
- Upcoming/scheduled streams: returns `is_upcoming: true` with no proxy URLs
|
||
- Invalid/private URLs: returns 200 with error field populated (yt-dlp exception caught)
|
||
- URL expiration: in-memory cache with TTL (5 min for live, 30 min for VOD)
|
||
- Service singleton: `@lru_cache` on `_get_youtube_service()` for cache persistence across requests
|
||
|
||
**Implementation Discoveries:**
|
||
- **No iOS client needed** — default yt-dlp works for both VOD (separate tracks) and live (combined HLS)
|
||
- **Live streams use combined formats** — all live formats include both video+audio; same HLS URL serves both `<video>` and `<audio>` elements
|
||
- **Format selection** (`_pick_best_video`): prefers ≤480p with HLS first, then falls back to ascending height + HLS preference
|
||
- **Error response pattern**: extraction errors return HTTP 200 with `error` field (not 4xx); the API call itself succeeds but YouTube returned an error
|
||
- **Proxy URL construction** (`_build_proxy_url`): URL-encodes upstream URL into `/api/v1/youtube/proxy/manifest.m3u8?url=<encoded>`
|
||
|
||
**Real-URL Verification:**
|
||
```
|
||
VOD: https://www.youtube.com/watch?v=5bF3tkO5jAA → 24 formats, separate video+audio ✓
|
||
Live: https://www.youtube.com/watch?v=fN9uYWCjQaw → 6 combined formats, same URL ✓
|
||
```
|
||
|
||
**Tasks:**
|
||
| # | Task | File | Status |
|
||
|---|------|------|--------|
|
||
| 3.2.1 | Write tests first | `app/test/test_phase3_youtube_extract.py` | Done |
|
||
| 3.2.2 | Implement `YouTubeService.extract_streams()` — yt-dlp wrapper with format selection | `services/youtube_service.py` | Done |
|
||
| 3.2.3 | Implement `YouTubeService._select_best_formats()` + `_pick_best_video()` — separate video/audio from format list, prefer ≤480p, combined fallback | `services/youtube_service.py` | Done |
|
||
| 3.2.4 | Implement format URL caching with TTL (live 5 min, VOD 30 min) | `services/youtube_service.py` | Done |
|
||
| 3.2.5 | Implement `POST /api/v1/youtube/extract` route with response model + error handling | `routers/youtube.py` | Done |
|
||
| 3.2.6 | Run tests → pass → verified with real URLs | — | Done (82/82 pass) |
|
||
|
||
---
|
||
|
||
### Phase 3.3 — HLS Proxy Backend (1 day)
|
||
|
||
Proxy service that rewrites HLS manifests and proxies .ts segments. StreamingResponse for minimal latency.
|
||
|
||
**Reference:** mediaflow-proxy M3U8Processor pattern (line-by-line streaming, URL rewriting)
|
||
|
||
**Tests:** `test_phase3_hls_proxy.py`, `test_phase3_hls_manifest.py`
|
||
|
||
**Acceptance Criteria:**
|
||
- `GET /api/v1/youtube/proxy/manifest.m3u8?url=<encoded>` — fetches upstream manifest, rewrites all segment/sub-manifest URLs to point back to our proxy, streams response
|
||
- `GET /api/v1/youtube/proxy/segment.ts?url=<encoded>` — fetches upstream .ts segment, proxies with correct Content-Type (`video/mp2t`) and CORS headers
|
||
- Lines rewritten: segment URIs, sub-manifest URIs, `#EXT-X-KEY:URI=`, absolute URLs
|
||
- Lines passed through: `#EXTINF:`, `#EXT-X-TARGETDURATION`, `#EXT-X-MEDIA-SEQUENCE`, `#EXT-X-STREAM-INFO`, comments
|
||
- Client disconnect → upstream connection closed cleanly
|
||
- CORS headers on every response: `access-control-allow-origin: *`
|
||
- **Upstream failure → HTTP 502 with error detail; frontend retries up to 3 times with fresh URL before showing "Service unavailable"**
|
||
|
||
**Tasks:**
|
||
| # | Task | File |
|
||
|---|------|------|
|
||
| 3.3.1 | Write tests first | `app/test/test_phase3_hls_proxy.py`, `app/test/test_phase3_hls_manifest.py` |
|
||
| 3.3.2 | Implement `HLSProxyService.rewrite_manifest()` — streaming line-by-line, URL detection + rewriting | `services/hls_proxy.py` |
|
||
| 3.3.3 | Implement `HLSProxyService.proxy_segment()` — httpx stream → StreamingResponse | `services/hls_proxy.py` |
|
||
| 3.3.4 | Implement `GET /api/v1/youtube/proxy/{type}/{path}` route — dispatch manifest vs segment | `routers/youtube.py` |
|
||
| 3.3.5 | Run tests → pass → commit | — |
|
||
|
||
---
|
||
|
||
### Phase 3.4 — Frontend: YouTube Input + Video Player ✅ Complete
|
||
|
||
URL input component and hls.js-based video player. Two media elements: visible `<video muted>` and hidden `<audio>` (for Web Audio API routing).
|
||
|
||
**Tests:** `test_phase3_YouTubeInput.test.tsx` (7 tests), `test_phase3_YouTubeVideoPlayer.test.tsx` (9 tests)
|
||
|
||
**Acceptance Criteria:**
|
||
- `YouTubeInput` accepts URL, validates format (youtube.com/watch, youtu.be, /live/, /shorts/), shows loading/error states
|
||
- `YouTubeVideoPlayer` uses `forwardRef<HTMLVideoElement>` (same pattern as `VideoPlayer`)
|
||
- Video HLS loaded via hls.js into `<video muted>` element, quality capped ≤480p via `capLevelsTo480()`
|
||
- Audio HLS loaded via hls.js into hidden `<audio>` element, exposed via `onAudioReady` callback
|
||
- Thumbnail displayed as placeholder until user presses play; video element replaces it on play
|
||
- Video does NOT auto-play on load (waits for manual user play)
|
||
- Loading spinner, error overlay, "LIVE" badge for live streams
|
||
- hls.js: dynamic `import('hls.js')` with fallback if not supported (SSR-safe)
|
||
- CrossOrigin="anonymous" on both elements (required for AudioContext graph)
|
||
- No quality selector (low resolution only, sufficient for reference video)
|
||
|
||
**Implementation Notes:**
|
||
- hls.js installed as npm dependency (was already in package.json from Phase 3.1)
|
||
- YouTubeVideoPlayer uses `useImperativeHandle`-style callback ref for audio element exposure
|
||
- Quality capping: on `MANIFEST_PARSED`, sets `hls.autoLevelCapping` to highest level with height ≤ 480
|
||
- Thumbnail overlay: absolute-positioned `<img>` that hides on video `onPlay` event
|
||
|
||
**Tasks:**
|
||
| # | Task | File | Status |
|
||
|---|------|------|--------|
|
||
| 3.4.1 | Write tests first | `src/test/test_phase3_YouTubeInput.test.tsx`, `src/test/test_phase3_YouTubeVideoPlayer.test.tsx` | Done |
|
||
| 3.4.2 | Add YouTube types to `types/index.ts` | `types/index.ts` | Done |
|
||
| 3.4.3 | Add API functions to `lib/api.ts` | `lib/api.ts` | Done |
|
||
| 3.4.4 | Add TanStack Query hooks to `lib/queries.tsx` | `lib/queries.tsx` | Done |
|
||
| 3.4.5 | Create `components/YouTubeInput.tsx` — URL input, validation, loading/error states | `components/YouTubeInput.tsx` | Done |
|
||
| 3.4.6 | Create `components/YouTubeVideoPlayer.tsx` — hls.js dual-element player, forwardRef, onAudioReady | `components/YouTubeVideoPlayer.tsx` | Done |
|
||
| 3.4.7 | Run tests → pass → commit | — | Done (16/16 pass) |
|
||
|
||
---
|
||
|
||
### Phase 3.5 — Integration: YouTube → ASR Pipeline ✅ Complete
|
||
|
||
Wire YouTube audio output into existing ASR pipeline. Creates `useYouTubeASR` hook (adapted from `useVideoASR`) and integrates YouTube components into `LTTPage` with a source toggle.
|
||
|
||
**Tests:** `test_phase3_useYouTubeASR.test.ts` (11 tests), `test_phase3_LTTPage_integration.test.tsx` (7 tests)
|
||
|
||
**Acceptance Criteria:**
|
||
- `useYouTubeASR` hook: accepts `audioElement` + `videoElement`, sets up AudioContext graph on mount
|
||
- AudioContext.createMediaElementSource(audioElement) → ScriptProcessorNode → WebSocket (same as useVideoASR, but audio source from `<audio>` element)
|
||
- Play/pause/ended events on `videoElement` (user controls video, audio follows)
|
||
- Auto-starts ASR on play, stops on pause/end (same lifecycle as `useVideoASR`)
|
||
- Transcript flows into QueryInput (same `onFinalTranscript` + `partialTranscript` callbacks)
|
||
- QueryInput remains editable during streaming — user can type corrections while ASR appends (already worked, no changes needed)
|
||
- "Full Transcript" button hidden when YouTube source is active
|
||
- Source toggle: "Upload" / "YouTube" tabs at top of upper-left panel
|
||
- Switching between "Upload" and "YouTube" sources clears previous YouTube state
|
||
- Upload video state preserved when switching to YouTube and back
|
||
|
||
**Implementation Notes:**
|
||
- Both `useVideoASR` and `useYouTubeASR` initialized unconditionally at top of LTTPage
|
||
- Hooks gracefully handle null elements (AudioContext setup aborts early if element is null)
|
||
- Unified `asr` variable: `const asr = source === 'youtube' ? youtubeASR : uploadASR`
|
||
- Source toggle uses `Upload`/`Youtube` icons from lucide-react, blue active / gray inactive state
|
||
- `QueryInput.tsx` — zero changes needed (already supports `partialText` + `value` from any source)
|
||
- `YouTubeVideoPlayer` exposes audio element via `onAudioReady` callback → LTTPage wires to `useYouTubeASR`
|
||
|
||
**Tasks:**
|
||
| # | Task | File | Status |
|
||
|---|------|------|--------|
|
||
| 3.5.1 | Write tests first | `src/test/test_phase3_useYouTubeASR.test.ts` | Done (11 tests) |
|
||
| 3.5.2 | Create `hooks/useYouTubeASR.ts` | `hooks/useYouTubeASR.ts` | Done |
|
||
| 3.5.3 | Update `QueryInput.tsx` | `components/QueryInput.tsx` | Done (no-op: already works) |
|
||
| 3.5.4 | Update `LTTPage.tsx` — source toggle, wire components | `pages/LTTPage.tsx` | Done |
|
||
| 3.5.5 | Create LTTPage integration test | `src/test/test_phase3_LTTPage_integration.test.tsx` | Done (7 tests) |
|
||
| 3.5.6 | Run tests → pass → commit | — | Done (189/189 pass) |
|
||
|
||
---
|
||
|
||
### Phase 3.6 — Integration & Acceptance Testing ✅ Complete
|
||
|
||
**Tests:** `test_integration_phase3.py` (6 tests), `test_acceptance_phase3_youtube.py` (3 tests), `test_acceptance_phase3_live.py` (3 tests)
|
||
|
||
**Integration test** (`backend/app/test/test_integration_phase3.py`):
|
||
- `TestExtractAndProxyFlow` — full extract→proxy flow (VOD manifest, VOD segment, live manifest), cache hit verification
|
||
- `TestProxyAfterExtract` — upstream manifest unavailable after extract → 502
|
||
- `TestExtractDisabled` — extract returns 503 when `youtube_proxy_enabled=false`
|
||
- Mocked yt-dlp, real FastAPI TestClient, real HLSProxyService
|
||
|
||
**Acceptance test VOD** (`backend/app/test/acceptance/test_acceptance_phase3_youtube.py`):
|
||
- Real YouTube VOD extraction and proxy verification
|
||
- Manifest proxy → verify M3U8 structure and CORS
|
||
- Segment proxy → follow master→variant→segment chain, verify MPEG-TS data
|
||
- Skips gracefully if `YOUTUBE_TEST_VOD_URL` not set
|
||
|
||
**Acceptance test live** (`backend/app/test/acceptance/test_acceptance_phase3_live.py`):
|
||
- Real YouTube live extraction (is_live=True, combined formats)
|
||
- Live manifest proxy → verify no #EXT-X-ENDLIST
|
||
- Cache refresh verification (same video_id on re-extract)
|
||
- Skips gracefully if `YOUTUBE_TEST_LIVE_URL` not set or stream offline
|
||
|
||
**How to run acceptance tests:**
|
||
```bash
|
||
cd backend && YOUTUBE_TEST_VOD_URL="https://www.youtube.com/watch?v=5bF3tkO5jAA" \
|
||
YOUTUBE_TEST_LIVE_URL="https://www.youtube.com/watch?v=fN9uYWCjQaw" \
|
||
python -m pytest app/test/acceptance/test_acceptance_phase3_youtube.py \
|
||
app/test/acceptance/test_acceptance_phase3_live.py -v -m acceptance
|
||
```
|
||
|
||
**Tasks:**
|
||
| # | Task | Status |
|
||
|---|------|--------|
|
||
| 3.6.1 | Integration test (mocked yt-dlp, real httpx + HLSProxyService) | Done (6 tests) |
|
||
| 3.6.2 | Acceptance: real YouTube VOD → extract → proxy | Done (3 tests) |
|
||
| 3.6.3 | Acceptance: real YouTube live → extract → proxy | Done (3 tests) |
|
||
| 3.6.4 | Full regression run | Done (234 pass, 1 pre-existing config mismatch) |
|
||
| 3.6.5 | Fix failures, commit | Done |
|
||
|
||
---
|
||
|
||
### Phase 3.7 — Polish & Deployment ✅ Complete
|
||
|
||
**Tasks:**
|
||
|
||
| # | Task | Status |
|
||
|---|------|--------|
|
||
| 3.7.1 | PO token expiration handling | Done — `_is_po_token_error()` detection, cache invalidation, 2 tests |
|
||
| 3.7.2 | Dockerfile — verify ffmpeg + yt-dlp | Done — ffmpeg already installed, yt-dlp in requirements.txt |
|
||
| 3.7.3 | docker-compose.yml — verify volumes/env vars | Done — no new volumes needed (in-memory only), env vars in .env.example |
|
||
| 3.7.4 | Verify production build | Done — `npm run build` succeeds (27s, hls chunk 523KB) |
|
||
| 3.7.5 | README.md — YouTube feature section | Done — added after Video Q&A section |
|
||
| 3.7.6 | development_plan.md — mark Phase 3 complete | Done — Phase 3 row added, status updated to "Phase 1-3 Complete" |
|
||
| 3.7.7 | Final commit | In progress |
|
||
|
||
**PO Token Handling:**
|
||
- `_is_po_token_error(msg)` helper detects YouTube bot-detection / PO token errors
|
||
- On detection: logs warning, invalidates URL cache (forces re-extract on next attempt)
|
||
- Graceful degradation: returns error field to frontend, which can retry
|
||
- Indicators: "sign in to confirm", "not a bot", "bot detection", "po token", "potoken"
|
||
|
||
**Docker/Infra:**
|
||
- Dockerfile already includes ffmpeg and all Python deps via requirements.txt
|
||
- docker-compose.yml unchanged (no new volumes or env vars needed)
|
||
- Frontend production build: 1403 modules, builds clean in ~27s
|
||
|
||
**Documentation:**
|
||
- README.md: new "YouTube Live Stream Proxy (Phase 3)" section with architecture, usage, config, limitations
|
||
- development_plan.md: Phase 3 timeline row, Phase 3 section (backend/frontend additions, design decisions)
|
||
|
||
---
|
||
|
||
## 4. Timeline
|
||
|
||
| Sub-Phase | Description | Effort | Depends On | Status |
|
||
|---|---|---|---|---|---|
|
||
| 3.1 | Config & Infrastructure | 0.5 day | — | ✅ Complete |
|
||
| 3.2 | YouTube URL Extraction | 0.5 day | 3.1 | ✅ Complete |
|
||
| 3.3 | HLS Proxy Backend | 1 day | 3.1 | ✅ Complete |
|
||
| 3.4 | Frontend Input + Player | 1 day | 3.2, 3.3 | ✅ Complete |
|
||
| 3.5 | YouTube → ASR Integration | 1 day | 3.4 | ✅ Complete |
|
||
| 3.6 | Integration & Acceptance | 1 day | 3.5 | ✅ Complete |
|
||
| 3.7 | Polish & Deployment | 0.5 day | 3.6 | ✅ Complete |
|
||
| **Total** | | **5.5 days** | | **7/7 done ✅** |
|
||
|
||
---
|
||
|
||
## 5. Dependencies
|
||
|
||
**Backend:** `yt-dlp>=2024.0.0` (new), `httpx>=0.26.0` (already present), `aiofiles>=24.0.0` (already present)
|
||
**Frontend:** `hls.js@^1.5.0` (new — NOT present, must install)
|
||
**System:** ffmpeg on server (already required by Phase 2)
|
||
|
||
---
|
||
|
||
## 6. Config Fields
|
||
|
||
```python
|
||
# YouTube live stream proxy (Phase 3)
|
||
youtube_proxy_enabled: bool = True
|
||
yt_dlp_timeout: int = 30 # seconds for yt-dlp extraction
|
||
yt_dlp_cache_ttl: int = 300 # seconds to cache extraction results
|
||
```
|
||
|
||
```bash
|
||
# .env.example additions
|
||
YOUTUBE_PROXY_ENABLED=true
|
||
YT_DLP_TIMEOUT=30
|
||
YT_DLP_CACHE_TTL=300
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Key Design Decisions
|
||
|
||
| Decision | Choice | Why |
|
||
|---|---|---|---|
|
||
| Streaming protocol | HLS (m3u8) | hls.js plays it natively; DASH requires dash.js |
|
||
| yt-dlp client | **Default** (no special client) | Default extractor works for both VOD (separate tracks) and live (combined HLS); iOS client caused "No video formats" errors on some live streams |
|
||
| Live format strategy | **Combined formats, same URL** | Live HLS formats include both video+audio; same URL for `<video>` and `<audio>` elements — hls.js demuxes each independently |
|
||
| HTTP client for proxy | httpx (already present) | Streaming support via `httpx.stream()`; no new dependency |
|
||
| Manifest rewriting | Line-by-line streaming | Live manifests can be large; never buffer whole file |
|
||
| Audio element | Hidden `<audio>` + hls.js | `createMediaElementSource` works on `<audio>` elements |
|
||
| URL caching | In-memory dict with TTL | yt-dlp extraction is slow (~2-5s); reuse for 5 min live, 30 min VOD |
|
||
| Service lifetime | `@lru_cache` singleton | Cache must persist across HTTP requests for caching to work |
|
||
| Error response | **HTTP 200 with error field** | API call succeeded; YouTube error is a content-level failure, not a protocol failure |
|
||
| **Full Transcript for YouTube** | **Disabled** | Button hidden; real-time streaming ASR only |
|
||
| **QueryInput during streaming** | **Editable** | User can type corrections while transcript streams (same as existing ASR) |
|
||
| **Video quality** | **360p–480p auto-best** | Low resolution sufficient for reference; no quality selector |
|
||
| **Auto-play on load** | **Wait for manual play** | Thumbnail placeholder; user presses play. Respects autoplay policy. |
|
||
| **Thumbnail** | **Stays until user presses play** | Clean transition; no black frame |
|
||
| **Error recovery** | **Retry 3× → "Service unavailable"** | Auto-re-extract URL on HLS failure; after 3 failures, show error state |
|
||
| **PO Tokens (live streams)** | **Graceful degradation for MVP** | Stream first ~30s; on failure retry 3× with fresh URL; after exhaustion show "Live stream unavailable" |
|
||
|
||
---
|
||
|
||
## 8. File Manifest
|
||
|
||
### New Files
|
||
```
|
||
backend/
|
||
app/models/youtube.py ✅ Created (3.1)
|
||
app/services/youtube_service.py ✅ Created (3.1), implemented (3.2)
|
||
app/services/hls_proxy.py ✅ Stub created (3.1)
|
||
app/routers/youtube.py ✅ Created (3.1), implemented (3.2)
|
||
app/test/test_phase3_config.py ✅ Written (3.1, 11 tests)
|
||
app/test/test_phase3_youtube_extract.py ✅ Written (3.2, 18 tests)
|
||
app/test/test_phase3_hls_proxy.py ⏳ Pending (3.3)
|
||
app/test/test_phase3_hls_manifest.py ⏳ Pending (3.3)
|
||
app/test/test_integration_phase3.py ⏳ Pending (3.6)
|
||
app/test/acceptance/test_acceptance_phase3_youtube.py ⏳ Pending (3.6)
|
||
app/test/acceptance/test_acceptance_phase3_live.py ⏳ Pending (3.6)
|
||
|
||
frontend/src/
|
||
components/YouTubeInput.tsx ✅ Created (3.4)
|
||
components/YouTubeVideoPlayer.tsx ✅ Created (3.4)
|
||
hooks/useYouTubeASR.ts ✅ Created (3.5)
|
||
pages/LTTPage.tsx ✅ Updated (3.5)
|
||
test/test_phase3_YouTubeInput.test.tsx ✅ Written (3.4, 7 tests)
|
||
test/test_phase3_YouTubeVideoPlayer.test.tsx ✅ Written (3.4, 9 tests)
|
||
test/test_phase3_useYouTubeASR.test.ts ✅ Written (3.5, 11 tests)
|
||
test/test_phase3_LTTPage_integration.test.tsx ✅ Written (3.5, 7 tests)
|
||
```
|
||
|
||
### Modified Files
|
||
```
|
||
backend/app/core/config.py ✅ Done (3 fields)
|
||
backend/.env.example ✅ Done (3 vars)
|
||
backend/main.py ✅ Done (router registered)
|
||
backend/requirements.txt ✅ Done (yt-dlp added)
|
||
|
||
frontend/package.json ✅ Done (hls.js added)
|
||
frontend/src/types/index.ts ✅ Done (3.4)
|
||
frontend/src/lib/api.ts ✅ Done (3.4)
|
||
frontend/src/lib/queries.tsx ✅ Done (3.4)
|
||
frontend/src/pages/LTTPage.tsx ✅ Done (3.5)
|
||
frontend/src/components/QueryInput.tsx ✅ Done (3.5 — no-op, already compatible)
|
||
|
||
Dockerfile ⏳ Pending (3.7)
|
||
docker-compose.yml ⏳ Pending (3.7)
|
||
README.md ⏳ Pending (3.7)
|
||
development_plan.md ⏳ Pending (3.7)
|
||
```
|
||
|
||
---
|
||
|
||
## 9. Known Risks & Mitigations
|
||
|
||
| Risk | Impact | Mitigation |
|
||
|---|---|---|
|
||
| PO Token expiration (live streams cut at 30s) | High — live streams unusable without token | Auto-re-extract on HLS failure; document cookie-based workaround; acceptance test to quantify |
|
||
| yt-dlp extraction slow (2-5s) | Medium — poor UX on "Load Stream" click | Cache results with TTL; show progress indicator |
|
||
| YouTube format changes break yt-dlp | Medium — sudden breakage | Pin yt-dlp version; CI test with known-good URLs; `pip install -U yt-dlp` in maintenance. **Note**: iOS client caused "No video formats" on Phoenix TV live stream; default extractor works for both tested URLs. Monitor for regressions. |
|
||
| hls.js audio sync drift vs video | Low — separate streams may drift | hls.js `liveSyncDuration` keeps both near live edge; test with 10+ min streams |
|
||
| Safari `createMediaElementSource` on HLS | Low — known Safari bug with native HLS | hls.js uses MSE, not native HLS — works around Safari bug; Chrome/Firefox unaffected |
|
||
| YouTube ToS for proxy | Low for internal demo | Personal/enterprise internal demo is generally fine; review for public product |
|
||
|
||
---
|
||
|
||
## 10. Example Data Flow
|
||
|
||
```
|
||
POST /api/v1/youtube/extract
|
||
Body: {"url": "https://www.youtube.com/watch?v=5bF3tkO5jAA"}
|
||
Response: {
|
||
"video_id": "5bF3tkO5jAA",
|
||
"title": "《2026年稅務(修訂)(自動交換資料)條例草案》委員會會議",
|
||
"is_live": false,
|
||
"is_upcoming": false,
|
||
"video_proxy_url": "/api/v1/youtube/proxy/manifest.m3u8?url=https%3A%2F%2Frr2---sn-jna...",
|
||
"audio_proxy_url": "/api/v1/youtube/proxy/manifest.m3u8?url=https%3A%2F%2Frr2---sn-jna...",
|
||
"thumbnail_url": "https://i.ytimg.com/vi/5bF3tkO5jAA/hqdefault.jpg",
|
||
"formats": [...],
|
||
"error": null
|
||
}
|
||
|
||
# Live stream (combined formats → same URL for video and audio)
|
||
POST /api/v1/youtube/extract
|
||
Body: {"url": "https://www.youtube.com/watch?v=fN9uYWCjQaw"}
|
||
Response: {
|
||
"video_id": "fN9uYWCjQaw",
|
||
"is_live": true,
|
||
"video_proxy_url": "/api/v1/youtube/proxy/manifest.m3u8?url=...",
|
||
"audio_proxy_url": "/api/v1/youtube/proxy/manifest.m3u8?url=...",
|
||
# video_proxy_url == audio_proxy_url (same combined HLS manifest)
|
||
}
|
||
|
||
GET /api/v1/youtube/proxy/manifest.m3u8?url=<encoded_upstream_m3u8>
|
||
→ Fetches upstream manifest from googlevideo.com
|
||
→ Rewrites segment URLs:
|
||
segment_0.ts → /api/v1/youtube/proxy/segment.ts?url=<encoded_segment_url>
|
||
→ Streams rewritten manifest to browser
|
||
|
||
GET /api/v1/youtube/proxy/segment.ts?url=<encoded_upstream_ts>
|
||
→ Fetches upstream .ts segment via httpx.stream()
|
||
→ StreamingResponse with Content-Type: video/mp2t
|
||
→ CORS: access-control-allow-origin: *
|
||
```
|
||
|
||
---
|
||
|
||
## 11. References
|
||
|
||
- **mediaflow-proxy**: Production FastAPI HLS proxy with M3U8Processor — [mhdzumair/mediaflow-proxy](https://github.com/mhdzumair/mediaflow-proxy)
|
||
- **yt-dlp API docs**: [yt-dlp-yt-dlp.mintlify.app](https://yt-dlp-yt-dlp.mintlify.app/api/extractors)
|
||
- **hls.js API docs**: [github.com/video-dev/hls.js/blob/master/docs/API.md](https://github.com/video-dev/hls.js/blob/master/docs/API.md)
|
||
- **hls.js low-latency live**: `lowLatencyMode: true`, `liveSyncDuration: 1.5`
|
||
- **Existing code patterns**: `.plans/phase2_implementation_plan.md`, `backend/app/routers/video.py`, `frontend/src/hooks/useVideoASR.ts`
|
||
|
||
---
|
||
|
||
## 12. Test Results (Current)
|
||
|
||
| Suite | Tests | Status |
|
||
|-------|-------|--------|
|
||
| Phase 2 backend (existing) | 53 | ✅ All pass |
|
||
| Phase 2 frontend (existing) | 51 | ✅ All pass |
|
||
| Phase 3.1 (config) | 11 | ✅ All pass |
|
||
| Phase 3.2 (extraction) | 18 | ✅ All pass |
|
||
| Phase 3.3 (HLS proxy) | 22 | ✅ All pass |
|
||
| Phase 3.4 frontend (YouTube components) | 16 | ✅ All pass |
|
||
| Phase 3.5 frontend (ASR integration) | 18 | ✅ All pass |
|
||
| Phase 3.6 integration | 6 | ✅ All pass |
|
||
| Phase 3.6 acceptance (VOD) | 3 | ⏭ Skip (needs env) |
|
||
| Phase 3.6 acceptance (live) | 3 | ⏭ Skip (needs env) |
|
||
| Phase 3.7 | 2 tests (PO token) | ✅ All pass |
|
||
| **Total CI** | **197** | **0 failures** |
|
||
|
||
**Pre-existing failures** (not from Phase 3):
|
||
- `test_phase1_config.py::test_config_default_values` — model version mismatch (3.5 vs 3.6)
|
||
- `test_phase3_history_service.py` (13 errors) — missing `highlight_prompt` column
|
||
- `test_phase3_sqlite_db.py::test_seed_default_profiles_idempotent` — stale assertion
|
||
- `e2e/query_flow.test.tsx` (3 failures) — Phase 4 file input tests, unrelated
|
||
|
||
### Real-URL Smoke Tests
|
||
| URL | Type | Result |
|
||
|-----|------|--------|
|
||
| `5bF3tkO5jAA` (LegCo meeting) | VOD | 24 formats, separate video+audio ✅ |
|
||
| `fN9uYWCjQaw` (Phoenix TV 24h) | Live | 6 combined HLS formats, same URL ✅ |
|