Commit Graph

11 Commits

Author SHA1 Message Date
Woody 39525a2344 feat: add ASR provider config, abstraction layer, and OpenRouter provider
Add ASR_PROVIDER env var (dashscope|openrouter), OPENROUTER_API_KEY, and ASR_OPENROUTER_MODEL to Settings. Create ASRProvider ABC with DashScopeASRProvider (wraps existing OpenAI-based DashScope calls via run_in_executor) and OpenRouterASRProvider (httpx + tenacity retry for batch STT). Add tenacity>=8.0.0 dependency. Realtime WebSocket stays DashScope-only.

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-19 09:47:30 +08:00
Woody b05c361fbd revert: remove Phase 3 YouTube proxy — all 7 sub-phases
Reverts commits 284028b through b4096d6. Phase 4 (System Audio Capture)
will replace the YouTube use case with a more versatile getDisplayMedia approach.

Removed: YouTube router, HLS proxy, YouTubeService, YouTubeInput,
YouTubeVideoPlayer, useYouTubeASR hook, all Phase 3 tests, hls.js dep,
YouTube config fields, YouTube README/plan sections.

Modified files restored to pre-Phase-3 state: LTTPage (no source toggle),
api.ts (no YouTube extract), types (no YouTube types), config.py (no
youtube fields), main.py (no YouTube router), requirements.txt (no yt-dlp),
.env.example (no YouTube vars), package.json (no hls.js).

Relevant Phase 2 code preserved: ws_asr.py (unchanged), useVideoASR,
VideoPlayer, VideoUpload, QueryInput, Full Transcript.
2026-05-09 21:07:21 +08:00
Woody 284028bb1f feat: Phase 3.1 + 3.2 — YouTube config infra and URL extraction
Phase 3.1 — Configuration & Infrastructure:
- Add youtube_proxy_enabled, yt_dlp_timeout, yt_dlp_cache_ttl config fields
- Add yt-dlp and hls.js dependencies
- Create models/youtube.py (request/response schemas)
- Create service stubs (youtube_service, hls_proxy)
- Create router stub and register in main.py
- 11 config tests

Phase 3.2 — YouTube URL Extraction:
- yt-dlp wrapper with async extraction (run_in_executor)
- Format selection: ≤480p video-only + highest-bitrate audio (VOD)
- Combined format fallback: same URL for live streams
- In-memory URL cache: 5min TTL live, 30min VOD
- lru_cache singleton service for cache persistence
- Error handling: DownloadError → 200 with error field
- 18 extract tests, 82/82 total pass (zero regressions)

Real-URL verified: VOD (5bF3tkO5jAA) 24 formats, Live (fN9uYWCjQaw) 6 HLS
2026-05-09 15:53:04 +08:00
Woody 9934749d2b feat: Phase 2.1 config + infrastructure and 2.2 video upload backend
- Add DashScope ASR and video upload config fields to Settings
- Create Pydantic models (video.py, asr.py)
- Create VideoService with validation, save, serve, delete
- Create ASR client stub with float32_to_s16le utility
- Implement POST /api/v1/video/upload with streaming validation
- Implement GET /api/v1/video/{video_id} with FileResponse
- Create WebSocket ASR endpoint stub
- Register new routers in main.py
- Update .env.example and requirements.txt
- Add reference examples for DashScope integration
- 8 tests passing (3 config + 5 video upload)
2026-05-06 13:08:19 +08:00
Woody cbb958d75d fix: vLLM chat_template_kwargs breaks LangChain structured output
vLLM's chat_template_kwargs leaked into LangChain's AsyncCompletions.parse()
via _get_langchain_model's model_kwargs, causing structured decomposition
to fail on vLLM backends. Skip vLLM-specific params when building the
LangChain model — only provider-agnostic params (OpenAI reasoning) pass through.
2026-04-29 16:07:44 +08:00
Woody 25b26c9b48 feat(ingest): generate per-chunk PDFs for DOCX/TXT documents (Phase 5.3)
DOCX and TXT ingestion now produces chunk_file_path + per-chunk PDF files matching the PDF ingestion flow. Uses reportlab to render chunk text as simple PDFs with automatic text wrapping.

- Add reportlab==4.2.5 to requirements.txt
- New utils/text_to_pdf.py: generate_text_pdf() renders chunk text as PDF
- Ingest router DOCX/TXT branches: generate chunk_N.pdf per chunk, store in chunk_file_paths
- Graceful degradation: chunk_file_path stays None if PDF generation fails
- Update test_phase1_ingest_page_aware.py assertions: DOCX chunks now HAVE chunk_file_path
- New test_phase5_docx_pdf_generation.py: 5 tests (DOCX PDF gen, TXT PDF gen, PDF regression, file count, graceful degradation)
- 361 backend tests pass (4 pre-existing embedding failures unrelated)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:32:22 +08:00
Woody 36fe1172a0 chore: add langchain dependencies
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:42:17 +08:00
Woody 05af86f5d2 fix(docker): set relative API base URL and pin numpy for ChromaDB compat 2026-04-27 19:15:16 +08:00
Woody 74cb8b83d5 feat(backend): migrate LLM client to OpenAI SDK with thinking control
- Replace httpx with openai.AsyncOpenAI

- Add llm_enable_thinking config (default False)

- Add _build_extra_body() for Qwen3.5 thinking mode control

- Use chat_template_kwargs for vLLM/SGLang compatibility

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-23 14:10:26 +08:00
Woody 4b633d86f7 fix(backend): resolve pytest dependency conflict and remove sentence-transformers
- Change pytest==8.0.0 to pytest==7.4.4 (conflict with pytest-asyncio<8)

- Remove sentence-transformers (not used with API-based embeddings)

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-23 13:24:54 +08:00
Woody 3712397d64 feat: Phase 1.1 project setup with config, database, and models
- Add requirements.txt with all dependencies
- Add .env.example with required environment variables
- Add Pydantic Settings (config.py) with .env loading
- Add ChromaDB persistent client (database.py)
- Add Pydantic schemas (ingest.py) for request/response
- Add FastAPI main.py with CORS middleware
- Add package __init__.py files
- Add tests: test_phase1_config.py, test_phase1_database.py
- All 5 tests pass
2026-04-22 16:13:52 +08:00