Commit Graph

154 Commits

Author SHA1 Message Date
Woody df62283f58 feat: inject Pydantic JSON schema into Deepseek prompt (Phase 6)
Follows Deepseek JSON Output guide: the prompt now includes the word 'json' and a format example derived from the Pydantic model schema. Added _pydantic_to_json_instruction() helper that builds a human-readable schema description with EXAMPLE JSON OUTPUT.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-04 15:17:24 +08:00
Woody 226f4ed700 test: update integration mocks for dual-client architecture (Phase 6)
Added complete_structured() to mock classes, split response lists between LLMClientDP (decompose) and LLMClient (filter+generate), and patched both clients in all integration tests.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-04 14:59:23 +08:00
Woody 3b5bd79839 feat: wire LLMClientDP into query decompose pipeline (Phase 6)
QueryDecomposer now uses LLMClientDP (Deepseek) while RelevanceFilter and RAGService continue using LLMClient (OpenRouter/vLLM).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-04 14:59:08 +08:00
Woody 849beb4d4e feat: add LLMClientDP for Deepseek decompose (Phase 6)
Uses Deepseek's json_object response_format (not json_schema, which Deepseek does not support). Always disables thinking mode. Includes unit tests (12) and acceptance tests (5).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-04 14:58:53 +08:00
Woody 73ae621f3b feat: add Deepseek config fields and DI wiring (Phase 6)
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-04 14:58:39 +08:00
Woody b6562f3d76 docs: add Package 6 enhancement plan
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-05-04 14:58:24 +08:00
Woody 23c665515d fix: wrap filter chunks in XML tags for clearer LLM input 2026-04-30 13:59:03 +08:00
Woody fc6b5463b5 fix: vLLM structured output missing thinking-control extra_body 2026-04-29 21:01:10 +08:00
Woody 16de8394aa fix: add full input/output logging to vLLM structured output path
Log the complete prompt, schema, extra_body content, full API response,
token counts, and full parsed JSON output. Add exc_info=True tracebacks
on all failure paths.
2026-04-29 16:52:26 +08:00
Woody 3ab6fd102a fix: use vLLM-native guided_json for structured output
vLLM servers support JSON schema enforcement via extra_body (guided_json
or structured_outputs), not OpenAI's response_format protocol. LangChain's
with_structured_output(method='json_schema') sends response_format which
vLLM ignores, causing NoneType not iterable parsing errors.

- vLLM path: direct OpenAI SDK call with extra_body={guided_json|structured_outputs}
- OpenRouter path: unchanged with_structured_output(method='json_schema')
- Try new 'structured_outputs' format first, fall back to legacy 'guided_json'
- Update _SEED_DECOMPOSE with explicit JSON array instruction
- Add diagnostic logging: exc_info=True, schema preview, prompt template preview
- Add logging in _parse_legacy_json for fallback failure debugging
2026-04-29 16:49:14 +08:00
Woody 2aca18d30e docs: add vLLM structured output fix plan
- Diagnose: vLLM ignores OpenAI-native response_format, causing NoneType error
- Diagnose: legacy fallback prompt lacks JSON instruction → empty questions
- Plan: use vLLM-native guided_json via extra_body instead of with_structured_output
- Plan: update _SEED_DECOMPOSE with JSON format instruction
- Plan: add diagnostic logging (exc_info, method, schema preview)

wip: temporary function_calling switch for vLLM (to be replaced by guided_json)
2026-04-29 16:42:23 +08:00
Woody cbb958d75d fix: vLLM chat_template_kwargs breaks LangChain structured output
vLLM's chat_template_kwargs leaked into LangChain's AsyncCompletions.parse()
via _get_langchain_model's model_kwargs, causing structured decomposition
to fail on vLLM backends. Skip vLLM-specific params when building the
LangChain model — only provider-agnostic params (OpenAI reasoning) pass through.
2026-04-29 16:07:44 +08:00
Woody 90269608bc fix: display highlight tracking data in history page UI
- Add highlight_prompt, highlight_response, highlight_time_ms to QueryHistoryDetail type
- Add 'Highlights' bar segment with pink color in TimingBar component
- Pass highlightTimeMs to TimingBar in HistoryCard expanded view
- Add collapsible sections for highlight prompt and response in HistoryCard detail
2026-04-29 13:42:08 +08:00
Woody 41f59b396f feat: track highlight generation prompt, response, and timing in history (Phase 5.5)
- Add 3 columns to query_history: highlight_prompt, highlight_response, highlight_time_ms
- HistoryService.update_highlights() updates existing row after batch LLM call
- ChunkHighlightService measures timing, captures prompt and structured JSON response
- SSE completed event includes history_id for frontend to pass back
- Frontend captures historyId, passes as ?history_id= query param in batch POST
- Highlight time tracked separately (excluded from total_time_ms)
- All 153 tests pass (108 backend + 45 frontend)
2026-04-29 11:18:21 +08:00
Woody 36dedab485 docs: finalize Phase 5 enhancement plan with completion status
- Mark Phase 5.4 complete with actual commit log
- Add Phase 5.4 completion checklist (15 items all checked)
- Add production notes (Vite proxy, port conflicts, cache location)
- Update test counts to current (108 backend, 45 frontend, 153 total)
- Update Decision #12 to reflect inline citation link upgrade
2026-04-29 10:54:18 +08:00
Woody 523b27bb58 test: update batch URL assertion to match absolute backend URL 2026-04-29 10:42:18 +08:00
Woody b47e37f39b fix: use absolute backend URL for highlight API calls
- Vite dev server doesn't proxy /api/v1/v2/ paths to backend
- Changed fetch URL and getHighlightUrl to use http://localhost:8000
- Fixed inline citation highlight URLs in buildCitationUrl
- Cleaned up debug code
2026-04-29 10:39:01 +08:00
Woody bcf4a853bf feat: add highlight status toast notification (Phase 5.4)
- Shows 'Preparing highlights...' (amber spinner) while LLM batch runs
- Shows 'Highlights ready' (green) for 4 seconds when batch completes
- Fixed position top-left corner, auto-dismisses
2026-04-29 10:00:54 +08:00
Woody 1c490ce2fa fix: inline citations now upgrade to highlighted view (Phase 5.4)
- Added sub_question_text to frontend SourceMetadata type
- SubQuestionSection enriches sources with parent sub-question text
- buildCitationUrl routes to highlight page when sub_question_text present
- processCitations threads highlightReadyKeys through inline citations
2026-04-29 09:54:40 +08:00
Woody c632b9ea3b feat: cited source extraction, background batch trigger, and View PDF link upgrade (Phase 5.4.6-5.4.8)
- citationParser.ts: extractCitedSources() parses answer text for [citations],
  resolves against SourceMetadata, returns deduplicated cited sources
- ResponsePanel.tsx: useEffect fires POST /api/v1/v2/highlights/batch after
  answer renders; View PDF link upgrades in-place to highlighted HTML when
  batch completes; stays as raw PDF on failure
- Updated plan: LLM-based relevance detection, eager background computation,
  single batched LLM call, sqlite cache, regex sentence splitter
- 45 frontend tests: 28 citationParser + 17 ResponsePanel (including 4 new
  sub-question highlight tests)
2026-04-29 09:27:04 +08:00
Woody a56f8f69e2 feat: add highlight batch and GET endpoints (Phase 5.4.5)
- POST /api/v1/v2/highlights/batch: compute and cache highlights for cited chunks
- GET /api/v1/v2/highlights: serve cached highlighted HTML pages
- chunks.py router registered in main.py
- Dynamic DB path computation (prompts.db -> highlights.db), no Settings changes
- 7 endpoint tests: POST 200/422, GET 200/404, mock service verification
2026-04-29 09:26:50 +08:00
Woody c6d4a38013 feat: add LLM-based batch highlight service and HTML rendering (Phase 5.4.4)
- ChunkHighlightService.compute_highlights_batch(): single LLM call across
  all cited chunks, grouped by sub-question, with structured output
- render_highlight_html(): self-contained HTML page with yellow-highlighted
  relevant sentences, LLM reason annotations, and View Original PDF footer
- Per-target error isolation, ChromaDB miss handling, graceful degradation
- 14 tests: 7 batch service + 7 HTML rendering
2026-04-29 09:26:33 +08:00
Woody bdbc8ea1a0 feat: add SQLite highlight cache service (Phase 5.4.3)
- highlight_cache.py: HighlightCache class with get/set_highlight and
  compute_cache_key (sha256 hash of document_id|chunk_index|sub_question)
- INSERT OR REPLACE semantics, idempotent table creation
- 13 tests covering round-trip, overwrite, missing keys, determinism
2026-04-29 09:26:20 +08:00
Woody b11d31e2d1 feat: add sentence splitter and highlight data models (Phase 5.4.1-5.4.2)
- sentence_splitter.py: regex-based sentence splitting for English + Chinese punctuation
- highlight.py: 6 Pydantic models (ChunkHighlightTarget, HighlightBatchRequest,
  RelevantSentence, ChunkHighlights, HighlightBatchResult, HighlightBatchResponse)
- 43 tests: 13 sentence splitter + 30 model validation
2026-04-29 09:26:06 +08:00
Woody ec3b5a4ae1 docs: mark Phase 5.3 complete in enhancement plan
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:33:00 +08:00
Woody 25b26c9b48 feat(ingest): generate per-chunk PDFs for DOCX/TXT documents (Phase 5.3)
DOCX and TXT ingestion now produces chunk_file_path + per-chunk PDF files matching the PDF ingestion flow. Uses reportlab to render chunk text as simple PDFs with automatic text wrapping.

- Add reportlab==4.2.5 to requirements.txt
- New utils/text_to_pdf.py: generate_text_pdf() renders chunk text as PDF
- Ingest router DOCX/TXT branches: generate chunk_N.pdf per chunk, store in chunk_file_paths
- Graceful degradation: chunk_file_path stays None if PDF generation fails
- Update test_phase1_ingest_page_aware.py assertions: DOCX chunks now HAVE chunk_file_path
- New test_phase5_docx_pdf_generation.py: 5 tests (DOCX PDF gen, TXT PDF gen, PDF regression, file count, graceful degradation)
- 361 backend tests pass (4 pre-existing embedding failures unrelated)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:32:22 +08:00
Woody bca534e1b5 chore: add .worktrees/ to .gitignore
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:18:42 +08:00
Woody 4058c7dffe fix(citations): use all sub-question sources for citation lookup
LLMs may cite chunks from one sub-question's context inside another sub-question's answer section. Previously, processCitationsForSubq only looked up the current sub-question's sources, leaving cross-referenced citations unlinked. Now SubQuestionSection passes all sub-question sources and uses processCitations with a combined flat lookup.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:10:02 +08:00
Woody 48e15f8232 feat(llm): log structured LLM response and extra_body
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:50:26 +08:00
Woody 091fa84443 docs: update Phase 5 plan with deferred/planned sub-phases
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:43:38 +08:00
Woody c43cb372e9 feat: integrate bullet points in ResponsePanel with CSS list-style
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:43:30 +08:00
Woody 1fdd2a70a5 feat: add bulletizeMarkdown post-processor with tests
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:43:21 +08:00
Woody 4c56e81872 feat(prompts): enforce bullet-point output in generate template
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:42:55 +08:00
Woody 095f013739 feat(llm): pass extra_body via model_kwargs in LangChain
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:42:49 +08:00
Woody 136c25ae38 feat: rewrite DOCX parser with table extraction
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:42:41 +08:00
Woody 36fe1172a0 chore: add langchain dependencies
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 16:42:17 +08:00
Woody f2115ae563 feat: structured LLM output for decompose + citation fuzzy matching (Phase 5)
Phase 5.1 — Structured LLM output for query decomposition:
- Add SubQuestions Pydantic model with sub_question, keywords, rationale
- Add LLMClient.complete_structured() using langchain with_structured_output
- Update QueryDecomposer with structured output path + legacy json.loads fallback
- Update SQLite seed templates: add subq+citation labeling requirement
- Add tests: structured output, subquestions model validation, logging

Phase 5.2 — Citation format alignment and fallback links:
- Add document_id to SourceMetadata (backend + frontend types)
- Rewrite citationParser.ts with fuzzy matching and fallback document links
- Add RAGDatabasePage auto-expand from ?document= URL param
- Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown'
- Add citation parser tests for fuzzy match and fallback link scenarios
- Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)
2026-04-28 15:39:17 +08:00
Woody 06ec37492c docs: add LLM_ENABLE_THINKING and VLLM_ENGINE to env example and README 2026-04-28 13:32:41 +08:00
Woody 711be3dfde feat(llm): add VLLM_ENGINE env flag for provider-specific extra_body format 2026-04-28 13:30:27 +08:00
Woody aa5f716578 feat(upload): support multiple file upload on RAG Database page 2026-04-28 13:22:25 +08:00
Woody fdd5a09c28 fix(pdf): polyfill URL.parse for Chrome <129 compatibility 2026-04-28 13:11:54 +08:00
Woody 49eed1f27e docs(readme): add cross-platform amd64 build guide and inline-env test run 2026-04-28 12:56:00 +08:00
Woody 23796d6a0c feat(prompts): add JSON export/import for profile prompt configurations 2026-04-27 19:44:35 +08:00
Woody bb6b159315 docs(plan): add Phase PX profile export/import feature plan 2026-04-27 19:26:33 +08:00
Woody 05af86f5d2 fix(docker): set relative API base URL and pin numpy for ChromaDB compat 2026-04-27 19:15:16 +08:00
Woody 4ad9deeccb feat(deploy): add Dockerfile, compose, nginx config, and README
Multi-stage Dockerfile: Node builds frontend, Python serves both API
and static files. docker-compose.yml with named volumes for ChromaDB,
chunks, and SQLite data. nginx.conf as reverse proxy with 350M upload
limit and 300s LLM proxy timeout. README with dev setup, deploy steps,
env vars table, and architecture diagram.

Backend main.py: add catch-all route to serve frontend/dist/static
files in production. Only activates when dist/ exists.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-27 17:17:53 +08:00
Woody d444c99c23 feat(config): log resolved llm and embedding model names on startup
Add INFO log in get_settings() to print the actual model names
after merging .env and class defaults. Confirms pydantic-settings
priority: env values override class defaults as expected.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-27 15:11:36 +08:00
Woody a7a22f1494 fix(relevance): tolerate LLM score count mismatches via padding instead of discarding
The per-sub-question filter was all-or-nothing: if the LLM returned
9 scores for 10 chunks (common with qwen3.5-35b), every chunk was
discarded and the user got 'no relevant information found'.

Now: fewer scores → pad with 0.0; more scores → truncate. Changed
from error→warning since this is recoverable.

Also improve LTT page UI: sources collapsed by default in per-sub-q
sections, and the 'Your question' text now shows the full question
instead of being truncated.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-27 14:31:18 +08:00
Woody 2656f9ca08 refactor(test): rewrite tests to comply with integration-first rules
Replace mocked DB/internal-services with real ChromaDB/SQLite via tmp_path.
Only mock truly external APIs (LLM, embedding for deterministic vectors).

13 test files rewritten (314 pass, 0 fail):
- Route tests: use TestClient + real ChromaDB, seed test data
- Service tests: use real PersistentClient/SQLite instances
- Pipeline tests: TestClient hits SSE /query endpoint, verify history
- Converted unittest.TestCase to pytest where applicable

Plus: fix metadata.py to filter None values from ChromaDB metadata
(pre-existing bug caught by real-DB ingestion tests)
2026-04-27 11:46:58 +08:00
Woody 3b868a0133 feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI
Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable.

Fix two latent bugs from Package 4:
- generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime)
- _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}.

Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart.

Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-27 11:14:27 +08:00