Commit Graph

8 Commits

Author SHA1 Message Date
Woody bca534e1b5 chore: add .worktrees/ to .gitignore
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:18:42 +08:00
Woody f2115ae563 feat: structured LLM output for decompose + citation fuzzy matching (Phase 5)
Phase 5.1 — Structured LLM output for query decomposition:
- Add SubQuestions Pydantic model with sub_question, keywords, rationale
- Add LLMClient.complete_structured() using langchain with_structured_output
- Update QueryDecomposer with structured output path + legacy json.loads fallback
- Update SQLite seed templates: add subq+citation labeling requirement
- Add tests: structured output, subquestions model validation, logging

Phase 5.2 — Citation format alignment and fallback links:
- Add document_id to SourceMetadata (backend + frontend types)
- Rewrite citationParser.ts with fuzzy matching and fallback document links
- Add RAGDatabasePage auto-expand from ?document= URL param
- Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown'
- Add citation parser tests for fuzzy match and fallback link scenarios
- Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)
2026-04-28 15:39:17 +08:00
Woody f4b404f27d feat(db): Phase 3.1 — SQLite infrastructure (prompts.db + history.db)
- Add sqlite_db.py with dual-DB connection factories (WAL mode, foreign keys)
- init_prompts_db() creates system_prompt_profiles + system_prompts tables
- init_history_db() creates query_history table + created_at index
- seed_default_profiles() inserts 3 profiles (A/B/C) x 3 steps each
- All 3 profiles start with identical seed templates; Profile A active
- Add prompts_db_path + history_db_path to config (./data/ default)
- Startup init in main.py creates data/ dir, inits both DBs, seeds profiles
- Add PROMPTS_DB_PATH + HISTORY_DB_PATH to .env.example
- Add data/ to .gitignore
- 17 new tests in test_phase3_sqlite_db.py (all passing)
2026-04-25 20:29:29 +08:00
Woody b710002c6e chore: clean up accidentally committed temp files 2026-04-25 18:29:38 +08:00
Woody 8c84062996 feat(backend): add PDF page extractor and chunk PDF storage config
New pdf_extractor.py with extract_page_as_pdf() and extract_pages_as_pdf() for extracting individual PDF pages as separate files. Adds document_chunk_path setting to config and document_chunk/ to .gitignore.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-24 10:52:57 +08:00
Woody e83a4708b5 feat(backend): add rotating file logging to backend/app/log/
- Configure RotatingFileHandler in main.py (10MB per file, 5 backups)

- Log directory auto-created on startup

- Add backend/app/log/ to .gitignore

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-23 14:09:48 +08:00
Woody 02e401740a chore: update .gitignore for frontend lib/ and Vite cache
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-23 10:56:35 +08:00
Woody 1518b72969 chore: add .gitignore with Python, Node, env, and ChromaDB exclusions 2026-04-22 15:57:04 +08:00