Woody
bca534e1b5
chore: add .worktrees/ to .gitignore
...
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-28 17:18:42 +08:00
Woody
f2115ae563
feat: structured LLM output for decompose + citation fuzzy matching (Phase 5)
...
Phase 5.1 — Structured LLM output for query decomposition:
- Add SubQuestions Pydantic model with sub_question, keywords, rationale
- Add LLMClient.complete_structured() using langchain with_structured_output
- Update QueryDecomposer with structured output path + legacy json.loads fallback
- Update SQLite seed templates: add subq+citation labeling requirement
- Add tests: structured output, subquestions model validation, logging
Phase 5.2 — Citation format alignment and fallback links:
- Add document_id to SourceMetadata (backend + frontend types)
- Rewrite citationParser.ts with fuzzy matching and fallback document links
- Add RAGDatabasePage auto-expand from ?document= URL param
- Tighten generate_per_subq seed prompt: 'Copy exact bracket labels shown'
- Add citation parser tests for fuzzy match and fallback link scenarios
- Defer: DOCX/TXT PDF generation → Phase 5.3 (fallback links sufficient)
2026-04-28 15:39:17 +08:00
Woody
f4b404f27d
feat(db): Phase 3.1 — SQLite infrastructure (prompts.db + history.db)
...
- Add sqlite_db.py with dual-DB connection factories (WAL mode, foreign keys)
- init_prompts_db() creates system_prompt_profiles + system_prompts tables
- init_history_db() creates query_history table + created_at index
- seed_default_profiles() inserts 3 profiles (A/B/C) x 3 steps each
- All 3 profiles start with identical seed templates; Profile A active
- Add prompts_db_path + history_db_path to config (./data/ default)
- Startup init in main.py creates data/ dir, inits both DBs, seeds profiles
- Add PROMPTS_DB_PATH + HISTORY_DB_PATH to .env.example
- Add data/ to .gitignore
- 17 new tests in test_phase3_sqlite_db.py (all passing)
2026-04-25 20:29:29 +08:00
Woody
b710002c6e
chore: clean up accidentally committed temp files
2026-04-25 18:29:38 +08:00
Woody
8c84062996
feat(backend): add PDF page extractor and chunk PDF storage config
...
New pdf_extractor.py with extract_page_as_pdf() and extract_pages_as_pdf() for extracting individual PDF pages as separate files. Adds document_chunk_path setting to config and document_chunk/ to .gitignore.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-24 10:52:57 +08:00
Woody
e83a4708b5
feat(backend): add rotating file logging to backend/app/log/
...
- Configure RotatingFileHandler in main.py (10MB per file, 5 backups)
- Log directory auto-created on startup
- Add backend/app/log/ to .gitignore
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-23 14:09:48 +08:00
Woody
02e401740a
chore: update .gitignore for frontend lib/ and Vite cache
...
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent )
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-23 10:56:35 +08:00
Woody
1518b72969
chore: add .gitignore with Python, Node, env, and ChromaDB exclusions
2026-04-22 15:57:04 +08:00