27 KiB

Raw Blame History

Package 5 Enhancement Plan — Structured Output + Robust Citation Linking

Source: User request (2026-04-28) Scope:

Phase 5.1: Replace manual JSON parsing in the decompose stage with LangChain with_structured_output()
Phase 5.2: Fix missing PDF links in citations and improve citation robustness Status: Phases 5.1 ✅, 5.2 ✅ — 5.3 Deferred, 5.4 Planned (2026-04-28)

LangChain version: 1.2.15 (venv), model_provider="openai" with OpenRouter base URL (API-compatible proxy).

Test results:

Backend: 115 passed, 0 failed (Phase 5.1 + Phase 5.2 + all integration/regression tests)
Frontend: 187 passed, 1 failed (pre-existing e2e test failure unrelated to these changes)

Objective

Decompose structured output: Eliminate json.JSONDecodeError failures in QueryDecomposer.decompose() by integrating LangChain's with_structured_output() to enforce a Pydantic schema at the API level. The LLM response is guaranteed to be a valid SubQuestions object — no manual json.loads(), no regex markdown stripping, no silent failures.
Robust citation linking: Fix the citation→PDF link pipeline so that:
- document_id flows through to the frontend for fallback document-level links
- chunk_file_path is always available (generate per-chunk PDFs for DOCX/TXT too, or provide a document-level PDF fallback)
- Citation matching in citationParser.ts handles fuzzy filename matching (strips extensions, tolerates whitespace variations)
- Frontend provides fallback "View Document" links when chunk-level PDF is unavailable

Decision Register

#	Decision	Rationale
1	Use LangChain `with_structured_output()` (not OpenAI `response_format` directly)	User explicitly chose Option B. Provides cleaner API, auto-retry on validation failure, and future flexibility for other pipeline stages (filter, generate).
2	Add `langchain` + `langchain-openai` to `requirements.txt`	Required dependencies for `init_chat_model()` and `with_structured_output()`. `langchain` ~0.3.x for stable API.
3	Define `SubQuestions` Pydantic model with `questions: list[str]`	LangChain's `with_structured_output()` requires a wrapper Pydantic model — bare `list[str]` is unsupported by provider-native schema enforcement.
4	Keep `LLMClient` as the central LLM access layer, add LangChain-based `complete_structured()` method	Minimizes refactoring. `QueryDecomposer` calls `llm_client.complete_structured(prompt, SubQuestions)` instead of `llm_client.complete(prompt)`. Other callers (filter, generate) remain unchanged.
5	Run decomposition at `temperature=0.0` (was `0.7`)	Structured output benefits from deterministic behavior. Lower temperature = more reliable schema compliance.
6	Add `document_id` to `SourceMetadata` Pydantic model and frontend type	`document_id` is already stored in ChromaDB metadata (`metadata.py:70`) but is discarded during serialization. Adding it enables document-level fallback links.
7	~~Generate monolithic PDFs for DOCX/TXT documents~~ → DEFERRED	More complex than needed. Instead, use fallback document-level links via `document_id` when `chunk_file_path` is null. DOCX/TXT PDF generation deferred to Phase 5.3.
8	Fuzzy citation matching: strip extensions, trim whitespace	`citationParser.ts` currently requires exact filename match. LLM may shorten `NEC4 ACC.pdf` to `NEC4 ACC` in citations.
9	Fallback "View Document" link when `chunk_file_path` is null	Even after Decision #7, network failures or edge cases may leave null paths. The frontend should show a document-level PDF link as fallback.
10	Keep `_extract_json_from_markdown()` as a fallback for backward compatibility	During a transition period (or if `with_structured_output()` fails), the existing regex-based extraction serves as a safety net. Log a warning when fallback is used.
11	Add `logger.warning` for JSON parse failures before returning empty	The biggest blind spot today: JSON parse failures are silent. Log the raw LLM response (truncated) so operators can debug.
12	Keep `QueryDecomposer.decompose()` return type as `Tuple[List[str], str]`	Existing callers unpack the tuple. Adding `Tuple[List[str], str, SubQuestions
13	Spike-test LangChain structured output with OpenRouter BEFORE implementation	2-minute test calling `init_chat_model().with_structured_output().ainvoke()` through OpenRouter to confirm `response_format={"type": "json_schema"}` is proxied correctly. If not, fall back to `method="function_calling"`.
14	Tighten `generate_per_subq` prompt alongside frontend fuzzy matching	Add "Copy the exact bracket labels shown in the document chunks — do not modify filenames or add/remove extensions." to seed template. Two-layer defense: prompt reduces hallucinations + fuzzy matching catches remaining cases. No separate task — folded into Task 5.2.3.

Phase 5.1 — Structured Output for Decompose

Test Files (write BEFORE implementation)

#	Test File	Coverage
T5.1.1	`backend/app/test/test_phase5_llm_client_structured.py`	`LLMClient.complete_structured()` with mock LangChain model. Tests: valid Pydantic return, validation error → retry, empty questions list, non-JSON fallback.
T5.1.2	`backend/app/test/test_phase5_query_decomposer_structured.py`	`QueryDecomposer.decompose()` using `MockLLMClient.complete_structured()`. Tests: valid SubQuestions, empty questions, LLM error fallback, prompt service integration.
T5.1.3	`backend/app/test/test_phase5_subquestions_model.py`	`SubQuestions` Pydantic model validation. Tests: valid input, empty list, too many questions, non-string items rejected.
T5.1.4	`backend/app/test/test_phase5_decompose_logging.py`	Verify `logger.warning` is emitted when JSON parse fallback is triggered (backward-compat path).

Acceptance Tests

#	Test File	Coverage
AT5.1.1	`backend/app/test/acceptance/test_acceptance_phase5_structured_decompose.py`	Real LLM call with structured output. Tests: Cantonese question → valid sub-questions, English question → valid sub-questions, very short question → 1 sub-question, very long question → ≤5 sub-questions.

Implementation Tasks

Task 5.1.1: Add LangChain dependencies

Add langchain>=0.3.0,<0.4.0 and langchain-openai>=0.3.0,<0.4.0 to backend/requirements.txt
Run pip install -r backend/requirements.txt in dev venv
Test file: test_phase5_subquestions_model.py (can run immediately after install)

Task 5.1.2: Define `SubQuestions` Pydantic model

Create backend/app/models/decompose.py with:

class SubQuestions(BaseModel):
    questions: list[str] = Field(
        description="2-5 simplified sub-questions, each focused on one aspect",
        min_length=1,
        max_length=5,
    )

Add min_length=1 and max_length=5 Pydantic constraints (aligns with decompose prompt's "2-5")
Test file: test_phase5_subquestions_model.py

Task 5.1.3: Add `complete_structured()` method to `LLMClient`

In llm_client.py, import init_chat_model from langchain.chat_models
Add self._langchain_model attribute (lazy-init from settings)
Add async complete_structured(prompt, pydantic_model, step_name) -> BaseModel method:
1. Calls self._langchain_model.with_structured_output(pydantic_model, method="json_schema").ainvoke(prompt)
2. Returns the validated Pydantic model instance
3. Logs timing (same pattern as existing complete())
4. Wraps errors in LLMClientError
Use temperature=0.0 via model config for structured calls
Test file: test_phase5_llm_client_structured.py

Task 5.1.4: Refactor `QueryDecomposer.decompose()` to use structured output

Change decompose() to call self.llm_client.complete_structured(prompt, SubQuestions, step_name="QueryDecomposer")
Add fallback path: if complete_structured() raises → log warning → attempt legacy complete() + json.loads() → if that works, log info "structured output failed, fallback succeeded"
Add logger.warning("Decompose JSON parse failed, raw response (first 500 chars): %s", response[:500]) when both paths fail
Keep return type Tuple[List[str], str] unchanged
Keep _extract_json_from_markdown() for backward-compat fallback path
Test file: test_phase5_query_decomposer_structured.py and test_phase5_decompose_logging.py

Task 5.1.5: Update prompt template for structured output

Update _SEED_DECOMPOSE in sqlite_db.py to instruct the LLM about the expected structure
New seed prompt: mention that output will be validated against a schema — more explicit about JSON array of strings requirement
Run seed_default_profiles() to backfill existing profiles
Test file: Existing test_phase3_prompt_service.py should continue to pass

Task 5.1.6: Integration test — end-to-end query pipeline

Verify existing integration tests still pass (test_integration_phase1.py, test_phase4_integration_query_pipeline.py)
Verify acceptance test passes with real LLM (test_acceptance_phase1_rag_query.py)
Run full test suite: cd backend && pytest app/test/test_phase5*.py app/test/test_phase4*.py app/test/test_phase3*.py -v

Phase 5.2 — Robust Citation Linking

Test Files (write BEFORE implementation)

#	Test File	Coverage
T5.2.1	`backend/app/test/test_phase5_source_metadata.py`	`SourceMetadata` model with `document_id`. Tests: serialization includes document_id, backward compat (old data without document_id).
T5.2.2	`backend/app/test/test_phase5_docx_pdf_generation.py`	DOCX/TXT ingestion now sets `chunk_file_path`. Tests: DOCX ingestion produces chunk PDFs, TXT ingestion produces chunk PDFs, PDF generation errors are handled gracefully.
T5.2.3	`frontend/src/test/utils/test_phase5_citation_parser_fuzzy.test.ts`	Fuzzy citation matching. Tests: citation `[NEC4 ACC]` matches source `NEC4 ACC.pdf`, citation `[nec4 acc.pdf, page 3]` matches after whitespace trim, citation `[NEC4 ACC.PDF]` matches case-insensitively, fallback "View Document" link shown when `chunk_file_path` is null.
T5.2.4	`frontend/src/test/utils/test_phase5_citation_fallback_link.test.ts`	Fallback document link rendering. Tests: chunk with `chunk_file_path: null` but `document_id` present → renders "View Document" link, chunk with both null → remains plain text, chunk with `chunk_file_path` → renders page-level PDF link.

Acceptance Tests

#	Test File	Coverage
AT5.2.1	`backend/app/test/acceptance/test_acceptance_phase5_citation_links.py`	Real LLM query with DOCX and PDF documents. Verify citations in the answer are clickable in the SSE response (sources include document_id and chunk_file_path).

Implementation Tasks

Task 5.2.1: Add `document_id` to `SourceMetadata` model

In backend/app/models/common.py, add document_id: Optional[str] = None to SourceMetadata
In backend/app/routers/query.py lines 310-319, include document_id=meta.get("document_id") when building SourceMetadata objects
In frontend/src/types/index.ts, add document_id: string | null to SourceMetadata interface
Test file: test_phase5_source_metadata.py

Task 5.2.2: Generate PDFs for DOCX/TXT documents during ingestion

Add reportlab to backend/requirements.txt (lightweight, pure Python PDF generation, no external binaries)
In backend/app/routers/ingest.py DOCX and TXT branches, add PDF generation logic:
1. After chunking, generate a single PDF from the full text (one page per chunk)
2. Store chunk_filename = f"{stem}_chunk_{idx}.pdf" for each chunk
3. Set chunk_file_paths list and pass to extract_metadata()
Add error handling: if PDF generation fails, chunk_file_path stays None (graceful degradation)
Use logger.warning on generation failure
Test file: test_phase5_docx_pdf_generation.py

Task 5.2.3: Improve `citationParser.ts` with fuzzy matching

Add extension-stripping helper: stripExtension(filename: string): string — removes .pdf, .docx, .txt
Modify buildCitationLookup() to register both filename and stripExtension(filename) as lookup keys
Add trim-whitespace normalization on citation text before lookup
Add test for LLM-common variations: NEC4 ACC.pdf vs NEC4 ACC vs NEC4_acc.pdf
Test file: test_phase5_citation_parser_fuzzy.test.ts

Task 5.2.4: Add fallback "View Document" link in frontend

In citationParser.ts replaceCitationPatterns(), when source?.chunk_file_path is null but source?.document_id exists:
1. Build a URL to the document chunk list page: /rag-database?document_id=${source.document_id}
2. Return [${trimmed}](${url}) with a different CSS class (e.g., text-green-600 for document-level vs text-blue-600 for page-level)
In ResponsePanel.tsx, update CitationLink component to accept a variant prop for visual differentiation
Test file: test_phase5_citation_fallback_link.test.ts

Task 5.2.5: Integration and regression testing

Verify all existing citation parser tests still pass: cd frontend && npx vitest run src/test/utils/citationParser.test.ts
Verify ResponsePanel tests still pass: npx vitest run src/test/components/ResponsePanel.test.tsx
Run full frontend test suite: npm test
Verify SSE streaming integration: query with a mix of PDF and DOCX documents, confirm citations are clickable

Dependency Graph

Phase 5.1 (Structured Output)
  Task 5.1.1 (add deps) ──┬── Task 5.1.2 (SubQuestions model) ── Task 5.1.3 (complete_structured)
                           │                                           │
                           │                                           ▼
                           │                              Task 5.1.4 (refactor decompose)
                           │                                           │
                           │                              Task 5.1.5 (update prompt template)
                           │                                           │
                           │                                           ▼
                           │                              Task 5.1.6 (integration tests)
                           │
Phase 5.2 (Citation Linking) — independent, can run in parallel with 5.1
  Task 5.2.1 (document_id in model) ──┬── Task 5.2.3 (fuzzy matching)
  Task 5.2.2 (DOCX/TXT PDF gen)    ──┤
                                      ├── Task 5.2.4 (fallback link)
                                      │
                                      ▼
                              Task 5.2.5 (integration tests)

Acceptance Criteria

Phase 5.1 Completion Checklist

LLMClient.complete_structured() returns validated SubQuestions Pydantic model — no json.JSONDecodeError possible
QueryDecomposer.decompose() never returns [] due to JSON parse failure
Fallback path (legacy json.loads()) logs a warning when triggered
Existing decompose tests pass (test_phase1_query_decomposer.py)
New structured output tests pass (test_phase5_*.py) — 33 tests
Spike test passed: Cantonese + English → valid sub-questions
SQLite seed templates updated and backfilled to all profiles
langchain and langchain-openai installed in venv (1.2.x)

Phase 5.2 Completion Checklist

SourceMetadata includes document_id in both backend and frontend types
~~DOCX/TXT ingestion generates per-chunk PDF files~~ → DEFERRED to Phase 5.3
citationParser.ts matches [NEC4 ACC] to source NEC4 ACC.pdf (fuzzy matching)
citationParser.ts renders fallback link to /rag-database?document=xxx when chunk_file_path is null but document_id exists
RAGDatabasePage auto-expands document from ?document= URL param
All existing citation parser tests pass (14 tests)
All existing ResponsePanel tests pass
generate_per_subq seed prompt tightened: "Copy the exact bracket labels shown"

Rollback Plan

If with_structured_output() causes issues in production:

The complete_structured() method wraps errors in LLMClientError — same exception type as existing complete()
QueryDecomposer.decompose() has a fallback to legacy complete() + json.loads() path
The _extract_json_from_markdown() function is preserved for backward compatibility
If LangChain is a complete failure, revert requirements.txt and llm_client.py changes (3 files), keeping the Pydantic model and improved logging

Phase 5.3 — DOCX/TXT PDF Generation (DEFERRED)

Generate per-chunk PDF files for DOCX/TXT documents at ingestion time so they have the same chunk_file_path → PDF viewer flow as PDF documents.

Status: Deferred. Phase 5.2 fallback links (/rag-database?document=xxx) are sufficient. Revisit after Phase 5.4 if plain-text chunk views are still needed alongside highlighted views.

Phase 5.4 — Sentence-Level Highlighting (PLANNED)

Problem

When a user clicks a citation link to view a cited chunk, they see the full chunk text (up to ~1000 tokens). They have to manually scan to find which sentences actually drove the relevance. This is especially painful for long, dense chunks.

Solution

On-the-fly highlighted HTML chunk views served by the backend. When a citation link is clicked, the frontend passes the sub-question that retrieved that chunk. The backend splits the chunk into sentences, computes embedding similarity of each sentence to the sub-question, and returns a styled HTML page with relevant sentences highlighted.

Why HTML, not PDF?

Approach	Complexity	Works for all doc types?	Preserves original formatting?
Highlighted HTML page	Low	✅ Yes (uses chunk text)	❌ Plain text only
Highlighted PDF via reportlab	Medium	✅ Yes (new PDF)	❌ Plain text only
Overlay highlights on existing PDF	High	⚠️ PDF only	✅ Yes

Recommendation: HTML page. Simple, fast, works uniformly for PDF/DOCX/TXT chunks. Original formatting is preserved in the existing PDF viewer (chunk_file_path link) — the highlighted HTML view is a supplementary view reached via a separate button/link. The two views coexist: "View Original PDF" vs "View Highlighted Text".

How It Works (No LLM Needed)

User clicks citation [NEC4 ACC, chunk 3]
       │
       ▼
Frontend sends: GET /api/v1/chunks/highlight?document_id=abc&chunk_index=2&sub_question=...
       │
       ▼
Backend:
  1. Fetch chunk text from ChromaDB                          [chromadb get()]
  2. Split into sentences                                    [nltk.sent_tokenize or regex]
  3. Embed sub-question                                      [existing embedding model]
  4. Embed each sentence (batch, parallel)                   [same model]
  5. Compute cosine similarity per sentence vs sub-question  [numpy]
  6. Return HTML with yellow background on sentences > threshold
       │
       ▼
Frontend renders HTML in an iframe or new tab

What Gets Highlighted

┌──────────────────────────────────────────────────────────┐
│ Chunk: NEC4 ACC, page 12          [View Original PDF →]  │
├──────────────────────────────────────────────────────────┤
│                                                            │
│ The programme shall be prepared in a form acceptable to   │
│ the Project Manager. It shall include:                    │
│                                                            │
│ ████████████████████████████████████████████████████████ │
│ █ The starting date, access dates, and Key Dates.       █ │  ← High similarity
│ ████████████████████████████████████████████████████████ │
│                                                            │
│ The Contractor shall submit a first programme within      │
│ ████████████████████████████████████████████████████████ │
│ █ two weeks of the starting date.                       █ │  ← High similarity
│ ████████████████████████████████████████████████████████ │
│                                                            │
│ The Project Manager may instruct the Contractor to        │
│ submit a revised programme showing the effects of a       │
│ compensation event. This does not affect the Contractor's │
│ right to be paid for preparing the programme.             │  ← Low similarity (no highlight)
│                                                            │
└──────────────────────────────────────────────────────────┘

Key Design Decisions

#	Decision	Rationale
1	HTML page, not PDF	Zero dependency (`reportlab` not needed). Faster to generate. CSS-based highlighting is more flexible. Original PDF view remains available separately.
2	Embedding similarity, not LLM	No API cost, no latency. The embedding model is already running. Cosine similarity is cheap.
3	Sentence-level granularity	Paragraph-level is too coarse (whole paragraph might be dimly relevant). Word/phrase-level is too noisy. Sentences are the natural unit of meaning.
4	Embed sentences in batch	A 1000-token chunk has ~8-12 sentences. One batch embedding call is fast (single API round-trip).
5	Configurable threshold (env var)	`HIGHLIGHT_SIMILARITY_THRESHOLD` (default 0.5). Tune per embedding model.
6	Cache sentence embeddings per chunk	A chunk may be cited in multiple queries. Cache sentence embeddings in ChromaDB metadata or SQLite to avoid recomputation.
7	Graceful degradation	If embedding fails → return plain text chunk view. If sentence splitting fails → highlight entire chunk.
8	Frontend: "View Highlighted" link alongside "View PDF"	The existing PDF viewer link (`chunk_file_path`) stays. A second link opens the highlighted HTML view. Both visible, user chooses.

Implementation Tasks

Task 5.4.1: Backend — Sentence splitting utility

Create backend/app/utils/sentence_splitter.py
Function split_sentences(text: str) -> list[dict] returns [{text, start_char, end_char}, ...]
Use nltk.sent_tokenize with fallback to regex (re.split(r'(?<=[.!?])\s+'))
NLTK punkt data auto-downloaded on first use (or bundled)
Handle edge cases: empty text, single sentence, lists/bullets
Test file: test_phase5_sentence_splitter.py

Task 5.4.2: Backend — Highlighted chunk endpoint

New endpoint: GET /api/v1/chunks/highlight
Query params: document_id, chunk_index, sub_question
Returns text/html (not JSON)
Logic in backend/app/services/chunk_highlight_service.py:
1. Fetch chunk from ChromaDB by document_id + chunk_index
2. Split into sentences via split_sentences()
3. Get embedding for sub_question via existing embedding model
4. Get embeddings for all sentences in one batch call
5. Compute cosine similarity: np.dot(q_emb, s_emb) / (norm(q) * norm(s))
6. Mark sentences with similarity > threshold as highlighted
7. Render HTML template with inline CSS (yellow background, subtle border)
Test file: test_phase5_chunk_highlight.py

Task 5.4.3: Frontend — "View Highlighted" link in citations and sources

In citationParser.ts and ResponsePanel.tsx, add a "🔍" or "View Highlighted" link next to each source
Link target: /api/v1/chunks/highlight?document_id=...&chunk_index=...&sub_question=...
The sub-question is the one that retrieved this chunk (already available in the sources structure: source.sub_question_index → look up sub-question text)
Open in new tab or modal
Test file: Update citationParser.test.ts and ResponsePanel.test.tsx

Task 5.4.4: Integration testing

Verify highlight endpoint returns 200 with valid HTML for all doc types (PDF, DOCX, TXT)
Verify sentence highlighting is proportional to relevance (spot-check manually)
Verify caching works (second request for same chunk is faster)
Verify graceful degradation (embedding API down → plain text still served)
Run full test suite

Test Files

#	Test File	Coverage
T5.4.1	`backend/app/test/test_phase5_sentence_splitter.py`	Sentence splitting: English, mixed punctuation, empty, single sentence, bullet lists
T5.4.2	`backend/app/test/test_phase5_chunk_highlight.py`	Highlight endpoint: valid request → HTML with highlights, threshold filtering, no sentences above threshold → all plain, missing document/chunk → 404, embedding failure → fallback plain text
T5.4.3	`frontend/src/test/utils/citationParser.test.ts` (update)	Citation links include highlight URL when sub-question context available
T5.4.4	`frontend/src/test/components/ResponsePanel.test.tsx` (update)	Sources section renders "View Highlighted" link alongside "View PDF"

Acceptance Tests

#	Test File	Coverage
AT5.4.1	`backend/app/test/acceptance/test_acceptance_phase5_highlight.py`	Real LLM query → real embeddings → open highlighted view → verify yellow spans exist on relevant sentences

Commit Plan

Commit	Message	Scope
1	`feat: add LangChain deps and SubQuestions Pydantic model`	Tasks 5.1.1 + 5.1.2 + tests
2	`feat: add LLMClient.complete_structured() with LangChain`	Task 5.1.3 + tests
3	`feat: refactor QueryDecomposer to use structured output with fallback`	Task 5.1.4 + tests
4	`chore: update decompose seed prompt for structured output`	Task 5.1.5
5	`feat: add document_id to SourceMetadata model`	Task 5.2.1 + tests
6	`feat: fuzzy citation matching and document fallback links`	Tasks 5.2.3 + 5.2.4 + tests
7	`feat: sentence-level chunk highlighting via embedding similarity`	Phase 5.4 (all tasks)

27 KiB Raw Blame History

Package 5 Enhancement Plan — Structured Output + Robust Citation Linking

Objective

Decision Register

Phase 5.1 — Structured Output for Decompose

Test Files (write BEFORE implementation)

Acceptance Tests

Implementation Tasks

Task 5.1.1: Add LangChain dependencies

Task 5.1.2: Define SubQuestions Pydantic model

Task 5.1.3: Add complete_structured() method to LLMClient

Task 5.1.4: Refactor QueryDecomposer.decompose() to use structured output

Task 5.1.5: Update prompt template for structured output

Task 5.1.6: Integration test — end-to-end query pipeline

Phase 5.2 — Robust Citation Linking

Test Files (write BEFORE implementation)

Acceptance Tests

Implementation Tasks

Task 5.2.1: Add document_id to SourceMetadata model

Task 5.2.2: Generate PDFs for DOCX/TXT documents during ingestion

Task 5.2.3: Improve citationParser.ts with fuzzy matching

Task 5.2.4: Add fallback "View Document" link in frontend

Task 5.2.5: Integration and regression testing

Dependency Graph

Acceptance Criteria

Phase 5.1 Completion Checklist

Phase 5.2 Completion Checklist

Rollback Plan

Phase 5.3 — DOCX/TXT PDF Generation (DEFERRED)

Phase 5.4 — Sentence-Level Highlighting (PLANNED)

Problem

Solution

Why HTML, not PDF?

How It Works (No LLM Needed)

What Gets Highlighted

Key Design Decisions

Implementation Tasks

Task 5.4.1: Backend — Sentence splitting utility

Task 5.4.2: Backend — Highlighted chunk endpoint

Task 5.4.3: Frontend — "View Highlighted" link in citations and sources

Task 5.4.4: Integration testing

Test Files

Acceptance Tests

Commit Plan

27 KiB

Raw Blame History

Task 5.1.2: Define `SubQuestions` Pydantic model

Task 5.1.3: Add `complete_structured()` method to `LLMClient`

Task 5.1.4: Refactor `QueryDecomposer.decompose()` to use structured output

Task 5.2.1: Add `document_id` to `SourceMetadata` model

Task 5.2.3: Improve `citationParser.ts` with fuzzy matching