feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI
Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable.
Fix two latent bugs from Package 4:
- generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime)
- _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}.
Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart.
Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
parent
7069f15b95
commit
3b868a0133
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
**Source**: User request (2026-04-26)
|
||||
**Scope**: Refactor the 3-step RAG query pipeline so retrieval, filtering, and response generation are organized per sub-question instead of batch-flattened.
|
||||
**Status**: ✅ Complete — All 7 sub-phases implemented (2026-04-26)
|
||||
**Status**: ✅ Complete — All 7 sub-phases implemented (2026-04-26). Phase 4a Prompt Integration added (2026-04-27).
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -711,6 +711,72 @@ Minor enhancement:
|
|||
|
||||
---
|
||||
|
||||
## Phase 4a: Prompt Service Integration for Per-Sub-Q Filter (2026-04-27)
|
||||
|
||||
**Root issue**: `filter_per_subquestion()` in `relevance_filter.py` had a hardcoded prompt (`_build_per_subq_prompt()`) — completely bypassing `PromptService`. Users could not edit the per-sub-q filter prompt on the System Prompts page, unlike the flat `filter` step which was already prompt-service-driven.
|
||||
|
||||
**Solution**: Broke the per-sub-q filter prompt into **3 composable pieces**, each a separately editable step on the System Prompts page:
|
||||
|
||||
| Step Name | Label | Placeholders | Default |
|
||||
|-----------|-------|-------------|---------|
|
||||
| `filter_intro` | Step 2.1: Filter Intro (Preamble) | *(none)* | `"Evaluate each chunk for relevance to its associated sub-question only."` |
|
||||
| `filter_section` | Step 2.2: Filter Section (Per Sub-Q) | `{subq_idx}`, `{subq_question}`, `{chunks}` | `'Sub-question {subq_idx}: "{subq_question}"\n{chunks}'` |
|
||||
| `filter_outro` | Step 2.3: Filter Outro (Format) | *(none)* | JSON format instructions + example |
|
||||
|
||||
The `RelevanceFilter._build_per_subq_prompt()` now composes them at runtime:
|
||||
```
|
||||
filter_intro + [filter_section.replace(...) for each sub-q] + filter_outro
|
||||
```
|
||||
Falls back to built-in defaults when `PromptService` is unavailable.
|
||||
|
||||
### Bugs Fixed
|
||||
|
||||
1. **`generate_per_subq` not seeded**: `rag.py` called `get_prompt_template("generate_per_subq")` but this step name was never added to `_VALID_STEPS`, `_SEED_STEPS`, or `_SEED_TEMPLATES` — would crash at runtime with `ValueError`. Now properly seeded with `{context_sections}` placeholder.
|
||||
|
||||
2. **`_SEED_GENERATE` placeholder mismatch from Package 4**: The flat `generate_response()` expects `{question}`/`{context}` placeholders, but Package 4 changed the seed template to use `{context_sections}` (intended for per-sub-q generate). Restored flat template; `generate_per_subq` now holds `{context_sections}`.
|
||||
|
||||
### Database Backfill Migration
|
||||
|
||||
The existing `seed_default_profiles()` only inserted steps for NEWLY created profiles. Added a backfill loop that iterates ALL existing profiles and `INSERT OR IGNORE`s any missing step names. This ensures existing A/B/C profiles pick up `filter_intro`, `filter_section`, `filter_outro`, and `generate_per_subq` on restart.
|
||||
|
||||
### System Prompts UI Restructured
|
||||
|
||||
The flat `filter` and `generate` steps were removed from the UI (they're unused by the current pipeline). The page now shows 5 steps:
|
||||
|
||||
| UI Order | Label | Step Key |
|
||||
|----------|-------|----------|
|
||||
| 1 | Step 1: Query Decomposition | `decompose` |
|
||||
| 2 | Step 2.1: Filter Intro (Preamble) | `filter_intro` |
|
||||
| 3 | Step 2.2: Filter Section (Per Sub-Q) | `filter_section` |
|
||||
| 4 | Step 2.3: Filter Outro (Format) | `filter_outro` |
|
||||
| 5 | Step 3: Generate (Per-Sub-Question) | `generate_per_subq` |
|
||||
|
||||
The old `filter` and `generate` templates remain in the DB (for API backward compatibility) but are hidden from the UI.
|
||||
|
||||
### Files Changed
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `backend/app/core/sqlite_db.py` | 3 new seed templates + `generate_per_subq` seed; backfill migration; restored `_SEED_GENERATE` to `{question}`/`{context}` |
|
||||
| `backend/app/services/prompt_service.py` | Added 4 step names to `_VALID_STEPS` |
|
||||
| `backend/app/routers/prompts.py` | Added 4 step names to `_VALID_STEPS` |
|
||||
| `backend/app/services/relevance_filter.py` | Refactored `_build_per_subq_prompt()` to use PromptService + built-in fallback constants |
|
||||
| `frontend/src/components/PromptEditor.tsx` | Replaced unused flat steps with 5-step per-sub-q layout (Step 2.1-2.3 + Step 3) |
|
||||
| `frontend/src/components/PlaceholderDocs.tsx` | Added `{context_sections}`, `{subq_idx}`, `{subq_question}` docs |
|
||||
| `backend/app/test/conftest.py` | Added 4 new templates to mock |
|
||||
| `backend/app/test/test_phase3_sqlite_db.py` | Updated counts (9→21 prompts) and placeholder assertions |
|
||||
| `backend/app/test/test_phase3_prompt_service.py` | Updated step set + placeholder assertions |
|
||||
| `backend/app/test/test_phase3_prompts_router.py` | Updated step set assertion |
|
||||
| `backend/app/test/test_phase4_prompt_templates.py` | Updated for split generate/generate_per_subq |
|
||||
| `frontend/src/test/components/PromptEditor.test.tsx` | Updated to 5 textareas, new labels, new placeholder layout |
|
||||
| `frontend/src/test/components/PlaceholderDocs.test.tsx` | Updated to 6 placeholders |
|
||||
|
||||
### Test Results (Post-Phase 4a)
|
||||
- **Backend**: 295 passed, 5 skipped (pre-existing)
|
||||
- **Frontend**: 182 passed, 1 pre-existing failure (unrelated `file-input` e2e)
|
||||
|
||||
---
|
||||
|
||||
## Sub-Phase Summary
|
||||
|
||||
| Sub-Phase | Scope | Backend | Frontend | Tests | Status |
|
||||
|
|
@ -722,6 +788,7 @@ Minor enhancement:
|
|||
| 4.5 | Frontend types + state | None | `types/index.ts`, `lib/queries.tsx` | `test_phase4_stream_state.test.tsx`, `test_phase4_types.test.ts` | ✅ Complete |
|
||||
| 4.6 | Frontend rendering | None | `ResponsePanel.tsx`, `citationParser.ts`, `ExtractedQuestionsDisplay.tsx` | `test_phase4_response_panel.test.tsx`, `test_phase4_citation_parser.test.ts` | ✅ Complete |
|
||||
| 4.7 | Testing & polish | All affected files | All affected files | Integration + acceptance + e2e tests | ✅ Complete |
|
||||
| 4a | Prompt service integration for filter_per_subq | `sqlite_db.py`, `prompt_service.py`, `prompts.py`, `relevance_filter.py` | `PromptEditor.tsx`, `PlaceholderDocs.tsx` | Updated 7 test files, 13 total files changed | ✅ Complete |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -22,28 +22,60 @@ _SEED_FILTER = (
|
|||
)
|
||||
|
||||
_SEED_GENERATE = (
|
||||
"You must answer each sub-question using ONLY the document chunks provided for it.\n"
|
||||
"Do not use any external knowledge.\n"
|
||||
"Format your answer as markdown sections — one section per sub-question.\n"
|
||||
"Each section should start with \"## Sub-question N: <the question>\"\n"
|
||||
"Each section should contain 1-5 bullet points.\n"
|
||||
"Cite your sources inline using bracket labels, e.g. [filename, page N].\n"
|
||||
"Place the citation at the end of each relevant bullet point.\n\n"
|
||||
"Question: {question}\n\n"
|
||||
"Answer the question using ONLY these document chunks. "
|
||||
"Do not use any external knowledge. "
|
||||
"Format your answer as bullet points. "
|
||||
"Cite your sources inline using bracket labels, e.g. [filename, page N]. "
|
||||
"Place the citation at the end of each relevant point.\n\n"
|
||||
"Document chunks:\n{context}\n\n"
|
||||
"Answer:"
|
||||
)
|
||||
|
||||
_SEED_GENERATE_PER_SUBQ = (
|
||||
"Answer each sub-question using ONLY its document chunks.\n"
|
||||
"Format as markdown sections with ## Sub-question N: headers.\n"
|
||||
"{context_sections}\n\n"
|
||||
"Answer:"
|
||||
)
|
||||
|
||||
_SEED_FILTER_INTRO = (
|
||||
"Evaluate each chunk for relevance to its associated sub-question only."
|
||||
)
|
||||
|
||||
_SEED_FILTER_SECTION = (
|
||||
'\nSub-question {subq_idx}: "{subq_question}"\n{chunks}'
|
||||
)
|
||||
|
||||
_SEED_FILTER_OUTRO = (
|
||||
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question.\n"
|
||||
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
|
||||
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
|
||||
)
|
||||
|
||||
_SEED_PROFILES = [
|
||||
("A", 1),
|
||||
("B", 0),
|
||||
("C", 0),
|
||||
]
|
||||
|
||||
_SEED_STEPS = ["decompose", "filter", "generate"]
|
||||
_SEED_STEPS = [
|
||||
"decompose",
|
||||
"filter",
|
||||
"generate",
|
||||
"generate_per_subq",
|
||||
"filter_intro",
|
||||
"filter_section",
|
||||
"filter_outro",
|
||||
]
|
||||
_SEED_TEMPLATES = {
|
||||
"decompose": _SEED_DECOMPOSE,
|
||||
"filter": _SEED_FILTER,
|
||||
"generate": _SEED_GENERATE,
|
||||
"generate_per_subq": _SEED_GENERATE_PER_SUBQ,
|
||||
"filter_intro": _SEED_FILTER_INTRO,
|
||||
"filter_section": _SEED_FILTER_SECTION,
|
||||
"filter_outro": _SEED_FILTER_OUTRO,
|
||||
}
|
||||
|
||||
# ── Connection factories ────────────────────────────────────────────────────
|
||||
|
|
@ -156,5 +188,18 @@ def seed_default_profiles(conn: sqlite3.Connection) -> None:
|
|||
(profile_id, step, _SEED_TEMPLATES[step]),
|
||||
)
|
||||
|
||||
# Backfill: add any newly-added step names to existing profiles that were
|
||||
# created before the new steps existed. INSERT OR IGNORE is idempotent.
|
||||
profile_rows = conn.execute(
|
||||
"SELECT id FROM system_prompt_profiles"
|
||||
).fetchall()
|
||||
for row in profile_rows:
|
||||
for step in _SEED_STEPS:
|
||||
if step in _SEED_TEMPLATES:
|
||||
conn.execute(
|
||||
"INSERT OR IGNORE INTO system_prompts (profile_id, step_name, prompt_template) VALUES (?, ?, ?)",
|
||||
(row["id"], step, _SEED_TEMPLATES[step]),
|
||||
)
|
||||
|
||||
conn.commit()
|
||||
logger.info("Default profiles (A/B/C) seeded.")
|
||||
|
|
|
|||
|
|
@ -17,7 +17,15 @@ logger = logging.getLogger(__name__)
|
|||
router = APIRouter(prefix="/api/v1/prompts", tags=["prompts"])
|
||||
|
||||
_VALID_NAMES = {"A", "B", "C"}
|
||||
_VALID_STEPS = {"decompose", "filter", "generate"}
|
||||
_VALID_STEPS = {
|
||||
"decompose",
|
||||
"filter",
|
||||
"generate",
|
||||
"generate_per_subq",
|
||||
"filter_intro",
|
||||
"filter_section",
|
||||
"filter_outro",
|
||||
}
|
||||
|
||||
|
||||
def _ensure_valid_name(name: str) -> None:
|
||||
|
|
|
|||
|
|
@ -12,7 +12,15 @@ from app.core.sqlite_db import _SEED_TEMPLATES
|
|||
logger = logging.getLogger(__name__)
|
||||
|
||||
_VALID_NAMES = {"A", "B", "C"}
|
||||
_VALID_STEPS = {"decompose", "filter", "generate"}
|
||||
_VALID_STEPS = {
|
||||
"decompose",
|
||||
"filter",
|
||||
"generate",
|
||||
"generate_per_subq",
|
||||
"filter_intro",
|
||||
"filter_section",
|
||||
"filter_outro",
|
||||
}
|
||||
|
||||
|
||||
def _connect(db_path: str) -> sqlite3.Connection:
|
||||
|
|
|
|||
|
|
@ -16,6 +16,18 @@ _BUILTIN_FILTER_TEMPLATE = (
|
|||
"Return JSON array of scores.\n{chunks}\n"
|
||||
)
|
||||
|
||||
_BUILTIN_FILTER_INTRO = "Evaluate each chunk for relevance to its associated sub-question only."
|
||||
|
||||
_BUILTIN_FILTER_SECTION = (
|
||||
'\nSub-question {subq_idx}: "{subq_question}"\n{chunks}'
|
||||
)
|
||||
|
||||
_BUILTIN_FILTER_OUTRO = (
|
||||
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question.\n"
|
||||
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
|
||||
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
|
||||
)
|
||||
|
||||
|
||||
def _extract_json_from_markdown(response: str) -> str:
|
||||
if not isinstance(response, str):
|
||||
|
|
@ -109,22 +121,28 @@ class RelevanceFilter:
|
|||
sub_questions: List[str],
|
||||
sub_chunks: List[List[Tuple[str, Dict]]],
|
||||
) -> str:
|
||||
sections: List[str] = [
|
||||
"Evaluate each chunk for relevance to its associated sub-question only."
|
||||
]
|
||||
for idx, (sq, chunks) in enumerate(zip(sub_questions, sub_chunks)):
|
||||
sections.append(f'\nSub-question {idx}: "{sq}"')
|
||||
for c_idx, (text, _meta) in enumerate(chunks):
|
||||
sections.append(f"Chunk {c_idx}: {text}")
|
||||
if self._prompt_service is not None:
|
||||
intro = self._prompt_service.get_prompt_template("filter_intro")
|
||||
section = self._prompt_service.get_prompt_template("filter_section")
|
||||
outro = self._prompt_service.get_prompt_template("filter_outro")
|
||||
else:
|
||||
intro = _BUILTIN_FILTER_INTRO
|
||||
section = _BUILTIN_FILTER_SECTION
|
||||
outro = _BUILTIN_FILTER_OUTRO
|
||||
|
||||
sections.append(
|
||||
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question."
|
||||
parts: List[str] = [intro]
|
||||
for idx, (sq, chunks) in enumerate(zip(sub_questions, sub_chunks)):
|
||||
formatted_chunks = "\n".join(
|
||||
f"Chunk {c_idx}: {text}" for c_idx, (text, _meta) in enumerate(chunks)
|
||||
)
|
||||
sections.append(
|
||||
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
|
||||
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
|
||||
parts.append(
|
||||
section
|
||||
.replace("{subq_idx}", str(idx))
|
||||
.replace("{subq_question}", sq)
|
||||
.replace("{chunks}", formatted_chunks)
|
||||
)
|
||||
return "\n".join(sections)
|
||||
parts.append(outro)
|
||||
return "\n".join(parts)
|
||||
|
||||
async def filter_per_subquestion(
|
||||
self,
|
||||
|
|
|
|||
|
|
@ -66,6 +66,15 @@ def mock_prompt_service():
|
|||
"{context_sections}\n\n"
|
||||
"Answer:"
|
||||
),
|
||||
"filter_intro": "Evaluate each chunk for relevance to its associated sub-question only.",
|
||||
"filter_section": (
|
||||
'\nSub-question {subq_idx}: "{subq_question}"\n{chunks}'
|
||||
),
|
||||
"filter_outro": (
|
||||
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question.\n"
|
||||
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
|
||||
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
|
||||
),
|
||||
}
|
||||
|
||||
class _MockPromptService:
|
||||
|
|
|
|||
|
|
@ -95,10 +95,15 @@ def test_get_profile_prompts_returns_all_three_steps(tmp_path):
|
|||
svc = _create_service(tmp_path)
|
||||
prompts = svc.get_profile_prompts("A")
|
||||
|
||||
assert set(prompts.keys()) == {"decompose", "filter", "generate"}
|
||||
assert set(prompts.keys()) == {
|
||||
"decompose", "filter", "generate",
|
||||
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
|
||||
}
|
||||
assert "{question}" in prompts["decompose"]
|
||||
assert "{question}" in prompts["filter"]
|
||||
assert "{context_sections}" in prompts["generate"]
|
||||
assert "{question}" in prompts["generate"]
|
||||
assert "{context}" in prompts["generate"]
|
||||
assert "{context_sections}" in prompts["generate_per_subq"]
|
||||
|
||||
|
||||
def test_get_profile_prompts_invalid_name_raises(tmp_path):
|
||||
|
|
|
|||
|
|
@ -68,7 +68,10 @@ def test_get_profile_prompts_a_returns_200(client):
|
|||
|
||||
data = resp.json()
|
||||
assert data["profile_name"] == "A"
|
||||
assert set(data["prompts"].keys()) == {"decompose", "filter", "generate"}
|
||||
assert set(data["prompts"].keys()) == {
|
||||
"decompose", "filter", "generate",
|
||||
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
|
||||
}
|
||||
|
||||
|
||||
def test_get_profile_prompts_invalid_returns_400(client):
|
||||
|
|
|
|||
|
|
@ -251,7 +251,7 @@ def test_seed_default_profiles_A_is_active(tmp_path, monkeypatch):
|
|||
|
||||
|
||||
def test_seed_default_profiles_creates_nine_prompts(tmp_path, monkeypatch):
|
||||
"""seed_default_profiles() should insert 9 prompt rows (3 profiles × 3 steps)."""
|
||||
"""seed_default_profiles() should insert 21 prompt rows (3 profiles × 7 steps)."""
|
||||
prompts_path = str(tmp_path / "prompts.db")
|
||||
history_path = str(tmp_path / "history.db")
|
||||
_patch_settings(monkeypatch, prompts_path, history_path)
|
||||
|
|
@ -268,13 +268,17 @@ def test_seed_default_profiles_creates_nine_prompts(tmp_path, monkeypatch):
|
|||
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
|
||||
ORDER BY spp.name, sp.step_name"""
|
||||
).fetchall()
|
||||
assert len(rows) == 9
|
||||
assert len(rows) == 21
|
||||
|
||||
# Each profile should have all 3 steps
|
||||
# Each profile should have all 7 steps
|
||||
expected_steps = {
|
||||
"decompose", "filter", "generate",
|
||||
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
|
||||
}
|
||||
for profile in ("A", "B", "C"):
|
||||
profile_rows = [r for r in rows if r["profile_name"] == profile]
|
||||
steps = {r["step_name"] for r in profile_rows}
|
||||
assert steps == {"decompose", "filter", "generate"}
|
||||
assert steps == expected_steps
|
||||
conn.close()
|
||||
|
||||
|
||||
|
|
@ -311,7 +315,7 @@ def test_seed_default_profiles_contains_expected_templates(tmp_path, monkeypatch
|
|||
assert "{question}" in filter_row["prompt_template"]
|
||||
assert "{chunks}" in filter_row["prompt_template"]
|
||||
|
||||
# Generate must have {context_sections}
|
||||
# Generate must have {question} and {context}
|
||||
generate_row = conn.execute(
|
||||
"""SELECT sp.prompt_template
|
||||
FROM system_prompts sp
|
||||
|
|
@ -319,7 +323,50 @@ def test_seed_default_profiles_contains_expected_templates(tmp_path, monkeypatch
|
|||
WHERE spp.name='A' AND sp.step_name='generate'"""
|
||||
).fetchone()
|
||||
assert generate_row is not None
|
||||
assert "{context_sections}" in generate_row["prompt_template"]
|
||||
assert "{question}" in generate_row["prompt_template"]
|
||||
assert "{context}" in generate_row["prompt_template"]
|
||||
|
||||
# Generate per-subq must have {context_sections}
|
||||
gen_per_subq_row = conn.execute(
|
||||
"""SELECT sp.prompt_template
|
||||
FROM system_prompts sp
|
||||
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
|
||||
WHERE spp.name='A' AND sp.step_name='generate_per_subq'"""
|
||||
).fetchone()
|
||||
assert gen_per_subq_row is not None
|
||||
assert "{context_sections}" in gen_per_subq_row["prompt_template"]
|
||||
|
||||
# Filter intro exists (no required placeholders)
|
||||
filter_intro_row = conn.execute(
|
||||
"""SELECT sp.prompt_template
|
||||
FROM system_prompts sp
|
||||
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
|
||||
WHERE spp.name='A' AND sp.step_name='filter_intro'"""
|
||||
).fetchone()
|
||||
assert filter_intro_row is not None
|
||||
assert len(filter_intro_row["prompt_template"]) > 0
|
||||
|
||||
# Filter section must have per-subq placeholders
|
||||
filter_section_row = conn.execute(
|
||||
"""SELECT sp.prompt_template
|
||||
FROM system_prompts sp
|
||||
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
|
||||
WHERE spp.name='A' AND sp.step_name='filter_section'"""
|
||||
).fetchone()
|
||||
assert filter_section_row is not None
|
||||
assert "{subq_idx}" in filter_section_row["prompt_template"]
|
||||
assert "{subq_question}" in filter_section_row["prompt_template"]
|
||||
assert "{chunks}" in filter_section_row["prompt_template"]
|
||||
|
||||
# Filter outro exists (no required placeholders)
|
||||
filter_outro_row = conn.execute(
|
||||
"""SELECT sp.prompt_template
|
||||
FROM system_prompts sp
|
||||
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
|
||||
WHERE spp.name='A' AND sp.step_name='filter_outro'"""
|
||||
).fetchone()
|
||||
assert filter_outro_row is not None
|
||||
assert len(filter_outro_row["prompt_template"]) > 0
|
||||
|
||||
conn.close()
|
||||
|
||||
|
|
@ -345,7 +392,7 @@ def test_seed_default_profiles_idempotent(tmp_path, monkeypatch):
|
|||
).fetchone()[0]
|
||||
|
||||
assert profile_count == 3
|
||||
assert prompt_count == 9
|
||||
assert prompt_count == 21
|
||||
conn.close()
|
||||
|
||||
|
||||
|
|
@ -361,7 +408,10 @@ def test_all_three_profiles_have_identical_seed_templates(tmp_path, monkeypatch)
|
|||
init_prompts_db(conn)
|
||||
seed_default_profiles(conn)
|
||||
|
||||
for step in ("decompose", "filter", "generate"):
|
||||
for step in (
|
||||
"decompose", "filter", "generate",
|
||||
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
|
||||
):
|
||||
rows = conn.execute(
|
||||
"""SELECT spp.name, sp.prompt_template
|
||||
FROM system_prompts sp
|
||||
|
|
|
|||
|
|
@ -31,36 +31,43 @@ def prompt_service():
|
|||
|
||||
|
||||
def test_generate_template_uses_context_sections(prompt_service):
|
||||
"""When PromptService returns the generate template, {context_sections}
|
||||
placeholder must be present and old {context} removed."""
|
||||
template = prompt_service.get_prompt_template("generate")
|
||||
"""When PromptService returns the generate_per_subq template,
|
||||
{context_sections} placeholder must be present."""
|
||||
template = prompt_service.get_prompt_template("generate_per_subq")
|
||||
|
||||
assert "{context_sections}" in template
|
||||
assert "{context}" not in template
|
||||
|
||||
|
||||
def test_builtin_generate_template_has_context_sections():
|
||||
"""When no prompt_service, the built-in seed template uses
|
||||
{context_sections} instead of {context}."""
|
||||
from app.core.sqlite_db import _SEED_GENERATE
|
||||
"""The per-subq seed template (_SEED_GENERATE_PER_SUBQ) uses
|
||||
{context_sections}, while the flat generate uses {question}/{context}."""
|
||||
from app.core.sqlite_db import _SEED_GENERATE_PER_SUBQ, _SEED_GENERATE
|
||||
|
||||
assert "{context_sections}" in _SEED_GENERATE
|
||||
assert "{context}" not in _SEED_GENERATE
|
||||
assert "{context_sections}" in _SEED_GENERATE_PER_SUBQ
|
||||
assert "{question}" in _SEED_GENERATE
|
||||
assert "{context}" in _SEED_GENERATE
|
||||
|
||||
|
||||
def test_reset_to_defaults_includes_new_generate_template(prompt_service):
|
||||
"""After calling reset_to_defaults() on a profile, the generate step
|
||||
uses {context_sections} placeholder."""
|
||||
"""After calling reset_to_defaults(), generate_per_subq uses
|
||||
{context_sections} and flat generate uses {question}/{context}."""
|
||||
profile_name = prompt_service.get_active_profile_name()
|
||||
|
||||
prompt_service.update_prompt(profile_name, "generate", "custom template with {context}")
|
||||
# Modify both templates
|
||||
prompt_service.update_prompt(profile_name, "generate_per_subq", "custom per-subq")
|
||||
prompt_service.update_prompt(profile_name, "generate", "custom flat")
|
||||
|
||||
modified = prompt_service.get_prompt_template("generate")
|
||||
assert "{context}" in modified
|
||||
assert "{context_sections}" not in modified
|
||||
modified_per_subq = prompt_service.get_prompt_template("generate_per_subq")
|
||||
assert "custom per-subq" == modified_per_subq
|
||||
|
||||
modified_flat = prompt_service.get_prompt_template("generate")
|
||||
assert "custom flat" == modified_flat
|
||||
|
||||
prompt_service.reset_to_defaults(profile_name)
|
||||
|
||||
template = prompt_service.get_prompt_template("generate")
|
||||
assert "{context_sections}" in template
|
||||
assert "{context}" not in template
|
||||
template_per_subq = prompt_service.get_prompt_template("generate_per_subq")
|
||||
assert "{context_sections}" in template_per_subq
|
||||
|
||||
template_flat = prompt_service.get_prompt_template("generate")
|
||||
assert "{question}" in template_flat
|
||||
assert "{context}" in template_flat
|
||||
|
|
|
|||
|
|
@ -7,9 +7,12 @@ interface PlaceholderRow {
|
|||
}
|
||||
|
||||
const PLACEHOLDERS: PlaceholderRow[] = [
|
||||
{ placeholder: '{question}', description: 'The user\'s input question' },
|
||||
{ placeholder: '{question}', description: "The user's input question" },
|
||||
{ placeholder: '{chunks}', description: 'Retrieved document chunks (filter step only)' },
|
||||
{ placeholder: '{context}', description: 'Formatted chunks with citations (generate step only)' },
|
||||
{ placeholder: '{context_sections}', description: 'Per-sub-question context sections (generate per-subq only)' },
|
||||
{ placeholder: '{subq_idx}', description: 'Sub-question index number (filter section only)' },
|
||||
{ placeholder: '{subq_question}', description: 'Sub-question text (filter section only)' },
|
||||
]
|
||||
|
||||
export const PlaceholderDocs: React.FC = () => {
|
||||
|
|
|
|||
|
|
@ -15,14 +15,20 @@ export interface PromptEditorProps {
|
|||
|
||||
const STEPS = [
|
||||
{ key: 'decompose', label: 'Step 1: Query Decomposition', placeholders: ['question'] },
|
||||
{ key: 'filter', label: 'Step 2: Relevance Filtering', placeholders: ['question', 'chunks'] },
|
||||
{ key: 'generate', label: 'Step 3: Response Generation', placeholders: ['question', 'context'] },
|
||||
{ key: 'filter_intro', label: 'Step 2.1: Filter Intro (Preamble)', placeholders: [] },
|
||||
{ key: 'filter_section', label: 'Step 2.2: Filter Section (Per Sub-Q)', placeholders: ['subq_idx', 'subq_question', 'chunks'] },
|
||||
{ key: 'filter_outro', label: 'Step 2.3: Filter Outro (Format)', placeholders: [] },
|
||||
{ key: 'generate_per_subq', label: 'Step 3: Generate (Per-Sub-Question)', placeholders: ['context_sections'] },
|
||||
] as const
|
||||
|
||||
const VALID_PLACEHOLDERS: Record<string, string[]> = {
|
||||
decompose: ['{question}'],
|
||||
filter: ['{question}', '{chunks}'],
|
||||
generate: ['{question}', '{context}'],
|
||||
generate_per_subq: ['{context_sections}'],
|
||||
filter_intro: [],
|
||||
filter_section: ['{subq_idx}', '{subq_question}', '{chunks}'],
|
||||
filter_outro: [],
|
||||
}
|
||||
|
||||
const findUnknownPlaceholders = (template: string, stepKey: string): string[] => {
|
||||
|
|
|
|||
|
|
@ -8,16 +8,19 @@ describe('PlaceholderDocs', () => {
|
|||
expect(screen.getByText('Available Placeholders')).toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('renders all 3 placeholders: {question}, {chunks}, {context}', () => {
|
||||
it('renders all 6 placeholders', () => {
|
||||
render(<PlaceholderDocs />)
|
||||
expect(screen.getByText('{question}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{chunks}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{context}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{context_sections}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{subq_idx}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{subq_question}')).toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('renders each placeholder in a monospace element', () => {
|
||||
render(<PlaceholderDocs />)
|
||||
const placeholders = ['{question}', '{chunks}', '{context}']
|
||||
const placeholders = ['{question}', '{chunks}', '{context}', '{context_sections}', '{subq_idx}', '{subq_question}']
|
||||
placeholders.forEach((ph) => {
|
||||
const el = screen.getByText(ph)
|
||||
expect(el.tagName.toLowerCase()).toBe('code')
|
||||
|
|
@ -29,6 +32,9 @@ describe('PlaceholderDocs', () => {
|
|||
expect(screen.getByText("The user's input question")).toBeInTheDocument()
|
||||
expect(screen.getByText('Retrieved document chunks (filter step only)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Formatted chunks with citations (generate step only)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Per-sub-question context sections (generate per-subq only)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Sub-question index number (filter section only)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Sub-question text (filter section only)')).toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('hides placeholder details when toggle button is clicked', () => {
|
||||
|
|
@ -39,6 +45,7 @@ describe('PlaceholderDocs', () => {
|
|||
expect(screen.queryByText('{question}')).not.toBeInTheDocument()
|
||||
expect(screen.queryByText('{chunks}')).not.toBeInTheDocument()
|
||||
expect(screen.queryByText('{context}')).not.toBeInTheDocument()
|
||||
expect(screen.queryByText('{context_sections}')).not.toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('shows placeholder details again when toggle is clicked twice', () => {
|
||||
|
|
@ -50,5 +57,6 @@ describe('PlaceholderDocs', () => {
|
|||
|
||||
fireEvent.click(toggleBtn)
|
||||
expect(screen.getByText('{question}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{context_sections}')).toBeInTheDocument()
|
||||
})
|
||||
})
|
||||
|
|
|
|||
|
|
@ -4,8 +4,10 @@ import { PromptEditor } from '../../components/PromptEditor'
|
|||
|
||||
const mockPrompts: Record<string, string> = {
|
||||
decompose: 'Decompose template with {question}',
|
||||
filter: 'Filter template with {question} and {chunks}',
|
||||
generate: 'Generate template with {question} and {context}',
|
||||
filter_intro: 'Evaluate each chunk for relevance',
|
||||
filter_section: 'Sub-question {subq_idx}: "{subq_question}"\n{chunks}',
|
||||
filter_outro: 'Return JSON object mapping sub-question indices to scores.',
|
||||
generate_per_subq: 'Generate per-subq with {context_sections}',
|
||||
}
|
||||
|
||||
const defaultProps = {
|
||||
|
|
@ -29,46 +31,54 @@ describe('PromptEditor', () => {
|
|||
defaultProps.onCancel.mockClear()
|
||||
})
|
||||
|
||||
it('renders 3 textareas for decompose, filter, and generate steps', () => {
|
||||
it('renders 5 textareas for the active steps', () => {
|
||||
render(<PromptEditor {...defaultProps} />)
|
||||
const textareas = screen.getAllByRole('textbox')
|
||||
expect(textareas).toHaveLength(3)
|
||||
expect(textareas).toHaveLength(5)
|
||||
})
|
||||
|
||||
it('renders labels for each step', () => {
|
||||
render(<PromptEditor {...defaultProps} />)
|
||||
expect(screen.getByText('Step 1: Query Decomposition')).toBeInTheDocument()
|
||||
expect(screen.getByText('Step 2: Relevance Filtering')).toBeInTheDocument()
|
||||
expect(screen.getByText('Step 3: Response Generation')).toBeInTheDocument()
|
||||
expect(screen.getByText('Step 2.1: Filter Intro (Preamble)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Step 2.2: Filter Section (Per Sub-Q)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Step 2.3: Filter Outro (Format)')).toBeInTheDocument()
|
||||
expect(screen.getByText('Step 3: Generate (Per-Sub-Question)')).toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('shows valid placeholders in the label area for each step', () => {
|
||||
render(<PromptEditor {...defaultProps} />)
|
||||
// {question} appears in all 3 steps (decompose, filter, generate)
|
||||
const questionPlaceholders = screen.getAllByText('{question}')
|
||||
expect(questionPlaceholders).toHaveLength(3)
|
||||
// {chunks} appears only in filter step
|
||||
// {question} appears only in decompose step
|
||||
expect(screen.getByText('{question}')).toBeInTheDocument()
|
||||
// {chunks} appears only in filter_section
|
||||
expect(screen.getByText('{chunks}')).toBeInTheDocument()
|
||||
// {context} appears only in generate step
|
||||
expect(screen.getByText('{context}')).toBeInTheDocument()
|
||||
// {context_sections} appears in generate_per_subq
|
||||
expect(screen.getByText('{context_sections}')).toBeInTheDocument()
|
||||
// {subq_idx}, {subq_question} appear in filter_section
|
||||
expect(screen.getByText('{subq_idx}')).toBeInTheDocument()
|
||||
expect(screen.getByText('{subq_question}')).toBeInTheDocument()
|
||||
// {context} is no longer shown (flat generate removed)
|
||||
expect(screen.queryByText('{context}')).not.toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('renders per-step reset buttons that call onResetStep when clicked', () => {
|
||||
// Mock window.confirm to return true
|
||||
const confirmSpy = vi.spyOn(window, 'confirm').mockReturnValue(true)
|
||||
|
||||
render(<PromptEditor {...defaultProps} />)
|
||||
const resetButtons = screen.getAllByTitle(/reset.*to default/i)
|
||||
expect(resetButtons).toHaveLength(3)
|
||||
expect(resetButtons).toHaveLength(5)
|
||||
|
||||
fireEvent.click(resetButtons[0])
|
||||
expect(defaultProps.onResetStep).toHaveBeenCalledWith('decompose')
|
||||
|
||||
fireEvent.click(resetButtons[1])
|
||||
expect(defaultProps.onResetStep).toHaveBeenCalledWith('filter')
|
||||
expect(defaultProps.onResetStep).toHaveBeenCalledWith('filter_intro')
|
||||
|
||||
fireEvent.click(resetButtons[2])
|
||||
expect(defaultProps.onResetStep).toHaveBeenCalledWith('generate')
|
||||
fireEvent.click(resetButtons[3])
|
||||
expect(defaultProps.onResetStep).toHaveBeenCalledWith('filter_outro')
|
||||
|
||||
fireEvent.click(resetButtons[4])
|
||||
expect(defaultProps.onResetStep).toHaveBeenCalledWith('generate_per_subq')
|
||||
|
||||
confirmSpy.mockRestore()
|
||||
})
|
||||
|
|
@ -133,8 +143,10 @@ describe('PromptEditor', () => {
|
|||
const textareas = screen.getAllByRole('textbox')
|
||||
|
||||
expect(textareas[0]).toHaveValue('Decompose template with {question}')
|
||||
expect(textareas[1]).toHaveValue('Filter template with {question} and {chunks}')
|
||||
expect(textareas[2]).toHaveValue('Generate template with {question} and {context}')
|
||||
expect(textareas[1]).toHaveValue('Evaluate each chunk for relevance')
|
||||
expect(textareas[2]).toHaveValue('Sub-question {subq_idx}: "{subq_question}"\n{chunks}')
|
||||
expect(textareas[3]).toHaveValue('Return JSON object mapping sub-question indices to scores.')
|
||||
expect(textareas[4]).toHaveValue('Generate per-subq with {context_sections}')
|
||||
})
|
||||
|
||||
it('calls onUpdate when typing in a textarea', () => {
|
||||
|
|
@ -149,13 +161,11 @@ describe('PromptEditor', () => {
|
|||
it('displays character count for each textarea', () => {
|
||||
render(<PromptEditor {...defaultProps} />)
|
||||
|
||||
const decomposeLength = mockPrompts.decompose.length
|
||||
const filterLength = mockPrompts.filter.length
|
||||
const generateLength = mockPrompts.generate.length
|
||||
|
||||
expect(screen.getByText(`${decomposeLength} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${filterLength} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${generateLength} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${mockPrompts.decompose.length} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${mockPrompts.filter_intro.length} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${mockPrompts.filter_section.length} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${mockPrompts.filter_outro.length} characters`)).toBeInTheDocument()
|
||||
expect(screen.getByText(`${mockPrompts.generate_per_subq.length} characters`)).toBeInTheDocument()
|
||||
})
|
||||
|
||||
it('shows the profile name being edited', () => {
|
||||
|
|
@ -169,7 +179,6 @@ describe('PromptEditor', () => {
|
|||
const textareas = screen.getAllByRole('textbox')
|
||||
textareas.forEach((ta) => expect(ta).toBeDisabled())
|
||||
|
||||
// "Reset All to Defaults" and "Cancel" should be disabled when saving
|
||||
expect(screen.getByRole('button', { name: /saving\.\.\./i })).toBeDisabled()
|
||||
expect(screen.getByRole('button', { name: /reset all to defaults/i })).toBeDisabled()
|
||||
expect(screen.getByRole('button', { name: /cancel/i })).toBeDisabled()
|
||||
|
|
|
|||
Loading…
Reference in New Issue