feat(prompts): integrate filter_per_subq with PromptService, fix seed bugs, restructure UI

Break the hardcoded per-sub-q filter prompt into 3 editable PromptService templates (filter_intro, filter_section, filter_outro) with placeholders for the for-loop iteration pattern. Refactor RelevanceFilter._build_per_subq_prompt() to compose them at runtime, falling back to built-in defaults when PromptService is unavailable.

Fix two latent bugs from Package 4:
- generate_per_subq was called by rag.py but never added to _VALID_STEPS or DB seed (would ValueError at runtime)
- _SEED_GENERATE placeholder mismatch: flat generate_response() expects {question}/{context} but Package 4 changed it to {context_sections}. Restored flat template; generate_per_subq now holds {context_sections}.

Add database backfill migration in seed_default_profiles() to INSERT OR IGNORE missing steps into existing profile rows, ensuring all 7 steps exist on restart.

Restructure System Prompts UI: remove unused flat filter/generate steps, replace with Step 2.1-2.3 (filter_intro/section/outro) and Step 3 (generate_per_subq). Update PlaceholderDocs with {context_sections}, {subq_idx}, {subq_question}.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
Woody 2026-04-27 11:14:27 +08:00
parent 7069f15b95
commit 3b868a0133
14 changed files with 333 additions and 87 deletions

View File

@ -2,7 +2,7 @@
**Source**: User request (2026-04-26) **Source**: User request (2026-04-26)
**Scope**: Refactor the 3-step RAG query pipeline so retrieval, filtering, and response generation are organized per sub-question instead of batch-flattened. **Scope**: Refactor the 3-step RAG query pipeline so retrieval, filtering, and response generation are organized per sub-question instead of batch-flattened.
**Status**: ✅ Complete — All 7 sub-phases implemented (2026-04-26) **Status**: ✅ Complete — All 7 sub-phases implemented (2026-04-26). Phase 4a Prompt Integration added (2026-04-27).
--- ---
@ -711,6 +711,72 @@ Minor enhancement:
--- ---
## Phase 4a: Prompt Service Integration for Per-Sub-Q Filter (2026-04-27)
**Root issue**: `filter_per_subquestion()` in `relevance_filter.py` had a hardcoded prompt (`_build_per_subq_prompt()`) — completely bypassing `PromptService`. Users could not edit the per-sub-q filter prompt on the System Prompts page, unlike the flat `filter` step which was already prompt-service-driven.
**Solution**: Broke the per-sub-q filter prompt into **3 composable pieces**, each a separately editable step on the System Prompts page:
| Step Name | Label | Placeholders | Default |
|-----------|-------|-------------|---------|
| `filter_intro` | Step 2.1: Filter Intro (Preamble) | *(none)* | `"Evaluate each chunk for relevance to its associated sub-question only."` |
| `filter_section` | Step 2.2: Filter Section (Per Sub-Q) | `{subq_idx}`, `{subq_question}`, `{chunks}` | `'Sub-question {subq_idx}: "{subq_question}"\n{chunks}'` |
| `filter_outro` | Step 2.3: Filter Outro (Format) | *(none)* | JSON format instructions + example |
The `RelevanceFilter._build_per_subq_prompt()` now composes them at runtime:
```
filter_intro + [filter_section.replace(...) for each sub-q] + filter_outro
```
Falls back to built-in defaults when `PromptService` is unavailable.
### Bugs Fixed
1. **`generate_per_subq` not seeded**: `rag.py` called `get_prompt_template("generate_per_subq")` but this step name was never added to `_VALID_STEPS`, `_SEED_STEPS`, or `_SEED_TEMPLATES` — would crash at runtime with `ValueError`. Now properly seeded with `{context_sections}` placeholder.
2. **`_SEED_GENERATE` placeholder mismatch from Package 4**: The flat `generate_response()` expects `{question}`/`{context}` placeholders, but Package 4 changed the seed template to use `{context_sections}` (intended for per-sub-q generate). Restored flat template; `generate_per_subq` now holds `{context_sections}`.
### Database Backfill Migration
The existing `seed_default_profiles()` only inserted steps for NEWLY created profiles. Added a backfill loop that iterates ALL existing profiles and `INSERT OR IGNORE`s any missing step names. This ensures existing A/B/C profiles pick up `filter_intro`, `filter_section`, `filter_outro`, and `generate_per_subq` on restart.
### System Prompts UI Restructured
The flat `filter` and `generate` steps were removed from the UI (they're unused by the current pipeline). The page now shows 5 steps:
| UI Order | Label | Step Key |
|----------|-------|----------|
| 1 | Step 1: Query Decomposition | `decompose` |
| 2 | Step 2.1: Filter Intro (Preamble) | `filter_intro` |
| 3 | Step 2.2: Filter Section (Per Sub-Q) | `filter_section` |
| 4 | Step 2.3: Filter Outro (Format) | `filter_outro` |
| 5 | Step 3: Generate (Per-Sub-Question) | `generate_per_subq` |
The old `filter` and `generate` templates remain in the DB (for API backward compatibility) but are hidden from the UI.
### Files Changed
| File | Change |
|------|--------|
| `backend/app/core/sqlite_db.py` | 3 new seed templates + `generate_per_subq` seed; backfill migration; restored `_SEED_GENERATE` to `{question}`/`{context}` |
| `backend/app/services/prompt_service.py` | Added 4 step names to `_VALID_STEPS` |
| `backend/app/routers/prompts.py` | Added 4 step names to `_VALID_STEPS` |
| `backend/app/services/relevance_filter.py` | Refactored `_build_per_subq_prompt()` to use PromptService + built-in fallback constants |
| `frontend/src/components/PromptEditor.tsx` | Replaced unused flat steps with 5-step per-sub-q layout (Step 2.1-2.3 + Step 3) |
| `frontend/src/components/PlaceholderDocs.tsx` | Added `{context_sections}`, `{subq_idx}`, `{subq_question}` docs |
| `backend/app/test/conftest.py` | Added 4 new templates to mock |
| `backend/app/test/test_phase3_sqlite_db.py` | Updated counts (9→21 prompts) and placeholder assertions |
| `backend/app/test/test_phase3_prompt_service.py` | Updated step set + placeholder assertions |
| `backend/app/test/test_phase3_prompts_router.py` | Updated step set assertion |
| `backend/app/test/test_phase4_prompt_templates.py` | Updated for split generate/generate_per_subq |
| `frontend/src/test/components/PromptEditor.test.tsx` | Updated to 5 textareas, new labels, new placeholder layout |
| `frontend/src/test/components/PlaceholderDocs.test.tsx` | Updated to 6 placeholders |
### Test Results (Post-Phase 4a)
- **Backend**: 295 passed, 5 skipped (pre-existing)
- **Frontend**: 182 passed, 1 pre-existing failure (unrelated `file-input` e2e)
---
## Sub-Phase Summary ## Sub-Phase Summary
| Sub-Phase | Scope | Backend | Frontend | Tests | Status | | Sub-Phase | Scope | Backend | Frontend | Tests | Status |
@ -722,6 +788,7 @@ Minor enhancement:
| 4.5 | Frontend types + state | None | `types/index.ts`, `lib/queries.tsx` | `test_phase4_stream_state.test.tsx`, `test_phase4_types.test.ts` | ✅ Complete | | 4.5 | Frontend types + state | None | `types/index.ts`, `lib/queries.tsx` | `test_phase4_stream_state.test.tsx`, `test_phase4_types.test.ts` | ✅ Complete |
| 4.6 | Frontend rendering | None | `ResponsePanel.tsx`, `citationParser.ts`, `ExtractedQuestionsDisplay.tsx` | `test_phase4_response_panel.test.tsx`, `test_phase4_citation_parser.test.ts` | ✅ Complete | | 4.6 | Frontend rendering | None | `ResponsePanel.tsx`, `citationParser.ts`, `ExtractedQuestionsDisplay.tsx` | `test_phase4_response_panel.test.tsx`, `test_phase4_citation_parser.test.ts` | ✅ Complete |
| 4.7 | Testing & polish | All affected files | All affected files | Integration + acceptance + e2e tests | ✅ Complete | | 4.7 | Testing & polish | All affected files | All affected files | Integration + acceptance + e2e tests | ✅ Complete |
| 4a | Prompt service integration for filter_per_subq | `sqlite_db.py`, `prompt_service.py`, `prompts.py`, `relevance_filter.py` | `PromptEditor.tsx`, `PlaceholderDocs.tsx` | Updated 7 test files, 13 total files changed | ✅ Complete |
--- ---

View File

@ -22,28 +22,60 @@ _SEED_FILTER = (
) )
_SEED_GENERATE = ( _SEED_GENERATE = (
"You must answer each sub-question using ONLY the document chunks provided for it.\n" "Question: {question}\n\n"
"Do not use any external knowledge.\n" "Answer the question using ONLY these document chunks. "
"Format your answer as markdown sections — one section per sub-question.\n" "Do not use any external knowledge. "
"Each section should start with \"## Sub-question N: <the question>\"\n" "Format your answer as bullet points. "
"Each section should contain 1-5 bullet points.\n" "Cite your sources inline using bracket labels, e.g. [filename, page N]. "
"Cite your sources inline using bracket labels, e.g. [filename, page N].\n" "Place the citation at the end of each relevant point.\n\n"
"Place the citation at the end of each relevant bullet point.\n\n" "Document chunks:\n{context}\n\n"
"Answer:"
)
_SEED_GENERATE_PER_SUBQ = (
"Answer each sub-question using ONLY its document chunks.\n"
"Format as markdown sections with ## Sub-question N: headers.\n"
"{context_sections}\n\n" "{context_sections}\n\n"
"Answer:" "Answer:"
) )
_SEED_FILTER_INTRO = (
"Evaluate each chunk for relevance to its associated sub-question only."
)
_SEED_FILTER_SECTION = (
'\nSub-question {subq_idx}: "{subq_question}"\n{chunks}'
)
_SEED_FILTER_OUTRO = (
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question.\n"
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
)
_SEED_PROFILES = [ _SEED_PROFILES = [
("A", 1), ("A", 1),
("B", 0), ("B", 0),
("C", 0), ("C", 0),
] ]
_SEED_STEPS = ["decompose", "filter", "generate"] _SEED_STEPS = [
"decompose",
"filter",
"generate",
"generate_per_subq",
"filter_intro",
"filter_section",
"filter_outro",
]
_SEED_TEMPLATES = { _SEED_TEMPLATES = {
"decompose": _SEED_DECOMPOSE, "decompose": _SEED_DECOMPOSE,
"filter": _SEED_FILTER, "filter": _SEED_FILTER,
"generate": _SEED_GENERATE, "generate": _SEED_GENERATE,
"generate_per_subq": _SEED_GENERATE_PER_SUBQ,
"filter_intro": _SEED_FILTER_INTRO,
"filter_section": _SEED_FILTER_SECTION,
"filter_outro": _SEED_FILTER_OUTRO,
} }
# ── Connection factories ──────────────────────────────────────────────────── # ── Connection factories ────────────────────────────────────────────────────
@ -156,5 +188,18 @@ def seed_default_profiles(conn: sqlite3.Connection) -> None:
(profile_id, step, _SEED_TEMPLATES[step]), (profile_id, step, _SEED_TEMPLATES[step]),
) )
# Backfill: add any newly-added step names to existing profiles that were
# created before the new steps existed. INSERT OR IGNORE is idempotent.
profile_rows = conn.execute(
"SELECT id FROM system_prompt_profiles"
).fetchall()
for row in profile_rows:
for step in _SEED_STEPS:
if step in _SEED_TEMPLATES:
conn.execute(
"INSERT OR IGNORE INTO system_prompts (profile_id, step_name, prompt_template) VALUES (?, ?, ?)",
(row["id"], step, _SEED_TEMPLATES[step]),
)
conn.commit() conn.commit()
logger.info("Default profiles (A/B/C) seeded.") logger.info("Default profiles (A/B/C) seeded.")

View File

@ -17,7 +17,15 @@ logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/v1/prompts", tags=["prompts"]) router = APIRouter(prefix="/api/v1/prompts", tags=["prompts"])
_VALID_NAMES = {"A", "B", "C"} _VALID_NAMES = {"A", "B", "C"}
_VALID_STEPS = {"decompose", "filter", "generate"} _VALID_STEPS = {
"decompose",
"filter",
"generate",
"generate_per_subq",
"filter_intro",
"filter_section",
"filter_outro",
}
def _ensure_valid_name(name: str) -> None: def _ensure_valid_name(name: str) -> None:

View File

@ -12,7 +12,15 @@ from app.core.sqlite_db import _SEED_TEMPLATES
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
_VALID_NAMES = {"A", "B", "C"} _VALID_NAMES = {"A", "B", "C"}
_VALID_STEPS = {"decompose", "filter", "generate"} _VALID_STEPS = {
"decompose",
"filter",
"generate",
"generate_per_subq",
"filter_intro",
"filter_section",
"filter_outro",
}
def _connect(db_path: str) -> sqlite3.Connection: def _connect(db_path: str) -> sqlite3.Connection:

View File

@ -16,6 +16,18 @@ _BUILTIN_FILTER_TEMPLATE = (
"Return JSON array of scores.\n{chunks}\n" "Return JSON array of scores.\n{chunks}\n"
) )
_BUILTIN_FILTER_INTRO = "Evaluate each chunk for relevance to its associated sub-question only."
_BUILTIN_FILTER_SECTION = (
'\nSub-question {subq_idx}: "{subq_question}"\n{chunks}'
)
_BUILTIN_FILTER_OUTRO = (
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question.\n"
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
)
def _extract_json_from_markdown(response: str) -> str: def _extract_json_from_markdown(response: str) -> str:
if not isinstance(response, str): if not isinstance(response, str):
@ -109,22 +121,28 @@ class RelevanceFilter:
sub_questions: List[str], sub_questions: List[str],
sub_chunks: List[List[Tuple[str, Dict]]], sub_chunks: List[List[Tuple[str, Dict]]],
) -> str: ) -> str:
sections: List[str] = [ if self._prompt_service is not None:
"Evaluate each chunk for relevance to its associated sub-question only." intro = self._prompt_service.get_prompt_template("filter_intro")
] section = self._prompt_service.get_prompt_template("filter_section")
for idx, (sq, chunks) in enumerate(zip(sub_questions, sub_chunks)): outro = self._prompt_service.get_prompt_template("filter_outro")
sections.append(f'\nSub-question {idx}: "{sq}"') else:
for c_idx, (text, _meta) in enumerate(chunks): intro = _BUILTIN_FILTER_INTRO
sections.append(f"Chunk {c_idx}: {text}") section = _BUILTIN_FILTER_SECTION
outro = _BUILTIN_FILTER_OUTRO
sections.append( parts: List[str] = [intro]
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question." for idx, (sq, chunks) in enumerate(zip(sub_questions, sub_chunks)):
formatted_chunks = "\n".join(
f"Chunk {c_idx}: {text}" for c_idx, (text, _meta) in enumerate(chunks)
) )
sections.append( parts.append(
'Return a JSON object mapping sub-question indices to arrays of scores.\n' section
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}' .replace("{subq_idx}", str(idx))
.replace("{subq_question}", sq)
.replace("{chunks}", formatted_chunks)
) )
return "\n".join(sections) parts.append(outro)
return "\n".join(parts)
async def filter_per_subquestion( async def filter_per_subquestion(
self, self,

View File

@ -66,6 +66,15 @@ def mock_prompt_service():
"{context_sections}\n\n" "{context_sections}\n\n"
"Answer:" "Answer:"
), ),
"filter_intro": "Evaluate each chunk for relevance to its associated sub-question only.",
"filter_section": (
'\nSub-question {subq_idx}: "{subq_question}"\n{chunks}'
),
"filter_outro": (
"\nFor each chunk, rate its relevance 0-10 considering ONLY its associated sub-question.\n"
'Return a JSON object mapping sub-question indices to arrays of scores.\n'
'Example: {"0": [8.5, 3.2, 9.0], "1": [7.0, 9.1]}'
),
} }
class _MockPromptService: class _MockPromptService:

View File

@ -95,10 +95,15 @@ def test_get_profile_prompts_returns_all_three_steps(tmp_path):
svc = _create_service(tmp_path) svc = _create_service(tmp_path)
prompts = svc.get_profile_prompts("A") prompts = svc.get_profile_prompts("A")
assert set(prompts.keys()) == {"decompose", "filter", "generate"} assert set(prompts.keys()) == {
"decompose", "filter", "generate",
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
}
assert "{question}" in prompts["decompose"] assert "{question}" in prompts["decompose"]
assert "{question}" in prompts["filter"] assert "{question}" in prompts["filter"]
assert "{context_sections}" in prompts["generate"] assert "{question}" in prompts["generate"]
assert "{context}" in prompts["generate"]
assert "{context_sections}" in prompts["generate_per_subq"]
def test_get_profile_prompts_invalid_name_raises(tmp_path): def test_get_profile_prompts_invalid_name_raises(tmp_path):

View File

@ -68,7 +68,10 @@ def test_get_profile_prompts_a_returns_200(client):
data = resp.json() data = resp.json()
assert data["profile_name"] == "A" assert data["profile_name"] == "A"
assert set(data["prompts"].keys()) == {"decompose", "filter", "generate"} assert set(data["prompts"].keys()) == {
"decompose", "filter", "generate",
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
}
def test_get_profile_prompts_invalid_returns_400(client): def test_get_profile_prompts_invalid_returns_400(client):

View File

@ -251,7 +251,7 @@ def test_seed_default_profiles_A_is_active(tmp_path, monkeypatch):
def test_seed_default_profiles_creates_nine_prompts(tmp_path, monkeypatch): def test_seed_default_profiles_creates_nine_prompts(tmp_path, monkeypatch):
"""seed_default_profiles() should insert 9 prompt rows (3 profiles × 3 steps).""" """seed_default_profiles() should insert 21 prompt rows (3 profiles × 7 steps)."""
prompts_path = str(tmp_path / "prompts.db") prompts_path = str(tmp_path / "prompts.db")
history_path = str(tmp_path / "history.db") history_path = str(tmp_path / "history.db")
_patch_settings(monkeypatch, prompts_path, history_path) _patch_settings(monkeypatch, prompts_path, history_path)
@ -268,13 +268,17 @@ def test_seed_default_profiles_creates_nine_prompts(tmp_path, monkeypatch):
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
ORDER BY spp.name, sp.step_name""" ORDER BY spp.name, sp.step_name"""
).fetchall() ).fetchall()
assert len(rows) == 9 assert len(rows) == 21
# Each profile should have all 3 steps # Each profile should have all 7 steps
expected_steps = {
"decompose", "filter", "generate",
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
}
for profile in ("A", "B", "C"): for profile in ("A", "B", "C"):
profile_rows = [r for r in rows if r["profile_name"] == profile] profile_rows = [r for r in rows if r["profile_name"] == profile]
steps = {r["step_name"] for r in profile_rows} steps = {r["step_name"] for r in profile_rows}
assert steps == {"decompose", "filter", "generate"} assert steps == expected_steps
conn.close() conn.close()
@ -311,7 +315,7 @@ def test_seed_default_profiles_contains_expected_templates(tmp_path, monkeypatch
assert "{question}" in filter_row["prompt_template"] assert "{question}" in filter_row["prompt_template"]
assert "{chunks}" in filter_row["prompt_template"] assert "{chunks}" in filter_row["prompt_template"]
# Generate must have {context_sections} # Generate must have {question} and {context}
generate_row = conn.execute( generate_row = conn.execute(
"""SELECT sp.prompt_template """SELECT sp.prompt_template
FROM system_prompts sp FROM system_prompts sp
@ -319,7 +323,50 @@ def test_seed_default_profiles_contains_expected_templates(tmp_path, monkeypatch
WHERE spp.name='A' AND sp.step_name='generate'""" WHERE spp.name='A' AND sp.step_name='generate'"""
).fetchone() ).fetchone()
assert generate_row is not None assert generate_row is not None
assert "{context_sections}" in generate_row["prompt_template"] assert "{question}" in generate_row["prompt_template"]
assert "{context}" in generate_row["prompt_template"]
# Generate per-subq must have {context_sections}
gen_per_subq_row = conn.execute(
"""SELECT sp.prompt_template
FROM system_prompts sp
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
WHERE spp.name='A' AND sp.step_name='generate_per_subq'"""
).fetchone()
assert gen_per_subq_row is not None
assert "{context_sections}" in gen_per_subq_row["prompt_template"]
# Filter intro exists (no required placeholders)
filter_intro_row = conn.execute(
"""SELECT sp.prompt_template
FROM system_prompts sp
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
WHERE spp.name='A' AND sp.step_name='filter_intro'"""
).fetchone()
assert filter_intro_row is not None
assert len(filter_intro_row["prompt_template"]) > 0
# Filter section must have per-subq placeholders
filter_section_row = conn.execute(
"""SELECT sp.prompt_template
FROM system_prompts sp
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
WHERE spp.name='A' AND sp.step_name='filter_section'"""
).fetchone()
assert filter_section_row is not None
assert "{subq_idx}" in filter_section_row["prompt_template"]
assert "{subq_question}" in filter_section_row["prompt_template"]
assert "{chunks}" in filter_section_row["prompt_template"]
# Filter outro exists (no required placeholders)
filter_outro_row = conn.execute(
"""SELECT sp.prompt_template
FROM system_prompts sp
JOIN system_prompt_profiles spp ON sp.profile_id = spp.id
WHERE spp.name='A' AND sp.step_name='filter_outro'"""
).fetchone()
assert filter_outro_row is not None
assert len(filter_outro_row["prompt_template"]) > 0
conn.close() conn.close()
@ -345,7 +392,7 @@ def test_seed_default_profiles_idempotent(tmp_path, monkeypatch):
).fetchone()[0] ).fetchone()[0]
assert profile_count == 3 assert profile_count == 3
assert prompt_count == 9 assert prompt_count == 21
conn.close() conn.close()
@ -361,7 +408,10 @@ def test_all_three_profiles_have_identical_seed_templates(tmp_path, monkeypatch)
init_prompts_db(conn) init_prompts_db(conn)
seed_default_profiles(conn) seed_default_profiles(conn)
for step in ("decompose", "filter", "generate"): for step in (
"decompose", "filter", "generate",
"generate_per_subq", "filter_intro", "filter_section", "filter_outro",
):
rows = conn.execute( rows = conn.execute(
"""SELECT spp.name, sp.prompt_template """SELECT spp.name, sp.prompt_template
FROM system_prompts sp FROM system_prompts sp

View File

@ -31,36 +31,43 @@ def prompt_service():
def test_generate_template_uses_context_sections(prompt_service): def test_generate_template_uses_context_sections(prompt_service):
"""When PromptService returns the generate template, {context_sections} """When PromptService returns the generate_per_subq template,
placeholder must be present and old {context} removed.""" {context_sections} placeholder must be present."""
template = prompt_service.get_prompt_template("generate") template = prompt_service.get_prompt_template("generate_per_subq")
assert "{context_sections}" in template assert "{context_sections}" in template
assert "{context}" not in template
def test_builtin_generate_template_has_context_sections(): def test_builtin_generate_template_has_context_sections():
"""When no prompt_service, the built-in seed template uses """The per-subq seed template (_SEED_GENERATE_PER_SUBQ) uses
{context_sections} instead of {context}.""" {context_sections}, while the flat generate uses {question}/{context}."""
from app.core.sqlite_db import _SEED_GENERATE from app.core.sqlite_db import _SEED_GENERATE_PER_SUBQ, _SEED_GENERATE
assert "{context_sections}" in _SEED_GENERATE assert "{context_sections}" in _SEED_GENERATE_PER_SUBQ
assert "{context}" not in _SEED_GENERATE assert "{question}" in _SEED_GENERATE
assert "{context}" in _SEED_GENERATE
def test_reset_to_defaults_includes_new_generate_template(prompt_service): def test_reset_to_defaults_includes_new_generate_template(prompt_service):
"""After calling reset_to_defaults() on a profile, the generate step """After calling reset_to_defaults(), generate_per_subq uses
uses {context_sections} placeholder.""" {context_sections} and flat generate uses {question}/{context}."""
profile_name = prompt_service.get_active_profile_name() profile_name = prompt_service.get_active_profile_name()
prompt_service.update_prompt(profile_name, "generate", "custom template with {context}") # Modify both templates
prompt_service.update_prompt(profile_name, "generate_per_subq", "custom per-subq")
prompt_service.update_prompt(profile_name, "generate", "custom flat")
modified = prompt_service.get_prompt_template("generate") modified_per_subq = prompt_service.get_prompt_template("generate_per_subq")
assert "{context}" in modified assert "custom per-subq" == modified_per_subq
assert "{context_sections}" not in modified
modified_flat = prompt_service.get_prompt_template("generate")
assert "custom flat" == modified_flat
prompt_service.reset_to_defaults(profile_name) prompt_service.reset_to_defaults(profile_name)
template = prompt_service.get_prompt_template("generate") template_per_subq = prompt_service.get_prompt_template("generate_per_subq")
assert "{context_sections}" in template assert "{context_sections}" in template_per_subq
assert "{context}" not in template
template_flat = prompt_service.get_prompt_template("generate")
assert "{question}" in template_flat
assert "{context}" in template_flat

View File

@ -7,9 +7,12 @@ interface PlaceholderRow {
} }
const PLACEHOLDERS: PlaceholderRow[] = [ const PLACEHOLDERS: PlaceholderRow[] = [
{ placeholder: '{question}', description: 'The user\'s input question' }, { placeholder: '{question}', description: "The user's input question" },
{ placeholder: '{chunks}', description: 'Retrieved document chunks (filter step only)' }, { placeholder: '{chunks}', description: 'Retrieved document chunks (filter step only)' },
{ placeholder: '{context}', description: 'Formatted chunks with citations (generate step only)' }, { placeholder: '{context}', description: 'Formatted chunks with citations (generate step only)' },
{ placeholder: '{context_sections}', description: 'Per-sub-question context sections (generate per-subq only)' },
{ placeholder: '{subq_idx}', description: 'Sub-question index number (filter section only)' },
{ placeholder: '{subq_question}', description: 'Sub-question text (filter section only)' },
] ]
export const PlaceholderDocs: React.FC = () => { export const PlaceholderDocs: React.FC = () => {

View File

@ -15,14 +15,20 @@ export interface PromptEditorProps {
const STEPS = [ const STEPS = [
{ key: 'decompose', label: 'Step 1: Query Decomposition', placeholders: ['question'] }, { key: 'decompose', label: 'Step 1: Query Decomposition', placeholders: ['question'] },
{ key: 'filter', label: 'Step 2: Relevance Filtering', placeholders: ['question', 'chunks'] }, { key: 'filter_intro', label: 'Step 2.1: Filter Intro (Preamble)', placeholders: [] },
{ key: 'generate', label: 'Step 3: Response Generation', placeholders: ['question', 'context'] }, { key: 'filter_section', label: 'Step 2.2: Filter Section (Per Sub-Q)', placeholders: ['subq_idx', 'subq_question', 'chunks'] },
{ key: 'filter_outro', label: 'Step 2.3: Filter Outro (Format)', placeholders: [] },
{ key: 'generate_per_subq', label: 'Step 3: Generate (Per-Sub-Question)', placeholders: ['context_sections'] },
] as const ] as const
const VALID_PLACEHOLDERS: Record<string, string[]> = { const VALID_PLACEHOLDERS: Record<string, string[]> = {
decompose: ['{question}'], decompose: ['{question}'],
filter: ['{question}', '{chunks}'], filter: ['{question}', '{chunks}'],
generate: ['{question}', '{context}'], generate: ['{question}', '{context}'],
generate_per_subq: ['{context_sections}'],
filter_intro: [],
filter_section: ['{subq_idx}', '{subq_question}', '{chunks}'],
filter_outro: [],
} }
const findUnknownPlaceholders = (template: string, stepKey: string): string[] => { const findUnknownPlaceholders = (template: string, stepKey: string): string[] => {

View File

@ -8,16 +8,19 @@ describe('PlaceholderDocs', () => {
expect(screen.getByText('Available Placeholders')).toBeInTheDocument() expect(screen.getByText('Available Placeholders')).toBeInTheDocument()
}) })
it('renders all 3 placeholders: {question}, {chunks}, {context}', () => { it('renders all 6 placeholders', () => {
render(<PlaceholderDocs />) render(<PlaceholderDocs />)
expect(screen.getByText('{question}')).toBeInTheDocument() expect(screen.getByText('{question}')).toBeInTheDocument()
expect(screen.getByText('{chunks}')).toBeInTheDocument() expect(screen.getByText('{chunks}')).toBeInTheDocument()
expect(screen.getByText('{context}')).toBeInTheDocument() expect(screen.getByText('{context}')).toBeInTheDocument()
expect(screen.getByText('{context_sections}')).toBeInTheDocument()
expect(screen.getByText('{subq_idx}')).toBeInTheDocument()
expect(screen.getByText('{subq_question}')).toBeInTheDocument()
}) })
it('renders each placeholder in a monospace element', () => { it('renders each placeholder in a monospace element', () => {
render(<PlaceholderDocs />) render(<PlaceholderDocs />)
const placeholders = ['{question}', '{chunks}', '{context}'] const placeholders = ['{question}', '{chunks}', '{context}', '{context_sections}', '{subq_idx}', '{subq_question}']
placeholders.forEach((ph) => { placeholders.forEach((ph) => {
const el = screen.getByText(ph) const el = screen.getByText(ph)
expect(el.tagName.toLowerCase()).toBe('code') expect(el.tagName.toLowerCase()).toBe('code')
@ -29,6 +32,9 @@ describe('PlaceholderDocs', () => {
expect(screen.getByText("The user's input question")).toBeInTheDocument() expect(screen.getByText("The user's input question")).toBeInTheDocument()
expect(screen.getByText('Retrieved document chunks (filter step only)')).toBeInTheDocument() expect(screen.getByText('Retrieved document chunks (filter step only)')).toBeInTheDocument()
expect(screen.getByText('Formatted chunks with citations (generate step only)')).toBeInTheDocument() expect(screen.getByText('Formatted chunks with citations (generate step only)')).toBeInTheDocument()
expect(screen.getByText('Per-sub-question context sections (generate per-subq only)')).toBeInTheDocument()
expect(screen.getByText('Sub-question index number (filter section only)')).toBeInTheDocument()
expect(screen.getByText('Sub-question text (filter section only)')).toBeInTheDocument()
}) })
it('hides placeholder details when toggle button is clicked', () => { it('hides placeholder details when toggle button is clicked', () => {
@ -39,6 +45,7 @@ describe('PlaceholderDocs', () => {
expect(screen.queryByText('{question}')).not.toBeInTheDocument() expect(screen.queryByText('{question}')).not.toBeInTheDocument()
expect(screen.queryByText('{chunks}')).not.toBeInTheDocument() expect(screen.queryByText('{chunks}')).not.toBeInTheDocument()
expect(screen.queryByText('{context}')).not.toBeInTheDocument() expect(screen.queryByText('{context}')).not.toBeInTheDocument()
expect(screen.queryByText('{context_sections}')).not.toBeInTheDocument()
}) })
it('shows placeholder details again when toggle is clicked twice', () => { it('shows placeholder details again when toggle is clicked twice', () => {
@ -50,5 +57,6 @@ describe('PlaceholderDocs', () => {
fireEvent.click(toggleBtn) fireEvent.click(toggleBtn)
expect(screen.getByText('{question}')).toBeInTheDocument() expect(screen.getByText('{question}')).toBeInTheDocument()
expect(screen.getByText('{context_sections}')).toBeInTheDocument()
}) })
}) })

View File

@ -4,8 +4,10 @@ import { PromptEditor } from '../../components/PromptEditor'
const mockPrompts: Record<string, string> = { const mockPrompts: Record<string, string> = {
decompose: 'Decompose template with {question}', decompose: 'Decompose template with {question}',
filter: 'Filter template with {question} and {chunks}', filter_intro: 'Evaluate each chunk for relevance',
generate: 'Generate template with {question} and {context}', filter_section: 'Sub-question {subq_idx}: "{subq_question}"\n{chunks}',
filter_outro: 'Return JSON object mapping sub-question indices to scores.',
generate_per_subq: 'Generate per-subq with {context_sections}',
} }
const defaultProps = { const defaultProps = {
@ -29,46 +31,54 @@ describe('PromptEditor', () => {
defaultProps.onCancel.mockClear() defaultProps.onCancel.mockClear()
}) })
it('renders 3 textareas for decompose, filter, and generate steps', () => { it('renders 5 textareas for the active steps', () => {
render(<PromptEditor {...defaultProps} />) render(<PromptEditor {...defaultProps} />)
const textareas = screen.getAllByRole('textbox') const textareas = screen.getAllByRole('textbox')
expect(textareas).toHaveLength(3) expect(textareas).toHaveLength(5)
}) })
it('renders labels for each step', () => { it('renders labels for each step', () => {
render(<PromptEditor {...defaultProps} />) render(<PromptEditor {...defaultProps} />)
expect(screen.getByText('Step 1: Query Decomposition')).toBeInTheDocument() expect(screen.getByText('Step 1: Query Decomposition')).toBeInTheDocument()
expect(screen.getByText('Step 2: Relevance Filtering')).toBeInTheDocument() expect(screen.getByText('Step 2.1: Filter Intro (Preamble)')).toBeInTheDocument()
expect(screen.getByText('Step 3: Response Generation')).toBeInTheDocument() expect(screen.getByText('Step 2.2: Filter Section (Per Sub-Q)')).toBeInTheDocument()
expect(screen.getByText('Step 2.3: Filter Outro (Format)')).toBeInTheDocument()
expect(screen.getByText('Step 3: Generate (Per-Sub-Question)')).toBeInTheDocument()
}) })
it('shows valid placeholders in the label area for each step', () => { it('shows valid placeholders in the label area for each step', () => {
render(<PromptEditor {...defaultProps} />) render(<PromptEditor {...defaultProps} />)
// {question} appears in all 3 steps (decompose, filter, generate) // {question} appears only in decompose step
const questionPlaceholders = screen.getAllByText('{question}') expect(screen.getByText('{question}')).toBeInTheDocument()
expect(questionPlaceholders).toHaveLength(3) // {chunks} appears only in filter_section
// {chunks} appears only in filter step
expect(screen.getByText('{chunks}')).toBeInTheDocument() expect(screen.getByText('{chunks}')).toBeInTheDocument()
// {context} appears only in generate step // {context_sections} appears in generate_per_subq
expect(screen.getByText('{context}')).toBeInTheDocument() expect(screen.getByText('{context_sections}')).toBeInTheDocument()
// {subq_idx}, {subq_question} appear in filter_section
expect(screen.getByText('{subq_idx}')).toBeInTheDocument()
expect(screen.getByText('{subq_question}')).toBeInTheDocument()
// {context} is no longer shown (flat generate removed)
expect(screen.queryByText('{context}')).not.toBeInTheDocument()
}) })
it('renders per-step reset buttons that call onResetStep when clicked', () => { it('renders per-step reset buttons that call onResetStep when clicked', () => {
// Mock window.confirm to return true
const confirmSpy = vi.spyOn(window, 'confirm').mockReturnValue(true) const confirmSpy = vi.spyOn(window, 'confirm').mockReturnValue(true)
render(<PromptEditor {...defaultProps} />) render(<PromptEditor {...defaultProps} />)
const resetButtons = screen.getAllByTitle(/reset.*to default/i) const resetButtons = screen.getAllByTitle(/reset.*to default/i)
expect(resetButtons).toHaveLength(3) expect(resetButtons).toHaveLength(5)
fireEvent.click(resetButtons[0]) fireEvent.click(resetButtons[0])
expect(defaultProps.onResetStep).toHaveBeenCalledWith('decompose') expect(defaultProps.onResetStep).toHaveBeenCalledWith('decompose')
fireEvent.click(resetButtons[1]) fireEvent.click(resetButtons[1])
expect(defaultProps.onResetStep).toHaveBeenCalledWith('filter') expect(defaultProps.onResetStep).toHaveBeenCalledWith('filter_intro')
fireEvent.click(resetButtons[2]) fireEvent.click(resetButtons[3])
expect(defaultProps.onResetStep).toHaveBeenCalledWith('generate') expect(defaultProps.onResetStep).toHaveBeenCalledWith('filter_outro')
fireEvent.click(resetButtons[4])
expect(defaultProps.onResetStep).toHaveBeenCalledWith('generate_per_subq')
confirmSpy.mockRestore() confirmSpy.mockRestore()
}) })
@ -133,8 +143,10 @@ describe('PromptEditor', () => {
const textareas = screen.getAllByRole('textbox') const textareas = screen.getAllByRole('textbox')
expect(textareas[0]).toHaveValue('Decompose template with {question}') expect(textareas[0]).toHaveValue('Decompose template with {question}')
expect(textareas[1]).toHaveValue('Filter template with {question} and {chunks}') expect(textareas[1]).toHaveValue('Evaluate each chunk for relevance')
expect(textareas[2]).toHaveValue('Generate template with {question} and {context}') expect(textareas[2]).toHaveValue('Sub-question {subq_idx}: "{subq_question}"\n{chunks}')
expect(textareas[3]).toHaveValue('Return JSON object mapping sub-question indices to scores.')
expect(textareas[4]).toHaveValue('Generate per-subq with {context_sections}')
}) })
it('calls onUpdate when typing in a textarea', () => { it('calls onUpdate when typing in a textarea', () => {
@ -149,13 +161,11 @@ describe('PromptEditor', () => {
it('displays character count for each textarea', () => { it('displays character count for each textarea', () => {
render(<PromptEditor {...defaultProps} />) render(<PromptEditor {...defaultProps} />)
const decomposeLength = mockPrompts.decompose.length expect(screen.getByText(`${mockPrompts.decompose.length} characters`)).toBeInTheDocument()
const filterLength = mockPrompts.filter.length expect(screen.getByText(`${mockPrompts.filter_intro.length} characters`)).toBeInTheDocument()
const generateLength = mockPrompts.generate.length expect(screen.getByText(`${mockPrompts.filter_section.length} characters`)).toBeInTheDocument()
expect(screen.getByText(`${mockPrompts.filter_outro.length} characters`)).toBeInTheDocument()
expect(screen.getByText(`${decomposeLength} characters`)).toBeInTheDocument() expect(screen.getByText(`${mockPrompts.generate_per_subq.length} characters`)).toBeInTheDocument()
expect(screen.getByText(`${filterLength} characters`)).toBeInTheDocument()
expect(screen.getByText(`${generateLength} characters`)).toBeInTheDocument()
}) })
it('shows the profile name being edited', () => { it('shows the profile name being edited', () => {
@ -169,7 +179,6 @@ describe('PromptEditor', () => {
const textareas = screen.getAllByRole('textbox') const textareas = screen.getAllByRole('textbox')
textareas.forEach((ta) => expect(ta).toBeDisabled()) textareas.forEach((ta) => expect(ta).toBeDisabled())
// "Reset All to Defaults" and "Cancel" should be disabled when saving
expect(screen.getByRole('button', { name: /saving\.\.\./i })).toBeDisabled() expect(screen.getByRole('button', { name: /saving\.\.\./i })).toBeDisabled()
expect(screen.getByRole('button', { name: /reset all to defaults/i })).toBeDisabled() expect(screen.getByRole('button', { name: /reset all to defaults/i })).toBeDisabled()
expect(screen.getByRole('button', { name: /cancel/i })).toBeDisabled() expect(screen.getByRole('button', { name: /cancel/i })).toBeDisabled()