fix: tighten skill specs — exact prd.json schema, explicit scaffold, validation

This commit is contained in:
2026-03-27 08:57:40 -04:00
parent fe14d81073
commit fee323a2d6
3 changed files with 220 additions and 248 deletions

View File

@@ -7,40 +7,53 @@ description: "Setup task: scaffold .loop/ directory and generate config for the
Set up the agent loop harness in the current project. This is infrastructure setup, not creative work. Set up the agent loop harness in the current project. This is infrastructure setup, not creative work.
**IMPORTANT:** Do NOT invoke brainstorming, planning, or idea-generation skills. This skill handles its own flow. If the user wants to brainstorm first, they should do that separately before running this skill. **IMPORTANT:** Do NOT invoke brainstorming, planning, or idea-generation skills. This skill handles its own flow.
## What This Skill Does
1. Checks for existing specs/plans in the project (uses them if found)
2. Scaffolds the `.loop/` directory
3. Detects tech stack
4. Picks mode and generates config
5. Flows into `/agent-loop:plan` to decompose the spec into stories
## Instructions ## Instructions
When the user invokes this skill, follow this sequence: Follow these steps exactly. Do not skip or reorder them.
### Step 0: Scaffold .loop/ Directory ### Step 1: Scaffold .loop/ Directory
Check if `.loop/` already exists in the project root. Check if `.loop/` already exists in the project root.
**If it does NOT exist**, create it by copying from the plugin: **If `.loop/` already exists** and contains `prd.json`, ask the user if they want to re-initialize. If yes, delete `.loop/` and continue. If no, skip to Step 3.
1. The plugin's root directory is available at `${CLAUDE_PLUGIN_ROOT}`. Copy the harness files: **Create `.loop/` and copy ALL required harness files.** Run these commands exactly:
```bash ```bash
mkdir -p .loop mkdir -p .loop
cp -r "${CLAUDE_PLUGIN_ROOT}/prompts" .loop/
cp -r "${CLAUDE_PLUGIN_ROOT}/templates" .loop/
cp -r "${CLAUDE_PLUGIN_ROOT}/lib" .loop/
cp "${CLAUDE_PLUGIN_ROOT}/loop.sh" .loop/
chmod +x .loop/loop.sh
``` ```
**IMPORTANT:** If `${CLAUDE_PLUGIN_ROOT}` is not set or the path doesn't exist, look for the files in the plugin's own directory structure. The prompts, templates, and lib directories are bundled with this plugin. Then check if the plugin root has the harness files. Try these paths in order:
1. `${CLAUDE_PLUGIN_ROOT}/prompts/` (if CLAUDE_PLUGIN_ROOT env var is set)
2. `~/.claude/plugins/cache/agent-loop/agent-loop/*/prompts/` (glob for any version)
2. Create `.loop/.gitignore` with runtime artifacts: Copy ALL of these directories and files — every one is required:
```bash
# Find the harness source (plugin cache)
HARNESS_SRC=$(ls -d ~/.claude/plugins/cache/agent-loop/agent-loop/*/prompts/.. 2>/dev/null | head -1)
if [ -n "$HARNESS_SRC" ]; then
cp -r "$HARNESS_SRC/prompts" .loop/
cp -r "$HARNESS_SRC/templates" .loop/
cp -r "$HARNESS_SRC/lib" .loop/
cp "$HARNESS_SRC/loop.sh" .loop/
chmod +x .loop/loop.sh
fi
```
**Verify the copy worked.** Check that these paths exist:
- `.loop/prompts/generator/_base.md`
- `.loop/prompts/evaluator/_base.md`
- `.loop/templates/progress.md.template`
- `.loop/lib/state.sh`
- `.loop/loop.sh`
If any are missing, tell the user the scaffold failed and show which files are missing.
Create `.loop/.gitignore`:
``` ```
prd.json prd.json
@@ -56,74 +69,78 @@ archive/
.loop.lock .loop.lock
``` ```
**If `.loop/` already exists**, ask the user if they want to re-initialize (which resets config but preserves prd.json/progress.md if they exist). ### Step 2: Check for Existing Specs
### Step 1: Check for Existing Specs Search for existing design documents or specs:
Search for existing design documents or specs in the project:
- `docs/superpowers/specs/*.md` - `docs/superpowers/specs/*.md`
- `docs/specs/*.md` - `docs/specs/*.md`
- `docs/*.md` (that look like feature specs)
- `SPEC.md`, `PRD.md`, `DESIGN.md` at root - `SPEC.md`, `PRD.md`, `DESIGN.md` at root
- Any markdown file that contains design/architecture/requirements content
**If a spec is found:** **If a spec is found:**
> "I found an existing spec at `{path}`. I'll use this as the basis for generating stories." > "I found an existing spec at `{path}`. I'll use this as the basis for generating stories."
Read the spec and use it as input for planning. Do NOT ask the user to re-describe what they want — the spec already has it. Skip to Step 3 (mode is almost certainly **implement**). Read it. The user does NOT need to re-describe what they want. Set mode to `implement` and skip to Step 4.
**If no spec is found**, proceed to Step 2. **If no spec is found**, proceed to Step 3.
### Step 2: Mode Selection and Description ### Step 3: Mode Selection
Ask the user: Ask the user:
> **What would you like to do?** > **What would you like to do?**
> >
> a) **Explore** — Analyze the codebase to understand what exists, find issues, and document the system. No code changes. > a) **Explore** — Read-only codebase analysis. No code changes.
> b) **Implement** — Build a new feature from a PRD. Code changes, commits, and tests. > b) **Implement** — Build a feature. Code changes, commits, and tests.
> c) **Fix** — Work through a list of bugs or tech debt items. Targeted code changes. > c) **Fix** — Targeted bug fixes or tech debt.
Based on the mode, ask 2-3 brief clarifying questions. Do NOT over-interview — keep it focused: For **implement** without a spec: "Describe the feature in 1-3 sentences."
For **explore**: "What areas should I focus on?"
For **fix**: "Do you have a list of issues, or should I find them?"
**For Implement:** "Describe the feature in 1-3 sentences." Also read the project to detect the tech stack:
**For Explore:** "What areas should I focus on?" - Check for `package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`, etc.
**For Fix:** "Do you have a list of issues, or should I find them?" - Run `ls` on the project root
### Step 3: Project Discovery
Read the project to understand what we're working with:
- Check for `CLAUDE.md`, `AGENTS.md`, `README.md` at the project root
- Check for `package.json`, `Cargo.toml`, `pyproject.toml`, `go.mod`, `Package.swift`, `composer.json` to identify the tech stack
- Run `ls` on the project root to see the top-level structure
Present a brief summary:
> "I see this is a [language/framework] project with [key characteristics]."
### Step 4: Generate Configuration ### Step 4: Generate Configuration
Create `.loop/config.json` based on the project: Create `.loop/config.json` with this EXACT structure:
```json ```json
{ {
"tool": "claude", "tool": "claude",
"mode": "<selected mode>", "mode": "<implement|explore|fix>",
"maxIterations": <appropriate default>, "maxIterations": 20,
"skipEval": false, "skipEval": false,
"evalRetries": 2, "evalRetries": 2,
"autoHooks": true, "autoHooks": true,
"branchPrefix": "loop/", "branchPrefix": "loop/",
"scopeBudgets": { "scopeBudgets": {
// Set based on project size and mode "explore": {
"maxFilesToRead": 15,
"maxLinesToWrite": 0,
"maxFilesToModify": 0
},
"implement": {
"maxFilesToRead": 50,
"maxLinesToWrite": 500,
"maxFilesToModify": 10
},
"fix": {
"maxFilesToRead": 30,
"maxLinesToWrite": 200,
"maxFilesToModify": 5
}
} }
} }
``` ```
Create `.loop/init.sh` with project-specific setup commands (dev server, test runner, linter, etc.). Make it executable. Adjust `maxIterations` based on estimated story count (stories + 30% for rejections).
Create `.loop/init.sh` with project-specific setup commands. Make it executable.
### Step 5: Flow into Planning ### Step 5: Flow into Planning
Tell the user: Say:
> "Project configured. Generating stories from {spec name / user description}..." > "Project configured. Generating stories..."
Then invoke `/agent-loop:plan` to generate the PRD and sprint contracts. If a spec was found in Step 1, pass it as context so the plan skill uses it directly. Then invoke `/agent-loop:plan`. If a spec was found in Step 2, tell the plan skill to use it.

View File

@@ -1,94 +1,91 @@
--- ---
name: plan name: plan
description: Interactive planning session that generates PRD (prd.json) and sprint contracts for the agent loop. Run /agent-loop:init first. description: Generate PRD (prd.json) with user stories and sprint contracts for the agent loop. Requires .loop/ directory (run /agent-loop:init first).
--- ---
# /plan — Generate PRD and Sprint Contracts # /plan — Generate PRD and Sprint Contracts
Interactive planning session that produces all artifacts needed for the autonomous agent loop. Produces all artifacts needed for the autonomous agent loop.
## Prerequisites ## Prerequisites
- `.loop/` directory must exist with `config.json` (run `/agent-loop:init` first if not) - `.loop/` directory must exist with `config.json` (run `/agent-loop:init` first)
- User should have a clear idea of what they want to build/explore/fix
## Usage
```
/loop-plan <optional feature description>
```
Examples:
- `/loop-plan Add OAuth authentication with Google and GitHub`
- `/loop-plan Explore the payment processing system`
- `/loop-plan Fix all critical security issues from the audit`
## Instructions ## Instructions
Follow these steps exactly.
### Step 1: Understand the Request ### Step 1: Understand the Request
If the user provided a feature description, use it. Otherwise ask: Check if a spec or feature description was passed from `/agent-loop:init`. If so, use it directly.
> "What would you like to work on? Describe it in 1-3 sentences."
Otherwise, check for specs in the project:
- `docs/superpowers/specs/*.md`
- `docs/specs/*.md`
- `SPEC.md`, `PRD.md`, `DESIGN.md`
If still nothing, ask: "What would you like to work on? Describe it in 1-3 sentences."
### Step 2: Codebase Analysis ### Step 2: Codebase Analysis
Read key project files to understand existing patterns: Read key project files to understand existing patterns:
- Relevant source directories for the feature - Relevant source directories
- Existing tests to understand testing patterns - Existing tests
- Configuration files for conventions - Configuration files
- Recent git history (`git log --oneline -20`) for active work - Recent git history (`git log --oneline -20`)
### Step 3: Clarifying Questions ### Step 3: Clarifying Questions
Ask 3-5 targeted questions based on what you found in the code. These should be questions where the answer isn't obvious from the codebase. Examples: Ask 2-3 targeted questions where human judgment is needed. Do NOT ask questions you can answer from the code or spec.
- "I see you have both REST endpoints and GraphQL. Should this feature use REST or GraphQL?"
- "The existing auth uses JWT. Should I add OAuth alongside it or replace it?"
- "I found two competing patterns for data validation. Which should I follow?"
**Do NOT ask questions you can answer from the code.** Only ask when human judgment is needed.
### Step 4: Generate PRD (`prd.json`) ### Step 4: Generate PRD (`prd.json`)
Create `.loop/prd.json` with properly-sized, dependency-ordered stories. **CRITICAL: The prd.json MUST use this EXACT schema.** The loop orchestrator parses this structure. Any deviation will break execution.
**Story Sizing Rules (CRITICAL):** Write `.loop/prd.json` with this structure:
- Each story must be completable in ONE context window (~100K tokens of work)
- Target: 1-3 files changed per story
- Too big: "Build the authentication system" → split into migration, endpoint, middleware, UI, tests
- Too small: "Add import statement" → combine with the story that needs it
**Dependency Ordering:**
1. Schema/database changes first (they block everything)
2. Backend logic (depends on schema)
3. Frontend components (depend on backend)
4. Integration/wiring (depends on components)
5. Polish/edge cases (depends on core being done)
**Required Fields Per Story:**
```json ```json
{ {
"id": "US-001", "project": "<project name>",
"title": "Short descriptive title", "branchName": "loop/<feature-slug>",
"description": "As a [role], I want [feature] so that [benefit].", "description": "<one-line description>",
"acceptanceCriteria": [ "userStories": [
"Specific, verifiable criterion", {
"Another criterion", "id": "US-001",
"Typecheck passes" "title": "Short descriptive title",
], "description": "What this story delivers",
"priority": 1, "acceptanceCriteria": [
"passes": false, "Specific, verifiable criterion 1",
"notes": "", "Specific, verifiable criterion 2"
"rejections": 0 ],
"priority": 1,
"passes": false,
"notes": "",
"rejections": 0
}
]
} }
``` ```
**Acceptance Criteria Rules:** **Schema rules — do NOT deviate:**
- Every criterion must be independently verifiable (not "works well" — "returns 200 with valid token") - Top-level key is `userStories` (array). NOT `sprints`, NOT `stories`, NOT `tasks`.
- Always include "Typecheck passes" (or equivalent for the language) - Each story has: `id`, `title`, `description`, `acceptanceCriteria`, `priority`, `passes`, `notes`, `rejections`
- UI stories must include "Verify UI renders and responds to interaction" - `id` format: `US-001`, `US-002`, etc.
- API stories must include status code expectations - `passes` is always `false` initially
- Database stories must include migration success check - `notes` is always `""` initially
- `rejections` is always `0` initially
- `priority` is a number (1 = highest). No two stories share a priority.
- `branchName` must be set — the loop uses it for git checkout
**Story sizing:**
- Each story must be completable in ONE agent context window
- Target: 1-3 files changed per story
- Too big → split. Too small → combine.
**Acceptance criteria rules:**
- Every criterion must be independently verifiable by the evaluator
- NOT "works well" — instead "function returns X when given Y"
- Include quality gates: "No lint errors", "Tests pass", etc.
### Step 5: Generate Sprint Contracts ### Step 5: Generate Sprint Contracts
@@ -98,7 +95,7 @@ For each story, create `.loop/contracts/{story-id}.contract.md`:
# Sprint Contract: {Story ID} — {Story Title} # Sprint Contract: {Story ID} — {Story Title}
## What Will Be Built ## What Will Be Built
Concrete description of the deliverable. Not the user story — the actual thing being built. Concrete description of the deliverable.
## Done Conditions ## Done Conditions
- [ ] Condition 1 (specific, testable) - [ ] Condition 1 (specific, testable)
@@ -109,35 +106,30 @@ Concrete description of the deliverable. Not the user story — the actual thing
What the evaluator will specifically check: What the evaluator will specifically check:
- [ ] Check 1 - [ ] Check 1
- [ ] Check 2 - [ ] Check 2
- [ ] No regressions in [specific area] - [ ] No regressions in existing functionality
## Out of Scope ## Out of Scope
Things explicitly NOT part of this story:
- Thing 1 - Thing 1
- Thing 2 - Thing 2
## Key Files ## Key Files
Files likely to be created or modified:
- path/to/file.ext — what changes - path/to/file.ext — what changes
- path/to/other.ext — what changes
## Dependencies ## Dependencies
- Depends on: [story IDs that must be done first, or "none"] - Depends on: [story IDs or "none"]
- Blocks: [story IDs that depend on this one, or "none"] - Blocks: [story IDs or "none"]
``` ```
### Step 6: Initialize Progress File ### Step 6: Initialize Progress File
Create `.loop/progress.md` from the template with an initial Codebase Patterns section populated from what you learned during analysis: Create `.loop/progress.md`:
```markdown ```markdown
# Progress # Progress
## Codebase Patterns ## Codebase Patterns
- [Pattern you discovered during analysis] - [Patterns discovered during analysis]
- [Convention you noticed]
- [Testing approach used in the project]
--- ---
@@ -146,56 +138,38 @@ Create `.loop/progress.md` from the template with an initial Codebase Patterns s
### Planning Session ### Planning Session
Date: YYYY-MM-DD HH:MM Date: YYYY-MM-DD HH:MM
**PRD created:** {N} stories for "{feature description}" **PRD created:** {N} stories for "{description}"
**Estimated iterations:** {N stories + ~30% for evaluator rejections} **Estimated iterations:** {N + 30%}
**Key decisions:**
- [Decision 1 and why]
- [Decision 2 and why]
--- ---
``` ```
### Step 7: Present Summary ### Step 7: Present Summary
Show the user a summary: Show the user:
> **Plan Ready** > **Plan Ready — Review Before Running**
> >
> | Stories | Est. Iterations | Mode | Branch | > | Stories | Est. Iterations | Mode | Branch |
> |---------|----------------|------|--------| > |---------|----------------|------|--------|
> | {N} | {N+30%} | {mode} | {branchName} | > | {N} | {N+30%} | {mode} | {branchName} |
> >
> **Story Overview:** > **Stories:**
> 1. US-001: {title} (priority 1) > 1. US-001: {title}
> 2. US-002: {title} (priority 2) > 2. US-002: {title}
> ... > ...
> >
> Review the stories in `.loop/prd.json` and contracts in `.loop/contracts/`. > **Review these files before running:**
> - `.loop/prd.json` — stories and acceptance criteria
> - `.loop/contracts/` — done conditions and scope per story
>
> Adjust anything you'd like, then run: > Adjust anything you'd like, then run:
> ``` > ```
> /agent-loop:run # Interactive (recommended)
> .loop/loop.sh # Headless
> ```
### Step 8: Review Before Execution
Tell the user:
> **Review the plan before running.** The stories and contracts are designed to be human-reviewed and adjusted before handing off to the autonomous loop.
>
> **Files to review:**
> - `.loop/prd.json` — stories, acceptance criteria, priorities
> - `.loop/contracts/` — sprint contracts with done conditions and scope
>
> **Common adjustments:**
> - Split a story that's too large
> - Reorder priorities
> - Tighten or loosen acceptance criteria
> - Add/remove stories
> - Adjust scope in contracts
>
> Let me know what changes you'd like, or when you're happy with the plan, run:
> ```
> /agent-loop:run > /agent-loop:run
> ``` > ```
Wait for the user to review. If they request changes, make them and re-present the summary. Do NOT automatically start the loop — the user must explicitly invoke `/agent-loop:run` when they're ready. ### Step 8: Wait for Review
Wait for the user to review. If they request changes, make them and re-present.
**Do NOT automatically start the loop.** The user must explicitly invoke `/agent-loop:run` when ready.

View File

@@ -1,43 +1,47 @@
--- ---
name: run name: run
description: Execute the generator-evaluator loop interactively inside Claude Code. Dispatches subagents with full visibility and intervention capability. Run /agent-loop:init and /agent-loop:plan first. description: Execute the generator-evaluator loop interactively inside Claude Code. Dispatches subagents with full visibility. Run /agent-loop:init and /agent-loop:plan first.
--- ---
# /run — Execute Agent Loop Inside Claude Code # /run — Execute Agent Loop Inside Claude Code
Run the generator-evaluator loop natively in Claude Code using subagents. Unlike `loop.sh` (headless), this gives you full visibility into each agent's work and the ability to intervene at any point. Run the generator-evaluator loop natively in Claude Code. You see every tool call and can intervene at any point.
## Usage ## Usage
``` ```
/agent-loop:run # Run until all stories pass or max iterations /agent-loop:run # Run until all stories pass or max iterations
/agent-loop:run 3 # Run at most 3 iterations /agent-loop:run 3 # Run at most 3 iterations
/agent-loop:run --skip-eval # Skip evaluator (generator marks stories done) /agent-loop:run --skip-eval # Skip evaluator pass
/agent-loop:run --story US-003 # Run only a specific story /agent-loop:run --story US-003 # Run only a specific story
``` ```
## Prerequisites
- `.loop/config.json` exists (run `/agent-loop:init` first)
- `.loop/prd.json` exists with stories (run `/agent-loop:plan` first)
## Instructions ## Instructions
When the user invokes `/loop-run`, follow this orchestration sequence exactly. Follow this orchestration sequence exactly.
### Step 0: Parse Arguments ### Step 0: Validate Prerequisites
- If a number is provided, use it as max iterations. Otherwise read `maxIterations` from `.loop/config.json`. 1. Check `.loop/config.json` exists. If not: tell user to run `/agent-loop:init` and stop.
- If `--skip-eval` is provided, skip the evaluator pass. 2. Check `.loop/prd.json` exists. If not: tell user to run `/agent-loop:plan` and stop.
- If `--story <ID>` is provided, only work on that specific story. 3. **Validate prd.json schema.** Read the file and verify:
- Has a `userStories` array (NOT `sprints`, `stories`, or `tasks`)
- Each story has: `id`, `title`, `passes`, `priority`
- If validation fails, show the error and stop. Do NOT attempt to fix it automatically.
4. Check prompts exist. Look for `.loop/prompts/generator/_base.md` in these locations (first match wins):
- `.loop/prompts/` (local project copy)
- `${CLAUDE_PLUGIN_ROOT}/prompts/` (plugin install)
- `~/.claude/plugins/cache/agent-loop/agent-loop/*/prompts/` (plugin cache)
### Step 1: Load State Save the resolved prompt base path for later use. If no prompts found, tell user to run `/agent-loop:init` and stop.
1. Read `.loop/config.json` — get `mode`, `maxIterations`, `evalRetries`, `scopeBudgets` ### Step 1: Parse Arguments and Load State
2. Read `.loop/prd.json` — get the story list and their statuses
3. Check `.loop/progress.md` exists; if not, create it from `.loop/templates/progress.md.template`
Report to the user: - Parse arguments: number → max iterations, `--skip-eval`, `--story <ID>`
- Read `.loop/config.json` for defaults
- Read `.loop/prd.json` for story list
Report:
> **Loop Ready** > **Loop Ready**
> - Mode: {mode} > - Mode: {mode}
@@ -45,7 +49,7 @@ Report to the user:
> - Max iterations: {N} > - Max iterations: {N}
> - Eval: {on/off} > - Eval: {on/off}
> >
> Starting loop. You can interrupt me at any time to adjust course. > Starting. Interrupt me at any time.
### Step 2: Iteration Loop ### Step 2: Iteration Loop
@@ -53,45 +57,42 @@ For each iteration (1 to max iterations):
#### 2a. Find Next Story #### 2a. Find Next Story
Find the highest-priority story in `prd.json` where `passes` is `false` and `blocked` is not `true`. If `--story` was specified, use that story instead. Find the story with the lowest `priority` number where `passes` is `false` and `blocked` is not `true`. If `--story` was specified, use that story.
**If no actionable story remains:** **If no actionable story remains:**
- If all stories have `passes: true` → report success and stop - All `passes: true` → report success and stop
- If some stories are `blocked: true` → report which are blocked and suggest `/agent-loop:triage` - Some `blocked: true` → report which and suggest `/agent-loop:triage`
- Stop the loop - Stop the loop
#### 2b. Report Iteration Start #### 2b. Report Iteration Start
Tell the user:
> **Iteration {N}/{max} — {story.id}: {story.title}** > **Iteration {N}/{max} — {story.id}: {story.title}**
If the story has `[REJECTED]` entries in its `notes` field, summarize the previous feedback so the user has context. If the story has `[REJECTED]` in its `notes`, summarize the feedback.
#### 2c. Assemble Generator Prompt #### 2c. Assemble Generator Prompt
Read these files and concatenate them with `---` separators: Read and concatenate with `---` separator:
1. `.loop/prompts/generator/_base.md` 1. `{prompt_base_path}/generator/_base.md`
2. `.loop/prompts/generator/{mode}.md` 2. `{prompt_base_path}/generator/{mode}.md`
Then substitute these template variables in the assembled text: Substitute template variables:
- `{{MAX_FILES_TO_READ}}` → from `config.scopeBudgets.{mode}.maxFilesToRead` - `{{MAX_FILES_TO_READ}}` → from config scopeBudgets
- `{{MAX_LINES_TO_WRITE}}` → from `config.scopeBudgets.{mode}.maxLinesToWrite` - `{{MAX_LINES_TO_WRITE}}` → from config scopeBudgets
- `{{MAX_FILES_TO_MODIFY}}` → from `config.scopeBudgets.{mode}.maxFilesToModify` - `{{MAX_FILES_TO_MODIFY}}` → from config scopeBudgets
- `{{MODE}}`the mode - `{{MODE}}` → mode from config
- `{{ITERATION}}` → current iteration number - `{{ITERATION}}` → current iteration
- `{{MAX_ITERATIONS}}` → max iterations - `{{MAX_ITERATIONS}}` → max iterations
- `{{LOOP_DIR}}` → path to `.loop/` directory - `{{LOOP_DIR}}` absolute path to `.loop/`
- `{{PROJECT_ROOT}}` → project root path - `{{PROJECT_ROOT}}` → project root absolute path
- `{{CURRENT_STORY_ID}}` the story ID being worked on - `{{CURRENT_STORY_ID}}` → story ID
#### 2d. Capture Pre-Generator Git State #### 2d. Capture Pre-Generator Git State
Run `git rev-parse HEAD` and save it. This is needed for the evaluator's diff. Run `git rev-parse HEAD` and save the SHA.
#### 2e. Dispatch Generator Agent #### 2e. Dispatch Generator Agent
Use the **Agent tool** to launch the generator:
``` ```
Agent( Agent(
prompt: <assembled generator prompt>, prompt: <assembled generator prompt>,
@@ -101,32 +102,28 @@ Agent(
) )
``` ```
**IMPORTANT:** Use `mode: "auto"` so the user can see tool calls but isn't prompted for every action. If the user has expressed a preference for more control, use `mode: "default"` instead. Wait for completion.
Wait for the agent to complete. The Agent tool returns the generator's final output.
#### 2f. Check for Completion Signal #### 2f. Check for Completion Signal
If the generator output contains `<promise>COMPLETE</promise>`, report all stories complete and stop. If output contains `<promise>COMPLETE</promise>`, report all stories complete and stop.
#### 2g. Skip Evaluator (if configured) #### 2g. Skip Evaluator (if configured)
If `--skip-eval` was specified or `config.skipEval` is true, skip to step 2j. If `--skip-eval` or `config.skipEval` is true, skip to 2j and treat as PASS.
#### 2h. Assemble Evaluator Prompt #### 2h. Assemble Evaluator Prompt
Read these files and concatenate them: Read and concatenate:
1. `.loop/prompts/evaluator/_base.md` 1. `{prompt_base_path}/evaluator/_base.md`
2. `.loop/prompts/evaluator/{mode}.md` 2. `{prompt_base_path}/evaluator/{mode}.md`
Substitute the same template variables as the generator, plus: Substitute same variables plus:
- `{{PRE_GENERATOR_SHA}}`the git SHA captured in step 2d - `{{PRE_GENERATOR_SHA}}`SHA from step 2d
- `{{CURRENT_STORY_ID}}` the story ID - `{{CURRENT_STORY_ID}}` → story ID
#### 2i. Dispatch Evaluator Agent #### 2i. Dispatch Evaluator Agent
Use the **Agent tool** to launch the evaluator:
``` ```
Agent( Agent(
prompt: <assembled evaluator prompt>, prompt: <assembled evaluator prompt>,
@@ -136,29 +133,24 @@ Agent(
) )
``` ```
Wait for completion. Parse the verdict from the output: Parse the verdict:
- `<verdict>PASS</verdict>` → PASS
- `<verdict>REJECT</verdict>` → REJECT, extract `<rejection_reason>...</rejection_reason>`
- No verdict tag → REJECT (fail-safe)
- Look for `<verdict>PASS</verdict>` → story passes #### 2j. Update State
- Look for `<verdict>REJECT</verdict>` → story rejected; extract reason from `<rejection_reason>...</rejection_reason>`
- No verdict tag found → treat as REJECT (fail-safe)
#### 2j. Update State Based on Verdict **On PASS:**
1. Read `.loop/prd.json`, set `passes: true` for the story, write it back
**On PASS (or skip-eval):** 2. Report: **{story.id} PASSED**
1. Update `.loop/prd.json` — set `passes: true` for the story
2. Report to user: ✓ **{story.id} PASSED**
**On REJECT:** **On REJECT:**
1. Update `.loop/prd.json`: 1. Read `.loop/prd.json`, increment `rejections`, append `[REJECTED] {reason}` to `notes`, write back
- Keep `passes: false` 2. Report: **{story.id} REJECTED** — {reason}
- Increment `rejections` count 3. If `rejections >= evalRetries`: set `blocked: true`, append `[BLOCKED]` to notes
- Append `[REJECTED] {reason}` to `notes` - Report: **{story.id} BLOCKED** — rejected {N} times, needs human review
2. Report to user: ✗ **{story.id} REJECTED** — {reason}
3. Check if `rejections` >= `evalRetries` from config:
- If yes: set `blocked: true` in prd.json, append `[BLOCKED]` to notes
- Report: ⚠ **{story.id} BLOCKED** — rejected {N} times, needs human review
#### 2k. Append Progress Entry #### 2k. Append Progress
Append to `.loop/progress.md`: Append to `.loop/progress.md`:
@@ -171,33 +163,22 @@ Verdict: {PASS/REJECT/SKIP-EVAL}
--- ---
``` ```
#### 2l. Report Iteration Summary #### 2l. Report and Continue
Show current story counts: `{passed}/{total} stories complete` Show: `{passed}/{total} stories complete`
If there are more iterations and more stories, continue to the next iteration. Continue to next iteration.
### Step 3: Loop Exit ### Step 3: Loop Exit
When the loop ends (all stories done, max iterations, or all remaining blocked), report:
> **Loop Complete** > **Loop Complete**
> - Iterations used: {N} > - Iterations used: {N}
> - Stories: {passed}/{total} complete, {blocked} blocked > - Stories: {passed}/{total} complete, {blocked} blocked
> - {Suggest `/agent-loop:triage` if anything is blocked or incomplete}
If incomplete, suggest `/agent-loop:triage`.
### Error Handling ### Error Handling
- If an Agent subagent fails or returns empty output, log a warning and continue to the next iteration. Do NOT stop the loop for a single agent failure. - Agent fails or empty output warn and continue to next iteration
- If `prd.json` cannot be parsed, stop immediately and report the error. - prd.json unparseable → stop immediately
- If the user interrupts (denies a tool call, says "stop", etc.), gracefully end the loop and report current status. - User says "stop" → end loop, report current status
### Key Differences from loop.sh
| Feature | loop.sh | /loop-run |
|---------|---------|-----------|
| Execution | Headless (`claude --print`) | Visible in Claude Code |
| Intervention | Kill the process | Deny tool calls, chat mid-loop |
| Permissions | `--dangerously-skip-permissions` | User-controlled |
| Context | Fresh process per agent | Fresh Agent subagent per agent |
| State updates | Shell functions | Claude Code reads/writes files directly |