fix: critical bugs, stale refs, README rewrite, security fixes
- Fix evaluator bypass on last story (moved completion check) - Fix all stale command name references across README, loop.sh, skills, plugin.json - Fix explore evaluator false rejects (.loop/ files are expected) - Fix stderr capture order in headless mode - Fix shell injection risk in hooks.sh python fallback - Remove .DS_Store from tracking - Rewrite README to match current architecture (single entry point, tmux, optional tools) - Add XcodeBuildMCP and iOS simulator MCP to optional tools docs
This commit is contained in:
@@ -1,7 +1,7 @@
|
|||||||
{
|
{
|
||||||
"name": "agent-loop",
|
"name": "agent-loop",
|
||||||
"version": "0.8.0",
|
"version": "0.8.0",
|
||||||
"description": "Autonomous generator-evaluator agent loop for long-running coding tasks. Plan with /agent-loop:init, then execute with /agent-loop:run.",
|
"description": "Autonomous generator-evaluator agent loop for long-running coding tasks. Run /agent-loop:run to start.",
|
||||||
"author": {
|
"author": {
|
||||||
"name": "Sheldon"
|
"name": "Sheldon"
|
||||||
},
|
},
|
||||||
|
|||||||
150
README.md
150
README.md
@@ -4,9 +4,7 @@ Autonomous AI agent harness that combines a generator-evaluator architecture wit
|
|||||||
|
|
||||||
Inspired by [Geoffrey Huntley's Ralph pattern](https://ghuntley.com/ralph/) and [Anthropic's harness design research](https://www.anthropic.com/engineering/harness-design-long-running-apps).
|
Inspired by [Geoffrey Huntley's Ralph pattern](https://ghuntley.com/ralph/) and [Anthropic's harness design research](https://www.anthropic.com/engineering/harness-design-long-running-apps).
|
||||||
|
|
||||||
A generator-evaluator loop runs fresh agent instances per iteration. Each iteration: a **Generator** does the work, then an **Evaluator** verifies it. Human judgment stays in the planning phase; execution is autonomous.
|
A generator-evaluator loop runs fresh Claude Code sessions per iteration. Each iteration: a **Generator** does the work, then an **Evaluator** verifies it. Human judgment stays in the planning phase; execution is autonomous with full visibility.
|
||||||
|
|
||||||
Two execution modes: **headless** via `loop.sh` (fully autonomous bash process) or **interactive** via `/loop-run` (Claude Code-native with full visibility and intervention).
|
|
||||||
|
|
||||||
## Install
|
## Install
|
||||||
|
|
||||||
@@ -20,46 +18,32 @@ Two execution modes: **headless** via `loop.sh` (fully autonomous bash process)
|
|||||||
Then in any project:
|
Then in any project:
|
||||||
|
|
||||||
```
|
```
|
||||||
/agent-loop:init # Set up the loop for your project
|
/agent-loop:run
|
||||||
/agent-loop:plan # Generate PRD and sprint contracts
|
|
||||||
/agent-loop:run # Run the loop interactively
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
That's it. The single command handles setup, planning, and execution.
|
||||||
|
|
||||||
### Manual Install
|
### Manual Install
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Clone into your project
|
|
||||||
cp -r /path/to/loop-loop .loop
|
cp -r /path/to/loop-loop .loop
|
||||||
|
|
||||||
# Install skills as Claude Code commands
|
|
||||||
mkdir -p .claude/commands
|
|
||||||
for skill in loop-init loop-plan loop-run loop-triage; do
|
|
||||||
ln -sf "../../.loop/skills/$skill/SKILL.md" ".claude/commands/$skill.md"
|
|
||||||
done
|
|
||||||
|
|
||||||
# Then in Claude Code:
|
|
||||||
/loop-init && /loop-plan && /loop-run
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Then run `.loop/loop.sh` directly.
|
||||||
|
|
||||||
## How It Works
|
## How It Works
|
||||||
|
|
||||||
```
|
```
|
||||||
[You + Claude Code] [Loop Execution]
|
/agent-loop:run
|
||||||
|
├─ Phase 1: Scaffold .loop/ (if needed)
|
||||||
/agent-loop:init Interactive (/agent-loop:run)
|
├─ Phase 2: Generate stories from spec (if needed)
|
||||||
→ scaffolds .loop/ └─ dispatches Agent subagents
|
│ └─ Presents stories for human review
|
||||||
→ detects project └─ visible tool calls, can intervene
|
│ └─ STOPS — user reviews and says "go"
|
||||||
→ picks mode └─ chat mid-loop to adjust course
|
└─ Phase 3: Launch loop in tmux
|
||||||
→ creates config.json
|
├─→ Generator → picks story → implements → commits
|
||||||
Headless (.loop/loop.sh)
|
├─→ Evaluator → verifies → PASS or REJECT
|
||||||
/agent-loop:plan └─ spawns claude --print per iteration
|
├─→ next iteration (fresh CC session each time)
|
||||||
→ asks clarifying questions └─ fully autonomous, no UI
|
└─→ all stories pass → done
|
||||||
→ generates prd.json
|
|
||||||
→ generates sprint contracts Both paths:
|
|
||||||
→ populates progress.md ├─→ Generator → picks story → implements → commits
|
|
||||||
├─→ Evaluator → verifies → PASS or REJECT
|
|
||||||
├─→ next iteration...
|
|
||||||
└─→ all stories pass → done
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Modes
|
## Modes
|
||||||
@@ -70,47 +54,48 @@ done
|
|||||||
| **explore** | Read-only codebase analysis | No |
|
| **explore** | Read-only codebase analysis | No |
|
||||||
| **fix** | Targeted bug fixes / tech debt | Yes |
|
| **fix** | Targeted bug fixes / tech debt | Yes |
|
||||||
|
|
||||||
## Running the Loop
|
## Monitoring
|
||||||
|
|
||||||
### Option A: Interactive (`/loop-run`) — Recommended
|
After the loop launches in tmux:
|
||||||
|
|
||||||
Run inside Claude Code. You see every tool call, file edit, and test run. You can intervene at any point — deny a tool call, chat to adjust course, or stop the loop.
|
|
||||||
|
|
||||||
```
|
|
||||||
/loop-run # Run until done or max iterations
|
|
||||||
/loop-run 3 # Run at most 3 iterations
|
|
||||||
/loop-run --skip-eval # Skip evaluator pass
|
|
||||||
/loop-run --story US-003 # Run only a specific story
|
|
||||||
```
|
|
||||||
|
|
||||||
### Option B: Headless (`loop.sh`)
|
|
||||||
|
|
||||||
Run as a standalone bash process. Fully autonomous — no UI, no intervention. Useful for background execution or CI.
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
.loop/loop.sh [options]
|
# Watch live (from Claude Code)
|
||||||
|
! tmux attach -t agent-loop
|
||||||
|
|
||||||
|
# Detach back to Claude Code
|
||||||
|
Ctrl+B then D
|
||||||
|
|
||||||
|
# Stop the loop
|
||||||
|
Ctrl+C in the tmux session
|
||||||
|
```
|
||||||
|
|
||||||
|
Or ask Claude Code "status" — it reads `.loop/prd.json` and `.loop/progress.md`.
|
||||||
|
|
||||||
|
## Headless Mode
|
||||||
|
|
||||||
|
For CI or background execution without the interactive UI:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.loop/loop.sh --headless [options]
|
||||||
|
|
||||||
--mode <implement|explore|fix> Operating mode
|
--mode <implement|explore|fix> Operating mode
|
||||||
--max <N> Maximum iterations (default: 20)
|
--max <N> Maximum iterations (default: 20)
|
||||||
--skip-eval Skip evaluator pass
|
--skip-eval Skip evaluator pass
|
||||||
--tool <claude|amp> AI tool to use
|
--dry-run Print assembled prompts without running
|
||||||
--no-hooks Don't install stop hooks
|
|
||||||
--dry-run Print assembled prompts without running agents
|
|
||||||
--resume Skip already-passed stories (explicit exit when none remain)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
### Generator
|
### Generator
|
||||||
Fresh Claude Code instance each iteration. Reads `prd.json` to find the highest-priority incomplete story, reads the sprint contract, implements the story, runs quality gates, commits, and marks it done.
|
Fresh Claude Code session each iteration. Reads `prd.json` to find the highest-priority incomplete story, reads the sprint contract, implements the story, runs quality gates, commits, and marks it done.
|
||||||
|
|
||||||
### Evaluator
|
### Evaluator
|
||||||
Separate fresh instance after each generator pass. Skeptically verifies the work: checks acceptance criteria against actual code, runs tests independently, and issues a `PASS` or `REJECT` verdict. Rejection sends the story back to the generator with specific feedback.
|
Separate fresh session after each generator pass. Skeptically verifies the work: checks acceptance criteria against actual code, runs tests and the application, and issues a `PASS` or `REJECT` verdict. Rejection sends the story back with specific feedback.
|
||||||
|
|
||||||
Evaluator skepticism is deliberately tuned — Claude's default tendency is to rationalize away issues. The evaluator prompt includes explicit bias correction.
|
Evaluator skepticism is deliberately tuned — Claude's default tendency is to rationalize away issues. The evaluator prompt includes explicit bias correction.
|
||||||
|
|
||||||
### Sprint Contracts
|
### Sprint Contracts
|
||||||
Before the loop starts, `/loop-plan` generates contracts for each story. These define "done" conditions that both generator and evaluator reference, eliminating ambiguity about whether work is complete.
|
Before the loop starts, the planner generates contracts for each story. These define "done" conditions that both generator and evaluator reference, eliminating ambiguity about whether work is complete.
|
||||||
|
|
||||||
### State Persistence
|
### State Persistence
|
||||||
|
|
||||||
@@ -122,59 +107,36 @@ Before the loop starts, `/loop-plan` generates contracts for each story. These d
|
|||||||
| `config.json` | Harness configuration |
|
| `config.json` | Harness configuration |
|
||||||
| Git commits | Code changes with story-tagged messages |
|
| Git commits | Code changes with story-tagged messages |
|
||||||
|
|
||||||
## File Structure
|
## Optional: Runtime Testing Tools
|
||||||
|
|
||||||
```
|
The evaluator verifies code actually runs, not just that it looks correct. It uses whatever tools are available. For richer verification, install these optional MCP servers:
|
||||||
.loop/
|
|
||||||
loop.sh # Main loop orchestrator
|
|
||||||
config.json # Project config (generated by /loop-init)
|
|
||||||
init.sh # Project setup script (generated by /loop-init)
|
|
||||||
prd.json # Active PRD (generated by /loop-plan)
|
|
||||||
progress.md # Cross-session memory (append-only)
|
|
||||||
|
|
||||||
prompts/
|
|
||||||
generator/_base.md # Shared generator instructions
|
|
||||||
generator/implement.md # Implement mode overlay
|
|
||||||
generator/explore.md # Explore mode overlay
|
|
||||||
generator/fix.md # Fix mode overlay
|
|
||||||
evaluator/_base.md # Skeptical evaluator base
|
|
||||||
evaluator/implement.md # Implement verification
|
|
||||||
evaluator/explore.md # Analysis verification
|
|
||||||
evaluator/fix.md # Fix verification
|
|
||||||
planner/plan.md # Planning context
|
|
||||||
|
|
||||||
templates/ # Reference templates
|
|
||||||
lib/ # Shell library functions
|
|
||||||
skills/ # Claude Code skills (/loop-init, /loop-plan, /loop-run, /loop-triage)
|
|
||||||
contracts/ # Sprint contracts (generated by /loop-plan)
|
|
||||||
triage/ # Analysis output (explore mode)
|
|
||||||
archive/ # Completed feature archives
|
|
||||||
```
|
|
||||||
|
|
||||||
## Browser Testing (Optional)
|
|
||||||
|
|
||||||
The evaluator includes basic runtime verification for web projects (starts a local server, checks HTTP response). For full browser testing with console error detection and screenshots, install the Playwright MCP server:
|
|
||||||
|
|
||||||
|
**Web projects (Playwright):**
|
||||||
```bash
|
```bash
|
||||||
claude mcp add playwright npx @playwright/mcp@latest --headless --browser=chromium
|
claude mcp add playwright npx @playwright/mcp@latest --headless --browser=chromium
|
||||||
```
|
```
|
||||||
|
|
||||||
When Playwright is available, the evaluator will use it to:
|
**iOS/Xcode projects (XcodeBuildMCP):**
|
||||||
- Navigate to the running application
|
```bash
|
||||||
- Check for JavaScript console errors
|
brew tap getsentry/xcodebuildmcp && brew install xcodebuildmcp
|
||||||
- Take screenshots for visual verification
|
claude mcp add xcodebuild -- xcodebuildmcp
|
||||||
- Reject stories with runtime errors
|
```
|
||||||
|
|
||||||
This is optional — the evaluator works without it, but may miss runtime issues that only surface in a browser.
|
**iOS Simulator interaction:**
|
||||||
|
```bash
|
||||||
|
claude mcp add ios-simulator -- npx -y ios-simulator-mcp
|
||||||
|
```
|
||||||
|
|
||||||
|
These are optional — the evaluator works without them but may miss runtime-only issues.
|
||||||
|
|
||||||
## Design Principles
|
## Design Principles
|
||||||
|
|
||||||
- **Fresh context per iteration** — no accumulated hallucination drift
|
- **Fresh context per iteration** — no accumulated hallucination drift
|
||||||
- **Separate generation from evaluation** — external skepticism is easier to tune than self-criticism
|
- **Separate generation from evaluation** — external skepticism is easier to tune than self-criticism
|
||||||
- **Human judgment for planning, AI for execution** — interactive `/loop-plan`, autonomous loop
|
- **Human judgment for planning, AI for execution** — human reviews stories, loop executes autonomously
|
||||||
- **Structured handoffs via artifacts** — not conversation memory
|
- **Structured handoffs via artifacts** — not conversation memory
|
||||||
- **No git revert on rejection** — next generator sees partial work + feedback (more signal)
|
- **No git revert on rejection** — next generator sees partial work + feedback (more signal)
|
||||||
- **Advisory scope budgets** — prompt-enforced limits on files read/written per iteration
|
- **Tool-agnostic** — evaluator uses whatever tools are available, no hardcoded dependencies
|
||||||
|
|
||||||
## Credits
|
## Credits
|
||||||
|
|
||||||
|
|||||||
@@ -5,7 +5,7 @@
|
|||||||
# 1. Copies the harness to ~/.claude/loop/ (prompts, templates, lib, loop.sh)
|
# 1. Copies the harness to ~/.claude/loop/ (prompts, templates, lib, loop.sh)
|
||||||
# 2. Installs skills as Claude Code commands at ~/.claude/commands/
|
# 2. Installs skills as Claude Code commands at ~/.claude/commands/
|
||||||
#
|
#
|
||||||
# After install, use /loop-init in any project to get started.
|
# After install, use /agent-loop:run in any project to get started.
|
||||||
#
|
#
|
||||||
# Usage:
|
# Usage:
|
||||||
# ./install.sh # Install
|
# ./install.sh # Install
|
||||||
@@ -18,7 +18,7 @@ HARNESS_DIR="$CLAUDE_DIR/loop"
|
|||||||
COMMANDS_DIR="$CLAUDE_DIR/commands"
|
COMMANDS_DIR="$CLAUDE_DIR/commands"
|
||||||
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
||||||
|
|
||||||
SKILLS=(loop-init loop-plan loop-run loop-triage)
|
SKILLS=(setup stories run triage)
|
||||||
|
|
||||||
# --- Colors (if terminal supports them) ---
|
# --- Colors (if terminal supports them) ---
|
||||||
if [ -t 1 ]; then
|
if [ -t 1 ]; then
|
||||||
@@ -100,9 +100,7 @@ info "${BOLD}Installation complete.${RESET}"
|
|||||||
echo ""
|
echo ""
|
||||||
echo " Next steps (inside Claude Code, in any project):"
|
echo " Next steps (inside Claude Code, in any project):"
|
||||||
echo ""
|
echo ""
|
||||||
echo " /loop-init # Set up the loop for your project"
|
echo " /agent-loop:run # Single command — setup, plan, and run"
|
||||||
echo " /loop-plan # Generate PRD and sprint contracts"
|
|
||||||
echo " /loop-run # Run the loop interactively"
|
|
||||||
echo ""
|
echo ""
|
||||||
echo " Or run headless: .loop/loop.sh"
|
echo " Or run headless: .loop/loop.sh"
|
||||||
echo ""
|
echo ""
|
||||||
|
|||||||
13
lib/hooks.sh
13
lib/hooks.sh
@@ -20,9 +20,9 @@ install_hooks() {
|
|||||||
jq '.hooks.Stop = [{"matcher": "", "hooks": [{"type": "command", "command": "kill -INT $PPID || true"}]}]' \
|
jq '.hooks.Stop = [{"matcher": "", "hooks": [{"type": "command", "command": "kill -INT $PPID || true"}]}]' \
|
||||||
"$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
|
"$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
|
||||||
else
|
else
|
||||||
python3 -c "
|
LOOP_SETTINGS="$SETTINGS_FILE" python3 -c "
|
||||||
import json, os
|
import json, os
|
||||||
p = '$SETTINGS_FILE'
|
p = os.environ['LOOP_SETTINGS']
|
||||||
s = json.load(open(p)) if os.path.exists(p) else {}
|
s = json.load(open(p)) if os.path.exists(p) else {}
|
||||||
s.setdefault('hooks', {})['Stop'] = [{'matcher': '', 'hooks': [{'type': 'command', 'command': 'kill -INT \$PPID || true'}]}]
|
s.setdefault('hooks', {})['Stop'] = [{'matcher': '', 'hooks': [{'type': 'command', 'command': 'kill -INT \$PPID || true'}]}]
|
||||||
json.dump(s, open(p, 'w'), indent=2)
|
json.dump(s, open(p, 'w'), indent=2)
|
||||||
@@ -37,12 +37,13 @@ remove_hooks() {
|
|||||||
jq 'del(.hooks.Stop)' "$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
|
jq 'del(.hooks.Stop)' "$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
|
||||||
jq 'if .hooks == {} then del(.hooks) else . end' "$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
|
jq 'if .hooks == {} then del(.hooks) else . end' "$SETTINGS_FILE" > "${SETTINGS_FILE}.tmp" && mv "${SETTINGS_FILE}.tmp" "$SETTINGS_FILE"
|
||||||
else
|
else
|
||||||
python3 -c "
|
LOOP_SETTINGS="$SETTINGS_FILE" python3 -c "
|
||||||
import json
|
import json, os
|
||||||
s = json.load(open('$SETTINGS_FILE'))
|
p = os.environ['LOOP_SETTINGS']
|
||||||
|
s = json.load(open(p))
|
||||||
s.get('hooks', {}).pop('Stop', None)
|
s.get('hooks', {}).pop('Stop', None)
|
||||||
if not s.get('hooks'): s.pop('hooks', None)
|
if not s.get('hooks'): s.pop('hooks', None)
|
||||||
json.dump(s, open('$SETTINGS_FILE', 'w'), indent=2)
|
json.dump(s, open(p, 'w'), indent=2)
|
||||||
"
|
"
|
||||||
fi
|
fi
|
||||||
log "Stop hook removed"
|
log "Stop hook removed"
|
||||||
|
|||||||
29
loop.sh
29
loop.sh
@@ -124,7 +124,7 @@ while [[ $# -gt 0 ]]; do
|
|||||||
--dry-run) DRY_RUN=true; shift ;;
|
--dry-run) DRY_RUN=true; shift ;;
|
||||||
--headless) export LOOP_HEADLESS=true; shift ;;
|
--headless) export LOOP_HEADLESS=true; shift ;;
|
||||||
--resume) RESUME=true; shift ;;
|
--resume) RESUME=true; shift ;;
|
||||||
--replan) log "ERROR: --replan is not yet implemented. Use /loop-plan interactively."; exit 1 ;;
|
--replan) log "ERROR: --replan is not yet implemented. Use /agent-loop:stories interactively."; exit 1 ;;
|
||||||
[0-9]*) MAX_ITERATIONS="$1"; shift ;;
|
[0-9]*) MAX_ITERATIONS="$1"; shift ;;
|
||||||
*) log "Unknown option: $1"; exit 1 ;;
|
*) log "Unknown option: $1"; exit 1 ;;
|
||||||
esac
|
esac
|
||||||
@@ -162,7 +162,7 @@ check_archive
|
|||||||
|
|
||||||
# Validate prd.json exists (AFTER archive check, which may delete it on branch change)
|
# Validate prd.json exists (AFTER archive check, which may delete it on branch change)
|
||||||
if [ ! -f "$LOOP_DIR/prd.json" ]; then
|
if [ ! -f "$LOOP_DIR/prd.json" ]; then
|
||||||
log "ERROR: No prd.json found. Run /loop-plan first to create one."
|
log "ERROR: No prd.json found. Run /agent-loop:stories first to create one."
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
@@ -240,11 +240,11 @@ run_agent() {
|
|||||||
claude)
|
claude)
|
||||||
printf '%s\n' "$prompt" | timeout "${LOOP_AGENT_TIMEOUT:-600}" \
|
printf '%s\n' "$prompt" | timeout "${LOOP_AGENT_TIMEOUT:-600}" \
|
||||||
claude --dangerously-skip-permissions --output-format text \
|
claude --dangerously-skip-permissions --output-format text \
|
||||||
--print 2>&1 > "$output_file"
|
--print > "$output_file" 2>&1
|
||||||
;;
|
;;
|
||||||
amp)
|
amp)
|
||||||
printf '%s\n' "$prompt" | timeout "${LOOP_AGENT_TIMEOUT:-600}" \
|
printf '%s\n' "$prompt" | timeout "${LOOP_AGENT_TIMEOUT:-600}" \
|
||||||
amp --dangerously-allow-all 2>&1 > "$output_file"
|
amp --dangerously-allow-all > "$output_file" 2>&1
|
||||||
;;
|
;;
|
||||||
*)
|
*)
|
||||||
log "ERROR: Unknown tool '$TOOL'"
|
log "ERROR: Unknown tool '$TOOL'"
|
||||||
@@ -319,7 +319,7 @@ while [ "$ITERATION" -lt "$MAX_ITERATIONS" ]; do
|
|||||||
fi
|
fi
|
||||||
snapshot_for_archive
|
snapshot_for_archive
|
||||||
if any_stories_blocked 2>/dev/null; then
|
if any_stories_blocked 2>/dev/null; then
|
||||||
log "Some stories are blocked and need human review. Run /loop-triage for details."
|
log "Some stories are blocked and need human review. Run /agent-loop:triage for details."
|
||||||
exit $EXIT_ALL_BLOCKED
|
exit $EXIT_ALL_BLOCKED
|
||||||
fi
|
fi
|
||||||
exit $EXIT_OK
|
exit $EXIT_OK
|
||||||
@@ -364,7 +364,7 @@ while [ "$ITERATION" -lt "$MAX_ITERATIONS" ]; do
|
|||||||
# --- Scope budget check ---
|
# --- Scope budget check ---
|
||||||
# Verify the generator stayed within configured limits (files modified, lines written).
|
# Verify the generator stayed within configured limits (files modified, lines written).
|
||||||
# Advisory in implement/fix modes (log warning), but enforced as rejection reason for evaluator.
|
# Advisory in implement/fix modes (log warning), but enforced as rejection reason for evaluator.
|
||||||
if [ -n "$PRE_GENERATOR_SHA" ] && [ "$PRE_GENERATOR_SHA" != "" ]; then
|
if [ -n "$PRE_GENERATOR_SHA" ]; then
|
||||||
SCOPE_FILES_MODIFIED=$(git diff --name-only "$PRE_GENERATOR_SHA" HEAD 2>/dev/null | wc -l | tr -d ' ')
|
SCOPE_FILES_MODIFIED=$(git diff --name-only "$PRE_GENERATOR_SHA" HEAD 2>/dev/null | wc -l | tr -d ' ')
|
||||||
SCOPE_LINES_WRITTEN=$(git diff --stat "$PRE_GENERATOR_SHA" HEAD 2>/dev/null | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
|
SCOPE_LINES_WRITTEN=$(git diff --stat "$PRE_GENERATOR_SHA" HEAD 2>/dev/null | tail -1 | grep -oE '[0-9]+ insertion' | grep -oE '[0-9]+' || echo "0")
|
||||||
|
|
||||||
@@ -381,18 +381,9 @@ while [ "$ITERATION" -lt "$MAX_ITERATIONS" ]; do
|
|||||||
export SCOPE_FILES_MODIFIED SCOPE_LINES_WRITTEN
|
export SCOPE_FILES_MODIFIED SCOPE_LINES_WRITTEN
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Check for completion — in interactive mode, check prd.json directly
|
# NOTE: Do NOT check all_stories_pass here. The generator marks its own story
|
||||||
if all_stories_pass 2>/dev/null; then
|
# as passed, but the evaluator hasn't verified yet. Checking here would skip
|
||||||
log_header "All Stories Complete! ($(story_counts))"
|
# evaluation on the last story. The completion check is at the top of the loop.
|
||||||
snapshot_for_archive
|
|
||||||
exit 0
|
|
||||||
fi
|
|
||||||
# Headless mode: also check output sentinel
|
|
||||||
if [ -n "$GENERATOR_OUTPUT" ] && echo "$GENERATOR_OUTPUT" | grep -q "<promise>COMPLETE</promise>"; then
|
|
||||||
log_header "Generator signaled COMPLETE ($(story_counts))"
|
|
||||||
snapshot_for_archive
|
|
||||||
exit 0
|
|
||||||
fi
|
|
||||||
|
|
||||||
# --- Evaluator pass ---
|
# --- Evaluator pass ---
|
||||||
if [ "$SKIP_EVAL" != true ]; then
|
if [ "$SKIP_EVAL" != true ]; then
|
||||||
@@ -460,6 +451,6 @@ done
|
|||||||
# --- Max iterations reached ---
|
# --- Max iterations reached ---
|
||||||
log_header "Max Iterations Reached ($MAX_ITERATIONS)"
|
log_header "Max Iterations Reached ($MAX_ITERATIONS)"
|
||||||
log "Stories completed: $(story_counts)"
|
log "Stories completed: $(story_counts)"
|
||||||
log "Run /loop-triage to generate a handoff brief."
|
log "Run /agent-loop:triage to generate a handoff brief."
|
||||||
snapshot_for_archive
|
snapshot_for_archive
|
||||||
exit $EXIT_MAX_ITERATIONS
|
exit $EXIT_MAX_ITERATIONS
|
||||||
|
|||||||
@@ -6,7 +6,7 @@ You are evaluating an analysis/exploration task. The generator claims to have an
|
|||||||
|
|
||||||
Before any other checks, verify explore mode's read-only constraint:
|
Before any other checks, verify explore mode's read-only constraint:
|
||||||
1. Run `git diff {{PRE_GENERATOR_SHA}}..HEAD --name-only`
|
1. Run `git diff {{PRE_GENERATOR_SHA}}..HEAD --name-only`
|
||||||
2. If ANY file outside `.loop/triage/` was modified or committed, **REJECT immediately** — explore mode is read-only. The generator must not modify host project files.
|
2. If ANY file outside `.loop/` was modified or committed, **REJECT immediately** — explore mode is read-only. The generator must not modify host project files. (Files inside `.loop/` like `prd.json` and `progress.md` are expected.)
|
||||||
|
|
||||||
## Exploration-Specific Checks
|
## Exploration-Specific Checks
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# Planner Context
|
# Planner Context
|
||||||
|
|
||||||
This file is loaded by the `/loop-plan` skill to provide additional context for PRD generation.
|
This file provides additional context for PRD generation.
|
||||||
|
|
||||||
## Story Decomposition Guidelines
|
## Story Decomposition Guidelines
|
||||||
|
|
||||||
|
|||||||
4
setup.sh
4
setup.sh
@@ -1,7 +1,7 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
# Agent Loop — project setup script
|
# Agent Loop — project setup script
|
||||||
# Scaffolds .loop/ directory and generates config.json.
|
# Scaffolds .loop/ directory and generates config.json.
|
||||||
# Called by /agent-loop:init or run directly.
|
# Called by /agent-loop:setup or /agent-loop:run, or run directly.
|
||||||
#
|
#
|
||||||
# Usage:
|
# Usage:
|
||||||
# setup.sh <mode> # mode: implement, explore, or fix
|
# setup.sh <mode> # mode: implement, explore, or fix
|
||||||
@@ -120,5 +120,5 @@ echo "[setup] Mode: $MODE"
|
|||||||
echo "[setup] Config: .loop/config.json"
|
echo "[setup] Config: .loop/config.json"
|
||||||
echo ""
|
echo ""
|
||||||
echo "Next steps (in Claude Code):"
|
echo "Next steps (in Claude Code):"
|
||||||
echo " /agent-loop:plan # Generate stories from your spec or description"
|
echo " /agent-loop:stories # Generate stories from your spec or description"
|
||||||
echo ""
|
echo ""
|
||||||
|
|||||||
@@ -3,7 +3,7 @@ name: setup
|
|||||||
description: "Run the setup script to scaffold .loop/ directory. Does not plan features or write code."
|
description: "Run the setup script to scaffold .loop/ directory. Does not plan features or write code."
|
||||||
---
|
---
|
||||||
|
|
||||||
# /init — Scaffold the Agent Loop
|
# /setup — Scaffold the Agent Loop
|
||||||
|
|
||||||
Run the setup script to create `.loop/` with harness files and config. This skill does ONE thing: run a bash command.
|
Run the setup script to create `.loop/` with harness files and config. This skill does ONE thing: run a bash command.
|
||||||
|
|
||||||
|
|||||||
@@ -3,7 +3,7 @@ name: stories
|
|||||||
description: "Generate prd.json and sprint contracts by dispatching the planner agent. Does not write source code."
|
description: "Generate prd.json and sprint contracts by dispatching the planner agent. Does not write source code."
|
||||||
---
|
---
|
||||||
|
|
||||||
# /plan — Generate PRD and Sprint Contracts
|
# /stories — Generate PRD and Sprint Contracts
|
||||||
|
|
||||||
Dispatch the planner agent to decompose a spec into stories. The planner agent cannot write source code or run bash commands — it can only write to `.loop/`.
|
Dispatch the planner agent to decompose a spec into stories. The planner agent cannot write source code or run bash commands — it can only write to `.loop/`.
|
||||||
|
|
||||||
@@ -11,7 +11,7 @@ Dispatch the planner agent to decompose a spec into stories. The planner agent c
|
|||||||
|
|
||||||
### 1. Check prerequisites
|
### 1. Check prerequisites
|
||||||
|
|
||||||
Verify `.loop/config.json` exists. If not, tell the user to run `/agent-loop:init` first and stop.
|
Verify `.loop/config.json` exists. If not, tell the user to run `/agent-loop:setup` first and stop.
|
||||||
|
|
||||||
### 2. Find the spec
|
### 2. Find the spec
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user