diff --git a/README.md b/README.md index 8f3506a..f4745be 100644 --- a/README.md +++ b/README.md @@ -8,10 +8,7 @@ A generator-evaluator loop runs fresh Claude Code sessions per iteration. Each i ## Install -### As a Claude Code Plugin (Recommended) - ``` -/plugin marketplace add https://git.jagfly.com/sheldon/loop-loop.git /plugin install agent-loop@agent-loop ``` @@ -23,16 +20,18 @@ Then in any project: That's it. The single command handles setup, planning, and execution. -### Manual Install +## Prerequisites -```bash -cp -r /path/to/loop-loop .loop -``` - -Then run `.loop/loop.sh` directly. +- [Claude Code](https://docs.anthropic.com/en/docs/claude-code) CLI installed +- `tmux` available (used to run the loop in a detachable session) +- `jq` or `python3` (for JSON state management) ## How It Works +1. Write a spec describing what you want to build (`SPEC.md`, `docs/specs/*.md`, or similar) +2. Run `/agent-loop:run` — it scaffolds `.loop/`, generates stories from your spec, and presents them for review +3. Say "go" — the loop launches in tmux and runs autonomously + ``` /agent-loop:run ├─ Phase 1: Scaffold .loop/ (if needed) @@ -50,7 +49,7 @@ Then run `.loop/loop.sh` directly. | Mode | What it does | Git writes? | |------|-------------|-------------| -| **implement** | Build features from a PRD | Yes | +| **implement** | Build features from a spec | Yes | | **explore** | Read-only codebase analysis | No | | **fix** | Targeted bug fixes / tech debt | Yes | @@ -80,6 +79,7 @@ For CI or background execution without the interactive UI: ```bash .loop/loop.sh --headless [options] +--headless Run without interactive UI --mode Operating mode --max Maximum iterations (default: 20) --skip-eval Skip evaluator pass @@ -89,12 +89,12 @@ For CI or background execution without the interactive UI: ## Architecture ### Generator -Fresh Claude Code session each iteration. Reads `prd.json` to find the highest-priority incomplete story, reads the sprint contract, implements the story, runs quality gates, commits, and marks it done. +Fresh Claude Code session each iteration. Follows a strict startup sequence: reads progress.md, finds the next story from prd.json, reads the sprint contract, checks for evaluator feedback, reviews git history, and runs a smoke test if available — all before writing any code. Then implements the story, runs quality gates, commits, and marks it done. ### Evaluator -Separate fresh session after each generator pass. Skeptically verifies the work: checks acceptance criteria against actual code, runs tests and the application, and issues a `PASS` or `REJECT` verdict. Rejection sends the story back with specific feedback. +Separate fresh session after each generator pass. Skeptically verifies the work: checks each acceptance criterion against actual code with file paths and line numbers, runs tests, and issues a `PASS` or `REJECT` verdict. Rejection sends the story back with specific feedback. -Evaluator skepticism is deliberately tuned — Claude's default tendency is to rationalize away issues. The evaluator prompt includes explicit bias correction. +Evaluator skepticism is deliberately tuned — Claude's default tendency is to rationalize away issues. The evaluator prompt includes explicit bias correction and few-shot calibration examples. ### Sprint Contracts Before the loop starts, the planner generates contracts for each story. These define "done" conditions that both generator and evaluator reference, eliminating ambiguity about whether work is complete.