docs: rewrite README for plugin-first install

- Remove manual install and install.sh references
- Add prerequisites section (tmux, jq/python3)
- Add step to write a spec before running
- Fix "PRD" → "spec" in modes table
- Add --headless to options list
- Update generator description with startup sequence
- Note evaluator calibration examples
This commit is contained in:
2026-03-28 12:01:05 -04:00
parent 60ce0fef54
commit b4d4e1952a

View File

@@ -8,10 +8,7 @@ A generator-evaluator loop runs fresh Claude Code sessions per iteration. Each i
## Install ## Install
### As a Claude Code Plugin (Recommended)
``` ```
/plugin marketplace add https://git.jagfly.com/sheldon/loop-loop.git
/plugin install agent-loop@agent-loop /plugin install agent-loop@agent-loop
``` ```
@@ -23,16 +20,18 @@ Then in any project:
That's it. The single command handles setup, planning, and execution. That's it. The single command handles setup, planning, and execution.
### Manual Install ## Prerequisites
```bash - [Claude Code](https://docs.anthropic.com/en/docs/claude-code) CLI installed
cp -r /path/to/loop-loop .loop - `tmux` available (used to run the loop in a detachable session)
``` - `jq` or `python3` (for JSON state management)
Then run `.loop/loop.sh` directly.
## How It Works ## How It Works
1. Write a spec describing what you want to build (`SPEC.md`, `docs/specs/*.md`, or similar)
2. Run `/agent-loop:run` — it scaffolds `.loop/`, generates stories from your spec, and presents them for review
3. Say "go" — the loop launches in tmux and runs autonomously
``` ```
/agent-loop:run /agent-loop:run
├─ Phase 1: Scaffold .loop/ (if needed) ├─ Phase 1: Scaffold .loop/ (if needed)
@@ -50,7 +49,7 @@ Then run `.loop/loop.sh` directly.
| Mode | What it does | Git writes? | | Mode | What it does | Git writes? |
|------|-------------|-------------| |------|-------------|-------------|
| **implement** | Build features from a PRD | Yes | | **implement** | Build features from a spec | Yes |
| **explore** | Read-only codebase analysis | No | | **explore** | Read-only codebase analysis | No |
| **fix** | Targeted bug fixes / tech debt | Yes | | **fix** | Targeted bug fixes / tech debt | Yes |
@@ -80,6 +79,7 @@ For CI or background execution without the interactive UI:
```bash ```bash
.loop/loop.sh --headless [options] .loop/loop.sh --headless [options]
--headless Run without interactive UI
--mode <implement|explore|fix> Operating mode --mode <implement|explore|fix> Operating mode
--max <N> Maximum iterations (default: 20) --max <N> Maximum iterations (default: 20)
--skip-eval Skip evaluator pass --skip-eval Skip evaluator pass
@@ -89,12 +89,12 @@ For CI or background execution without the interactive UI:
## Architecture ## Architecture
### Generator ### Generator
Fresh Claude Code session each iteration. Reads `prd.json` to find the highest-priority incomplete story, reads the sprint contract, implements the story, runs quality gates, commits, and marks it done. Fresh Claude Code session each iteration. Follows a strict startup sequence: reads progress.md, finds the next story from prd.json, reads the sprint contract, checks for evaluator feedback, reviews git history, and runs a smoke test if available — all before writing any code. Then implements the story, runs quality gates, commits, and marks it done.
### Evaluator ### Evaluator
Separate fresh session after each generator pass. Skeptically verifies the work: checks acceptance criteria against actual code, runs tests and the application, and issues a `PASS` or `REJECT` verdict. Rejection sends the story back with specific feedback. Separate fresh session after each generator pass. Skeptically verifies the work: checks each acceptance criterion against actual code with file paths and line numbers, runs tests, and issues a `PASS` or `REJECT` verdict. Rejection sends the story back with specific feedback.
Evaluator skepticism is deliberately tuned — Claude's default tendency is to rationalize away issues. The evaluator prompt includes explicit bias correction. Evaluator skepticism is deliberately tuned — Claude's default tendency is to rationalize away issues. The evaluator prompt includes explicit bias correction and few-shot calibration examples.
### Sprint Contracts ### Sprint Contracts
Before the loop starts, the planner generates contracts for each story. These define "done" conditions that both generator and evaluator reference, eliminating ambiguity about whether work is complete. Before the loop starts, the planner generates contracts for each story. These define "done" conditions that both generator and evaluator reference, eliminating ambiguity about whether work is complete.