loop-loop

Author	SHA1	Message	Date
Sheldon Finlay	344b179b4d	feat: support parallel loops with per-project tmux session names The tmux session name is now derived from the project directory name (e.g., agent-loop-server, agent-loop-webapp). This allows running multiple loops in parallel on different projects without collisions. Previously hardcoded to "agent-loop", which meant launching a second loop would kill the first project's tmux session.	2026-04-02 10:54:22 -04:00
Sheldon Finlay	b516492a91	fix: install Stop hook once at loop startup, not per-iteration Per-iteration install/remove had a race condition: settings.local.json was written immediately before CC started, and CC could read the old file (without the hook) on the first iteration. Now the hook is installed once when loop.sh starts and removed on exit. The AGENT_LOOP_ACTIVE env var guard ensures it only fires for CC sessions spawned by the loop, so keeping it installed the whole time is safe.	2026-04-02 10:51:48 -04:00
Sheldon Finlay	a1a3dfbd63	fix: use env var instead of tmux check for Stop hook scoping The tmux display-message approach had edge cases: it could succeed outside tmux, fail on first iteration, or behave differently depending on tmux socket state. Replace with AGENT_LOOP_ACTIVE env var exported by loop.sh. CC sessions spawned by the loop inherit it; interactive CC sessions don't. Simple, no external dependencies, no race conditions.	2026-04-02 10:42:46 -04:00
Sheldon Finlay	bab002b927	fix: prevent Stop hook from killing sessions outside tmux tmux display-message succeeds even outside tmux by falling back to the most recently created session (agent-loop). This caused the hook to match and kill interactive CC sessions. Fix: check $TMUX env var first — only set when actually inside tmux.	2026-04-02 09:14:43 -04:00
Sheldon Finlay	71b00cf11f	feat: auto-update harness files when plugin version changes setup.sh now stamps .harness-version in .loop/ at scaffold time. On each /agent-loop:run, Phase 1 compares the installed harness version against the plugin version and auto-updates lib/, prompts/, and loop.sh if stale. Run state (prd.json, contracts, config.json) is preserved. Also adds setup.sh --update mode for refreshing harness files without re-scaffolding. Bump to 0.10.0.	2026-04-02 09:02:41 -04:00
Sheldon Finlay	1bd8004854	fix: scope Stop hook to agent-loop tmux session only The Stop hook (kill -INT $PPID) was written to the project's settings.local.json, causing ANY Claude Code session in the same project to kill its parent shell on exit — not just the loop's sessions. Now the hook checks tmux session name before firing: only CC sessions inside the "agent-loop" tmux session trigger the kill. Other CC sessions in the same project are unaffected.	2026-04-02 08:17:15 -04:00
Sheldon Finlay	ad58a49182	feat: auto-archive completed runs before starting new features When /agent-loop:run detects a previous run with all stories passed (or the feature branch deleted after merge), it archives the old artifacts and resets .loop/ automatically — no more manual rm -rf .loop. - Add archive_and_reset() for on-demand archiving from skills - Add runs.log index tracking all archived runs - Update /run and /stories skills to detect completed runs - setup.sh archives instead of hard-failing when prd.json exists - Bump version to 0.9.0	2026-04-02 07:40:07 -04:00
Sheldon Finlay	ce111b4cbe	feat: add guidance for subjective acceptance criteria Planner now has examples for design/UX criteria that are evaluable without being purely binary. Prevents the planner from avoiding qualitative criteria just because they aren't grep-checkable.	2026-03-28 12:59:42 -04:00
Sheldon Finlay	77fd9e0cd6	feat: add concrete examples of good vs bad acceptance criteria Planner now sees specific examples of verifiable criteria (grep, test commands, file checks) alongside vague anti-patterns. Drives higher story quality which directly improves evaluator accuracy.	2026-03-28 12:56:53 -04:00
Sheldon Finlay	1efca3c185	feat: add blocker handling and artifact protection to generator Generator now has explicit instructions for when it's stuck: write the blocker to notes, leave passes as false, and stop. Also adds a "Do Not Modify" section preventing changes to other stories, contracts, or config.	2026-03-28 12:40:05 -04:00
Sheldon Finlay	e4df81fdac	feat: add self-verification gate before generator marks story done Generator must now verify each acceptance criterion against actual code before setting passes: true. Acts as a first filter before the evaluator runs, reducing false completions.	2026-03-28 12:36:24 -04:00
Sheldon Finlay	6833d94cf4	docs: mention using Claude or /plan to generate specs	2026-03-28 12:26:40 -04:00
Sheldon Finlay	c293f53d90	docs: make runtime verification claim accurate Only claim what the evaluator actually does: runs tests, builds, and checks for errors. Don't overstate MCP server discovery.	2026-03-28 12:20:31 -04:00
Sheldon Finlay	9fd428ac51	docs: replace specific MCP recommendations with general guidance Avoid maintaining specific install commands that will go stale. The evaluator uses whatever tools are available — let users configure their own testing environment.	2026-03-28 12:19:50 -04:00
Sheldon Finlay	c46de6815c	refactor: remove headless mode Headless mode was half-built and untested. Agent-loop is a plugin that runs interactively via tmux — there's no CI use case yet. Removes --headless flag, timeout compatibility shim, output capture logic, and LOOP_AGENT_TMPFILE handling. Cuts 82 lines from loop.sh.	2026-03-28 12:17:30 -04:00
Sheldon Finlay	b4d4e1952a	docs: rewrite README for plugin-first install - Remove manual install and install.sh references - Add prerequisites section (tmux, jq/python3) - Add step to write a spec before running - Fix "PRD" → "spec" in modes table - Add --headless to options list - Update generator description with startup sequence - Note evaluator calibration examples	2026-03-28 12:01:05 -04:00
Sheldon Finlay	60ce0fef54	fix: tighten vague language across all prompt files - Remove blanket "write tests" instructions; tests only when acceptance criteria require them - Replace arbitrary "30-50% rejection rate" with clear directive - Replace "4/5 threshold" with "majority of claims" rule - List concrete quality gate commands instead of "whatever project uses" - Remove "learnings" from progress summary (too vague) - Make error-leak pattern generic (not HTTP-specific) - Align fix evaluator with updated test expectations	2026-03-28 11:58:13 -04:00
Sheldon Finlay	f26bdce534	fix: replace misleading context budget percentages with scope guidance The planner prompt had vague context window budget percentages that don't reflect how agents actually work. Replaced with concrete scope guidance (keep stories to ~10 files) which aligns with the existing scope budgets in config.json.	2026-03-28 11:49:04 -04:00
Sheldon Finlay	2dc291aac4	fix: make evaluator calibration examples project-agnostic Replace ChaosRush-specific references with generic examples that apply to any codebase.	2026-03-28 11:21:11 -04:00
Sheldon Finlay	1d059e218b	feat: add few-shot calibration examples to evaluator prompt Three examples showing bad rubber-stamp, good rejection, and good pass patterns. Based on Anthropic's harness design recommendation to calibrate evaluators with few-shot score breakdowns, and informed by real failures observed in a production loop run.	2026-03-28 11:15:52 -04:00
Sheldon Finlay	80b0f0f4c1	feat: add regression patterns to evaluator implement prompt Three new failure patterns: missing imports after refactoring, orphaned resource instances, and error detail leakage. These were observed in a real loop run where the evaluator missed them.	2026-03-28 10:57:44 -04:00
Sheldon Finlay	5e4ad3b12e	feat: add smoke test step to generator startup sequence Generator now runs a quick health check before implementing if the project has tests or a dev server. Catches regressions from previous iterations early instead of building on a broken foundation.	2026-03-27 21:09:36 -04:00
Sheldon Finlay	9a7fa3a1bd	fix: enforce strict orientation sequence in generator prompt Add git log step and explicit gate requiring all startup steps complete before implementation begins. Based on Anthropic's prompting guide recommendation for prescriptive session orientation.	2026-03-27 21:07:48 -04:00
Sheldon Finlay	50e62ca979	fix: correct URLs, author name, and clean up stale hook - Revert plugin/README/CONTRIBUTING URLs to git.jagfly.com (not on GitHub yet) - Fix LICENSE copyright to Sheldon Finlay - Remove leftover Stop hook from settings.local.json	2026-03-27 19:00:26 -04:00
Sheldon Finlay	d8c95397f2	feat: US-008 - Add CONTRIBUTING.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 18:51:33 -04:00
Sheldon Finlay	410c17b3b3	feat: US-007 - Increase evalRetries default from 2 to 3	2026-03-27 18:49:40 -04:00
Sheldon Finlay	25d53a6b4f	feat: US-006 - Improve init.sh.example with project-type guidance	2026-03-27 18:47:44 -04:00
Sheldon Finlay	6b6cf842b9	feat: US-005 - Add MIT LICENSE file Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 18:46:38 -04:00
Sheldon Finlay	978783d1be	feat: US-004 - Update plugin URLs from jagfly.com to GitHub Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 18:44:59 -04:00
Sheldon Finlay	a4e9c4de05	feat: US-003 - Clarify .loop/ changes are expected in explore evaluator	2026-03-27 18:42:46 -04:00
Sheldon Finlay	3c518794ee	feat: US-002 - Guard against data loss in archive.sh	2026-03-27 18:40:31 -04:00
Sheldon Finlay	a935997ac4	feat: US-001 - Clean up settings.local.json Remove hardcoded development paths (ralph-loop, loop-test2) and absolute-path permissions from the allow list, keeping only project-agnostic and relative-path permissions.	2026-03-27 18:38:33 -04:00
Sheldon Finlay	e3554010dd	fix: auto-close finish screen after 30s so background watcher fires	2026-03-27 18:18:07 -04:00
Sheldon Finlay	3d86562205	fix: scope Stop hook to per-agent — prevents killing orchestrating CC session	2026-03-27 16:06:48 -04:00
Sheldon Finlay	a9af753a2e	fix: setup.sh initializes git repo if none exists	2026-03-27 15:47:43 -04:00
Sheldon Finlay	6b13fc3d38	feat: background watcher notifies CC session when loop completes	2026-03-27 15:22:43 -04:00
Sheldon Finlay	ddd8790481	docs: note that each loop session is resumable via claude -r	2026-03-27 15:20:05 -04:00
Sheldon Finlay	f1fde5cb01	fix: show summary and pause on loop exit — tmux doesn't vanish abruptly	2026-03-27 15:17:13 -04:00
Sheldon Finlay	bc7a1e2f04	fix: require spec file before story generation — don't reinvent planning	2026-03-27 15:08:30 -04:00
Sheldon Finlay	b3d263258a	fix: critical bugs, stale refs, README rewrite, security fixes - Fix evaluator bypass on last story (moved completion check) - Fix all stale command name references across README, loop.sh, skills, plugin.json - Fix explore evaluator false rejects (.loop/ files are expected) - Fix stderr capture order in headless mode - Fix shell injection risk in hooks.sh python fallback - Remove .DS_Store from tracking - Rewrite README to match current architecture (single entry point, tmux, optional tools) - Add XcodeBuildMCP and iOS simulator MCP to optional tools docs	2026-03-27 14:58:01 -04:00
Sheldon Finlay	f3cbfd258c	refactor: remove domain-specific language from prompts — fully universal	2026-03-27 14:50:52 -04:00
Sheldon Finlay	48bc656cd8	refactor: trim generator and evaluator prompts — cut total in half	2026-03-27 14:48:42 -04:00
Sheldon Finlay	5f8a34cc7b	fix: simplify evaluator runtime verification — let claude figure out the tools	2026-03-27 14:45:55 -04:00
Sheldon Finlay	ee08e3617c	feat: evaluator runtime verification for web projects, optional Playwright docs	2026-03-27 14:30:09 -04:00
Sheldon Finlay	18d95fed0d	fix: don't capture stdout in interactive mode — run claude directly so UI renders	2026-03-27 13:34:54 -04:00
Sheldon Finlay	994908aed2	feat: adopt Ralph pattern — pipe to claude (no --print), working Stop hook	2026-03-27 13:24:13 -04:00
Sheldon Finlay	1e7f7ea6ed	feat: true interactive mode — run claude directly, verdict via file, no script/capture	2026-03-27 13:07:25 -04:00
Sheldon Finlay	5e456cff6d	fix: drop osascript, use universal ! tmux attach approach	2026-03-27 12:53:26 -04:00
Sheldon Finlay	4a6ddaa193	fix: pass prompt as CLI arg instead of stdin to preserve interactive UI	2026-03-27 12:49:42 -04:00
Sheldon Finlay	8129b5736b	fix: platform-aware terminal launch — osascript on macOS, fallback on Linux	2026-03-27 12:42:01 -04:00

1 2

72 Commits