f26bdce534
fix: replace misleading context budget percentages with scope guidance
...
The planner prompt had vague context window budget percentages that
don't reflect how agents actually work. Replaced with concrete
scope guidance (keep stories to ~10 files) which aligns with the
existing scope budgets in config.json.
2026-03-28 11:49:04 -04:00
2dc291aac4
fix: make evaluator calibration examples project-agnostic
...
Replace ChaosRush-specific references with generic examples
that apply to any codebase.
2026-03-28 11:21:11 -04:00
1d059e218b
feat: add few-shot calibration examples to evaluator prompt
...
Three examples showing bad rubber-stamp, good rejection, and good
pass patterns. Based on Anthropic's harness design recommendation
to calibrate evaluators with few-shot score breakdowns, and
informed by real failures observed in a production loop run.
2026-03-28 11:15:52 -04:00
80b0f0f4c1
feat: add regression patterns to evaluator implement prompt
...
Three new failure patterns: missing imports after refactoring,
orphaned resource instances, and error detail leakage. These were
observed in a real loop run where the evaluator missed them.
2026-03-28 10:57:44 -04:00
5e4ad3b12e
feat: add smoke test step to generator startup sequence
...
Generator now runs a quick health check before implementing if the
project has tests or a dev server. Catches regressions from previous
iterations early instead of building on a broken foundation.
2026-03-27 21:09:36 -04:00
9a7fa3a1bd
fix: enforce strict orientation sequence in generator prompt
...
Add git log step and explicit gate requiring all startup steps
complete before implementation begins. Based on Anthropic's
prompting guide recommendation for prescriptive session orientation.
2026-03-27 21:07:48 -04:00
50e62ca979
fix: correct URLs, author name, and clean up stale hook
...
- Revert plugin/README/CONTRIBUTING URLs to git.jagfly.com (not on GitHub yet)
- Fix LICENSE copyright to Sheldon Finlay
- Remove leftover Stop hook from settings.local.json
2026-03-27 19:00:26 -04:00
d8c95397f2
feat: US-008 - Add CONTRIBUTING.md
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-27 18:51:33 -04:00
410c17b3b3
feat: US-007 - Increase evalRetries default from 2 to 3
2026-03-27 18:49:40 -04:00
25d53a6b4f
feat: US-006 - Improve init.sh.example with project-type guidance
2026-03-27 18:47:44 -04:00
6b6cf842b9
feat: US-005 - Add MIT LICENSE file
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-27 18:46:38 -04:00
978783d1be
feat: US-004 - Update plugin URLs from jagfly.com to GitHub
...
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com >
2026-03-27 18:44:59 -04:00
a4e9c4de05
feat: US-003 - Clarify .loop/ changes are expected in explore evaluator
2026-03-27 18:42:46 -04:00
3c518794ee
feat: US-002 - Guard against data loss in archive.sh
2026-03-27 18:40:31 -04:00
a935997ac4
feat: US-001 - Clean up settings.local.json
...
Remove hardcoded development paths (ralph-loop, loop-test2) and
absolute-path permissions from the allow list, keeping only
project-agnostic and relative-path permissions.
2026-03-27 18:38:33 -04:00
e3554010dd
fix: auto-close finish screen after 30s so background watcher fires
2026-03-27 18:18:07 -04:00
3d86562205
fix: scope Stop hook to per-agent — prevents killing orchestrating CC session
2026-03-27 16:06:48 -04:00
a9af753a2e
fix: setup.sh initializes git repo if none exists
2026-03-27 15:47:43 -04:00
6b13fc3d38
feat: background watcher notifies CC session when loop completes
2026-03-27 15:22:43 -04:00
ddd8790481
docs: note that each loop session is resumable via claude -r
2026-03-27 15:20:05 -04:00
f1fde5cb01
fix: show summary and pause on loop exit — tmux doesn't vanish abruptly
2026-03-27 15:17:13 -04:00
bc7a1e2f04
fix: require spec file before story generation — don't reinvent planning
2026-03-27 15:08:30 -04:00
b3d263258a
fix: critical bugs, stale refs, README rewrite, security fixes
...
- Fix evaluator bypass on last story (moved completion check)
- Fix all stale command name references across README, loop.sh, skills, plugin.json
- Fix explore evaluator false rejects (.loop/ files are expected)
- Fix stderr capture order in headless mode
- Fix shell injection risk in hooks.sh python fallback
- Remove .DS_Store from tracking
- Rewrite README to match current architecture (single entry point, tmux, optional tools)
- Add XcodeBuildMCP and iOS simulator MCP to optional tools docs
2026-03-27 14:58:01 -04:00
f3cbfd258c
refactor: remove domain-specific language from prompts — fully universal
2026-03-27 14:50:52 -04:00
48bc656cd8
refactor: trim generator and evaluator prompts — cut total in half
2026-03-27 14:48:42 -04:00
5f8a34cc7b
fix: simplify evaluator runtime verification — let claude figure out the tools
2026-03-27 14:45:55 -04:00
ee08e3617c
feat: evaluator runtime verification for web projects, optional Playwright docs
2026-03-27 14:30:09 -04:00
18d95fed0d
fix: don't capture stdout in interactive mode — run claude directly so UI renders
2026-03-27 13:34:54 -04:00
994908aed2
feat: adopt Ralph pattern — pipe to claude (no --print), working Stop hook
2026-03-27 13:24:13 -04:00
1e7f7ea6ed
feat: true interactive mode — run claude directly, verdict via file, no script/capture
2026-03-27 13:07:25 -04:00
5e456cff6d
fix: drop osascript, use universal ! tmux attach approach
2026-03-27 12:53:26 -04:00
4a6ddaa193
fix: pass prompt as CLI arg instead of stdin to preserve interactive UI
2026-03-27 12:49:42 -04:00
8129b5736b
fix: platform-aware terminal launch — osascript on macOS, fallback on Linux
2026-03-27 12:42:01 -04:00
d457344806
feat: auto-open terminal window attached to tmux session
2026-03-27 12:41:02 -04:00
2a02a54b9d
feat: interactive mode — full CC sessions visible in tmux, headless mode via --headless flag
2026-03-27 12:36:56 -04:00
a3cf3e7bae
fix: add macOS timeout compatibility (gtimeout or perl fallback)
2026-03-27 12:24:53 -04:00
0666903b5f
fix: launch tmux detached, prompt user to attach with ! prefix
2026-03-27 12:14:55 -04:00
e810d1a1db
fix: attach to tmux session instead of detaching
2026-03-27 12:10:12 -04:00
a2b4369035
feat: launch execution in tmux, orchestrator monitors progress
2026-03-27 11:48:15 -04:00
f867630639
fix: use bypassPermissions for generator/evaluator agents (autonomous mode)
2026-03-27 10:14:11 -04:00
9508ad20b6
fix: rename init to setup to avoid built-in /init conflict
2026-03-27 10:01:50 -04:00
2a78915dcf
feat: single entry point /agent-loop:run handles setup, planning, and execution
2026-03-27 09:53:52 -04:00
381741509d
fix: rename generate to stories to avoid autocomplete issues
2026-03-27 09:49:10 -04:00
8c4e123976
fix: rename plan skill to generate to avoid name collision with built-in /plan
2026-03-27 09:39:13 -04:00
e9d87fa6a1
chore: bump to 0.3.0
2026-03-27 09:28:06 -04:00
86b2b7271b
feat: bash setup script, planner agent with disallowedTools, simplified skills
2026-03-27 09:23:42 -04:00
53086c9dbc
fix: radically simplify skills — each does exactly one thing, no chaining, explicit boundaries
2026-03-27 09:03:47 -04:00
fee323a2d6
fix: tighten skill specs — exact prd.json schema, explicit scaffold, validation
2026-03-27 08:57:40 -04:00
fe14d81073
fix: init skill avoids brainstorming interception, detects existing specs
2026-03-27 08:46:18 -04:00
2c8ea90176
fix: plan skill requires explicit user review before execution
2026-03-27 08:41:11 -04:00