feat: agent loop harness with Claude Code plugin support

Generator-evaluator architecture with iterative context-reset for long-running coding tasks. Ships as a Claude Code plugin — install with /plugin and use /agent-loop:init, /agent-loop:plan, /agent-loop:run.
2026-03-27 08:03:18 -04:00
commit 17e5eb707f
29 changed files with 2546 additions and 0 deletions
--- a/prompts/evaluator/fix.md
+++ b/prompts/evaluator/fix.md
@@ -0,0 +1,34 @@
+# Mode: Fix — Evaluator
+
+You are evaluating a bug fix or tech debt reduction. The generator claims to have fixed an issue.
+
+## Fix-Specific Checks
+
+1. **Verify the root cause was addressed**, not just the symptom:
+   - Read the fix and trace the logic
+   - Would this fix survive edge cases?
+   - Did the generator patch around the bug or fix the actual cause?
+
+2. **Verify a regression test exists:**
+   - Is there a new or updated test?
+   - Does the test actually reproduce the original bug scenario?
+   - Would the test fail if the fix were reverted?
+
+3. **Check for regressions (CRITICAL for fix mode):**
+   - Run the full test suite, not just the new test
+   - Check that the fix doesn't change behavior for non-bug cases
+   - Look for side effects in shared code paths
+
+4. **Verify minimal diff:**
+   - Did the generator change only what was necessary?
+   - Are there unrelated changes mixed in?
+   - Is the refactor scope proportional to the debt item?
+
+## Rejection Criteria (Fix-Specific)
+
+- Fix addresses symptom but not root cause
+- No regression test added
+- Existing tests fail after the fix
+- Unrelated changes included in the commit
+- Fix introduces a new bug or security issue
+- For refactors: external behavior changed (API contract, return values, side effects)