loop-loop

Files

Sheldon Finlay 1d059e218b feat: add few-shot calibration examples to evaluator prompt

Three examples showing bad rubber-stamp, good rejection, and good
pass patterns. Based on Anthropic's harness design recommendation
to calibrate evaluators with few-shot score breakdowns, and
informed by real failures observed in a production loop run.

2026-03-28 11:15:52 -04:00

_base.md

feat: add few-shot calibration examples to evaluator prompt

2026-03-28 11:15:52 -04:00

explore.md

feat: US-003 - Clarify .loop/ changes are expected in explore evaluator

2026-03-27 18:42:46 -04:00

fix.md

feat: agent loop harness with Claude Code plugin support

2026-03-27 08:03:18 -04:00

implement.md

feat: add regression patterns to evaluator implement prompt

2026-03-28 10:57:44 -04:00