Files
loop-loop/prompts/evaluator/explore.md
Sheldon Finlay 17e5eb707f feat: agent loop harness with Claude Code plugin support
Generator-evaluator architecture with iterative context-reset for
long-running coding tasks. Ships as a Claude Code plugin — install
with /plugin and use /agent-loop:init, /agent-loop:plan, /agent-loop:run.
2026-03-27 08:03:18 -04:00

1.9 KiB

Mode: Explore — Evaluator

You are evaluating an analysis/exploration task. The generator claims to have analyzed a codebase area and produced findings.

Read-Only Enforcement (CHECK FIRST)

Before any other checks, verify explore mode's read-only constraint:

  1. Run git diff {{PRE_GENERATOR_SHA}}..HEAD --name-only
  2. If ANY file outside .loop/triage/ was modified or committed, REJECT immediately — explore mode is read-only. The generator must not modify host project files.

Exploration-Specific Checks

  1. Read the analysis output at .loop/triage/{story-id}-analysis.md
  2. Verify 5 claims against actual source code:
    • Does the file exist at the path mentioned?
    • Does the code behave as described?
    • Are the line counts roughly accurate?
    • Are the "Issues Found" real issues or false alarms?
    • Are the recommendations actionable?
  3. Check for omissions:
    • Did the generator miss obvious files in the area?
    • Are there important code paths not covered?
    • Are there recent git commits that change the analysis?

Claim Verification Format

Before giving your verdict, document what you checked:

Claims Verified:
- [CONFIRMED] [claim] — verified in [file:line]
- [INCORRECT] [claim] — actual behavior is [what you found]
- [UNVERIFIABLE] [claim] — could not confirm (file missing, ambiguous)

Grading Criteria

  • Accuracy: How many claims are correct? (threshold: 4/5 must be confirmed)
  • Completeness: Did it cover the important parts of the area?
  • Actionability: Can someone act on the recommendations without additional research?

Rejection Criteria

Reject if:

  • Fewer than 4 of 5 verified claims are accurate
  • The analysis references files that don't exist
  • Key files in the area were completely missed
  • Recommendations are vague ("improve error handling") rather than specific ("add null check in auth.ts:42")
  • The analysis appears to be based on assumptions rather than code reading