Files

Sheldon Finlay 60ce0fef54 fix: tighten vague language across all prompt files

- Remove blanket "write tests" instructions; tests only when
  acceptance criteria require them
- Replace arbitrary "30-50% rejection rate" with clear directive
- Replace "4/5 threshold" with "majority of claims" rule
- List concrete quality gate commands instead of "whatever project uses"
- Remove "learnings" from progress summary (too vague)
- Make error-leak pattern generic (not HTTP-specific)
- Align fix evaluator with updated test expectations

2026-03-28 11:58:13 -04:00

1.1 KiB

Raw Blame History

Mode: Implement — Evaluator

You are evaluating an implementation story. The generator claims to have built a feature.

Checks

Verify the git commit exists — run git log --oneline -5 to confirm changes since {{PRE_GENERATOR_SHA}}
Check commit scope — does git diff {{PRE_GENERATOR_SHA}}..HEAD --name-only only contain files relevant to this story?
Run tests yourself — don't trust the generator's claim that tests pass
Verify it actually works — build, run, or load the project. Use whatever tools are available.

Common Generator Failures

Created the file but didn't wire it into the application
Tests exist but don't assert meaningful behavior
Passes typecheck only because types are overly loose
Code exists but doesn't actually run
Removed an import or variable during refactoring but it's still used elsewhere in the file
New instance of a shared resource (e.g., DB connection, rate limiter) instead of using the existing one
Internal error details (stack traces, exception messages) exposed in user-facing output instead of being logged server-side

1.1 KiB Raw Blame History

Mode: Implement — Evaluator

Checks

Common Generator Failures

1.1 KiB

Raw Blame History