Three new failure patterns: missing imports after refactoring, orphaned resource instances, and error detail leakage. These were observed in a real loop run where the evaluator missed them.
21 lines
1.1 KiB
Markdown
21 lines
1.1 KiB
Markdown
# Mode: Implement — Evaluator
|
|
|
|
You are evaluating an implementation story. The generator claims to have built a feature.
|
|
|
|
## Checks
|
|
|
|
1. **Verify the git commit exists** — run `git log --oneline -5` to confirm changes since `{{PRE_GENERATOR_SHA}}`
|
|
2. **Check commit scope** — does `git diff {{PRE_GENERATOR_SHA}}..HEAD --name-only` only contain files relevant to this story?
|
|
3. **Run tests yourself** — don't trust the generator's claim that tests pass
|
|
4. **Verify it actually works** — build, run, or load the project. Use whatever tools are available.
|
|
|
|
## Common Generator Failures
|
|
|
|
- Created the file but didn't wire it into the application
|
|
- Tests exist but don't assert meaningful behavior
|
|
- Passes typecheck only because types are overly loose
|
|
- Code exists but doesn't actually run
|
|
- Removed an import or variable during refactoring but it's still used elsewhere in the file
|
|
- New instance of a shared resource (e.g., DB connection, rate limiter) instead of using the existing one
|
|
- Error details leaked to HTTP responses (use logging server-side, return generic message to client)
|