From 77fd9e0cd61969043159b6a9ef7b813de0a7816d Mon Sep 17 00:00:00 2001
From: Sheldon Finlay <sheldon@outerlimitsmedia.com>
Date: Sat, 28 Mar 2026 12:56:53 -0400
Subject: [PATCH] feat: add concrete examples of good vs bad acceptance
 criteria

Planner now sees specific examples of verifiable criteria (grep,
test commands, file checks) alongside vague anti-patterns. Drives
higher story quality which directly improves evaluator accuracy.
---
 prompts/planner/plan.md | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/prompts/planner/plan.md b/prompts/planner/plan.md
index 97588b4..d860ce9 100644
--- a/prompts/planner/plan.md
+++ b/prompts/planner/plan.md
@@ -16,7 +16,19 @@ A story must be completable in a single iteration. Keep each story focused — a
 If a story fails (evaluator rejects it), the next iteration should be able to retry it cleanly. Stories with too many moving parts are hard to retry because partial state is messy.
 
 ### Evaluability
-Every story must have criteria the evaluator can independently verify. "The code is clean" is not evaluable. "The function returns 404 when the user doesn't exist" is evaluable.
+Every story must have criteria the evaluator can independently verify by reading code, running commands, or testing behavior.
+
+Good criteria are specific and checkable:
+- "Grep for 'HARDCODED_KEY' returns zero matches"
+- "The function returns 404 when the user doesn't exist"
+- "Running `npm test` passes with no failures"
+- "The config file contains entries for all three required env vars"
+
+Bad criteria are vague or subjective:
+- "The code is clean"
+- "Works correctly"
+- "Performance is improved"
+- "Error handling is robust"
 
 ## PRD Anti-Patterns