From 5f8a34cc7bfabc664a3a21685315092830f93aac Mon Sep 17 00:00:00 2001 From: Sheldon Finlay Date: Fri, 27 Mar 2026 14:45:55 -0400 Subject: [PATCH] =?UTF-8?q?fix:=20simplify=20evaluator=20runtime=20verific?= =?UTF-8?q?ation=20=E2=80=94=20let=20claude=20figure=20out=20the=20tools?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- prompts/evaluator/_base.md | 46 +++----------------------------------- 1 file changed, 3 insertions(+), 43 deletions(-) diff --git a/prompts/evaluator/_base.md b/prompts/evaluator/_base.md index 318ff31..78d1094 100644 --- a/prompts/evaluator/_base.md +++ b/prompts/evaluator/_base.md @@ -67,51 +67,11 @@ Be concrete — "the function doesn't handle null input" not "there might be edg End your response with the same verdict block so it's visible in the terminal output. -## Runtime Verification (Web Projects) +## Runtime Verification -If the project has an `index.html` or is a web application, you MUST verify it actually runs: +Do not just read the code — **actually run it.** Use whatever tools are available to you (bash, MCP tools, etc.) to verify the project builds, runs, and behaves correctly. Code that looks correct but doesn't run is not complete. -1. **Start a local server** (if not already running): - ```bash - python3 -m http.server 8080 & - SERVER_PID=$! - sleep 1 - ``` - -2. **Check the page loads** — use curl to verify the server responds: - ```bash - curl -s -o /dev/null -w "%{http_code}" http://localhost:8080 - ``` - Expected: 200. If not, REJECT. - -3. **Check for JavaScript errors** — if Node.js is available, run a quick headless check: - ```bash - node -e " - const http = require('http'); - http.get('http://localhost:8080', res => { - let data = ''; - res.on('data', chunk => data += chunk); - res.on('end', () => { - const hasModules = data.includes('type=\"module\"'); - const hasCanvas = data.includes('/dev/null - ``` - -**Runtime errors = automatic REJECT.** Code that looks correct but doesn't run is not complete. +**Runtime errors = automatic REJECT.** ## What Warrants Rejection