Showcase

Same Prompt. Four Outputs.

We gave the same prompt to vanilla Claude and three Godmode tiers. The difference isn't subtle.

Claude Opus 4.7 · May 2026 · Identical environment
claude-code — prompt
$ Make an instructional animation on how to boil an egg
The test: One prompt. No follow-up. No clarification. Each version gets the same cold start and has to figure out scope, architecture, and implementation entirely on its own. The metrics below are from real runs.
Results
Single-pass output — no self-review
Total Tokens
35,500
22,000 in / 13,500 out
API Cost
$0.45
estimated
Time
6m 40s
wall clock
Files
1
created
Test Suite
0
tests written
Loops
0
no self-review
Quality Audit
Code Quality
0.78
Testing
0.00
Security
0.72
Error Handling
0.55
Completeness
0.85
UX / Polish
0.84
Issues Found
  • criticalNo test suite — zero coverage of step transitions or state resets
  • highMobile CSS source-order bug — unconditional .stage rules overrode media queries until manually fixed
  • highNo reduced-motion support — ignores prefers-reduced-motion users
  • mediumapplyStepInstant() doesn't fully reset bubble/flame intervals when scrubbing backwards
  • mediumNo ARIA labels on SVG stage or controls; screen-reader users get nothing
  • mediumDoneness change mid-step doesn't recompute timer label until step 4 is re-entered
  • lowCross-section yolk shrink uses setInterval not requestAnimationFrame — stutters on low-end devices
  • lowNo graceful handling for very wide aspect ratios (ultrawide monitors)
Composite Score 0.60
8-layer execution — single pass, no scoring
Total Tokens
14,972,064
112 in / 45,863 out
API Cost
$33.38
estimated
Time
26m 15s
wall clock
Files
1
created
Test Suite
0
tests written
Loops
0
single pass
Quality Audit
Code Quality
0.90
Testing
0.30
Security
0.92
Error Handling
0.78
Completeness
0.95
UX / Polish
0.92
Issues Found
  • mediumNo unit tests for state-machine transitions — coverage came from an 11-state visual audit, but there's no persistent regression suite to re-run later
  • mediumPartial ARIA — doneness tabs are labelled but the SVG scene is aria-hidden and the step list isn't announced as the current cooking step
  • lowKeyboard shortcuts (space, R, 1–4) work but aren't documented anywhere in the UI
  • lowMobile breakpoints (768 px, 480 px) added in showcase prep but only validated in a desktop-Chromium emulator, not on physical devices
  • lowNo graceful handling for very wide aspect ratios beyond the 1280 px max-width container
Composite Score 0.78
7-phase execution — enhanced verification
Total Tokens
274,500
218,000 in / 56,500 out
API Cost
$2.50
estimated
Time
48m 22s
wall clock
Files
33
created
Test Suite
1
tests written
Loops
0
single pass
Quality Audit
Code Quality
0.88
Testing
0.60
Security
0.90
Error Handling
0.70
Completeness
0.94
UX / Polish
0.88
Issues Found
  • mediumNo automated regression test suite — only screenshot-based visual sanity checks (screenshot-test.js, verify/ frames). A future text/timing change could silently regress without being caught.
  • mediumTTS voice is en-US-AvaNeural — user is Australian, so accent mismatch is a minor authenticity issue. Easy fix (swap to en-AU-NatashaNeural) but wasn't caught at planning time.
  • lowStep-6 caption initially overlapped the "Perfect every time." final-text — caught only after user feedback; codified as a new "Don't regress what was working" hard rule in the skill.
  • lowInitial water level in pot scenes filled to the rim despite narration saying "about one inch above the tops" — caught only after user feedback; codified as a new "Internal consistency" hard rule in the skill.
  • lowOriginal narration title said "the complete six-step guide" but only Steps 1–5 were spoken — same internal-consistency class of bug as above.
  • lowCaptured video had a 5s trailing blank stage (Playwright recorded longer than the JS animation timeline). Fixed by adding `-t 61.5` trim flag to record.js and narrate.py.
  • fixedOriginal silent video had no narration; added a 7-line edge-tts script (en-US-AvaNeural, +5% rate) muxed in via ffmpeg adelay+amix. Codified as the "Match the medium's expectations" hard rule.
  • fixedWater level corrected from full-pot fill to ~1 inch above egg tops; bracket repositioned inside the pot to clearly indicate the gap.
  • fixedStep 6 caption added to peel scene + "Step six. Crack, peel..." narration line so spoken count matches the title's six-step claim.
  • fixed"Perfect every time." text moved to the top of scene-peel so the new caption strip doesn't cover it.
  • fixedBubble keyframe travel reduced from -280px to -155px so bubbles pop at the new water surface instead of floating into the air above it.
  • fixedCSS-transform-overrides-SVG-transform bug: animated eggs were rendering at (0,0) instead of their position. Wrapped each animated egg in outer <g transform="translate(...)"> + inner animated <g> with transform-box: fill-box.
Composite Score 0.81
Assess-fix loop — shipped only when all dimensions passed
Total Tokens
9,715,635
228 in / 130,946 out
API Cost
$30.29
estimated
Time
29m 47s
wall clock
Files
13
created
Test Suite
0
tests written
Loops
4
self-corrections
Quality Audit
Code Quality
0.78
Testing
0.05
Security
0.85
Error Handling
0.55
Completeness
0.72
UX / Polish
0.50
Issues Found
  • criticalUser verdict: Rejected. Flagged visual quality, sync/timing, and content/instructions all as weak — wanted higher-fidelity animation AND more polish on the executed approach.
  • highCaption fade transitions bleed across scene boundaries — at scene N → N+1 boundaries, the next scene's caption appears while the previous scene's visuals are still on screen, reading as a glitch in the rendered video.
  • highcapture.js used wrong Playwright API signature for page.waitForFunction — passed { timeout } as 2nd arg instead of 3rd. The timeout was silently ignored (defaulted to 30s), so the 43s animation never reached the __animationDone flag and the recording failed. Fixed mid-session by passing null as the arg parameter.
  • mediumFlames are visually small and rely only on CSS keyframe motion — don't convincingly read as a 'rolling boil' even after multiple polish passes.
  • mediumBubbles barely visible in the boil scene despite explicit polish iterations to make them more prominent.
  • mediumScene durations duplicated between timings.json and the SCENES array in animation.html — drift risk if one is changed without the other.
  • mediumHandover suggested NODE_PATH=$HOME/playground/node_modules — the POSIX-style path didn't work on Windows Node 24. Required Windows-style 'C:\\Users\\Lloyd Gibbs\\playground\\node_modules' instead. Hidden footgun for the next agent.
  • lowZero automated tests — only manual frame verification via screenshot pass + ad-hoc audit frame extraction. No regression suite to catch future drift.
  • lowPeel scene shows whole egg + cross-section side-by-side but lacks visual progression of the peeling action itself; static end-state instead of demonstrative motion.
  • lowStep indicator and progress dots update instantly on scene activation while captions cross-fade — contributes to the caption-overlap effect at boundaries.
  • fixedcapture.js page.waitForFunction signature corrected mid-session (timeout option moved from 2nd to 3rd argument with null arg).
  • fixedNODE_PATH switched from POSIX-style to Windows-style for Node 24 compatibility on the user's Win11 setup.
Composite Score 0.57
Head-to-Head
Metric Vanilla Godmode Godmode+ One-Shot
Total Tokens 35,500 14,972,064 274,500 9,715,635
API Cost $0.45 $33.38 $2.50 $30.29
Time 6m 40s 26m 15s 48m 22s 29m 47s
Files Created 1 1 33 13
Tests Written 0 0 1 0
Self-Corrections 0 0 0 4
Composite Score 0.60 0.78 0.81 0.57
Issues at Delivery 8 5 6 10
Note: Higher token usage and cost for Godmode tiers reflects deeper execution — more context loaded, more tests written, more security checks, more verification passes. You're paying for quality, not verbosity.

See for yourself.

Same prompt. Same model. The only difference is the skill.
Stop settling for first-draft output.

Get Access Learn More