We gave the same prompt to vanilla Claude and three Godmode tiers. The difference isn't subtle.
Claude Opus 4.6 ·
April 2026 ·
Identical environment
claude-code — prompt
$Create a markdown note-taking app with live preview, folder organization, search, and local storage persistence.
The test: One prompt. No follow-up. No clarification. Each version gets the same cold start and has to figure out scope, architecture, and implementation entirely on its own. The metrics below are from real runs.
Assess-fix loop — shipped only when all dimensions passed
Total Tokens
41,000
30,000 in / 11,000 out
API Cost
$0.42
estimated
Time
9m 30s
wall clock
Files
1
created
Test Suite
0
tests written
Loops
0
self-corrections
Quality Audit
Code Quality
0.93
Testing
0.20
Security
0.95
Error Handling
0.90
Completeness
0.97
UX / Polish
0.94
Issues Found
mediumFolder context-menu uses native confirm() dialog for rename/delete while new/rename flow uses a custom modal — inconsistent affordance
lowNo persisted test suite — Phase 6 verification ran live via headless Chromium but tests were not saved to disk
lowMarked.js / DOMPurify CDN failure is not handled — preview silently fails if scripts don't load
lowNo QuotaExceededError handling on save — large note collections could hit the ~5MB localStorage ceiling and fail silently
lowNo cross-tab sync via storage events — editing the same note in two tabs is last-write-wins
lowNo undo for deletions (confirm() prompts protect against accidents but there is no recovery once confirmed)
fixedInitial mobile CSS used 1000px/640px breakpoints instead of the iframe-required 768px/480px — rewrote media queries
fixedTouch targets were 30-32px tall (buttons, inputs, list rows) — bumped all interactive elements to 44px minimum in mobile queries
fixedTopbar edited-date meta ate horizontal space at 375px — hidden on mobile
fixedPhase 6 headless run caught zero console errors and confirmed state fully restored across reload, including active note + folder + content
Composite Score0.79
Head-to-Head
Metric
Vanilla
Godmode
One-Shot
Total Tokens
34,200
46,000
41,000
API Cost
$0.29
$0.45
$0.42
Time
6m 30s
5m 0s
9m 30s
Files Created
1
1
1
Tests Written
0
0
0
Self-Corrections
0
0
0
Composite Score
0.68
0.77
0.79
Issues at Delivery
7
5
6
Note: Higher token usage and cost for Godmode tiers reflects deeper execution — more context loaded, more tests written, more security checks, more verification passes. You're paying for quality, not verbosity.
See for yourself.
Same prompt. Same model. The only difference is the skill. Stop settling for first-draft output.