We gave the same prompt to vanilla Claude and three Godmode tiers. The difference isn't subtle.
Claude Opus 4.6 ·
April 2026 ·
Identical environment
claude-code — prompt
$Build a personal finance dashboard that imports CSV bank statements, categorizes transactions, and shows charts and trends.
The test: One prompt. No follow-up. No clarification. Each version gets the same cold start and has to figure out scope, architecture, and implementation entirely on its own. The metrics below are from real runs.
fixedInitial categorization rule order matched 'coles' before 'shell coles express', tagging fuel spend as Groceries — caught by the test suite, fixed by reordering Fuel ahead of Groceries plus a `coles(?!\s*express)` negative lookahead.
fixedSample data generator used Math.random in transaction IDs, so re-clicking 'Load sample' duplicated rows instead of deduping — replaced with a seeded LCG so IDs are stable across calls.
fixedCSV importer's headerless retry path used Papa.unparse(parsed-objects) → re-parse hack that broke on some inputs — refactored to retry the original text directly with `header: false`.
fixedMobile CSS only had a single 900px breakpoint with no touch-target sizing — added 768px and 480px breakpoints, 44px min-height on every interactive element, horizontal-scrolling tx table, stacking budgets, and KPI collapse on phones.
lowInline category edit still uses native prompt() — functional but lacks autocomplete and bulk-reassign. A custom popover would be a better UX.
lowCharts fully destroy + rebuild on every state change rather than updating datasets in place — fine at sample volumes (240 tx) but could lag at 10k+ tx.
lowCDN script tags (PapaParse, Chart.js) have no Subresource Integrity hashes — low blast radius for a local-first app, but worth tightening.
lowHeaderless CSV fallback uses positional heuristics (longest string = description, first numeric = amount) — robust for common AU bank exports but may misorder unusual layouts.
Composite Score0.92
Head-to-Head
Metric
Vanilla
Godmode
Total Tokens
35,000
149,000
API Cost
$0.38
$1.02
Time
4m 20s
14m 12s
Files Created
4
13
Tests Written
0
27
Self-Corrections
0
0
Composite Score
0.63
0.92
Issues at Delivery
7
4
Note: Higher token usage and cost for Godmode tiers reflects deeper execution — more context loaded, more tests written, more security checks, more verification passes. You're paying for quality, not verbosity.
See for yourself.
Same prompt. Same model. The only difference is the skill. Stop settling for first-draft output.