June 14, 2026 · deterministic · no model in the loop · reproduce in one command
The paid skills do not just advise. They run a security scan and a correctness gate that block a run from being reported done. Here is each gate driven against a fixture with known defects and against the same code with the defects fixed.
For each gate, a fixture is copied into an isolated, git-initialised temporary directory and the real shipped CLI is driven start -> gate. The security scan and the file discovery only ever see the fixture. Nothing is mocked: these are the exact binaries that ship in the skill zips.
Both controls are run on purpose. A gate that flagged everything would also flag the clean fixture, so the clean pass is as much a part of the proof as the defective catch. The runner grades itself against an answer key; if any expected outcome is missing the verdict below turns false.
Judge-free, not assumption-free. A string-and-state comparator grades this run, not a model, so there is no AI opinion to discount. That is the whole point: it sidesteps the same-model-family question that applies to any review where a model is the judge. But judge-free is not the same as independent. We wrote the fixture, we planted the defects, we wrote the answer key, and the runner grades its own output. What this rules out is a model's subjective bias. What it does not rule out is that we chose the test.
It proves: the paid harden and verify gates fire on the defect classes they claim, on this fixture, and that the clean control passes without a false-positive avalanche. The envelopes below are the real CLI output, published verbatim.
It does not prove: that the full eight-layer protocol makes Claude write better code. That efficacy question is separate, and it is what the blind review and the showcase address, not this page.
On the gates themselves: harden is a static heuristic scan of the kind open tools also do, and verify is the language's own checker plus a boot smoke test. What the paid tiers add is wiring these into a gate the agent cannot mark a run done while it fails, not inventing novel detection. The defective fixture exercises all eight of harden's pattern rules (both XSS variants and both weak-hash algorithms included). What these gates genuinely do not catch is shown, with real defects they pass, in What the gates miss, on purpose below.
This is the verbatim terminal output of a real run on v24.14.0, win32 x64. Every line is the runner driving a real shipped binary against a fixture and recording the gate state it returned; the last line is the self-graded verdict. The fixture source and the detection patterns are not on screen here by design, what you are watching is the binaries actually executing and agreeing with the answer key.
No timing tricks: the run takes about half a minute because the smoke check holds each cleanly-booting server for its full ten-second window. The raw capture is published verbatim at /showcase/data/gate-demo-run.txt. The structured envelopes these commands emitted are under each gate's receipts below; to confirm the data behind this page is byte-for-byte unchanged, see the SHA-256 tamper check under Reproduce it.
Transcript tamper check: the transcript file above is its own SHA-256 6535f9397a1331063c9be9a2ac221d95465d1bbeda689efc0a960a1704d6c339, an independent per-file digest (separate from the JSON digest under Reproduce it), so the capture is tamper-checkable on its own, not just as the text rendered here. Fetch the file and run sha256sum to confirm.
godmode v2.3.2 · harden paid tier only
Six defect classes by static scan: hardcoded secrets, SQL built by string concatenation, command injection, innerHTML / dangerouslySetInnerHTML XSS, MD5 and SHA1 weak hashing, and Math.random used for a secret or token. The defective fixture exercises all eight underlying pattern rules (both XSS variants and both weak-hash algorithms included).
hardened_dirty
Found 9 candidate findings across 8 files. Triage with judgment, heuristic flags, not all true positives.
9 findings, tagged across 8 distinct harden pattern rules, in 8 scanned files.
These are the 9 security-class defects. The other 2 of the 11 planted defects are correctness-class (a syntax error and a crash on boot) and are caught by the verify gate below, not by the security scan.
hardcoded-secret ×2weak-crypto-md5 ×1weak-crypto-sha1 ×1weak-rng-secret ×1sql-concat ×1command-injection ×1xss-innerhtml ×1xss-react ×1| Location | Pattern | Offending line |
|---|---|---|
config.js:4 |
hardcoded-secret |
const apiKey = "EXAMPLE_FAKE_api_key_1234567890abcdef"; |
config.js:5 |
hardcoded-secret |
const dbPassword = "EXAMPLE_FAKE_password_not_a_real_value"; |
crypto.js:7 |
weak-crypto-md5 |
return crypto.createHash("MD5").update(pw).digest('hex'); |
crypto.js:11 |
weak-crypto-sha1 |
return crypto.createHash("SHA1").update(payload).digest('hex'); |
crypto.js:15 |
weak-rng-secret |
const r = Math.random().toString(36).slice(2); |
db.js:7 |
sql-concat |
const query = "SELECT * FROM users WHERE id = " + req.params.id; |
exec.js:7 |
command-injection |
exec("rm -rf " + userPath, (err) => { |
render.js:6 |
xss-innerhtml |
element.innerHTML = userContent; |
render.js:10 |
xss-react |
return { dangerouslySetInnerHTML: { __html: userContent } }; |
hardened_clean
Layer 4 complete. No security/quality red flags across 8 files.
0 findings, 8 files scanned.
godmode harden<STAGING>, run id as <run_id>):{
"state": "hardened_dirty",
"run_id": "<run_id>",
"narrate": "Found 9 candidate findings across 8 files. Triage with judgment, heuristic flags, not all true positives.",
"next": "alternatives",
"instructions": "Layer 4 hardening complete. Triage data.findings with judgment, these are heuristic flags, not all true positives.\n\nSeverity ladder for each finding:\n1. Confirm the pattern is real (read the file, not just the matched line).\n2. If real and exploitable: fix immediately.\n3. If real but not exploitable in context: leave a brief justification comment.\n4. If a false positive: ignore.\n\nAfter triage, run `node bin/godmode alternatives` to begin Layer 5.",
"data": {
"scanned_files": 8,
"total_findings": 9,
"by_pattern": {
"hardcoded-secret": 2,
"weak-crypto-md5": 1,
"weak-crypto-sha1": 1,
"weak-rng-secret": 1,
"sql-concat": 1,
"command-injection": 1,
"xss-innerhtml": 1,
"xss-react": 1
},
"findings": [
{
"file": "config.js",
"line": 4,
"tag": "hardcoded-secret",
"text": "const apiKey = \"EXAMPLE_FAKE_api_key_1234567890abcdef\";"
},
{
"file": "config.js",
"line": 5,
"tag": "hardcoded-secret",
"text": "const dbPassword = \"EXAMPLE_FAKE_password_not_a_real_value\";"
},
{
"file": "crypto.js",
"line": 7,
"tag": "weak-crypto-md5",
"text": "return crypto.createHash(\"MD5\").update(pw).digest('hex');"
},
{
"file": "crypto.js",
"line": 11,
"tag": "weak-crypto-sha1",
"text": "return crypto.createHash(\"SHA1\").update(payload).digest('hex');"
},
{
"file": "crypto.js",
"line": 15,
"tag": "weak-rng-secret",
"text": "const r = Math.random().toString(36).slice(2);"
},
{
"file": "db.js",
"line": 7,
"tag": "sql-concat",
"text": "const query = \"SELECT * FROM users WHERE id = \" + req.params.id;"
},
{
"file": "exec.js",
"line": 7,
"tag": "command-injection",
"text": "exec(\"rm -rf \" + userPath, (err) => {"
},
{
"file": "render.js",
"line": 6,
"tag": "xss-innerhtml",
"text": "element.innerHTML = userContent;"
},
{
"file": "render.js",
"line": 10,
"tag": "xss-react",
"text": "return { dangerouslySetInnerHTML: { __html: userContent } };"
}
]
}
}
{
"state": "hardened_clean",
"run_id": "<run_id>",
"narrate": "Layer 4 complete. No security/quality red flags across 8 files.",
"next": "alternatives",
"instructions": "Layer 4 hardening complete. Triage data.findings with judgment, these are heuristic flags, not all true positives.\n\nSeverity ladder for each finding:\n1. Confirm the pattern is real (read the file, not just the matched line).\n2. If real and exploitable: fix immediately.\n3. If real but not exploitable in context: leave a brief justification comment.\n4. If a false positive: ignore.\n\nAfter triage, run `node bin/godmode alternatives` to begin Layer 5.",
"data": {
"scanned_files": 8,
"total_findings": 0,
"by_pattern": {},
"findings": []
}
}
godmode-plus v2.7.1 · verify paid tier only
Every changed source file is parsed with the language's own checker (node --check, py_compile, tsc --noEmit, ruby -c, gofmt). If a start script exists the app is booted for up to ten seconds to catch a crash-on-boot. A run cannot be reported done while either fails.
verified_fail
Verify FAILED. Syntax failures: total.js; smoke check failed. Loop back to Phase 2.
7 files syntax-checked — failure: total.js
| File | Checker | Result |
|---|---|---|
config.js |
node --check |
✓ ok |
crypto.js |
node --check |
✓ ok |
db.js |
node --check |
✓ ok |
exec.js |
node --check |
✓ ok |
render.js |
node --check |
✓ ok |
server.js |
node --check |
✓ ok |
total.js |
node --check |
✗ failed |
✗ smoke boot failed — Process exited with code 1 inside 10s, likely crashed.
verified_pass
Phase 6 verified: 7 files syntax-checked (recent), smoke ok, 0 entry points listed.
7 files syntax-checked — 0 failures
| File | Checker | Result |
|---|---|---|
config.js |
node --check |
✓ ok |
crypto.js |
node --check |
✓ ok |
db.js |
node --check |
✓ ok |
exec.js |
node --check |
✓ ok |
render.js |
node --check |
✓ ok |
server.js |
node --check |
✓ ok |
total.js |
node --check |
✓ ok |
✓ smoke boot passed — Process still running after 10s (tree-killed), treated as ok (no immediate crash).
godmode-plus verify<STAGING>, run id as <run_id>):{
"state": "verified_fail",
"run_id": "<run_id>",
"narrate": "Verify FAILED. Syntax failures: total.js; smoke check failed. Loop back to Phase 2.",
"next": "check (fix the failures, then re-verify)",
"instructions": "Verify failed. Read data.syntax for files that did not parse and data.smoke for app boot failures.\nLoop back to Phase 2 for the broken pieces only, do not re-run the full protocol.\nAfter fixing, run `node bin/godmode-plus check`, `test`, then `verify` again.",
"data": {
"file_source": "recent",
"syntax": [
{
"file": "config.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "crypto.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "db.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "exec.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "render.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "server.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "total.js",
"ext": ".js",
"tool": "node --check",
"ok": false,
"exit_code": 1,
"stderr": "<STAGING>\\total.js:5\r\nfunction computeTotal(items {\r\n ^\r\n\r\nSyntaxError: Unexpected token '{'\r\n at wrapSafe (node:internal/modules/cjs/loader:1743:18)\r\n at checkSyntax (node:internal/main/check_syntax:76:3)\r\n\r\nNode.js v24.14.0\r\n"
}
],
"smoke": {
"applicable": true,
"script": "node server.js",
"ok": false,
"timed_out": false,
"exit_code": 1,
"stderr": "ppData\\Local\\Temp\\godmode-gate-demo-<stage>\\server.js:8\r\n throw new Error('Fatal: missingRequiredSetting is not configured, cannot boot');\r\n ^\r\n\r\nError: Fatal: missingRequiredSetting is not configured, cannot boot\r\n at Object.<anonymous> (<STAGING>\\server.js:8:9)\r\n at Module._compile (node:internal/modules/cjs/loader:1812:14)\r\n at Object..js (node:internal/modules/cjs/loader:1943:10)\r\n at Module.load (node:internal/modules/cjs/loader:1533:32)\r\n at Module._load (node:internal/modules/cjs/loader:1335:12)\r\n at wrapModuleLoad (node:internal/modules/cjs/loader:255:19)\r\n at Module.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:154:5)\r\n at node:internal/main/run_main_module:33:47\r\n\r\nNode.js v24.14.0\r\n",
"note": "Process exited with code 1 inside 10s, likely crashed."
},
"entry_points": [],
"changed_files": [
"config.js",
"crypto.js",
"db.js",
"exec.js",
"render.js",
"server.js",
"total.js"
]
}
}
{
"state": "verified_pass",
"run_id": "<run_id>",
"narrate": "Phase 6 verified: 7 files syntax-checked (recent), smoke ok, 0 entry points listed.",
"next": "polish",
"instructions": "Phase 7 of 7: Polish-Report.\n\nFinal pass. The application is verified working. Now leave it cleaner than you found it.\n\n1. Run `node bin/godmode-plus polish` to run the formatter / linter and write the run report.\n2. Manually trace the user flow listed in data.entry_points, confirm each new entry point does what the task asked.\n3. List any adjacent issues you noticed but did not fix.\n4. Then `node bin/godmode-plus end` for the final summary.",
"data": {
"file_source": "recent",
"syntax": [
{
"file": "config.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "crypto.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "db.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "exec.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "render.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "server.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "total.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
}
],
"smoke": {
"applicable": true,
"script": "node server.js",
"ok": true,
"timed_out": true,
"exit_code": null,
"stderr": "",
"note": "Process still running after 10s (tree-killed), treated as ok (no immediate crash)."
},
"entry_points": [],
"changed_files": [
"config.js",
"crypto.js",
"db.js",
"exec.js",
"render.js",
"server.js",
"total.js"
]
}
}
The free tier ships a lighter protocol (see ships_commands). It has no harden (security scan) and no smoke-boot verify, so it runs neither gate shown here. Each gate's free_tier_probe records the free binary answering "Unknown command" when asked. The free tier does have check, test and polish, so the delta is specifically these two paid-only gates, not "the free tier does nothing".
The free tier (godmode-lite v2.3.1) ships these commands:
start discover check test polish end status
| Paid gate | Free tier asked the same | Free binary answered |
|---|---|---|
godmode harden |
godmode-lite harden |
Unknown command: harden |
godmode-plus verify |
godmode-lite verify |
Unknown command: verify |
So the delta the paid tiers add here is specifically these two gates, recorded by asking the free binary to run them and capturing its own refusal.
Every defect was planted on purpose, so the grade is checkable. These are the 11 defects in the negative-control fixture and the gate each one must trip. The positive control is the same eight files with every one of them fixed.
| Location | Planted defect | CWE class | Gate that must catch it |
|---|---|---|---|
config.js:4 |
Hardcoded API key | CWE-798 |
harden:hardcoded-secret |
config.js:5 |
Hardcoded password | CWE-798 |
harden:hardcoded-secret |
db.js:7 |
SQL built from user input by string concatenation | CWE-89 |
harden:sql-concat |
exec.js:7 |
Shell command built from user input | CWE-78 |
harden:command-injection |
render.js:6 |
Untrusted value assigned to innerHTML | CWE-79 |
harden:xss-innerhtml |
render.js:10 |
Untrusted value passed to React dangerouslySetInnerHTML | CWE-79 |
harden:xss-react |
crypto.js:7 |
MD5 used to hash a password | CWE-328 |
harden:weak-crypto-md5 |
crypto.js:11 |
SHA1 used to sign a payload | CWE-328 |
harden:weak-crypto-sha1 |
crypto.js:15 |
Math.random used to mint a session token | CWE-338 |
harden:weak-rng-secret |
total.js:5 |
Syntax error (malformed function signature) | not a CWE class | verify:syntax |
server.js:8 |
Throws on startup (crash on boot) | not a CWE class | verify:smoke |
Every planted security defect is tagged with its public CWE id (see the cwe field on each entry under fixtures.defective.planted and fixtures.blindspot.planted_missed). CWE is the vendor-neutral Common Weakness Enumeration catalogue at cwe.mitre.org. The point is to settle the "are these strawman defects the vendor invented to be easy?" question without asking anyone to trust us: the defects map to canonical weakness classes (CWE-798 hardcoded credentials, CWE-89 SQL injection, CWE-78 OS command injection, CWE-79 XSS, CWE-328 weak hash, CWE-338 weak PRNG), and each label is verifiable by reading the few-line fixture file against the public CWE definition. The blindspot path traversal is CWE-22, a famous class harden deliberately does not cover. The three correctness defects (syntax error, crash on boot, discount logic bug) carry no CWE because they are not security weaknesses.
Clean fixture: The same eight files with every defect fixed: env-var config, parameterised SQL, execFile with an argument array, textContent and escaping React markup, SHA-256 and HMAC with a CSPRNG, a valid function, and a server that boots and stays up.
The runner carries the answer key and grades the captured output against it. If any expected outcome were missing, the verdict below would read false and the failing row would show it. 19 of 19 checks pass.
| Check | Expected | Actual | |
|---|---|---|---|
| ✓ | godmode harden: defective fixture is rejected | hardened_dirty |
hardened_dirty |
| ✓ | godmode harden: clean fixture passes | hardened_clean |
hardened_clean |
| ✓ | godmode-plus verify: defective fixture is rejected | verified_fail |
verified_fail |
| ✓ | godmode-plus verify: clean fixture passes | verified_pass |
verified_pass |
| ✓ | harden catches config.js:4 (hardcoded-secret) | found at file:line |
found at file:line |
| ✓ | harden catches config.js:5 (hardcoded-secret) | found at file:line |
found at file:line |
| ✓ | harden catches db.js:7 (sql-concat) | found at file:line |
found at file:line |
| ✓ | harden catches exec.js:7 (command-injection) | found at file:line |
found at file:line |
| ✓ | harden catches render.js:6 (xss-innerhtml) | found at file:line |
found at file:line |
| ✓ | harden catches render.js:10 (xss-react) | found at file:line |
found at file:line |
| ✓ | harden catches crypto.js:7 (weak-crypto-md5) | found at file:line |
found at file:line |
| ✓ | harden catches crypto.js:11 (weak-crypto-sha1) | found at file:line |
found at file:line |
| ✓ | harden catches crypto.js:15 (weak-rng-secret) | found at file:line |
found at file:line |
| ✓ | verify catches total.js (syntax) | syntax failure |
syntax failure |
| ✓ | verify catches server.js (crash on boot) | smoke failure |
smoke failure |
| ✓ | boundary: godmode harden stays silent on the out-of-scope blindspot defect (scoped, not catch-all) | hardened_clean |
hardened_clean |
| ✓ | boundary: godmode-plus verify stays silent on the out-of-scope blindspot defect (scoped, not catch-all) | verified_pass |
verified_pass |
| ✓ | free tier (godmode-lite) does not ship harden | unavailable |
unavailable |
| ✓ | free tier (godmode-lite) does not ship verify | unavailable |
unavailable |
A third fixture (blindspot) plants two real defects that are deliberately out of each gate's scope: a path traversal (not one of harden's pattern rules) and a boot-clean logic bug (verify checks syntax and crash-on-boot, not whether the math is right). The gates pass both, on purpose. This shows what each gate catches AND what it does not, so "catches what it claims" cannot be misread as "catches everything".
A third fixture (the blindspot control) plants real, working defects that sit outside each gate's job. A truthful proof has to show the edges, not just the catches, so here they are: the gates are supposed to pass these, and they do.
| Location | Real defect | CWE class | Gate that misses it | Why it is out of scope |
|---|---|---|---|---|
lookup.js:12 |
Path traversal: a user-controlled filename is joined to a base dir and read with no containment check | CWE-22 |
harden |
path traversal (CWE-22) is a real, canonical weakness, and it is deliberately NOT one of harden's eight pattern rules, so the static scan stays clean. The miss is the point: the scan is scoped, not a guarantee of no vulnerabilities. |
pricing.js:8 |
Logic bug: applyDiscount returns the discount amount, not the discounted price | not a CWE class | verify |
verify checks syntax and crash-on-boot; the file parses and the server boots, so a wrong-but-valid computation passes. This is a correctness bug, not a security weakness, so it carries no CWE. |
Run against that fixture, both gates stay silent, exactly as scope predicts:
godmode harden on the blindspot fixturehardened_clean
Layer 4 complete. No security/quality red flags across 4 files.
godmode-plus verify on the blindspot fixtureverified_pass
Phase 6 verified: 3 files syntax-checked (recent), smoke ok, 0 entry points listed.
godmode harden (verbatim):{
"state": "hardened_clean",
"run_id": "<run_id>",
"narrate": "Layer 4 complete. No security/quality red flags across 4 files.",
"next": "alternatives",
"instructions": "Layer 4 hardening complete. Triage data.findings with judgment, these are heuristic flags, not all true positives.\n\nSeverity ladder for each finding:\n1. Confirm the pattern is real (read the file, not just the matched line).\n2. If real and exploitable: fix immediately.\n3. If real but not exploitable in context: leave a brief justification comment.\n4. If a false positive: ignore.\n\nAfter triage, run `node bin/godmode alternatives` to begin Layer 5.",
"data": {
"scanned_files": 4,
"total_findings": 0,
"by_pattern": {},
"findings": []
}
}
godmode-plus verify (verbatim):{
"state": "verified_pass",
"run_id": "<run_id>",
"narrate": "Phase 6 verified: 3 files syntax-checked (recent), smoke ok, 0 entry points listed.",
"next": "polish",
"instructions": "Phase 7 of 7: Polish-Report.\n\nFinal pass. The application is verified working. Now leave it cleaner than you found it.\n\n1. Run `node bin/godmode-plus polish` to run the formatter / linter and write the run report.\n2. Manually trace the user flow listed in data.entry_points, confirm each new entry point does what the task asked.\n3. List any adjacent issues you noticed but did not fix.\n4. Then `node bin/godmode-plus end` for the final summary.",
"data": {
"file_source": "recent",
"syntax": [
{
"file": "lookup.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "pricing.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
},
{
"file": "server.js",
"ext": ".js",
"tool": "node --check",
"ok": true,
"exit_code": 0,
"stderr": ""
}
],
"smoke": {
"applicable": true,
"script": "node server.js",
"ok": true,
"timed_out": true,
"exit_code": null,
"stderr": "",
"note": "Process still running after 10s (tree-killed), treated as ok (no immediate crash)."
},
"entry_points": [],
"changed_files": [
"lookup.js",
"pricing.js",
"server.js"
]
}
}
This is why the verdict above counts the blindspot passes as checks too: "catches what it claims" only means something if "does not catch everything" is also on the record. A third fixture of real defects that sit outside each gate's scope on purpose: a path traversal (harden has no pattern rule for it) and a boot-clean logic bug (verify checks syntax and crash-on-boot, not whether the result is correct). The gates are supposed to pass these, and they do. This bounds the claim.
With the Godmode skills installed under ~/.claude/skills/, plus Node and git on PATH, run the runner from the repo root. It copies each fixture into an isolated, git-initialised temp directory, drives the real CLIs, and rewrites this page's data file (about half a minute: the smoke check holds each cleanly-booting server for its full ten-second window):
node scripts/gate-demo/run-gates.js
Runner, fixtures, and a full README live in the repo under scripts/gate-demo/. The machine-readable data behind this page is at /showcase/data/gate-demo.json. Environment of the recorded run: Node v24.14.0, win32 x64.
Tamper check: this page was built from a data file whose SHA-256 is 43969873747fe9786b753482aeb7c3cfa86eda7995d14f9052711a580bbce99e. Fetch the JSON and run sha256sum on it to confirm you are reading the same bytes this page renders.
start then gate commands.harden is a static heuristic scan, not a guarantee of no vulnerabilities. It catches the defect classes it knows about and labels its findings as candidates to triage. This proves it catches what it claims to, not that it catches everything.verify's smoke check treats "still running after ten seconds" as a pass. It catches a crash on boot, not a logic bug that boots fine.This is the judge-free companion to the independent blind review. The review asks whether the output is better; this asks whether the gates do what the product says they do. Neither involves a human or an outside lab grading the result, and both reproduce from the repo.