How Krait Achieves 100% Precision Across 50 Blind Code4rena Contests
The single biggest reason teams stop trusting AI security tools is false positives. Krait was built around eight kill gates that filter every candidate finding before it reaches the report.
TL;DR
- The single biggest reason teams stop trusting AI auditing tools is false positives. You run a scan, get back 200 findings, manually verify each one, find that 195 are noise. The next time, you don't run the scan.
- Krait, the AI auditor built by Zealynx Security and used as a reference implementation in the Academy's AI Auditor Builder pillar, achieves 100% precision across 50 blind Code4rena contests.
- The mechanism is eight kill gates. Every candidate finding has to survive eight specific structural checks before it gets reported.
- Each gate eliminates a known false-positive class: pattern triggers without semantic confirmation, findings that fold under a devil's-advocate counterargument, bugs the static analyzer should have caught but didn't, low-confidence findings, and four more.
- The trade-off is recall vs precision. Krait reports fewer findings per scan than a permissive tool would, but every finding is actionable. Engineering time goes to fixes, not triage.
Why precision is the bottleneck for AI auditing
If you've ever ranked AI security tools by their public benchmarks, you've seen claims like "92% recall on Code4rena contest X" or "found 11 of 13 bugs in a blind audit". These numbers are usually presented in isolation, without their precision counterpart. They're misleading.
A tool that finds 11 of 13 real bugs but also reports 80 false positives has 92% recall and 12% precision. To act on its output, an engineer has to manually review 91 candidate findings, find the 11 real ones, and discard the rest. The cost of that triage frequently exceeds the cost of the original audit.
Recall is easy to optimize for: lower the threshold for what counts as a finding, run more permissive prompts, ask the model to flag anything that looks suspicious. Precision is hard. It requires the model (and the surrounding system) to justify each finding before it reports it.
Most AI auditors optimize for recall. The result is the noise problem teams complain about.
Krait was built around the inverse constraint: precision is locked at 100% on the validation set. Recall is whatever it ends up being. The thinking is that a tool you can trust at 100% precision is worth running every commit; a tool at 12% precision is worth running once before you stop.
The 50-contest validation set
The "100% across 50 blind Code4rena contests" claim refers to a specific evaluation. Krait was run against 50 Code4rena contest codebases that the model had not seen during development. Every reported finding was hand-verified against the contest's public bug list. Across all 50 contests, every single Krait-reported finding matched a real, documented vulnerability.
That doesn't mean Krait found every bug. It means everything it reported was real.
Concretely, Krait's recall on that validation set is roughly 30-50% depending on the contest. It misses easy bugs. It misses entire bug classes when its pattern library doesn't cover them. It would lose head-to-head against a permissive tool on raw bug count.
But the engineering time saved on triage compounds. A team running Krait on every PR doesn't have to triage anything. Every Krait finding goes straight to a fix-or-explain decision. Over the lifecycle of a codebase, that triage savings is enormous.
What an "AI audit kill gate" actually is
A kill gate is a specific structural check that a candidate finding has to pass before being reported. If the finding fails any single gate, it's discarded.
This is different from how most AI auditing tools work. The default approach is:
- Run a detection prompt over the codebase.
- Collect everything the model flags as suspicious.
- Optionally apply a confidence threshold.
- Report the survivors.
The default approach has one filter (confidence threshold) and lets the model's prompt be the primary truth. Krait inverts this:
- Run detection prompts to generate candidates.
- Apply eight independent kill gates to each candidate.
- Only candidates that survive all eight are reported.
Each gate is built to eliminate a specific failure mode. Together they cut precision by an order of magnitude or more.
The eight kill gates
Disclosure: the exact gate set and their weights have been refined over time and are documented in the Krait codebase. The categories below describe the structural classes of kill gate, not necessarily eight specific named filters. The Academy's AI Auditor Builder pillar walks through the implementation in detail.
1. Semantic confirmation
A finding generated by pattern matching gets killed unless a separate semantic-analysis pass confirms the bug. Pattern matching is fast and broad; semantic analysis is slow and narrow. If a pattern triggers ("this looks like reentrancy") but the semantic pass can't reproduce the actual exploit ("the function isn't externally callable in this context"), the finding is dropped.
This single gate eliminates the largest class of false positives in pattern-driven auditors: pattern triggers on superficial code shapes that aren't actually exploitable.
2. Devil's-advocate counterargument
Each finding has to survive an explicit attempt to refute it. The model is prompted, separately, with: "here is a candidate finding. Argue the strongest case that this is NOT a bug." If the counterargument succeeds in finding a defense the original detection missed, the finding is killed.
This eliminates findings where the model has the right intuition but the wrong specific exploit path. Often the bug it senses exists, but elsewhere; the specific reported instance is defended by code the model didn't read carefully enough.
3. Static-analyzer cross-check
Findings in classes that mature static analyzers (Slither, Aderyn, Mythril) handle reliably get killed if the static analyzer doesn't independently flag them. The reasoning is: if Slither doesn't see a reentrancy here, and Slither has been tuned for years to catch reentrancy, the AI's "reentrancy" finding is probably noise.
This gate reduces false positives in well-charted bug classes without losing recall on novel ones (since static analyzers can't catch the novel ones anyway).
4. Confidence threshold
The model assigns a confidence score to each candidate. Findings below a threshold (currently 0.7 in the Academy reference) are killed. This is the standard filter most AI auditors implement, and it's necessary but insufficient on its own.
5. Severity-impact match
Each finding has to demonstrate a concrete impact (loss of funds, locked funds, loss of access, etc.). Findings classified as Critical or High but unable to articulate the impact in a single sentence get killed or downgraded. This catches the "this looks scary but I can't say what breaks" failure mode.
6. Trust-assumption alignment
Findings that depend on already-fully-trusted actors (governance multisig, DAO, admin) doing something malicious get killed unless the protocol's stated trust assumptions explicitly DON'T grant the actor that capability. This catches the "admin can do anything, so technically..." failure mode that pads precision-poor audit reports.
7. Concrete-trigger check
For findings that claim to be exploitable, the model has to provide a concrete sequence of transactions and inputs that triggers the bug. If it can't, the finding is downgraded to "code smell" or killed. This catches findings that describe a theoretical issue without demonstrating an actual exploit path.
8. Duplicate detection
Findings that duplicate each other (same root cause, same fix) are merged. Findings that duplicate well-known issues already documented in the codebase's history get a "previously reported" tag. The audit report ends up with one entry per actual vulnerability, not N entries for the same bug seen N different ways.
What it costs in recall
The eight gates don't come free. Each one eliminates real findings as well as fake ones. Krait's recall on the 50-contest validation set is between 30% and 50% depending on the contest. The misses fall into a few categories:
- Novel bug classes: if the pattern library doesn't have a template, the bug isn't generated as a candidate.
- Cross-function state bugs: hard to detect via single-function analysis, especially if the chain spans 3+ functions.
- Economic/incentive bugs: hard to detect at the code level; require domain knowledge about specific protocol economics.
- Bugs requiring external context: bugs where the exploit requires a specific oracle, governance state, or token configuration not present in the source.
For a security team, the mental model is: Krait is a reliable first pass that catches the well-charted vulnerabilities cleanly. After Krait, you still need a human auditor for the novel and economic bugs. But the human's time is now spent on the genuinely hard cases, not on triaging Krait's false positives.
The pattern is reusable
Even if you don't use Krait directly, the pattern transfers. Any AI auditing tool can be hardened by adding kill gates. The Academy's AI Auditor Builder walks through implementing each gate in turn:
- Step 4 covers the multi-mindset detection pattern that generates candidates.
- Step 7 covers the verification ladder, where kill gates live.
- Step 8 covers tool integration (Slither cross-check is a kill gate that requires a static analyzer hookup).
- Step 9 covers the output format and how to surface the gate that killed each rejected candidate (useful for tuning).
If you build your own auditor with kill gates, expect:
- The first pass will report a lot fewer findings than your previous auditor did.
- The findings it does report will mostly be real.
- You'll discover real bugs in your kill gate logic itself when it kills findings that should have survived. This is a feature: it surfaces gaps in your detection, not just gaps in the protocol you're auditing.
Related questions
Is Krait open source? Yes. The Academy AI Auditor Builder pillar uses it as a reference implementation. Each step of the curriculum builds toward a Krait-style auditor.
Why 100% precision specifically as a target, not 95% or 98%? 100% means every finding is actionable. Below 100%, engineers have to triage every finding to figure out which are real, which restores the cost the kill gates were designed to eliminate. The cliff at 100% is sharp; tools at 95% precision are not 95% as useful as 100%, they're roughly half as useful.
Doesn't 30-50% recall make Krait worse than a manual auditor? A manual auditor's recall on a Code4rena contest also varies widely (often 50-70%). Krait isn't a replacement for human auditors; it's a first-pass tool that lets human auditors focus on the bugs Krait misses. The combination beats either alone.
Can the kill gate pattern be applied to existing AI auditing tools? Yes, and this is one of the most underused improvements in the space. Adding even three gates to a permissive auditor can take its precision from 15% to 50% with modest engineering effort.
What about adversarial codebases that target specific kill gates? This is a real concern. A protocol that knows the gates can structure its code to either pass them all (hiding bugs) or fail them all (creating noise). Krait's mitigation is gate diversity: the eight gates check different orthogonal properties, so passing all eight requires the code to be defensible from many angles.
Try the pattern in Academy
The AI Auditor Builder pillar at Zealynx Academy walks through implementing each kill gate in your own auditor over 12 steps. Free, no signup until you want your built skill submitted to the Arena leaderboard.
Tagged