Research

AI Security Research → OWASP, NIST, and MITRE Standards Mapping

AI Security Research → Standards Mapping Last updated: 2026-03-31 A cross-reference between original AI security research and the frameworks practitioners use to assess, govern, and defend AI systems. Why This Mapping Exists Security practitioners work within frameworks — Open Web Application Security Project (OWASP), National Institute of Standards and Technology (NIST), MITRE. Researchers often publish findings without connecting them to these frameworks, leaving practitioners to do the mapping themselves (or never find the research at all). ...

Our Simulation Was Wrong by 37 Percentage Points — What Real LLM Agents Taught Us About Multi-Agent Cascade

I built a multi-agent security simulation, ran 6 experiments, then validated against real Claude Haiku agents. The simulation predicted 97% poison rate. Real agents: 60%. And the biggest surprise: topology matters — something the simulation said was irrelevant. What I Built A simulation-based testbed that models multi-agent systems with configurable trust architectures, network topologies, attacker types, and agent compositions. One agent gets compromised. We measure how poisoned outputs cascade through the system. ...

Your AI Makes SQL Injection Worse: CWE-Stratified Patch Safety for LLM Code Generation

LLM-generated security patches have a 42% fix rate and a 10% regression rate — but the aggregate hides a dangerous pattern. SQL injection patches are net-negative: 0% fix rate, 50% regression. The model recognizes the vulnerability but its rewrites introduce new injection vectors. Cryptography patches, by contrast, hit 100% fix rate with 0% regression. I tested Claude Haiku generating patches for 50 vulnerable code snippets across 5 CWE categories, measured by static analysis for both fix rate and regression rate. ...

How Many Rewrites to Strip a Watermark? Empirical Paraphrase-Removal Curves for LLM Watermarks

Cross-model paraphrasing drops Kirchenbauer watermark detection from 100% to 60% in a single pass. After ten passes, it plateaus at 40%. The watermark is partially robust — but not enough for adversarial settings where the attacker has access to any LLM. I measured this by watermarking text with GPT-2, paraphrasing with Claude Haiku, and tracking how the z-score decays. Five experiments. Six pre-registered hypotheses. Real green-list watermarking with logit access. ...

Privilege Escalation Cascades at 98% While Domain-Aligned Attacks Are Invisible

Domain-aligned prompt injections cascade through multi-agent systems at a 0% detection rate. Privilege escalation payloads hit 97.6%. That’s a 98 percentage-point spread across payload types in the same agent architecture — the single biggest variable determining whether your multi-agent system catches an attack or never sees it. I ran six experiments on real Claude Haiku agents to find out why. Three resistance patterns explain the gap — and each has a quantified bypass condition. ...

Your AI Can't Beat EPSS at Vulnerability Triage (But the Ensemble Might)

Can an LLM agent prioritize vulnerabilities better than EPSS? Every security team drowning in CVEs wants to know whether AI can help them triage faster. We tested this empirically: Claude Haiku as a vulnerability triage agent, ranked against EPSS, CVSS, and random baselines, with CISA KEV as ground truth for “actually exploited.” The short answer: no, the agent doesn’t beat EPSS. But the longer answer is more interesting. ...

We Built a Multi-Agent Defense and It Failed — Here's Why That Matters More

We proposed a verified delegation protocol — LLM-as-judge verification, cryptographic signing, adaptive rate limiting — and pre-registered 7 hypotheses predicting it would reduce multi-agent cascade poison by 70%. Then we tested it on real Claude agents. Five hypotheses were refuted. The protocol doesn’t work. And that’s the finding. ...

A CFA Charterholder Built an ML Fraud Detector: Here's What the Models Miss

Note (2026-03-19): This was an early exploration in my AI security research. The methodology has known limitations documented in the quality assessment. For the current state of this work, see Multi-Agent Security and Verified Delegation Protocol. I’m a CFA charterholder who builds ML systems. I trained XGBoost on 100K financial transactions to detect fraud — AUC 0.987. But the most interesting finding wasn’t the model performance. It was that CFA-informed rule-based scoring achieves 0.898 AUC on its own, and 8 of the top 20 predictive features come from domain expertise, not raw data. ...

I Built a PQC Migration Scanner: Here's What Your Codebase Is Hiding

Note (2026-03-19): This was an early exploration in my AI security research. The methodology has known limitations documented in the quality assessment. For the current state of this work, see Multi-Agent Security and Verified Delegation Protocol. I scanned Python’s standard library for quantum-vulnerable cryptography. Found 39 findings — 19 critical, all Shor-vulnerable. Then I trained ML models on 21,142 crypto-related CVEs to score migration priority. The surprise: classical exploit risk matters more than quantum vulnerability for deciding what to fix first. And 70% of the crypto in your codebase isn’t yours to change. ...

Beyond Prompt Injection: RL Attacks on AI Agent Decision-Making

Observation perturbation degrades RL agent performance 20-50x more effectively than reward poisoning. And prompt-injection defenses? 0% effective against RL-specific attacks — they target completely different surfaces. I built two custom Gymnasium environments (access control, tool selection), trained 40 agents across 4 algorithms and 5 seeds, then ran 150 attack experiments across 4 attack classes. The result: if you’re monitoring reward signals but not observation channels, you’re watching the wrong surface. ...