AI Security

We Mapped 8 AI Security Research Projects to OWASP, NIST, and MITRE — Here's Where the Gaps Are

Five of eight research projects map to OWASP LLM01 (Prompt Injection). Three OWASP categories — LLM03 (Supply Chain), LLM07 (System Prompt Leakage), and LLM08 (Vector/Embedding Weaknesses) — have zero research coverage. That gap tells you where the next round of experiments needs to go. I published a full standards mapping that cross-references 8 original AI security research projects against four frameworks. The mapping covers OWASP Top 10 for Large Language Model (LLM) Applications, OWASP Top 10 for Agentic Applications, National Institute of Standards and Technology (NIST) AI Risk Management Framework (RMF), and MITRE Adversarial Threat Landscape for AI Systems (ATLAS). ...

AI Security Research → OWASP, NIST, and MITRE Standards Mapping

AI Security Research → Standards Mapping Last updated: 2026-03-31 A cross-reference between original AI security research and the frameworks practitioners use to assess, govern, and defend AI systems. Why This Mapping Exists Security practitioners work within frameworks — Open Web Application Security Project (OWASP), National Institute of Standards and Technology (NIST), MITRE. Researchers often publish findings without connecting them to these frameworks, leaving practitioners to do the mapping themselves (or never find the research at all). ...

5 AI Security Gaps That Jensen Huang, Eric Schmidt, and the OpenClaw Creator All Flagged This Month

I spent this week extracting AI security signals from five frontier podcasts — Jensen Huang on Lex Fridman, Eric Schmidt on Moonshots, Peter Steinberger (OpenClaw creator) on Lex Fridman, and two Moonshots panel episodes covering NVIDIA, Anthropic, and Tesla. 68 claims, 30 concepts, 26 signals logged to a structured knowledge base. The finding that surprised me: three independent sources — a $4 trillion CEO, a former Google CEO, and the creator of the fastest-growing open-source project in history — all flagged the same security gaps without coordinating. Here are the five signals that converged. ...

Our Simulation Was Wrong by 37 Percentage Points — What Real LLM Agents Taught Us About Multi-Agent Cascade

I built a multi-agent security simulation, ran 6 experiments, then validated against real Claude Haiku agents. The simulation predicted 97% poison rate. Real agents: 60%. And the biggest surprise: topology matters — something the simulation said was irrelevant. What I Built A simulation-based testbed that models multi-agent systems with configurable trust architectures, network topologies, attacker types, and agent compositions. One agent gets compromised. We measure how poisoned outputs cascade through the system. ...

Your AI Makes SQL Injection Worse: CWE-Stratified Patch Safety for LLM Code Generation

LLM-generated security patches have a 42% fix rate and a 10% regression rate — but the aggregate hides a dangerous pattern. SQL injection patches are net-negative: 0% fix rate, 50% regression. The model recognizes the vulnerability but its rewrites introduce new injection vectors. Cryptography patches, by contrast, hit 100% fix rate with 0% regression. I tested Claude Haiku generating patches for 50 vulnerable code snippets across 5 CWE categories, measured by static analysis for both fix rate and regression rate. ...

How Many Rewrites to Strip a Watermark? Empirical Paraphrase-Removal Curves for LLM Watermarks

Cross-model paraphrasing drops Kirchenbauer watermark detection from 100% to 60% in a single pass. After ten passes, it plateaus at 40%. The watermark is partially robust — but not enough for adversarial settings where the attacker has access to any LLM. I measured this by watermarking text with GPT-2, paraphrasing with Claude Haiku, and tracking how the z-score decays. Five experiments. Six pre-registered hypotheses. Real green-list watermarking with logit access. ...

Privilege Escalation Cascades at 98% While Domain-Aligned Attacks Are Invisible

Domain-aligned prompt injections cascade through multi-agent systems at a 0% detection rate. Privilege escalation payloads hit 97.6%. That’s a 98 percentage-point spread across payload types in the same agent architecture — the single biggest variable determining whether your multi-agent system catches an attack or never sees it. I ran six experiments on real Claude Haiku agents to find out why. Three resistance patterns explain the gap — and each has a quantified bypass condition. ...

Your AI Can't Beat EPSS at Vulnerability Triage (But the Ensemble Might)

Can an LLM agent prioritize vulnerabilities better than EPSS? Every security team drowning in CVEs wants to know whether AI can help them triage faster. We tested this empirically: Claude Haiku as a vulnerability triage agent, ranked against EPSS, CVSS, and random baselines, with CISA KEV as ground truth for “actually exploited.” The short answer: no, the agent doesn’t beat EPSS. But the longer answer is more interesting. ...

Why Third-Party Skills Are the Biggest Agent Attack Vector

Last week I published a 30-minute hardening guide for OpenClaw. The #1 risk on that list was third-party skills. Since then, the numbers have gotten worse. 820+ malicious skills are now on ClawHub — roughly 20% of the entire registry. That’s not a rounding error. That’s one in five skills being actively hostile to the agent that installs them. But the number isn’t what makes this the biggest attack vector. The architecture is. ...

We Built a Multi-Agent Defense and It Failed — Here's Why That Matters More

We proposed a verified delegation protocol — LLM-as-judge verification, cryptographic signing, adaptive rate limiting — and pre-registered 7 hypotheses predicting it would reduce multi-agent cascade poison by 70%. Then we tested it on real Claude agents. Five hypotheses were refuted. The protocol doesn’t work. And that’s the finding. ...