Ai-Security

We Mapped 8 AI Security Research Projects to OWASP, NIST, and MITRE — Here's Where the Gaps Are

Five of eight research projects map to OWASP LLM01 (Prompt Injection). Three OWASP categories — LLM03 (Supply Chain), LLM07 (System Prompt Leakage), and LLM08 (Vector/Embedding Weaknesses) — have zero research coverage. That gap tells you where the next round of experiments needs to go. I published a full standards mapping that cross-references 8 original AI security research projects against four frameworks. The mapping covers OWASP Top 10 for Large Language Model (LLM) Applications, OWASP Top 10 for Agentic Applications, National Institute of Standards and Technology (NIST) AI Risk Management Framework (RMF), and MITRE Adversarial Threat Landscape for AI Systems (ATLAS). ...

AI Security Research → OWASP, NIST, and MITRE Standards Mapping

AI Security Research → Standards Mapping Last updated: 2026-03-31 A cross-reference between original AI security research and the frameworks practitioners use to assess, govern, and defend AI systems. Why This Mapping Exists Security practitioners work within frameworks — Open Web Application Security Project (OWASP), National Institute of Standards and Technology (NIST), MITRE. Researchers often publish findings without connecting them to these frameworks, leaving practitioners to do the mapping themselves (or never find the research at all). ...

5 AI Security Gaps That Jensen Huang, Eric Schmidt, and the OpenClaw Creator All Flagged This Month

I spent this week extracting AI security signals from five frontier podcasts — Jensen Huang on Lex Fridman, Eric Schmidt on Moonshots, Peter Steinberger (OpenClaw creator) on Lex Fridman, and two Moonshots panel episodes covering NVIDIA, Anthropic, and Tesla. 68 claims, 30 concepts, 26 signals logged to a structured knowledge base. The finding that surprised me: three independent sources — a $4 trillion CEO, a former Google CEO, and the creator of the fastest-growing open-source project in history — all flagged the same security gaps without coordinating. Here are the five signals that converged. ...

Our Simulation Was Wrong by 37 Percentage Points — What Real LLM Agents Taught Us About Multi-Agent Cascade

I built a multi-agent security simulation, ran 6 experiments, then validated against real Claude Haiku agents. The simulation predicted 97% poison rate. Real agents: 60%. And the biggest surprise: topology matters — something the simulation said was irrelevant. What I Built A simulation-based testbed that models multi-agent systems with configurable trust architectures, network topologies, attacker types, and agent compositions. One agent gets compromised. We measure how poisoned outputs cascade through the system. ...

How Many Rewrites to Strip a Watermark? Empirical Paraphrase-Removal Curves for LLM Watermarks

Cross-model paraphrasing drops Kirchenbauer watermark detection from 100% to 60% in a single pass. After ten passes, it plateaus at 40%. The watermark is partially robust — but not enough for adversarial settings where the attacker has access to any LLM. I measured this by watermarking text with GPT-2, paraphrasing with Claude Haiku, and tracking how the z-score decays. Five experiments. Six pre-registered hypotheses. Real green-list watermarking with logit access. ...

Privilege Escalation Cascades at 98% While Domain-Aligned Attacks Are Invisible

Domain-aligned prompt injections cascade through multi-agent systems at a 0% detection rate. Privilege escalation payloads hit 97.6%. That’s a 98 percentage-point spread across payload types in the same agent architecture — the single biggest variable determining whether your multi-agent system catches an attack or never sees it. I ran six experiments on real Claude Haiku agents to find out why. Three resistance patterns explain the gap — and each has a quantified bypass condition. ...

Your AI Can't Beat EPSS at Vulnerability Triage (But the Ensemble Might)

Can an LLM agent prioritize vulnerabilities better than EPSS? Every security team drowning in CVEs wants to know whether AI can help them triage faster. We tested this empirically: Claude Haiku as a vulnerability triage agent, ranked against EPSS, CVSS, and random baselines, with CISA KEV as ground truth for “actually exploited.” The short answer: no, the agent doesn’t beat EPSS. But the longer answer is more interesting. ...

We Built a Multi-Agent Defense and It Failed — Here's Why That Matters More

We proposed a verified delegation protocol — LLM-as-judge verification, cryptographic signing, adaptive rate limiting — and pre-registered 7 hypotheses predicting it would reduce multi-agent cascade poison by 70%. Then we tested it on real Claude agents. Five hypotheses were refuted. The protocol doesn’t work. And that’s the finding. ...

AI Security Has a Shipping Problem

Thesis: The AI security industry produces frameworks and guidelines but almost no one ships working tools that practitioners can deploy today. The gap between “risk identified” and “risk mitigated” in AI security is wider than any other area of cybersecurity I’ve worked in. We have more frameworks per deployed tool than any domain in the history of information security. And the frameworks keep coming while the tools don’t. The Evidence 1. OWASP published the Agentic Top 10 in late 2025. No tools enforce it. ...

ICA+GMM improves backdoor cluster detection by 163%

Combining Independent Component Analysis (ICA) with Gaussian Mixture Models (GMM) improved backdoor cluster detection by 163% compared to standard PCA+KMeans approaches in model behavioral fingerprinting experiments. The improvement was consistent across multiple trigger types and model architectures. Why this matters Backdoor detection in neural networks is an unsupervised problem — you don’t know which models are trojaned, and you don’t know what the trigger looks like. Most existing approaches use PCA for dimensionality reduction and KMeans for clustering, then look for outlier clusters. This works, but it misses subtle backdoors where the behavioral signature is non-Gaussian or where multiple backdoor variants coexist in the same model population. ...