Apply Adversarial Control Analysis to Your ML System in 3 Steps

Note (2026-03-19): This was an early exploration in my AI security research. The methodology has known limitations documented in the quality assessment. For the current state of this work, see Multi-Agent Security and Verified Delegation Protocol. Problem Statement You have deployed an ML model and someone asks: “Is it robust to adversarial attack?” You do not have a principled way to answer. You could fuzz every input, but that is expensive and tells you nothing about which attacks are structurally impossible versus which are just untested. You need a method that maps the attack surface before you start testing. ...

March 19, 2026 · 7 min · Rex Coleman

Build Your Own ML Vuln Prioritizer

Problem Statement Your security team triages vulnerabilities by CVSS score. A 9.8 gets patched immediately; a 7.5 waits. But CVSS measures severity, not exploitability. In real-world data, CVSS achieves an AUC of just 0.662 at predicting which CVEs actually get exploited – barely better than a coin flip. You need a model that predicts exploitation likelihood, not just theoretical severity. For the full research behind this tutorial, including SHAP analysis and adversarial robustness evaluation, see Why CVSS Gets It Wrong. ...

March 19, 2026 · 8 min · Rex Coleman

govML Quickstart: Governed ML in 15 Minutes

Note (2026-03-20): govML is now maintained as internal tooling. This tutorial references a public repository that is no longer available. For the current description of govML’s methodology, see How I Govern AI-Assisted ML Projects. Problem Statement You run an ML experiment. You tweak a hyperparameter. Three weeks later you cannot reproduce the original result because you forgot which config, which seed, which preprocessing step you changed. You have no contracts, no phase gates, no leakage tests. Your experiment notebook is a graveyard of dead cells and commented-out code. ...

March 19, 2026 · 7 min · Rex Coleman

How to Detect Backdoored ML Models Without Labeled Examples

Problem Statement Pre-trained models from public registries can pass every accuracy benchmark while hiding backdoors that activate only on attacker-chosen trigger inputs. Static analysis tools miss these because the backdoor lives in learned weights, not code. In 150 detection runs across 6 methods, Local Outlier Factor on raw activations achieved 0.622 AUROC at detecting backdoored models with zero labeled examples — modest but above chance, and the best unsupervised result I measured. ...

March 19, 2026 · 9 min · Rex Coleman

How to Red-Team Your AI Agent in 1 Hour

Note (2026-03-19): This was an early exploration in my AI security research. The methodology has known limitations documented in the quality assessment. For the current state of this work, see Multi-Agent Security and Verified Delegation Protocol. Problem Statement You are deploying an AI agent that can read files, search the web, or call APIs on behalf of users. Before you ship it, you need to know: what happens when someone tries to make it do something it should not? Existing frameworks like OWASP LLM Top 10 cover the language model layer, but agents have attack surfaces that models do not – tool orchestration, multi-step reasoning, persistent memory, and cross-agent delegation. You need a systematic way to test these surfaces. ...

March 19, 2026 · 9 min · Rex Coleman

How to Secure Your OpenClaw in 30 Minutes

A default OpenClaw installation has file system access, API credentials, and code execution — with zero security controls enabled. One in five ClawHub skills is actively malicious. Exposed credentials from VPS-hosted agents are already showing up in public breach lists. A compromised agent isn’t a compromised browser tab — it’s a compromised employee with the keys to everything. For the full analysis of why third-party skills are the biggest agent attack vector and what makes this worse than prompt injection at the architecture level, see the companion research. ...

March 17, 2026 · 8 min · Rex Coleman
© 2026 Rex Coleman. Content under CC BY 4.0. Code under MIT. GitHub · LinkedIn · Email