After four ML projects at Georgia Tech, I’d run 14 manual audit cycles with 30+ findings each. The governance wasn’t the problem — the manual enforcement was. So I built govML.

The Problem

Every ML project needs governance: reproducible experiments, documented decisions, data integrity checks, fair comparisons. But enforcing governance manually is a workflow killer. My unsupervised learning project had 7 audit cycles with 49+ findings. The RL project had 14 cycles with 30+ findings. I was spending more time auditing than experimenting.

The root cause: governance was prose in a document, not executable infrastructure. Contracts told me what to do but didn’t do it for me.

What I Built

govML is an open-source governance framework for ML projects — 32 templates, 7 quickstart profiles, 7 generators, and an agent orchestrator prototype.

The architecture has three layers:

Layer 1: GOVERNANCE (templates)
  Defines WHAT must be true — data contracts, experiment protocols, phase gates
    ↓ generates
Layer 2: SCAFFOLDING (generators)
  Scripts that ENFORCE governance automatically — sweep orchestration,
  manifest verification, phase gate checks
    ↓ orchestrated by
Layer 3: ORCHESTRATION (agent)
  AI-driven workflow that manages the experiment lifecycle with human
  approval at decision points

How It Works in Practice

Initialize a new project:

bash scripts/init_project.sh /path/to/project --profile security-ml --fill

This copies 21 governance templates, pre-fills common placeholders from project.yaml, and gives you a PROJECT_BRIEF to fill before writing any code. The brief forces you to define your thesis, research questions, scope, and publication target upfront.

Every experiment runs through phase gates. You can’t advance to the next phase until the current gate passes. Decisions are logged in ADR format at every gate — mandatory, not optional. When the experiments are done, a PUBLICATION_PIPELINE template governs the blog post from draft structure through distribution checklist.

The Automation Progression

Each project compounded on the last:

GenerationProjectManual Steps
Gen 0Supervised Learning~15 steps
Gen 1Optimization~12 steps
Gen 2Unsupervised~10 steps
Gen 3Reinforcement Learning~6 steps
Gen 4Adversarial IDS (with govML v2.4)<5 minutes setup

The key accelerators: --fill for bulk placeholder substitution, PROJECT_BRIEF for thesis-first thinking, and PUBLICATION_PIPELINE for governing the highest-leverage activity — publishing.

What I Learned

Governance docs that aren’t executable are decoration. The templates matter, but the generators (sweep orchestration, manifest verification, phase gates) are what actually prevent errors. When I added automated audit generators (G13-G16), manual audit cycles dropped from 14 to zero.

The highest-leverage template was the one I built last. PUBLICATION_PIPELINE governs the blog workflow — the single most important brand activity. govML governed everything except publishing for months. The irony of a governance framework that doesn’t govern its own distribution was the key insight that led to v2.4.

PROJECT_BRIEF changes behavior, not just documentation. When you force thesis + research questions + scope BEFORE code, projects start differently. My vulnerability prioritization project (FP-05) went from mkdir to complete FINDINGS.md in a single session — because the brief defined what I was proving before I wrote a line of Python.

The Numbers

  • 32 templates across 4 directories (core, management, report, publishing)
  • 7 quickstart profiles (minimal → full)
  • 7 generators + agent orchestrator prototype
  • 24 issues identified, 15 resolved in v2.4
  • Tested across 7 real projects (4 academic, 2 frontier research, 1 systems benchmark)

Try It

govML is open source: github.com/rexcoleman/govML

git clone https://github.com/rexcoleman/govML.git
cd govML
bash scripts/init_project.sh /your/project --profile supervised --fill

If you run ML experiments and want reproducibility without the overhead, this is the framework I built to solve that problem for myself. Every template was extracted from real project friction, not designed speculatively.


Rex Coleman builds what’s missing between ML research and production security. 9 open-source projects across 4 ML paradigms. Georgia Tech OMSCS (ML). CFA. CISSP. Creator of govML. rexcoleman.dev