Voluntary commitments were always the strategy and the problem simultaneously.
The strategy: build a governance foundation fast, before legislation could, using reputational incentives and industry credibility to make safety evaluations the expected norm. The problem: voluntary means revocable. A lab that signed a memorandum of understanding with CAISI could, in theory, decline to renew it. A government that changed political direction could deprioritize enforcement. The architecture worked as long as everyone agreed to let it work.
What’s changing isn’t that agreement. It’s the legal weight of the documents.
The Anchor Event: What “Formal Agreements” Actually Mean
According to reporting by CNBC, CAISI has signed formal pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI, bringing all five major frontier labs under CAISI’s evaluation umbrella. The earlier engagements with Anthropic and OpenAI were structured as memoranda of understanding. The new agreements are described as “formal”, but the reporting doesn’t specify whether that formality is reflected in the legal structure, the enforcement mechanism, or the process requirements.
That distinction matters enormously. A formal agreement without a legal enforcement backstop is still a voluntary arrangement. It carries more institutional weight, more reputational stakes, and presumably a more defined process, but a lab that decided to walk away from it would face reputational consequences, not legal ones. At least under the current architecture.
The enforcement question is the unresolved center of this story, and it’s worth being precise about what we don’t know: there is no publicly confirmed statutory or executive order authority that requires a frontier lab to submit to CAISI evaluation. The White House has been reported as drafting a mandatory pre-release review executive order, but as of this writing, that order hasn’t published.
The Capability Trigger: Why Mythos Changed the Calculus
The governance significance of this expansion isn’t just that more labs joined the program. It’s the mechanism that drove the expansion.
Reporting from Just Security citing an independent UK AISI assessment characterizes Anthropic’s Mythos model, a restricted, non-publicly-released system, as representing a step up in zero-day vulnerability identification capability. That assessment, the reporting suggests, directly informed CAISI’s decision to formalize evaluation scope across additional labs.
The precedent that sets is structural. Prior expansions of federal AI safety programs followed policy processes: executive direction, NIST engagement, voluntary commitment frameworks. This expansion reportedly followed a finding. A specific capability assessment identified a threshold that triggered formal governance action.
If that causal chain is accurate and repeatable, it changes what “pre-deployment evaluation” means in practice. It’s no longer a check-the-box process that labs submit to as a condition of maintaining good relationships with the government. It’s a threshold-detection mechanism, one where the findings can directly expand what the government requires of everyone operating in the same capability neighborhood.
US Frontier AI Oversight: Before and After Formal Agreements
Prior TJS coverage on the Mythos access architecture and the control framework around restricted model access provide relevant background on why this model attracted the level of federal attention it did.
The Five-Step Progression
The shift from voluntary to formal didn’t happen overnight. It’s visible in the record of the past eighteen months.
Step 1 was the Frontier Model Forum commitments, voluntary pledges from major labs on safety practices, red-teaming, and incident reporting. No evaluation, no verification, no enforcement. Reputational accountability only.
Step 2 was the early CAIS/AISI memoranda of understanding, bilateral agreements that created a process for pre-deployment evaluation but left participation at the lab’s discretion.
Step 3 was the CAISI expansion to formal agreements. The process now has a defined structure, covers specific risk domains (cybersecurity, biosecurity, chemical weapons), and involves all five major frontier labs. “Formal” raises the institutional stakes even if the legal architecture hasn’t changed.
Step 4, which hasn’t happened yet, would be a statutory mandate or executive order establishing evaluation as a legal prerequisite for deployment. The White House is reportedly in that step now.
Step 5 would be international harmonization: aligning US evaluation requirements with UK AISI, EU AI Office, and other national frameworks. That step is visible on the horizon given the Mythos assessment’s cross-Atlantic character.
The current state sits between steps 3 and 4. Formal but not yet mandatory in a legally enforceable sense.
The Enforcement Question
What happens if a lab declines to participate in a CAISI evaluation? Under the current architecture, the answer is: reputational damage, potential exclusion from federal contracting, and political pressure. Those are real consequences. They’re not the same as legal liability.
The reported White House executive order on mandatory pre-release review is the fulcrum. If it publishes, the current agreements become the template, the process that was developed voluntarily becomes the baseline for what mandatory compliance looks like. That’s not an accident. Building the voluntary architecture first, then mandating it, is a governance strategy that avoids the legislative process while creating durable institutional infrastructure.
Unanswered Questions
- What legal authority backs the formal CAISI agreements, and what happens if a lab declines?
- If the White House EO publishes, what evaluation process requirements will it codify?
- Does the Mythos capability-trigger precedent apply to open-source models, or only to labs with institutional CAISI relationships?
Analysis
The most underappreciated element of this shift is the capability-trigger mechanism. If a capability finding in one model can expand evaluation scope for all labs operating in adjacent capability spaces, the governance boundary isn't defined by what labs agreed to, it's defined by what the evaluators find. That's a fundamentally different power structure than voluntary commitments, and it exists independent of whether a statutory mandate ever passes.
There’s a separate claim in the reporting that deserves its own treatment here. Just Security reported, in a single story, that federal agencies reportedly circumvented the Trump administration’s ban on Anthropic tools to conduct defensive safety tests. That’s a significant allegation, executive branch agencies breaking with administration policy. It’s one source. This brief includes it as a data point because it potentially illuminates the demand side of the capability access question: even agencies under a procurement ban found the Mythos evaluation results compelling enough to work around official restrictions. If corroborated, that’s a signal about how seriously at least some parts of the federal government take the capability threshold that Mythos represents.
What This Structural Shift Means for Labs Approaching Deployment
For frontier labs with models approaching deployment, the practical implications of the current architecture are clearer than the legal ones. Submitting to CAISI evaluation is the expectation. Declining carries visible reputational costs. The evaluation scope, cybersecurity, biosecurity, chemical weapons, defines the risk domains that will receive scrutiny.
More importantly: if the capability-trigger mechanism holds, a finding in one lab’s model can expand what the government expects from all labs operating in adjacent capability spaces. That’s a different compliance environment than the one that existed eighteen months ago.
The voluntary AI safety era isn’t over. But the ceiling of what “voluntary” can achieve has been reached, and the architecture above it is under active construction.
Here’s what matters for your planning.
What to Watch
Don’t expect a clean transition from voluntary to mandatory. The more likely pattern is layered obligations: formal agreements for current frontier models, statutory requirements for the next generation, and international harmonization requirements after that. Each layer will use the previous one as its template.