AI Safety News: Anthropic Won't Release Mythos-Class Models Until Defenses Catch Up

May 25, 2026 3 min read Anthropic Partial Strong

Tech Jacks Solutions AI News Coverage

Anthropic has stated it intends to make models matching Claude Mythos Preview's performance publicly available, but only after developing significantly more robust defenses that don't yet exist. The condition isn't a timeline. It's a capability threshold the field hasn't reached.

ai-safety anthropic claude-mythos project-glasswing agentic-ai cybersecurity vulnerability-discovery restricted-access

High/critical vulnerabilities found, 6,202

Key Takeaways

Anthropic has stated it will not release Mythos-class models publicly until significantly more robust defensive capabilities exist, no timeline given.
Project Glasswing identified 6,202 high- or critical-severity vulnerabilities out of 23,019 total (per T1 Anthropic source); this is the engine behind the access restriction.
Access is currently limited to US government and allied customers under Project
Glasswing; broader enterprise access is not scheduled.
The release condition is capability-based, not calendar-based, security teams should not plan around a near-term public Mythos launch.

Anthropic’s most capable model isn’t coming to a public API anytime soon. The company has stated that Claude Mythos Preview, currently restricted to a limited group of partners under Project Glasswing, will eventually be made publicly available, but only after defenses are developed that are “significantly more robust” than anything available today. That’s not hedging. It’s a structural admission about where the frontier currently sits.

Don’t expect a launch date. There isn’t one, because the release condition isn’t calendar-based, it’s tied to a defensive capability that Glasswing’s own findings suggest is lagging badly behind offensive potential. Anthropic’s initial Glasswing update reports that the program identified 6,202 high- or critical-severity vulnerabilities out of 23,019 total vulnerabilities found across critical software infrastructure. That’s the scale of the discovery engine they’re holding back from public release.

The model identifier claude-mythos-1-preview reportedly appeared briefly in public versions of Claude Code and Claude Security before being pulled – a signal, per The Register’s reporting, that integration work is already underway even as the model itself stays gated. According to Anthropic’s internal evaluation, Claude Mythos Preview scored 97.6% on USAMO 2026, a self-reported figure that hasn’t been independently verified by Epoch AI or any third-party evaluator.

The part nobody mentions: Glasswing’s vulnerability discovery rate is the more important number than the benchmark score. Finding 23,019 vulnerabilities, 6,202 of them high or critical, across real software infrastructure in the program’s early phase means the model is already operating as a high-throughput security analysis engine. The patching side of that equation moves at human speed. Security researchers and analysts have noted that AI-accelerated vulnerability discovery is outpacing the industry’s capacity to remediate findings, though this assessment reflects analytical inference rather than a single primary data source. Anthropic’s public release restraint isn’t separable from that gap. Releasing Mythos-class capability broadly before defensive tooling catches up means handing the same discovery engine to actors whose interests don’t include responsible disclosure.

For security architects and enterprise AI procurement teams, the practical read is this: Mythos-class capability is in production, just not yours. US government and allied customers under Project Glasswing have access now. The restricted API is reportedly priced at $25 per million input tokens and $125 per million output tokens, according to reports, figures that haven’t been verified against Anthropic’s official pricing documentation. If your threat model already accounts for nation-state actors having access to frontier vulnerability discovery AI, today’s announcement changes nothing. If it doesn’t, that’s the gap to close first.

What to watch

The release condition Anthropic has named, “significantly more robust defenses”, will eventually need to be operationalized. Watch for Anthropic or a third party defining what “robust enough” looks like in measurable terms. Epoch AI’s model database was confirmed live as of today; no independent Mythos evaluation appears there yet. When an Epoch AI evaluation does publish, the benchmark picture will either validate or complicate Anthropic’s internal figures. That’s the trigger worth tracking.

The catch is that “we’ll release it when it’s safe” is both the most responsible position and the least testable one. Anthropic has named a condition without naming the evidence standard for meeting it. Until that standard is defined, by Anthropic, by a regulatory body, or by the security research community, the public release timeline is genuinely open-ended. Security teams shouldn’t plan around a Mythos public launch in the next 12 months. Plan around adversaries who already have equivalent access.