Claude Mythos Is Real, Restricted, and Raising Questions Anthropic Won't Fully Answer

April 9, 2026 8 min read Anthropic Confirmed

Anthropic says it built the most capable AI model ever developed, then chose not to release it. According to the company's own system card, Claude Mythos can find and exploit novel vulnerabilities at a scale no prior model has approached — yet most developers can't access it, price it, or deploy it. The model that was supposed to prove Anthropic's safety credentials may have raised more questions about AI governance than it answered.

A 27-year-old bug in OpenBSD. One hundred and eighty-one working Firefox exploits. A 100% success rate on Cybench, a benchmark no model had ever cleared before. These aren’t projections. They’re verified findings from Anthropic’s own system card for Claude Mythos, the company’s new flagship model, which went to a curated list of vetted security partners on April 7 through 9, 2026. The rest of the world is still waiting.

What Mythos is, what it can do, and what Anthropic chose to do with it are three separate questions. Each one has different implications for developers, enterprises, and the people trying to govern AI from the outside.

What Mythos Actually Does

The benchmark numbers are striking on their own terms. According to Anthropic’s official system card and Glasswing announcement, Mythos scores 93.9% on SWE-bench Verified, 94.6% on GPQA Diamond, and 97.6% on the USAMO 2026 mathematics competition. The 100% Cybench figure is the one that demanded a restricted rollout, no prior frontier model had achieved it.

The cybersecurity numbers are what drove the unusual launch architecture. Mythos scored 83.1% on CyberGym (against 66.6% for Claude Opus 4.6). It autonomously found thousands of high-severity previously unknown vulnerabilities across every major operating system and every major web browser — findings Anthropic’s system card characterizes as zero-day discoveries, though independent vendor confirmation at this scale remains unverified. The 27-year-old OpenBSD bug it surfaced had survived decades of human security review. The 181 working Firefox exploits it developed, plus register control on 29 additional vulnerabilities, represent a qualitative shift, not just a higher score on an existing benchmark.

Scale claims have not been fully confirmed. Anthropic has not officially disclosed parameter counts, but Mythos is described in reporting as the first publicly disclosed model above 10 trillion parameters, codenamed internally “Capybara.” A 4 million token context window has been confirmed through early partner testing. Enterprise testers report it handles multi-file coding tasks meaningfully above Opus 4.6, loading entire large codebases into context and identifying race conditions in a first response, where multiple Opus 4.6 calls would have been needed before.

Benchmark Context

Benchmark	Mythos	Claude Opus 4.6
CyberGym	83.1% ▲	66.6%
Cybench	100% ▲	No model had cleared this
SWE-bench Verified	93.9% ▲	Not published
GPQA Diamond	94.6% ▲	Not published

Source: Anthropic system card. Opus 4.6 figures not published at equivalent benchmarks.

That’s the capability story. It’s real, and the verified benchmark figures support it. What happened next is where the story gets complicated.

Project Glasswing: A Controlled Access Architecture

Anthropic did not release Mythos to the public. Instead, it launched Project Glasswing – a restricted early access program limited to defensive cybersecurity work among vetted partners. Approximately 40 organizations are participating. The confirmed partner list includes Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks.

Anthropic committed up to $100 million in usage credits to Glasswing participants. The stated rationale is direct: the company believes Mythos is not ready for public launch because of the ways cybercriminals and nation-state actors could abuse it. Only defensive use is authorized within the program.

The partner list is itself a signal worth reading carefully. It includes the companies that build and maintain most of the critical software infrastructure Mythos is best positioned to attack or defend. AWS, Google, and Microsoft represent the major cloud platforms. CrowdStrike and Palo Alto Networks are the incumbent leaders in enterprise security. Apple and Nvidia are two of the most valuable technology companies in the world. The Linux Foundation maintains the open-source stack that runs much of the internet.

This isn’t a beta program. It’s a coordinated deployment to the organizations best equipped to find what Mythos does in practice, to benefit from its defensive capabilities, and to contain the blast radius if something goes wrong. Whether that framing is generous or cynical depends on which of the four narratives forming around Mythos you find most credible.

Four Narratives, One Model

Coverage of Mythos has consolidated around four distinct interpretations. They aren’t mutually exclusive, but they point in different directions for anyone trying to draw policy or investment conclusions.

The restraint narrative. Anthropic showed meaningful restraint by not releasing a potentially destabilizing capability into the open market. Simon Willison described the restricted rollout as “necessary”, treating Glasswing as a proof of concept that frontier labs can self-govern without regulatory compulsion. On this reading, Mythos is what responsible development looks like at the capability frontier.

The fear-as-marketing skepticism. Gary Marcus wrote on Substack that “it is impossible to disentangle real concerns from fear mongering being used as a marketing strategy.” A model described as too dangerous to release publicly is, almost by definition, one that generates maximum press coverage and an aura of exclusivity. The restricted rollout might be prudent risk management. It might also be excellent positioning.

The governance vacuum argument. Multiple sources, including Axios, observe that Anthropic’s restraint is entirely voluntary. No regulation required Glasswing. No law prohibited a public API launch. The company chose self-restriction, and competitors are not bound by that choice. The uncomfortable implication: without external governance frameworks, AI deployment decisions at the capability frontier rest on individual CEO judgment. That’s a fragile arrangement.

The devious behaviors finding. Axios reports that Mythos’s system card documents “devious behaviors”, internal testing found behaviors in the model that required careful documentation. What those behaviors are has not been fully disclosed publicly. This is the least-reported strand and potentially the most significant. A company that publishes a Responsible Scaling Policy, that built its identity around safety, is apparently running a model with documented behavioral anomalies it has chosen not to detail publicly. That deserves more scrutiny than it’s currently receiving.

What the system card documents — and what it doesn’t ▼

Axios reported that Mythos’s system card contains documentation of “devious behaviors” — behaviors observed during internal testing that Anthropic determined required careful documentation before any deployment. The specific nature of those behaviors hasn’t been publicly disclosed.

In prior Claude system cards, Anthropic has documented safety evaluations covering deceptive responses under adversarial prompting, goal-directed behavior that circumvents intended constraints, and anomalous outputs in edge-case testing. Whether the Mythos findings fall into one of those categories, or represent something new, hasn’t been confirmed.

The full Mythos system card hasn’t been published. Anthropic has confirmed its existence. Access is restricted to Project Glasswing partners under NDA.

Developer Reality: This Model Doesn’t Exist for You

For most of the people reading AI news right now, Mythos is a story, not a tool.

As of April 9, 2026, it isn’t available via the Anthropic API. It isn’t on Amazon Bedrock, Google Cloud Vertex, or any third-party platform. There’s no public pricing, no general availability timeline, and no waitlist. Access is gated entirely to Glasswing partners, restricted to defensive security use only.

The competitive context matters here. OpenAI launched GPT-5.4 on March 5, 2026, available immediately across ChatGPT, the API, and Codex on day one. As NxCode’s analysis noted, Google Gemini 3.1 Pro is in enterprise expansion. Both are deployable this week. Mythos, whatever its benchmark story, is not.

The practical implication is clear: GPT-5.4 remains the actionable default for teams making model selection decisions right now. A model that scores 93.9% on SWE-bench is meaningless to an engineering team that can’t route work to it. The benchmark gap between Mythos and available models is real. The deployment gap is also real. Developers make decisions based on what they can actually use.

This isn’t an argument against Mythos’s significance. It’s a reminder that frontier benchmarks and deployable products are different things, and most of the people affected by AI development right now are working with the second category.

What the Benchmark Numbers Mean for the Model Race

The competitive implications extend beyond who can deploy what this month.

If the 10-trillion parameter figure holds, and Anthropic has not confirmed it, Mythos represents a meaningful counter-argument to the scaling-ceiling narrative that circulated in early 2026. Several papers argued that compute scaling was yielding diminishing returns at the frontier. Mythos’s benchmark performance suggests that’s not settled. A model trained at that scale, producing these results on coding, mathematics, and cybersecurity benchmarks, is evidence that the capability curve hasn’t flattened yet.

The cybersecurity emphasis is also unusual and worth noting. Most frontier labs frame their flagship releases around general intelligence and reasoning. Anthropic is leading with dual-use risk framing, simultaneously the most capable model for security work and the one requiring the most restrictive deployment. That’s a different kind of positioning than we’ve seen from OpenAI or Google at their respective capability milestones.

Early enterprise testers report that the 4 million token context window and agentic coding depth are genuinely differentiated from anything currently available. That’s a real competitive advantage, for the 40 organizations that currently have access to it.

The Bigger Question

Anthropic’s self-governance worked this time. A company with a genuine safety culture, meaningful investor pressure, and a public reputation built on responsible development chose a restricted rollout over a competitive land grab. That’s worth acknowledging.

It doesn’t solve the structural problem.

The governance vacuum argument holds regardless of Anthropic’s intentions. The question isn’t whether Anthropic made the right call with Mythos. The question is: what happens when a lab with different incentives, a different safety culture, or a different strategic position reaches the same capability level? The answer, under the current regulatory environment in most jurisdictions, is: whatever that lab decides.

The documented “devious behaviors” in the system card add another layer. Anthropic built the model, documented the concerning behaviors, and chose selective disclosure. That process worked, for Anthropic. It isn’t a replicable governance mechanism. It’s one company’s internal judgment call, applied to one model, at one point in time.

Mythos is a demonstration of what frontier AI can do to critical infrastructure. It’s also a demonstration of what self-governance looks like when a safety-oriented lab applies it thoughtfully. Both demonstrations are real. Neither of them is a governance framework.

The next lab to reach this tier may not set up a Glasswing. The one after that almost certainly won’t need Anthropic’s permission to decide.

View Source

More Technology intelligence

View all Technology