Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Anthropic Claude

What Is Claude Mythos? Anthropic's Gated Cybersecurity Model

Claude Mythos Preview is Anthropic's Capybara-tier model -- a cybersecurity-specialized variant sitting above the Opus 4.6 flagship in the Claude 4 family. It was announced on April 7, 2026 as part of Project Glasswing and it is not generally available. Twelve founding partners hold launch access, roughly forty additional organizations sit on a waitlist, and an open-source maintainer program is pending. Everyone else is locked out. The Anthropic system card reports 83.1% on the CyberGym vulnerability benchmark -- 16.5 points above Opus 4.6 (66.6%), the previous Anthropic flagship on the same benchmark. During red-team testing, Mythos surfaced a 27-year-old bug in OpenBSD, a 16-year-old FFmpeg flaw that five million fuzzing runs had missed, and a 4-vulnerability Linux kernel chain ending in root. This article explains what Mythos is, what it can do, how Anthropic is handling release, and what to be skeptical about.

Quick verdict. Private gated model -- if you are not on the allow-list, you will not get API access. Plan accordingly. Individuals, most enterprises, and almost all security vendors are outside the access tent. Anthropic has published no public roadmap to general availability.


83.1%
Mythos on CyberGym vulnerability discovery -- 16.5 points above Opus 4.6 (66.6%), the previous Anthropic flagship on the same benchmark. The 22.0% GPT-5 figure in the CyberGym paper is the Level 1 subset ceiling (paper-era), not the current frontier baseline. Source: Anthropic Mythos Preview System Card.

What Is Claude Mythos?

Mythos is a Claude 4 family model positioned above Opus 4.6 on Anthropic's internal capability ladder. Anthropic calls this higher rung the Capybara tier. In public communications, Anthropic has been specific about one thing and vague about others. Specific: Mythos is trained on the same base architecture as Opus 4.6 with additional post-training targeted at cybersecurity reasoning, exploit development, and patch analysis. Vague: the exact parameter count, the training corpus composition, and the benchmarks where Mythos underperforms Opus.

The release pattern is unusual. Anthropic is not shipping Mythos to its consumer chatbot, to Claude Code, or to the general API. Instead, the Claude team built Project Glasswing as a governance wrapper: twelve founding partners (AWS, Anthropic itself, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks), a waitlist of roughly forty additional organizations, an open-source maintainer program in design, $100 million in usage credits to the partners, and $4 million in donations to security foundations ($2.5M to Alpha-Omega/OpenSSF via the Linux Foundation, $1.5M to the Apache Software Foundation).

Anthropic has committed to publishing a public findings report within 90 days of the April 7 launch -- early July 2026. The report should disclose which vulnerability classes Mythos found, which it missed, and how often the model produced false-positive patches. That deadline is the first external checkpoint the public gets. Until then, every claim about Mythos rests on Anthropic's own documentation and a handful of partner statements.

The Capybara Tier

Anthropic's public model lineup is still Opus, Sonnet, and Haiku. Capybara is a separate internal tier name used for Mythos Preview specifically. There is no Capybara chatbot, no Capybara developer tool, and no Capybara landing page on claude.com. The system card describes Capybara as a classification above Opus on Anthropic's Frontier Model capability framework -- the internal rubric that triggers additional safety evaluations before release. Mythos is the first model to receive the Capybara designation.

Base Model Lineage

Anthropic's Frontier Red Team report notes that Mythos's cyber capabilities emerged from general code and reasoning training rather than from explicit security training. In plain English: Anthropic did not set out to build a bug-finding model. They built Opus 4.6 as a coding and reasoning model, applied cyber-focused post-training, and the resulting capability uplift was larger than predicted. That framing matters for two reasons. First, it suggests general-purpose frontier models will keep trending toward offensive-security utility whether vendors intend it or not. Second, it is also a convenient narrative for a vendor that wants to control release -- the claim is unfalsifiable without access to training logs.

Apr 7, 2026
Preview Release Date
$25 / $125
Post-Preview API (per MTok)
12
Founding Partners
90 days
Public Findings Report Window
Due early July 2026
$104M
Credits + Donations ($100M + $4M)

What Mythos Can Actually Do

Mythos is narrow. It is not a chat assistant, not a code-writing partner, not a research tool. It is an agent harness wrapped around a model trained to reason about memory-safety bugs, exploit chains, and patch correctness. The Frontier Red Team report describes four notable pre-release discoveries, each verified by Anthropic and the affected project maintainers.

Vulnerability Classes

  • Memory safety bugs -- buffer overflows, use-after-free, out-of-bounds reads and writes. This is the CyberGym scope and Mythos's strongest domain.
  • Privilege-escalation chains -- combining several lower-severity bugs into a root-level exploit. See the Linux example below.
  • Patch analysis -- reading a proposed fix and assessing whether it actually closes the underlying flaw. The CyberGym paper documented 18 incomplete patches across 34 zero-days; Anthropic reports Mythos flagged a similar proportion during internal testing.
  • Authentication and network-protocol bugs -- the FreeBSD NFS finding falls in this category.

Verified Pre-Release Discoveries

OpenBSD -- 27-Year-Old Remote Crash
A bug that sat in the OpenBSD codebase for nearly three decades, missed by one of the most heavily audited security-focused operating systems in existence. Mythos surfaced it during agentic code review. Anthropic reported the finding to the OpenBSD team; a fix has been committed.
FFmpeg -- 16-Year-Old Flaw Missed by ~5M Fuzzing Runs
FFmpeg is continuously fuzzed by Google's OSS-Fuzz program. Per Anthropic, this specific flaw had survived an estimated five million fuzzing runs. Mythos found it by reading the code, not by randomized input generation -- which is the whole point of a reasoning-based approach.
FreeBSD -- CVE-2026-4747, 17-Year-Old Unauth RCE
An unauthenticated remote code execution bug in the FreeBSD NFS server, tracked as CVE-2026-4747. This is the highest-severity finding in the pre-release batch: an attacker reaching an exposed NFS service could execute code without any credential. The bug had existed since 2009. Covered in the Cybersecurity News Center.
Linux Kernel -- 4-Bug Chain, KASLR Bypass to Root
A four-vulnerability exploit chain ending at root privilege. The chain bypasses KASLR (Kernel Address Space Layout Randomization) and uses a heap spray to achieve reliable code execution. See our autonomous exploit chaining guide for how these chains are constructed. This is the finding that most concerns the defensive community -- it demonstrates Mythos can not only find individual bugs but compose them into weaponizable exploits.

The emergent-capability caveat. Anthropic's position is that these capabilities emerged from general training rather than from a deliberate attempt to build an offensive security tool. Accept that framing if you want; it still means the capability exists, a narrow group of organizations has access, and similar capabilities will appear in other frontier models over the next 6-12-24 months. Logan Graham, Anthropic's Frontier Red Team Lead, has been direct about that timeline in press interviews.


Benchmarks: Where Mythos Lands

Benchmarks measure what they measure and no more. All figures below come from Anthropic's Mythos Preview system card published April 7, 2026. Independent replication is limited because independent researchers do not have API access. Treat these numbers as vendor-reported until third parties publish confirmations.

CyberGym (Level 0-3)
Real-world vulnerability discovery across 1,507 task instances drawn from 188 projects -- Wang et al., UC Berkeley.
Mythos Preview
83.1%
GPT-5 + thinking (paper Level 1)
22.0%
Opus 4.6
67.0%
Sonnet 4.6
65.0%
Opus 4.5
51.0%
Mythos Preview at 83.1% is 16.5 points above Opus 4.6 (66.6%), the previous Anthropic flagship on this benchmark. The 22.0% GPT-5 figure is the paper's Level 1 subset ceiling (arXiv 2506.02548), not a current frontier score -- include it for paper-era context, not as the live baseline. Anthropic attributes Mythos's gain to agent scaffolding plus cyber-focused post-training. Without independent replication, treat the absolute score as provisional as of April 2026.
Cybench
Capture-the-flag style challenges derived from CTF competitions, 10 trials per challenge, pass@1 metric.
Mythos Preview
100%
A 100% score is a warning sign for bench saturation, not a capability ceiling. When a model saturates a benchmark, the benchmark stops measuring differences. Cybench needs harder variants before this number tells us anything about Mythos vs Mythos v2.
SWE-bench Verified
Real GitHub issue resolution -- human-verified subset of open-source bug fixes.
Mythos Preview
93.9%
Opus 4.6
80.8%
Mythos is 13.1 points above Opus 4.6 (80.8%, internally codenamed "Fennec"), the previous Anthropic leader on SWE-bench Verified. As of April 2026, Mythos Preview holds the SWE-bench Verified high mark among Anthropic-tested models.
SWE-bench Pro
Harder SWE-bench variant with multi-file edits and longer task horizons.
Mythos Preview
77.8%
The Pro variant is designed to resist saturation. 77.8% leaves room for future models to show improvement -- unlike Cybench at 100%.
Terminal-Bench 2.0
Long-horizon terminal task completion -- shell, file-system, and tool-use tasks.
Mythos (4h)
92.1%
Mythos (default)
82.0%
Giving the agent a 4-hour timeout adds 10 points. The default-timeout number is what matters for real-world deployment; the 4-hour number is what matters for red-team research.
GPQA Diamond
Graduate-level physics, chemistry, and biology reasoning.
Mythos Preview
94.5%
Gemini 3.1 Pro
94.3%
Mythos edges Gemini 3.1 Pro by 0.3 points -- inside noise range. The useful read here is that a cyber-specialized model did not sacrifice general reasoning to get its vulnerability-finding edge.
USAMO 2026
USA Mathematical Olympiad, 2026 edition -- proof-based competition math.
Mythos Preview
97.6%
Proof-based math historically resisted language models. 97.6% is approaching human-expert ceiling -- another saturation candidate.
Humanity's Last Exam (with tools)
Expert-level reasoning across multiple domains, crowd-sourced from domain specialists.
Mythos (tools)
64.7%
Mythos (no tools)
56.8%
Opus 4.6 (tools)
53.1%
Mythos posts an 11.6-point gain over Opus 4.6 on the hardest reasoning benchmark. Both figures are with-tools runs (web search, code execution). Raw model scores are lower.
Benchmarks as of April 2026. Source: Anthropic Mythos Preview System Card.

Access and Pricing

Anthropic has published post-preview API pricing but not a preview-to-general-availability schedule. The numbers below are what the company has stated publicly. They may change before any broader release.

Pricing (Post-Preview)

Tier Input (per MTok) Output (per MTok) Notes
Mythos Preview $25 $125 Post-preview rate per Anthropic
Opus 4.6 (reference) $5 $25 Five times cheaper than Mythos
Sonnet 4.6 (reference) $3 $15 Roughly 8x cheaper than Mythos

During the preview. Approved partners draw down against a shared $100M usage-credit pool rather than paying the listed rate. Anthropic has not published per-partner allocation numbers. The $100M figure is a total program commitment, not a per-seat line item.

Access Model

  • Allow-list only. Every API call is gated by Anthropic's partner approval system. There is no waitlist form on claude.com for Mythos specifically.
  • Twelve founding partners -- AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks.
  • Roughly forty additional organizations sit on a secondary access list, per Anthropic's launch announcement. Names have not been disclosed.
  • Open-source maintainer program is in design. Critical open-source project maintainers will be able to request access to audit their own code; the application process and eligibility criteria have not been published.

Platforms

Mythos is available through four API paths, all of which enforce the allow-list at the infrastructure layer:

  • Claude API (console.anthropic.com) -- direct partner access.
  • Amazon Bedrock -- US East (N. Virginia) region only. AWS CISO Amy Herzog has stated that the region restriction is a pre-condition of the launch. See the Bedrock announcement.
  • Google Cloud Vertex AI -- gated, US regions at launch.
  • Microsoft Foundry -- gated.

Why Mythos Is Gated

Anthropic's public reasoning for restricting Mythos rests on three pillars: dual-use risk, distillation concern, and alignment confidence. Each deserves a skeptical look.

Dual-Use Risk

Quoting Anthropic's launch statement: "The same improvements that make the model substantially more effective at patching vulnerabilities also make it substantially more effective at exploiting them." That is a defensible position. The Linux kernel 4-bug chain example is not a patch -- it is an exploit. A model that can produce that chain on demand, at API pricing, available to anyone, lowers the offensive-capability floor meaningfully. How meaningfully is debatable, and the debate is happening without the outside world having a test environment.

Distillation Concern

If Mythos's output were publicly available at scale, competing labs could distill the capability into smaller, cheaper, uncontrolled models. Anthropic has not published specific evidence that distillation is feasible at these capability levels, but the concern is plausible enough that other frontier labs have cited it as a restriction rationale for capability releases.

Alignment Metrics

Anthropic reports the following alignment improvements for Mythos relative to Opus 4.6:

Metric Change vs Opus 4.6 Notes
Misuse success rate >50% reduction Prompts designed to elicit harmful outputs succeed less often
Deceptive behavior >50% reduction Internal red-team probes for deception patterns
Over-refusal rate 0.06% Nearly zero -- model refuses benign requests only rarely
Evaluation awareness 29% (Anthropic system card) Model recognizes it is being tested -- no independent replication yet published

The evaluation-awareness figure is vendor-only so far. Anthropic's system card cites 29% -- meaning Mythos recognizes it is being tested in roughly three of every ten evaluation runs. No independent replication has been published as of April 2026. The phenomenon is documented in multiple frontier models, but the exact rate here rests on Anthropic's internal protocol until a third party rechecks.

Ecosystem Posture

The $4M in donations -- $2.5M to Alpha-Omega/OpenSSF via the Linux Foundation, $1.5M to the Apache Software Foundation -- is modest relative to the $100M usage-credit pool and tiny relative to Anthropic's reported run-rate. It reads more as ecosystem goodwill than as material open-source security funding. The real open-source play is the pending maintainer program, which has not yet launched.


Who This Article Is For

Mythos is narrowly useful. So is reading about it. Here are the four audiences who should care:

CISOs and Security Leaders
You will not get Mythos access this year. Your realistic question is: how do I plan for the 6-12-24 month period when equivalent capabilities reach actors who are less careful with release? Factor AI-accelerated vulnerability discovery into patch-management SLAs and vendor risk reviews. See the Cybersecurity News Center.
Red-Team Leads
Understand what capabilities your tool budget is now competing against. If you are at one of the twelve launch partners, build your internal request process early -- usage credits will be contested. If not, map out which open-source tooling approximates Mythos's workflow (iterative agent harnesses over Claude or GPT).
Open-Source Maintainers
The pending maintainer program is your access path. Watch the Alpha-Omega/OpenSSF and Apache Software Foundation channels for eligibility announcements. If your project handles memory-safety-sensitive code (parsers, kernel modules, network protocols), you are the intended user.
Policy Analysts
Mythos is the first publicly acknowledged case of a frontier-model vendor choosing gated release on capability grounds alone. The governance template -- launch partners, usage credits, 90-day public report, maintainer carve-out -- will shape how other labs release similar models. See AI governance for policy context.

What to Be Skeptical About

A release this tightly controlled demands a skeptical read. Four concerns stand out.

No General Availability and No Public Path
If you are not a launch partner or on the 40-organization secondary list, you have no documented way to get access. Anthropic has not committed to general availability, has not published a waitlist form, and has not set a date for broader release. This article is documenting a capability most readers cannot use.
Benchmark Saturation on Cybench and USAMO
100% on Cybench and 97.6% on USAMO 2026 mean those benchmarks have stopped measuring capability differences. Future Mythos versions, and rival models, will all post near-ceiling numbers. The headline scores look striking but will age quickly. Harder variants are needed.
Evaluation Awareness Is Vendor-Reported Only
Anthropic's system card reports 29% -- Mythos recognizes it is being tested in roughly three of every ten evaluation runs. No independent third-party replication has been published as of April 2026. The phenomenon is documented across frontier models, but the exact rate here rests on Anthropic's internal protocol until an external evaluator rechecks.
Anthropic's Commercial Incentive to Hype
Cybersecurity stocks moved on the launch -- CrowdStrike -7%, Palo Alto Networks -6%, the iShares cyber ETF -4.5% on the related leak day. Anthropic benefits from a narrative that its model is qualitatively ahead of competitors. The +16.5pp CyberGym gain over Opus 4.6, the emergent-capability framing, and the Capybara-tier branding are all consistent with that incentive. None of which means the numbers are wrong -- it means independent replication matters more than usual.
March 31 Claude Code Leak Context
On March 31, 2026 -- seven days before the Mythos announcement -- Anthropic's @anthropic-ai/claude-code 2.1.88 npm package exposed approximately 1,900 files and 512,000+ lines of source via a stray .map file for roughly three hours. The incident is separate from Mythos but informs the broader release-security posture. A vendor gating a cyber-capable model has to hold its own supply chain to the same bar.

Platform Access Paths

Four hosted API routes. All gated. All enforce allow-list access at the infrastructure layer, not just the account layer.

Amazon Bedrock
US East (N. Virginia) region only, gated preview. AWS CISO Amy Herzog is the named executive sponsor. Other AWS regions are not on the announced roadmap. AWS announcement.
Google Cloud Vertex AI
Gated preview on Vertex AI. US regions at launch. Partners with existing Vertex contracts route through standard Vertex IAM plus the Mythos allow-list.
Microsoft Foundry
Gated. Foundry is Microsoft's AI model catalog product, separate from Azure OpenAI. Partners accessing Mythos via Foundry use Azure AD plus Anthropic's approval list.
Claude API (Direct)
console.anthropic.com with an allow-listed organization ID. This is the path the 12 founding partners use. Regular Claude API customers (Opus, Sonnet, Haiku) do not see the Mythos endpoint in their console.

Video Resources

Video coverage pending editorial review. Independent explainer videos on Mythos Preview, Project Glasswing, CyberGym methodology, and CVE-2026-4747 are emerging across the security community. We will add verified video embeds once they meet our sourcing threshold. Until then, the primary references above (Anthropic system card, UC Berkeley paper, NVD) are the authoritative written sources.


Data verified: 2026-04-13
Data verified: 2026-04-13. Claude and Mythos are trademarks of Anthropic. GPT is a trademark of OpenAI. Google Gemini is a trademark of Google LLC. CyberGym is the work of Wang, Shi, He, Cai, Zhang, and Song (UC Berkeley).
Before You Use AI
Your Privacy

Anthropic's commercial API and business plans do not use customer data to train models. Free-tier Claude.ai conversations may be used for training unless you opt out. Mythos Preview runs under partner agreements with custom data-retention terms, including HIPAA BAAs where applicable, on AWS, GCP, and Microsoft infrastructure under the allow-list. Review Anthropic's privacy policy before submitting sensitive code or credentials.

Mental Health & AI Dependency

Security work under time pressure, combined with AI tooling, can push practitioners toward over-reliance on model output. Keep human review in the loop for every vulnerability disclosure decision. If you or someone you know is experiencing a mental health crisis:

  • 988 Suicide & Crisis Lifeline -- Call or text 988 (US)
  • SAMHSA Helpline -- 1-800-662-4357
  • Crisis Text Line -- Text HOME to 741741
Your Rights & Our Transparency

Under GDPR and CCPA you have the right to access, correct, and delete your personal data. Tech Jacks Solutions maintains editorial independence from all vendors, including Anthropic. This article was not sponsored, reviewed, or approved by Anthropic. We do not receive affiliate commissions on Claude or Mythos access. Evaluations here draw on primary Anthropic documentation, the CyberGym paper, and third-party reporting.