Claude Fable 5's Benchmark Record Exists. The Model Doesn't, For Most Teams.

June 16, 2026 3 min read Epoch AI (per published reports); Anthropic API Documentation Partial Strong

Tech Jacks Solutions AI News Coverage

According to multiple published reports, the U.S. Commerce Department issued an export control directive requiring Anthropic to suspend global access to Claude Fable 5 and Mythos 5. Per reports citing Epoch AI's June 12 independent evaluation, Fable 5 scored 88% on FrontierMath Tier 4, the highest figure reported for any model on that benchmark, and now that data belongs to a model most teams can't reach.

claude-fable-5 epoch-ai-benchmark ai-model-suspension ai-export-control fable-5-benchmarks frontier-math same-event

FrontierMath Tier 4, 88% (per reports)

Key Takeaways

Epoch AI's June 12 independent evaluation reportedly found Claude Fable 5 scored 88% on FrontierMath Tier 4, the highest reported figure on that benchmark, per multiple T3 reports.
Claude Fable 5 and Mythos 5 share confirmed specs: 1M token context window, 128K max output tokens, $10/$50 per million input/output tokens (T1-confirmed via Anthropic API docs).
Per multiple published reports, the Commerce Department issued an export control directive; Anthropic suspended global access to both models, making the Epoch evaluation the only current independent technical record.
Reports indicate the directive cited jailbreak vulnerability concerns, this is media characterization, not confirmed government directive language.

Model Release

Claude Fable 5 / Claude Mythos 5

OrganizationAnthropic

TypeLLM — Flagship

ParametersNot disclosed

Benchmark[per reports citing Epoch AI] FrontierMath Tier 4: 88%; Tiers 1–3: 87%

AvailabilitySUSPENDED, per U.S. export control directive (as of June 2026, per reports)

Verification

Partial T1: Anthropic API documentation (specs confirmed); T3: Multiple journalism outlets (suspension, benchmark figures) Epoch AI primary source URL not confirmed; benchmark figures from T3 reporting only. Suspension framing from multiple T3 sources, no T1 government directive accessed.

The evaluation arrived before the suspension did.

Per reports citing Epoch AI’s June 12 independent evaluation, Claude Fable 5 scored 88% on FrontierMath Tier 4 (v2) and 87% on Tiers 1–3. Fable 5 reportedly outperformed GPT-5.5 by approximately 13 percentage points on the same benchmark, with GPT-5.5 scoring roughly 75%, according to reports citing that evaluation. Three days later, according to multiple published reports, the Commerce Department issued an export control directive requiring Anthropic to suspend access to both models for foreign nationals. Anthropic suspended global access to both models, per those reports.

The catch is: Epoch AI published the only independent technical record of these models the same week they disappeared from general availability.

What the specs confirm, per Anthropic’s official API documentation, is more grounded. Claude Fable 5 and Claude Mythos 5 share the same underlying specifications: a 1 million token default context window, up to 128,000 output tokens per request, and pricing of $10 per million input tokens and $50 per million output tokens. Both are accessible via the API under the model identifier `claude-fable-5`. These figures are T1-confirmed. The suspension status and the benchmark data are not, both rest on T3 reporting, qualified throughout.

FrontierMath Tier 4 (v2), per reports citing Epoch AI evaluation

Claude Fable 5

88%

GPT-5.5

~75% (approx., single T3 source)

Evidence

Claude Fable 5 outperformed GPT-5.5 by approximately 13 percentage points on FrontierMath Tier 4

Multiple T3 journalism sources reference Epoch AI evaluation; Epoch AI primary URL not confirmed; GPT-5.5 score from one T3 source described as 'roughly 75%'

Why it matters

Most AI benchmark cycles work like this: a model releases, a third-party evaluator publishes independent results, developers run their own tests, adoption decisions follow. That sequence broke here. The evaluation shipped. The adoption phase didn’t. For teams that were using Fable 5 or Mythos 5 in production, the suspension is operational; for teams evaluating whether to adopt, the Epoch data has become their only external reference point for a model they can’t currently test.

Reports indicate the directive was issued in part over concerns about a potential jailbreak vulnerability that could enable the models to be used without their built-in safety restrictions. That framing comes from media coverage, not from confirmed government directive language, and it should be read accordingly.

Context

The Fable 5 suspension isn’t the first AI governance intervention, but it’s the first reported case of a U.S. government directive removing a commercially deployed frontier model from broad public availability. The prior TJS analysis of the directive covered the legal and national security framing; this brief sits on the technology side of that story. The pattern matters independently of this specific model: a government can issue a directive that removes a production AI system from availability with no technical warning.

What to Watch

Anthropic legal challenge outcome (10 USC 3252)Ongoing

Epoch AI primary FrontierMath evaluation page confirmed and sourcedNext pipeline cycle

Access restoration or directive extension announcementUnknown

What to watch

Anthropic has reportedly cited 10 USC 3252 in contesting the directive. The legal challenge is covered in the regulation pillar. The technology question is simpler: if the suspension lifts, the Epoch evaluation data will have aged by weeks or months, and developer teams will face the same adoption decision with a benchmark record they couldn’t act on in real time. If the suspension holds or extends, the FrontierMath data becomes a historical artifact, useful for competitive analysis, not for deployment planning.

TJS synthesis

Don’t rely on the Epoch benchmark alone to anchor a migration decision. Independent evaluation data is more valuable than vendor claims, but it’s not a substitute for your team’s own testing on your actual workloads. If you were planning to evaluate Fable 5 or Mythos 5, wait for the Epoch AI primary evaluation page to be confirmed and sourced, the figures cited here come from T3 journalism, and the gap between “roughly 75%” for GPT-5.5 and a precise score matters if that comparison is driving your architecture decision. The suspension itself is reason enough to hold evaluation timelines until availability is confirmed.