Claude AI Pricing: Every Plan and API Cost Explained (2026)
Claude's pricing spans from $0 to custom enterprise contracts. The right plan depends entirely on how you use it: a student exploring AI pays nothing, a solo developer on Pro pays $20/month, a power user burning through tokens on Claude Code pays $100-200/month, and a 70-person enterprise team negotiates a custom deal. The API adds another layer with per-token pricing that ranges from $1 to $25 per million tokens depending on the model. This guide walks through every tier, every pricing lever, and the specific traps that inflate your bill if you are not paying attention.
Understand the Model Costs
If you are building on the API, your costs are determined by three variables: which model you pick, how many tokens you send and receive, and whether you use cost-reduction features like batching and caching. Anthropic released Claude Opus 4.8 on May 28, 2026 at the same $5/$25 per-token rate card as 4.7 and 4.6. The tokenizer change introduced in 4.7 (which can map the same text to 1.0-1.35x more tokens) carries forward. New with 4.8: a fast mode at $10 input / $50 output per MTok delivers 2.5x throughput. On June 9, 2026, Anthropic released Claude Fable 5, the current flagship that sits above Opus 4.8 in capability at $10 input / $50 output per MTok (double the Opus 4.x standard rate, the same headline number as Opus 4.8 fast mode). Opus 4.8 stays on its $5/$25 rate card as the standard, lower-cost choice and the safeguard fallback. Here is the current pricing across the Claude model lineup:
| Model | Input /MTok | Output /MTok | Batch In | Batch Out |
|---|---|---|---|---|
| Fable 5 FLAGSHIP | $10 | $50 | -- | -- |
| Opus 4.8 STANDARD | $5 | $25 | $2.50 | $12.50 |
| Opus 4.8 Fast 2.5x | $10 | $50 | -- | -- |
| Opus 4.7 / 4.6 | $5 | $25 | $2.50 | $12.50 |
| Sonnet 4.6 | $3 | $15 | $1.50 | $7.50 |
| Haiku 4.5 | $1 | $5 | $0.50 | $2.50 |
Opus 4.8, 4.7, and 4.6 all offer 1M token context and 128K max output. The Opus 4.7 tokenizer (which maps input text to 1.0-1.35x more tokens than 4.6) carries forward to 4.8. Fast mode delivers 2.5x throughput at 2x the standard rate. Sonnet 4.6: 1M context / 64K output. Haiku 4.5: 200K context / 64K output.
Claude Fable 5: The New Flagship Above Opus 4.8
Released June 9, 2026, Claude Fable 5 is Anthropic's current flagship and most capable generally-available model. It is priced at $10 input / $50 output per MTok, double the flat $5/$25 Opus 4.x rate card and the same headline number as Opus 4.8 fast mode. It sits above Opus 4.8 in capability; Opus 4.8 stays at $5/$25 as the lower-cost standard and the safeguard fallback that Fable 5 routes risky queries to.
Included on paid plans, then credits: Fable 5 is included on the Pro, Max, and Team plans at no extra cost until June 22, 2026. From June 23, 2026 onward, Fable 5 usage on those plans draws on usage credits.
Mythos 5: the same underlying model as Fable 5 with safeguards lifted in some areas. It is not generally available and not separately priced for the public; access is restricted to vetted Project Glasswing partners (Anthropic's critical-infrastructure cyberdefense collaboration with the US government).
Read our full Claude Fable 5 review for benchmarks, the over-refusal tradeoff, and how it compares to Opus 4.8.
Opus 4.8: Same Rate Card, New Fast Mode, Better Benchmarks
The headline: Opus 4.8 ships at the same $5 input / $25 output per MTok as Opus 4.7 and 4.6. The per-token rate did not move.
What's new: Opus 4.8 introduces a fast mode at $10 input / $50 output per MTok, delivering 2.5x throughput for latency-sensitive workloads. On benchmarks, SWE-Bench Pro jumps from 64.3% (4.7) to 69.2%, and it ranks #1 on the Intelligence Index (score 61, across 218 models). The 4.7 tokenizer (1.0-1.35x more tokens than 4.6 on the same text) carries forward, so the effective-cost considerations from the previous update still apply.
Mitigations that still work:
- Prompt caching: up to 90% off cache reads. The most reliable way to offset the tokenizer change if you send repeat context.
- Batch API: 50% off async workloads. Stacks with caching.
- task_budget (beta): advisory token cap (minimum 20,000) that the model sees as a running countdown across an agentic loop, letting it self-pace. Header:
task-budgets-2026-03-13. - Effort parameter tuning: lower effort reduces tool-call and reasoning-token consumption.
Benchmarks sourced from Anthropic, Claude Opus 4.8 announcement (May 28, 2026). Independent verification pending. The Intelligence Index score of 61 used 110M output tokens versus a 23M median, so verbose reasoning may inflate some metrics.
Additional pricing tiers to know:
- 200K token threshold: When your input exceeds 200K tokens, input pricing doubles for Opus ($5 to $10) and Sonnet ($3 to $6). Output also increases: Opus from $25 to $37.50 and Sonnet from $15 to $22.50 per MTok. This is the single biggest surprise on API bills. The surcharge applies to Opus 4.8 identically to 4.7 and 4.6, and the 4.7 tokenizer change (carried into 4.8) means you may cross the 200K boundary sooner on the same source text.
- Prompt caching reads: Cached input tokens cost 0.1x the standard rate. That is a 90% discount on repeated context. Writes cost 1.25x, but you only pay the write cost once.
- Batch API: 50% discount on both input and output tokens. Results return within 24 hours instead of real-time. Use it for bulk processing, evaluations, and non-interactive workflows.
- Fast mode (Opus 4.8): Higher throughput at premium pricing: Opus 4.8 fast mode costs $10 input / $50 output per MTok at 2.5x normal speed. This is significantly cheaper than the legacy Opus 4.6 fast mode ($30/$150 per MTok), making latency-sensitive production workloads more affordable. Sonnet 4.6 fast mode is not yet publicly priced.
- Model migration: Opus 4.8 (model ID
claude-opus-4-8) replaces 4.7 as the default in consumer Claude apps and Claude Code. Opus 4.7 and 4.6 remain available on the API with no published deprecation date. Rate limits are pooled across Opus 4.8, 4.7, 4.6, 4.5, 4.1, and 4 at your tier, allowing gradual migration without separate quota management.
Pick Your Consumer Plan
Consumer plans are straightforward: pick the tier that matches how heavily you use Claude. The key differentiator is usage volume, measured as a multiple of the free tier's daily limit. Heavy Claude Code users will hit Pro limits by midday and should evaluate Max.
How the 5-hour rolling window works: Token budgets reset on a rolling 5-hour window, not a fixed daily cap. If you burn through your budget at 10am, you will start regenerating capacity around 3pm. Claude Code sessions consume tokens faster than chat because they include system prompts, tool calls, and multi-turn context on every interaction.
Evaluate Business Plans
Business plans add admin controls, SSO, and seat management. The critical fork is between Team Standard (no Claude Code) and Team Premium (Claude Code included). Enterprise is custom pricing for organizations that need compliance certifications and dedicated support.
AI Governance Charter
Establish your organization's AI principles in one document
Download Free →The 200K Token Trap
This is the pricing trap that catches most API users. When your input crosses the 200K token threshold, Opus jumps from $5 to $10 per MTok input, and Sonnet jumps from $3 to $6. Output also rises: Opus from $25 to $37.50, Sonnet from $15 to $22.50 per MTok. This means a single large-document analysis can cost significantly more than you expected if you are not monitoring token counts. The surcharge applies to Opus 4.8 the same as 4.7 and 4.6, and because the 4.7 tokenizer (carried into 4.8) maps the same text to 1.0-1.35x more tokens, you can cross the 200K boundary on source text that stayed under the threshold on 4.6.
The 200K threshold exists because processing long contexts requires proportionally more compute. Anthropic chose to make this a step function rather than a gradual increase, which means the cost jump is sudden and noticeable.
How to avoid the trap:
- Break large inputs into chunks: If you are processing a 300K token document, consider splitting it into segments that stay under 200K each.
- Use prompt caching aggressively: Cached reads cost 0.1x, so even with the 200K surcharge, cached long-context calls are still affordable.
- Switch to Haiku for preprocessing: Haiku at $1/$5 per MTok does not have a 200K surcharge. Use it to summarize or filter long documents before sending the refined output to Opus or Sonnet.
- Monitor via the API usage dashboard: Anthropic's console shows per-request token counts. Set alerts before you get a surprise bill.
How to Save Up to 95% on API Costs
Anthropic offers two independent cost-reduction mechanisms that stack together. Used correctly, they can reduce your per-token cost by up to 95%.
Prompt Caching (90% Savings on Reads)
When you send the same system prompt, document, or context prefix repeatedly, prompt caching stores it on Anthropic's servers. The first call costs 1.25x (the cache write). Every subsequent call using that cached prefix costs 0.1x (the cache read). For applications that send the same instructions or reference documents on every call, this is the single biggest cost lever available.
Batch API (50% Savings)
The Batch API accepts a set of requests and returns results within 24 hours. Every request in a batch costs 50% of the standard per-token rate. Use it for content generation pipelines, evaluation runs, data extraction jobs, and any workflow where you do not need real-time responses.
Combined: Cache + Batch = Up to 95%
When you combine prompt caching reads (0.1x) with batch pricing (0.5x), the effective input cost drops to 0.05x the standard rate. On Haiku, that means input tokens at $0.05 per MTok instead of $1. The math works out to roughly 95% savings on the cached input portion of batch requests.
API Rate Limits by Tier
| Tier | Credit Purchase | Requests/min | Tokens/min |
|---|---|---|---|
| Tier 1 | $5 | 50 | 40,000 |
| Tier 2 | $40 | 1,000 | 80,000 |
| Tier 3 | $200 | 2,000 | 160,000 |
| Tier 4 | $400 | 4,000 | 400,000 |
Rate limits are per-model. Opus, Sonnet, and Haiku each have independent limits at your tier level. Source: Anthropic Docs
What to Tell Your Boss
Forward this box to whoever approves your software budget. It covers the numbers they will ask about.
Claude AI Pricing Summary (June 2026)
What's new (June 9, 2026): Anthropic released Claude Fable 5, the current flagship above Opus 4.8, at $10/$50 per MTok (double the Opus 4.x standard rate). Fable 5 is included on Pro, Max, and Team plans at no extra cost until June 22, 2026, then draws on usage credits. Mythos 5 is the same model with safeguards lifted, restricted to Project Glasswing partners and not generally available. Opus 4.8 stays at $5/$25 as the standard, lower-cost choice and safeguard fallback.
Earlier (May 28, 2026): Opus 4.8 launched at the same $5/$25 per-MTok rate card as 4.7 and 4.6, plus a fast mode at $10/$50 per MTok (2.5x throughput). SWE-Bench Pro reached 69.2% (from 64.3% on 4.7). The 4.7 tokenizer increase (1.0-1.35x more tokens on the same text) still applies. Prompt caching (90% off reads) remains the top mitigation. Free tier remains Sonnet 4.6.
Individual use: Free tier available (Sonnet 4.6 only, daily limits). Pro plan at $20/mo unlocks all models including Opus 4.8 and Claude Code. Max plans ($100 for 5x, $200 for 20x) for heavy coding users.
Team deployment: Team Standard at $25/seat/mo (no Claude Code) or Team Premium at $125-150/seat/mo monthly ($100/seat/mo annual, Claude Code included). 5-150 seats. Annual billing saves 17-33% depending on tier.
Enterprise: Custom pricing, estimated ~$60/seat for 70+ users (~$50K annual minimum). Includes HIPAA, SCIM, SOC 2, ISO 27001, audit logs, and dedicated support.
API: Pay-per-token. Haiku 4.5 ($1/$5 MTok) for cost-sensitive tasks, Sonnet 4.6 ($3/$15) for balanced workloads, Opus 4.8 ($5/$25, or $10/$50 fast mode) for complex reasoning, and the Fable 5 flagship ($10/$50) for the most demanding agentic and reasoning work. Batch API and prompt caching reduce costs up to 95%.
Hidden costs to budget for: Web search ($10/1K searches + tokens), code execution ($0.05/hr after free allotment), US data residency (1.1x multiplier), 200K+ input surcharge ($10 input / $37.50 output for Opus). No published SLA on self-serve plans.
Sources: anthropic.com/pricing, Opus 4.8 announcement, and Fable 5 / Mythos 5 announcement, verified June 9, 2026.
What Claude Pricing Does Not Include
The base subscription and API prices do not cover everything. These additional costs can add up if you are not tracking them.
Learn More: Video Resources
Go Deeper
Resources from across Tech Jacks Solutions
FREEAI Governance Charter
Establish your organization's AI principles in one document
AI Career Paths
Explore roles that work with these tools daily
EU AI Act Guide
Check your compliance obligations under the EU AI Act
FREEAI Risk Management Template
Identify, assess, and mitigate AI deployment risks