Claude AI Pricing: Every Plan and API Cost Explained (2026)
Claude's pricing spans from $0 to custom enterprise contracts. The right plan depends entirely on how you use it: a student exploring AI pays nothing, a solo developer on Pro pays $20/month, a power user burning through tokens on Claude Code pays $100-200/month, and a 70-person enterprise team negotiates a custom deal. The API adds another layer with per-token pricing that ranges from $1 to $25 per million tokens depending on the model. This guide walks through every tier, every pricing lever, and the specific traps that inflate your bill if you are not paying attention.
Understand the Model Costs
If you are building on the API, your costs are determined by three variables: which model you pick, how many tokens you send and receive, and whether you use cost-reduction features like batching and caching. Anthropic released Claude Opus 4.7 on April 16, 2026 at the same per-token rate card as 4.6 — but it uses a new tokenizer, so the same text may cost 0-35% more per request. Here is the current pricing across the Claude model lineup:
| Model | Input /MTok | Output /MTok | Batch In | Batch Out |
|---|---|---|---|---|
| Opus 4.7 NEW | $5 | $25 | $2.50 | $12.50 |
| Opus 4.6 | $5 | $25 | $2.50 | $12.50 |
| Sonnet 4.6 | $3 | $15 | $1.50 | $7.50 |
| Haiku 4.5 | $1 | $5 | $0.50 | $2.50 |
Opus 4.7 and 4.6 both offer 1M token context and 128K max output. Opus 4.7 uses a new tokenizer that maps the same input text to 1.0-1.35x more tokens than 4.6 — see the tokenizer callout below for effective-cost impact. Sonnet 4.6: 1M context / 64K output. Haiku 4.5: 200K context / 64K output.
Opus 4.7: Rate Card Unchanged, Effective Cost May Rise
The headline: Opus 4.7 ships at the same $5 input / $25 output per MTok as Opus 4.6. The per-token rate did not move.
The catch: Opus 4.7 uses a new tokenizer. The same input text maps to roughly 1.0x to 1.35x more tokens than it did on 4.6. In practice, per-request cost can rise 0% to 35% on identical prompts — with the biggest increases on code, structured data (JSON/XML/CSV), and non-English text. Natural-English prose sees the smallest change.
Mitigations that still work:
- Prompt caching — up to 90% off cache reads. The most reliable way to offset the tokenizer change if you send repeat context.
- Batch API — 50% off async workloads. Stacks with caching.
- task_budget (beta) — advisory token cap (minimum 20,000) that the model sees as a running countdown across an agentic loop, letting it self-pace. Header:
task-budgets-2026-03-13. - Effort parameter tuning — lower effort reduces tool-call and reasoning-token consumption.
This is the same failure pattern behind last cycle's "output pricing stays the same" mistake — we flag it explicitly so readers budgeting API costs do not under-shoot. Source: Anthropic, Claude Opus 4.7 announcement (April 16, 2026).
Additional pricing tiers to know:
- 200K token threshold: When your input exceeds 200K tokens, input pricing doubles for Opus ($5 to $10) and Sonnet ($3 to $6). Output also increases: Opus from $25 to $37.50 and Sonnet from $15 to $22.50 per MTok. This is the single biggest surprise on API bills. The surcharge applies to Opus 4.7 identically to 4.6 — the tokenizer change means you may cross the 200K boundary sooner on the same source text.
- Prompt caching reads: Cached input tokens cost 0.1x the standard rate. That is a 90% discount on repeated context. Writes cost 1.25x, but you only pay the write cost once.
- Batch API: 50% discount on both input and output tokens. Results return within 24 hours instead of real-time. Use it for bulk processing, evaluations, and non-interactive workflows.
- Fast mode (historical, Opus 4.6): Higher throughput at premium pricing — Opus 4.6 fast mode costs $30 input / $150 output per MTok. Anthropic has not published a fast-mode rate card for Opus 4.7 as of April 16, 2026. Sonnet 4.6 fast mode is not yet publicly priced.
- Model migration: Opus 4.7 (model ID
claude-opus-4-7) is replacing 4.5 and 4.6 in consumer Claude apps and GitHub Copilot over the coming weeks. Opus 4.6 remains available on the API with no published deprecation date. Rate limits are pooled across Opus 4.7, 4.6, 4.5, 4.1, and 4 at your tier, allowing gradual migration without separate quota management.
Pick Your Consumer Plan
Consumer plans are straightforward: pick the tier that matches how heavily you use Claude. The key differentiator is usage volume, measured as a multiple of the free tier's daily limit. Heavy Claude Code users will hit Pro limits by midday and should evaluate Max.
How the 5-hour rolling window works: Token budgets reset on a rolling 5-hour window, not a fixed daily cap. If you burn through your budget at 10am, you will start regenerating capacity around 3pm. Claude Code sessions consume tokens faster than chat because they include system prompts, tool calls, and multi-turn context on every interaction.
Evaluate Business Plans
Business plans add admin controls, SSO, and seat management. The critical fork is between Team Standard (no Claude Code) and Team Premium (Claude Code included). Enterprise is custom pricing for organizations that need compliance certifications and dedicated support.
The 200K Token Trap
This is the pricing trap that catches most API users. When your input crosses the 200K token threshold, Opus jumps from $5 to $10 per MTok input, and Sonnet jumps from $3 to $6. Output also rises: Opus from $25 to $37.50, Sonnet from $15 to $22.50 per MTok. This means a single large-document analysis can cost significantly more than you expected if you are not monitoring token counts. The surcharge applies to Opus 4.7 the same as 4.6 — and because 4.7's new tokenizer maps the same text to 1.0-1.35x more tokens, you can cross the 200K boundary on source text that stayed under the threshold on 4.6.
The 200K threshold exists because processing long contexts requires proportionally more compute. Anthropic chose to make this a step function rather than a gradual increase, which means the cost jump is sudden and noticeable.
How to avoid the trap:
- Break large inputs into chunks: If you are processing a 300K token document, consider splitting it into segments that stay under 200K each.
- Use prompt caching aggressively: Cached reads cost 0.1x, so even with the 200K surcharge, cached long-context calls are still affordable.
- Switch to Haiku for preprocessing: Haiku at $1/$5 per MTok does not have a 200K surcharge. Use it to summarize or filter long documents before sending the refined output to Opus or Sonnet.
- Monitor via the API usage dashboard: Anthropic's console shows per-request token counts. Set alerts before you get a surprise bill.
How to Save Up to 95% on API Costs
Anthropic offers two independent cost-reduction mechanisms that stack together. Used correctly, they can reduce your per-token cost by up to 95%.
Prompt Caching (90% Savings on Reads)
When you send the same system prompt, document, or context prefix repeatedly, prompt caching stores it on Anthropic's servers. The first call costs 1.25x (the cache write). Every subsequent call using that cached prefix costs 0.1x (the cache read). For applications that send the same instructions or reference documents on every call, this is the single biggest cost lever available.
Batch API (50% Savings)
The Batch API accepts a set of requests and returns results within 24 hours. Every request in a batch costs 50% of the standard per-token rate. Use it for content generation pipelines, evaluation runs, data extraction jobs, and any workflow where you do not need real-time responses.
Combined: Cache + Batch = Up to 95%
When you combine prompt caching reads (0.1x) with batch pricing (0.5x), the effective input cost drops to 0.05x the standard rate. On Haiku, that means input tokens at $0.05 per MTok instead of $1. The math works out to roughly 95% savings on the cached input portion of batch requests.
API Rate Limits by Tier
| Tier | Credit Purchase | Requests/min | Tokens/min |
|---|---|---|---|
| Tier 1 | $5 | 50 | 40,000 |
| Tier 2 | $40 | 1,000 | 80,000 |
| Tier 3 | $200 | 2,000 | 160,000 |
| Tier 4 | $400 | 4,000 | 400,000 |
Rate limits are per-model. Opus, Sonnet, and Haiku each have independent limits at your tier level. Source: Anthropic Docs
What to Tell Your Boss
Forward this box to whoever approves your software budget. It covers the numbers they will ask about.
Claude AI Pricing Summary (April 2026)
What's new (April 16, 2026): Opus 4.7 launched at the same $5/$25 per-MTok rate card as 4.6 — but its new tokenizer maps the same text to 1.0-1.35x more tokens, so effective cost per request can rise 0-35%. Prompt caching (90% off reads) is the most reliable mitigation. Free tier remains Sonnet 4.6.
Individual use: Free tier available (Sonnet 4.6 only, daily limits). Pro plan at $20/mo unlocks all models including Opus 4.7 and Claude Code. Max plans ($100 for 5x, $200 for 20x) for heavy coding users.
Team deployment: Team Standard at $25/seat/mo (no Claude Code) or Team Premium at $125-150/seat/mo monthly ($100/seat/mo annual, Claude Code included). 5-150 seats. Annual billing saves 17-33% depending on tier.
Enterprise: Custom pricing, estimated ~$60/seat for 70+ users (~$50K annual minimum). Includes HIPAA, SCIM, SOC 2, ISO 27001, audit logs, and dedicated support.
API: Pay-per-token. Haiku 4.5 ($1/$5 MTok) for cost-sensitive tasks, Sonnet 4.6 ($3/$15) for balanced workloads, Opus 4.7 or 4.6 ($5/$25) for complex reasoning. Batch API and prompt caching reduce costs up to 95%.
Hidden costs to budget for: Web search ($10/1K searches + tokens), code execution ($0.05/hr after free allotment), US data residency (1.1x multiplier), 200K+ input surcharge ($10 input / $37.50 output for Opus). No published SLA on self-serve plans.
Sources: anthropic.com/pricing and Opus 4.7 announcement, verified April 16, 2026.
What Claude Pricing Does Not Include
The base subscription and API prices do not cover everything. These additional costs can add up if you are not tracking them.