Rankings

Top 8 Cheapest Frontier-Class AI APIs in 2026 (Price per 1M Tokens)

Frontier-class capability used to mean frontier-class pricing. In 2026 that link has broken. This ranking lists the 8 cheapest AI APIs that still clear a real capability bar, measured by vendor list price per 1 million tokens. Every price below is paired with a capability marker, so you never confuse cheap with good.

$0.10

Cheapest Input / 1M (Step-3.5-Flash)

Stepfun list price, output $0.30

$0.14

DeepSeek V4-Flash Input / 1M

Output $0.28, official list

$15 / $75

Premium Anchor (Claude Opus 4.6)

Input / output per 1M

150x

Input Spread, Cheapest to Anchor

$0.10 vs $15 per 1M

The Price Ranking

This is the full ranking, cheapest first, by blended list price per 1 million tokens. Click a model name to jump to its detail below, or click any column header to sort. The final row is the premium anchor, included only for contrast, not part of the cheapest-8 ranking.

# ↕	Model ↕	Org ↕	Input $/1M ↕	Output $/1M ↕	Capability Marker
1	Step-3.5-Flash	Stepfun	$0.10	$0.30	Arena Elo ~1389
2	DeepSeek V4-Flash	DeepSeek	$0.14	$0.28	Cache-hit input $0.0028
3	DeepSeek R1 / V3.2	DeepSeek	$0.28	$0.42	Arena Elo ~1398 / 1423
4	MiniMax M2.5	MiniMax	$0.30	$1.20	SWE-bench Verified 80.2
5	DeepSeek V4-Pro	DeepSeek	$1.74	$3.48	Flagship reasoning (list)
6	Gemini 3.1 Pro	Google	$2.00	$12.00	Arena Elo ~1492
7	GPT-5.4	OpenAI	$2.50	$15.00	Arena Elo ~1463
8	Grok 3	xAI	$3.00	$15.00	Arena Elo ~1412
--	Claude Opus 4.6	Anthropic	$15.00	$75.00	Premium Anchor

Prices are vendor-published list rates per 1 million tokens as of Feb-Mar 2026, cross-checked against LLM-Stats. List prices change frequently. The anchor row is for contrast only.

How This Ranking Works

Ranked by list price per 1M tokens (input plus output) among models that clear a frontier-capability bar (Arena Elo roughly 1380 or higher, or SWE-bench above 70), as of Feb-Mar 2026. List prices are vendor-published and change frequently; third-party providers (OpenRouter, Together, Fireworks) can be cheaper.

Three caveats carry through every row. First, the DeepSeek V4-Pro launch promotion ($0.435 input / $0.87 output per 1M) ended on May 31, 2026; this ranking uses the current list price of $1.74 / $3.48. Second, cheapest is not best, so each price is paired with a capability marker (Arena Elo or SWE-bench) to keep the tradeoff visible. Third, verify the live pricing page before you budget, because these numbers move month to month.

150x

The price-versus-capability gap is real, but smaller than the price gap

Input tokens cost 150 times more on the premium anchor (Claude Opus 4.6 at $15 per 1M) than on the cheapest pick (Step-3.5-Flash at $0.10 per 1M). The capability gap between them, measured on public leaderboards, is nowhere near 150 times. That is the whole point of this list: for a large share of production work, a frontier-adjacent model at a tenth of the cost is the rational default, and you reserve the premium model for the genuinely hard requests.

1. Step-3.5-Flash (Stepfun)

Stepfun's Step-3.5-Flash is the cheapest model on this list that still clears the frontier-capability bar. At $0.10 per 1 million input tokens and $0.30 per 1 million output tokens, it undercuts almost every name-brand API while posting an LMArena Elo around 1389. It is released under Apache 2.0, which means you can also self-host the weights if you prefer to avoid the hosted API entirely. Model identifier on the hosted API is step-3.5-flash.

Why It Tops the List

Lowest blended list price of any frontier-bar model
Arena Elo ~1389 clears the capability bar
Apache 2.0 license, so self-host is an option
Output at $0.30 per 1M keeps generation cheap

What to Watch

Smaller ecosystem than DeepSeek or the US labs
Elo near the bar, not at the top of it
Regional availability and docs may be limited
List price can change without notice

Best for: High-volume, cost-sensitive workloads (classification, extraction, summarization, routing) where a strong mid-frontier model at the lowest price wins, and where you can validate quality on your own evals first.

Capability marker: Arena Elo approximately 1389. Cheapest is not best, so benchmark it against your specific task before committing volume.

2. DeepSeek V4-Flash

DeepSeek V4-Flash is the value workhorse of the list. At $0.14 per 1 million input tokens and $0.28 per 1 million output tokens, it sits within a rounding error of the cheapest pick while carrying DeepSeek's larger ecosystem, tooling, and documentation. The standout detail is cache-hit pricing: repeated input tokens that hit DeepSeek's context cache are billed at roughly $0.0028 per 1 million, which makes repetitive prompting (long system prompts, RAG with stable context) dramatically cheaper than the headline rate suggests. Model identifier is deepseek-v4-flash.

Why It Ranks Here

$0.14 / $0.28 list, second cheapest overall
Cache-hit input around $0.0028 per 1M
Mature SDKs, docs, and community support
Output even cheaper than the #1 pick

What to Watch

Data is processed through servers in China
Cache savings only apply to repeated input
Flash trades some depth for speed and price
Compliance review needed for regulated data

Best for: Teams that want rock-bottom pricing plus a real ecosystem. The cache-hit rate makes it especially strong for RAG pipelines and agents with large, stable system prompts.

Capability marker: Cache-hit input around $0.0028 per 1M on repeated context. Note the data-sovereignty consideration: DeepSeek processes data through servers in China, which matters for regulated workloads.

3. DeepSeek R1 / V3.2

This pairing covers DeepSeek's reasoning-tuned R1 and its V3.2 general model, both priced at $0.28 per 1 million input tokens and $0.42 per 1 million output tokens. They post some of the highest capability markers in the cheap tier, with Arena Elo figures around 1398 for R1 and 1423 for V3.2. R1 exposes its chain-of-thought, which is useful for transparency and debugging; V3.2 is the broader general-purpose option. Reasoning output is more verbose, so the slightly higher output rate matters when the model thinks at length.

Why It Ranks Here

Arena Elo ~1398 (R1) and ~1423 (V3.2)
Visible chain-of-thought reasoning on R1
Still well under a dollar per 1M either way
Strong on math, logic, and code tasks

What to Watch

Reasoning models emit more output tokens
Same China data-sovereignty consideration
Latency higher than the Flash variants
Verbose output can raise effective cost

Best for: Tasks that genuinely need step-by-step reasoning (complex math, multi-step code, logic) where the higher Elo earns its keep and you still want a sub-dollar price.

Capability marker: Arena Elo approximately 1398 (R1) and 1423 (V3.2), the strongest markers in the under-a-dollar tier. Watch output volume on long reasoning chains.

4. MiniMax M2.5

MiniMax M2.5 is the coding standout of the cheap tier. Input is $0.30 per 1 million tokens, which is competitive, but output jumps to $1.20 per 1 million, so the blended cost depends heavily on how much the model generates. What earns its place is the capability marker: a SWE-bench Verified score of 80.2, which is a software-engineering benchmark result that sits in genuinely strong territory and beats several pricier models. Arena Elo lands around 1404. If your workload is heavy on reading and light on writing, the high output rate stings less.

Why It Ranks Here

SWE-bench Verified 80.2, strong on real coding tasks
Arena Elo ~1404
Cheap $0.30 input for read-heavy pipelines
Competitive on agentic coding workflows

What to Watch

Output at $1.20 per 1M is 4x the input rate
Generation-heavy use raises blended cost fast
Smaller ecosystem than the US labs
Estimate your input/output ratio before budgeting

Best for: Coding agents and software-engineering tasks where the 80.2 SWE-bench Verified score pays off, and where inputs (large codebases) dominate outputs (focused edits).

Capability marker: SWE-bench Verified 80.2. The asymmetric pricing ($0.30 in versus $1.20 out) means your blended cost rises with output volume, so model your token mix.

5. DeepSeek V4-Pro

Expired promotion: DeepSeek V4-Pro launched with a promotional rate of $0.435 input / $0.87 output per 1M. That promotion ended on May 31, 2026. This ranking uses the current list price of $1.74 / $3.48. If you see the old promo numbers quoted elsewhere, they are out of date.

V4-Pro is DeepSeek's flagship reasoning model, and it shows the difference between a launch promo and a sustainable list price. At the current list rate of $1.74 per 1 million input tokens and $3.48 per 1 million output tokens, it is roughly an order of magnitude pricier than V4-Flash, but it is still a fraction of what the US-lab flagships charge. It belongs on this list because it remains cheap relative to its frontier reasoning capability, even after the promotion ended. Model identifier is deepseek-v4-pro.

Why It Ranks Here

Flagship-grade reasoning at $1.74 / $3.48
Far below US-lab flagship pricing
Same mature DeepSeek tooling and docs
Step up in depth from the Flash variants

What to Watch

Launch promo ($0.435 / $0.87) ended May 31, 2026
About 12x pricier than V4-Flash on input
Same China data-sovereignty consideration
Reserve for tasks that need the extra depth

Best for: Hard reasoning and agentic tasks where V4-Flash falls short, but where the US-lab flagships are overkill on price. A sensible middle tier.

Capability marker: Flagship reasoning class at list price. Critical caveat: the launch promotion of $0.435 / $0.87 per 1M ended on May 31, 2026; budget against the current $1.74 / $3.48 list rate.

6. Gemini 3.1 Pro

Gemini 3.1 Pro is where this list crosses from challenger pricing into name-brand frontier territory. At $2.00 per 1 million input tokens and $12.00 per 1 million output tokens, it is the cheapest of the top-tier US-lab flagships here, and it carries the highest capability marker on the entire ranking: an Arena Elo around 1492. The input rate is genuinely competitive; the output rate is where it stops being cheap. Google's long-context strength and Workspace integration are the practical sweeteners.

Why It Ranks Here

Highest capability marker on the list (Elo ~1492)
Input at $2.00 per 1M is competitive
Strong long-context handling
Cheapest of the US-lab flagships shown

What to Watch

Output at $12.00 per 1M is 6x the input rate
Generation-heavy use gets expensive quickly
Best value when inputs dominate outputs
Confirm current rate on the pricing page

Best for: Workloads that need the top capability marker but stay input-heavy (long-document analysis, large-context retrieval) so the $12 output rate is a smaller share of the bill.

Capability marker: Arena Elo approximately 1492, the highest on this ranking. The asymmetric pricing means output-heavy generation erodes the value fast.

7. GPT-5.4

OpenAI's GPT-5.4 is the safe-default flagship for a huge number of teams, and at $2.50 per 1 million input tokens and $15.00 per 1 million output tokens it is priced like one. It earns a spot on a cheapest list only because frontier pricing has compressed so far that a top US-lab flagship now clears the bar for inclusion. The capability marker is an Arena Elo around 1463, just below Gemini 3.1 Pro. The draw here is less about price and more about the maturity of the ecosystem, tooling, and reliability that surround it.

Why It Ranks Here

Arena Elo ~1463, frontier capability
Deepest ecosystem and tooling support
Reliable, well-documented production API
Input at $2.50 still beats the premium anchor

What to Watch

Output at $15.00 per 1M is the priciest in the 8
25x the input cost of the #1 pick
Value comes from ecosystem, not price
Route easy traffic to a cheaper model

Best for: Teams that prioritize ecosystem maturity and reliability over raw cost, ideally paired with a cheaper model for high-volume, low-difficulty traffic.

Capability marker: Arena Elo approximately 1463. At $15 output per 1M it is the most expensive of the eight, so it belongs in a routing strategy, not as a blanket default.

8. Grok 3

xAI's Grok 3 rounds out the eight at $3.00 per 1 million input tokens and $15.00 per 1 million output tokens. It is the most expensive on input in this group, which is why it lands at the bottom of the ranking, but it still clears the frontier bar with an Arena Elo around 1412. Grok's differentiator is real-time access to X data, which gives it a freshness edge for tasks that depend on current events and social signals. On pure price-per-capability it is outclassed by the cheaper picks above it.

Why It Ranks Here

Arena Elo ~1412 clears the frontier bar
Real-time X data access for fresh signals
Still cheaper than the premium anchor
Competitive for current-events workloads

What to Watch

Most expensive input rate of the eight
Output at $15.00 per 1M matches GPT-5.4
Lower price-per-capability than picks above
Niche value tied to X data freshness

Best for: Applications that specifically need real-time X data and current-events grounding, where the freshness advantage outweighs the higher price.

Capability marker: Arena Elo approximately 1412. It clears the bar, but at $3.00 input it is the weakest price-per-capability pick in the eight unless you need the live X data.

The Expensive Anchor: Claude Opus 4.6

To make the cheap end of the list legible, you need to see the expensive end. Anthropic's Claude Opus 4.6 is the premium anchor at claude-opus-4-6 pricing of $15.00 per 1 million input tokens and $75.00 per 1 million output tokens. It is a frontier reasoning model that many teams reach for on the hardest tasks, and its quality is not in question. The point of including it here is the spread.

On input, Opus 4.6 costs 150 times more than Step-3.5-Flash ($15.00 versus $0.10 per 1M). On output, it costs 250 times more than the cheapest output rate ($75.00 versus $0.30 per 1M). No public leaderboard shows a capability gap anywhere near those multiples. That is the central lesson of this ranking: the premium tier is real and sometimes worth it, but for most production traffic, a frontier-adjacent model at a fraction of the price is the rational default. Reserve the premium anchor for the genuinely hard requests and route everything else cheaper.

The open-weight axis: Self-hosting open-weight models (GLM-5, Qwen 3.5, Kimi K2.5) costs $0 per token, because there is no metered API charge at all. What you pay instead is GPU hardware or cloud GPU rental, plus the engineering time to run the stack. Self-hosting trades per-token cost for fixed infrastructure cost, so it wins at high, steady volume and loses at low or bursty volume. It is a different axis from the list-price ranking above, not a row in it.

How We Ranked These APIs

We ranked by list price per 1 million tokens (input plus output) among models that clear a frontier-capability bar, defined as an LMArena Elo of roughly 1380 or higher, or a SWE-bench Verified score above 70. The pricing snapshot is Feb-Mar 2026, cross-checked against vendor pricing pages and the LLM-Stats benchmark aggregator. Three rules shaped the list:

Capability bar first, price second: A model has to clear the frontier bar before its price counts. This is why cheap-but-weak models are absent. Every row carries a capability marker (Arena Elo or SWE-bench Verified) so the tradeoff is always visible.
List price, not promo price: We use sustainable vendor list rates. The DeepSeek V4-Pro launch promotion ($0.435 input / $0.87 output per 1M) ended on May 31, 2026, so we rank it at its current $1.74 / $3.48 list price, not the expired promo.
List rates are a reference, not a floor: Third-party providers (OpenRouter, Together, Fireworks) can undercut these list prices, especially for open-weight models. Treat the numbers here as a stable reference point and verify the live pricing page before you budget.

Prices move fast. AI API pricing changed repeatedly through 2026. Every number on this page is a Feb-Mar 2026 snapshot. Before you commit volume or sign a budget, confirm the current rate on the vendor's own pricing page and against an independent aggregator such as LLM-Stats.

This ranking reflects our independent analysis. Tech Jacks Solutions has no affiliate or advertising relationship with any provider listed. Rankings are editorial, based on vendor-published list prices and independent benchmark aggregators, not paid placements.

Frequently Asked Questions

What is the cheapest frontier-class AI API in 2026?

By list price, Stepfun Step-3.5-Flash is the cheapest model that clears the frontier-capability bar, at $0.10 per 1M input and $0.30 per 1M output, with an Arena Elo around 1389. DeepSeek V4-Flash is a close second at $0.14 input and $0.28 output. These are vendor-published list prices as of Feb-Mar 2026 and change frequently, so verify before budgeting.

How is a frontier-class API defined here?

We only ranked models that clear a frontier-capability bar: an LMArena Elo of roughly 1380 or higher, or a SWE-bench Verified score above 70. That filter keeps cheap-but-weak models off the list, so every price is paired with a real capability marker. Cheapest is not the same as best, which is why each row shows its Elo or SWE-bench result alongside the price.

Is the DeepSeek V4-Pro promotional price still available?

No. DeepSeek V4-Pro had a launch promotional price of $0.435 per 1M input and $0.87 per 1M output, but that promotion ended on May 31, 2026. The current list price is $1.74 per 1M input and $3.48 per 1M output. If you see the old promo numbers quoted anywhere, they are out of date. Always check the live pricing page before budgeting.

Are open-weight models cheaper than these APIs?

Open-weight models such as GLM-5, Qwen 3.5, and Kimi K2.5 can be self-hosted at $0 per token, because there is no metered API charge. The catch is that you pay for GPU hardware or cloud GPU rental, plus the engineering time to run the stack. Self-hosting trades per-token cost for fixed infrastructure cost, so it only wins at high, steady volume. It is a different cost axis from the list-price ranking on this page.

Can I get these models even cheaper through third-party providers?

Often yes. Aggregators and inference providers such as OpenRouter, Together, and Fireworks sometimes undercut the vendor list price, especially for open-weight models. The prices in this ranking are vendor-published list rates, which serve as a stable reference point. Compare against third-party quotes before committing to a provider.

Video Resources

Cheapest Frontier LLM APIs by Price per Token (2026)

YouTube Search

Walkthroughs comparing list prices across the major and challenger model APIs

How LLM API Token Pricing Actually Works

YouTube Search

Why input and output tokens are priced differently and how to estimate your bill

Self-Hosting Open-Weight Models: GPU Cost vs API

YouTube Search

When the zero-per-token open-weight path beats paying for a hosted API

Go Deeper

Resources from across Tech Jacks Solutions

FREEAI Risk Management Template

Identify, assess, and mitigate AI deployment risks before you scale spend

Prompt Engineering Library

Techniques that cut token use and get better results from any model

AI Glossary

Definitions for tokens, Elo, SWE-bench, and other terms used here

Pricing from vendor list rates and LLM-Stats, as of Feb-Mar 2026. Prices change often; verify before budgeting.

Step and Step-3.5 are trademarks of Stepfun. DeepSeek is a trademark of DeepSeek. MiniMax is a trademark of MiniMax. Gemini is a trademark of Google LLC. GPT is a trademark of OpenAI. Grok is a trademark of xAI. Claude is a trademark of Anthropic. GLM is a trademark of Zhipu AI. Qwen is a trademark of Alibaba. Kimi is a trademark of Moonshot AI. All other trademarks belong to their respective owners. Tech Jacks Solutions is not affiliated with or endorsed by any of the companies mentioned.

Gallery

Contacts

Top 8 Cheapest Frontier-Class AI APIs in 2026 (Price per 1M Tokens)

The Price Ranking

1. Step-3.5-Flash (Stepfun)

Why It Tops the List

What to Watch

2. DeepSeek V4-Flash

Why It Ranks Here

What to Watch

3. DeepSeek R1 / V3.2

Why It Ranks Here

What to Watch

4. MiniMax M2.5

Why It Ranks Here

What to Watch

5. DeepSeek V4-Pro

Why It Ranks Here

What to Watch

6. Gemini 3.1 Pro

Why It Ranks Here

What to Watch

7. GPT-5.4

Why It Ranks Here

What to Watch

8. Grok 3

Why It Ranks Here

What to Watch

The Expensive Anchor: Claude Opus 4.6

How We Ranked These APIs

Frequently Asked Questions

What is the cheapest frontier-class AI API in 2026?

How is a frontier-class API defined here?

Is the DeepSeek V4-Pro promotional price still available?

Are open-weight models cheaper than these APIs?

Can I get these models even cheaper through third-party providers?

Video Resources

Go Deeper

Services

Learn

Company