Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Skip to content
Technology Daily Brief Vendor Claim

Gemini 3.5 Flash Costs 3x More Than Its Predecessor, and Outperforms the Model It Was Supposed to Be Below

2 min read Google Blog / Google Cloud Blog Partial
Google's mid-tier Gemini 3.5 Flash launched at $1.50/M input and $9.00/M output tokens, three times the price of Gemini 3 Flash Preview, and with benchmark scores that exceed the prior flagship Gemini 3.1 Pro. That combination breaks the logic enterprise teams have used to route workloads across Google's model tiers.
Pricing increase, 3x vs. prior Flash tier

Key Takeaways

  • Gemini 3.5 Flash launched at $1.50/M input and $9.00/M output, confirmed at 3x the cost of Gemini 3 Flash Preview
  • Google reports 76.2% on Terminal-Bench 2.1, above prior flagship Gemini 3.1 Pro's 70.3%; independent Epoch AI evaluation is pending
  • The 4x speed claim couldn't be independently corroborated, exclude it from production planning until verified
  • Any team with an existing Flash-vs-Pro routing decision needs to revisit it; the price and performance tiers no longer map the same way

Model Release

Gemini 3.5 Flash
OrganizationGoogle DeepMind
TypeLLM — Mid-tier
ParametersNot disclosed
Benchmark[SELF-REPORTED] Terminal-Bench 2.1: 76.2% (partial third-party corroboration via Artificial Analysis; Epoch AI evaluation pending)
AvailabilityGoogle AI Studio, Vertex AI, GitHub Copilot

Pricing: Input tokens per 1M (confirmed)

Gemini 3.5 Flash
$1.50
Gemini 3 Flash Preview (prior)
$0.50 (approx. 3x less)
Gemini 3.1 Flash-Lite
Not disclosed in this package

Verification

Partial Vendor benchmarks with partial third-party corroboration (Artificial Analysis, blog.google) Epoch AI independent evaluation pending. Finance Agent v2 cross-model comparison is vendor-reported only and has not been independently verified.

Pricing confirmed. $1.50 per million input tokens, $9.00 per million output tokens. Google announced Gemini 3.5 Flash at I/O 2026 on May 19 alongside a price point that is three times what Gemini 3 Flash Preview carried. That’s not a minor revision to the Flash tier. That’s a repositioning.

The performance numbers complicate the picture further. According to Google’s benchmark data, corroborated by third-party evaluation indices including Artificial Analysis, Gemini 3.5 Flash scores 76.2% on Terminal-Bench 2.1, outperforming Gemini 3.1 Pro’s 70.3% on the same evaluation. Google’s reported evaluations place MCP Atlas tool-use performance at 83.6%, above Gemini 3.1 Pro’s reported 78.2%. Independent evaluation by Epoch AI is pending, treat these figures as vendor-reported with partial third-party corroboration, not confirmed benchmarks.

A mid-tier model that outscores the prior flagship raises an architectural question most teams haven’t had to ask before: what does “mid-tier” mean when performance tiers and price tiers no longer align?

Disputed Claim

Output speeds up to 4x faster than comparable frontier models
Could not be independently corroborated via cross-reference at publication time
Exclude from production performance estimates until independent evaluation confirms this figure

The catch is that the pricing increase lands unevenly depending on your use case. Teams running high-throughput agentic workflows, coding assistants, multi-step document processing, API orchestration, will feel a 3x token cost increase acutely. Teams that were already routing premium workloads to Gemini 3.1 Pro may find the switch economically neutral or even favorable given the benchmark improvement. The question isn’t whether Gemini 3.5 Flash is good. It’s whether it’s good enough at this price relative to what you were paying before.

Google claims output speeds are up to 4x faster than comparable frontier models, a figure that couldn’t be independently corroborated at publication time. Don’t build that number into your production estimates until independent evaluation confirms it. Context window is specified at 1,048,576 input tokens per Google’s documentation. The model is available via Google AI Studio and Vertex AI, with GitHub Copilot integration also announced.

The comparison most teams will run is against Gemini 3.1 Flash-Lite, which remains available at a lower price point. Per Google’s internal Finance Agent v2 benchmark, Gemini 3.5 Flash reportedly scores 57.9%, above GPT-5.5’s 51.8% and below Claude Opus 4.7’s 66.1% on the same evaluation, figures that haven’t been independently verified. The three-way comparison is useful context but treat it as Google’s own benchmark framing, not an independent assessment.

What to Watch

Epoch AI independent benchmark evaluation for Gemini 3.5 FlashWeeks to months post-launch
Gemini 3.1 Flash-Lite pricing update or repositioning in responseNext Google developer update cycle
Enterprise cost-routing announcements from teams currently on Gemini 3.1 Pro30-60 days post-GA

The part nobody mentions: the Flash tier was positioned as the cost-efficient alternative for teams that didn’t need flagship performance. That positioning is gone. Gemini 3.5 Flash now costs what Gemini 3.1 Pro used to justify, while offering performance that exceeds it. Every team with a current routing decision between Flash and Pro-tier models needs to revisit that architecture before the next billing cycle.

Wait for Epoch AI’s independent evaluation before migrating workloads based on the benchmark claims. The pricing is confirmed, use that for budget planning now. The performance advantage over GPT-5.5 on Finance Agent v2 is vendor-reported and requires independent verification before it influences procurement.

View Source
More Technology intelligence
View all Technology

More from May 21, 2026

Stay ahead on Technology

Get verified AI intelligence delivered daily. No hype, no speculation, just what matters.

Explore the AI News Hub