Pricing confirmed. $1.50 per million input tokens, $9.00 per million output tokens. Google announced Gemini 3.5 Flash at I/O 2026 on May 19 alongside a price point that is three times what Gemini 3 Flash Preview carried. That’s not a minor revision to the Flash tier. That’s a repositioning.
The performance numbers complicate the picture further. According to Google’s benchmark data, corroborated by third-party evaluation indices including Artificial Analysis, Gemini 3.5 Flash scores 76.2% on Terminal-Bench 2.1, outperforming Gemini 3.1 Pro’s 70.3% on the same evaluation. Google’s reported evaluations place MCP Atlas tool-use performance at 83.6%, above Gemini 3.1 Pro’s reported 78.2%. Independent evaluation by Epoch AI is pending, treat these figures as vendor-reported with partial third-party corroboration, not confirmed benchmarks.
A mid-tier model that outscores the prior flagship raises an architectural question most teams haven’t had to ask before: what does “mid-tier” mean when performance tiers and price tiers no longer align?
Disputed Claim
The catch is that the pricing increase lands unevenly depending on your use case. Teams running high-throughput agentic workflows, coding assistants, multi-step document processing, API orchestration, will feel a 3x token cost increase acutely. Teams that were already routing premium workloads to Gemini 3.1 Pro may find the switch economically neutral or even favorable given the benchmark improvement. The question isn’t whether Gemini 3.5 Flash is good. It’s whether it’s good enough at this price relative to what you were paying before.
Google claims output speeds are up to 4x faster than comparable frontier models, a figure that couldn’t be independently corroborated at publication time. Don’t build that number into your production estimates until independent evaluation confirms it. Context window is specified at 1,048,576 input tokens per Google’s documentation. The model is available via Google AI Studio and Vertex AI, with GitHub Copilot integration also announced.
The comparison most teams will run is against Gemini 3.1 Flash-Lite, which remains available at a lower price point. Per Google’s internal Finance Agent v2 benchmark, Gemini 3.5 Flash reportedly scores 57.9%, above GPT-5.5’s 51.8% and below Claude Opus 4.7’s 66.1% on the same evaluation, figures that haven’t been independently verified. The three-way comparison is useful context but treat it as Google’s own benchmark framing, not an independent assessment.
What to Watch
The part nobody mentions: the Flash tier was positioned as the cost-efficient alternative for teams that didn’t need flagship performance. That positioning is gone. Gemini 3.5 Flash now costs what Gemini 3.1 Pro used to justify, while offering performance that exceeds it. Every team with a current routing decision between Flash and Pro-tier models needs to revisit that architecture before the next billing cycle.
Wait for Epoch AI’s independent evaluation before migrating workloads based on the benchmark claims. The pricing is confirmed, use that for budget planning now. The performance advantage over GPT-5.5 on Finance Agent v2 is vendor-reported and requires independent verification before it influences procurement.