Google’s Gemini 2.5 Flash gives you speed when you need it and reasoning when you can afford it

Apr 18, 2025

GPT-4 prompted by THE DECODER

Google has released an early preview of Gemini 2.5 Flash, a faster, more flexible version of its lightweight AI model.

Developers can try it now through the Gemini API using Google AI Studio and Vertex AI. The model is also available to users in the Gemini app.

Built on the 2.0 Flash foundation, the new version is designed to offer stronger reasoning while optimizing for speed and cost.

Google describes it as a hybrid model that gives developers more control over how much "thinking" the system does. With this control, users can set budgets to balance quality, response time, and cost.

Even with "thinking" turned off, Gemini 2.5 Flash still outperforms its predecessor. Turning it on improves output quality, but the price increases—from $0.004 to $3.50 per response.

Despite the higher cost, the model is still cheaper than comparable systems. Only OpenAI's o4-mini comes close in terms of price performance.

Flash extends Google’s hybrid Gemini 2.5 lineup

The release of Flash complements Google’s broader Gemini 2.5 series of hybrid reasoning models. While Flash is designed for speed and affordability, Gemini 2.5 Pro targets more demanding tasks with full-scale reasoning and multimodal support.

Gemini 2.5 Pro is Google’s most capable model to date, leading several benchmark tests. It performs well across math, science, and programming tasks, scoring 18.8% on the "Humanity’s Last Exam" and 63.8% on SWE-Bench Verified. Pro is available now in Google AI Studio and to Gemini Advanced subscribers.

But Gemini 2.5 Pro comes with higher pricing. Input tokens cost $1.25 per million for prompts up to 200,000 tokens and $2.50 beyond that. Output tokens—including "thinking"—are priced at $10 per million under 200k tokens, and $15 above.

Together, Gemini 2.5 Flash and Pro offer developers more flexibility across speed, cost, and reasoning power—part of Google’s broader strategy to deliver scalable AI options for a wide range of use cases.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Google’s Gemini 2.5 Flash gives you speed when you need it and reasoning when you can afford it

Flash extends Google’s hybrid Gemini 2.5 lineup

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.