Ad
Skip to content

Google’s Gemini 2.5 Flash gives you speed when you need it and reasoning when you can afford it

Image description
GPT-4 prompted by THE DECODER

Key Points

  • Google has released Gemini 2.5 Flash, a faster and more flexible version of its lightweight AI model, which developers can now access through the Gemini API using Google AI Studio, Vertex AI, and the Gemini app for end users.
  • Built on the 2.0 Flash foundation, Gemini 2.5 Flash is a hybrid model that allows developers to control the amount of "thinking" the system does, balancing quality, response time, and cost.
  • The release of Flash complements Google's Gemini 2.5 series, which includes the more powerful Gemini 2.5 Pro model for demanding tasks requiring full-scale reasoning and multimodal support.

Google has released an early preview of Gemini 2.5 Flash, a faster, more flexible version of its lightweight AI model.

Developers can try it now through the Gemini API using Google AI Studio and Vertex AI. The model is also available to users in the Gemini app.

Built on the 2.0 Flash foundation, the new version is designed to offer stronger reasoning while optimizing for speed and cost.

Google describes it as a hybrid model that gives developers more control over how much "thinking" the system does. With this control, users can set budgets to balance quality, response time, and cost.

Ad
DEC_D_Incontent-1

Even with "thinking" turned off, Gemini 2.5 Flash still outperforms its predecessor. Turning it on improves output quality, but the price increases—from $0.004 to $3.50 per response.

Despite the higher cost, the model is still cheaper than comparable systems. Only OpenAI's o4-mini comes close in terms of price performance.

Image: Google

Flash extends Google’s hybrid Gemini 2.5 lineup

The release of Flash complements Google’s broader Gemini 2.5 series of hybrid reasoning models. While Flash is designed for speed and affordability, Gemini 2.5 Pro targets more demanding tasks with full-scale reasoning and multimodal support.

Gemini 2.5 Pro is Google’s most capable model to date, leading several benchmark tests. It performs well across math, science, and programming tasks, scoring 18.8% on the "Humanity’s Last Exam" and 63.8% on SWE-Bench Verified. Pro is available now in Google AI Studio and to Gemini Advanced subscribers.

Ad
DEC_D_Incontent-2

But Gemini 2.5 Pro comes with higher pricing. Input tokens cost $1.25 per million for prompts up to 200,000 tokens and $2.50 beyond that. Output tokens—including "thinking"—are priced at $10 per million under 200k tokens, and $15 above.

Together, Gemini 2.5 Flash and Pro offer developers more flexibility across speed, cost, and reasoning power—part of Google’s broader strategy to deliver scalable AI options for a wide range of use cases.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Google