Content
summary Summary

Google has released an early preview of Gemini 2.5 Flash, a faster, more flexible version of its lightweight AI model.

Ad

Developers can try it now through the Gemini API using Google AI Studio and Vertex AI. The model is also available to users in the Gemini app.

Built on the 2.0 Flash foundation, the new version is designed to offer stronger reasoning while optimizing for speed and cost.

Google describes it as a hybrid model that gives developers more control over how much "thinking" the system does. With this control, users can set budgets to balance quality, response time, and cost.

Ad
Ad

Even with "thinking" turned off, Gemini 2.5 Flash still outperforms its predecessor. Turning it on improves output quality, but the price increases—from $0.004 to $3.50 per response.

Despite the higher cost, the model is still cheaper than comparable systems. Only OpenAI's o4-mini comes close in terms of price performance.

Image: Google

Flash extends Google’s hybrid Gemini 2.5 lineup

The release of Flash complements Google’s broader Gemini 2.5 series of hybrid reasoning models. While Flash is designed for speed and affordability, Gemini 2.5 Pro targets more demanding tasks with full-scale reasoning and multimodal support.

Gemini 2.5 Pro is Google’s most capable model to date, leading several benchmark tests. It performs well across math, science, and programming tasks, scoring 18.8% on the "Humanity’s Last Exam" and 63.8% on SWE-Bench Verified. Pro is available now in Google AI Studio and to Gemini Advanced subscribers.

But Gemini 2.5 Pro comes with higher pricing. Input tokens cost $1.25 per million for prompts up to 200,000 tokens and $2.50 beyond that. Output tokens—including "thinking"—are priced at $10 per million under 200k tokens, and $15 above.

Recommendation

Together, Gemini 2.5 Flash and Pro offer developers more flexibility across speed, cost, and reasoning power—part of Google’s broader strategy to deliver scalable AI options for a wide range of use cases.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google has released Gemini 2.5 Flash, a faster and more flexible version of its lightweight AI model, which developers can now access through the Gemini API using Google AI Studio, Vertex AI, and the Gemini app for end users.
  • Built on the 2.0 Flash foundation, Gemini 2.5 Flash is a hybrid model that allows developers to control the amount of "thinking" the system does, balancing quality, response time, and cost.
  • The release of Flash complements Google's Gemini 2.5 series, which includes the more powerful Gemini 2.5 Pro model for demanding tasks requiring full-scale reasoning and multimodal support.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.