Google has released an early preview of Gemini 2.5 Flash, a faster, more flexible version of its lightweight AI model.

Developers can try it now through the Gemini API using Google AI Studio and Vertex AI. The model is also available to users in the Gemini app.

Built on the 2.0 Flash foundation, the new version is designed to offer stronger reasoning while optimizing for speed and cost.

Google describes it as a hybrid model that gives developers more control over how much "thinking" the system does. With this control, users can set budgets to balance quality, response time, and cost.

Even with "thinking" turned off, Gemini 2.5 Flash still outperforms its predecessor. Turning it on improves output quality, but the price increases—from $0.004 to $3.50 per response.

Despite the higher cost, the model is still cheaper than comparable systems. Only OpenAI's o4-mini comes close in terms of price performance.

Flash extends Google’s hybrid Gemini 2.5 lineup

The release of Flash complements Google’s broader Gemini 2.5 series of hybrid reasoning models. While Flash is designed for speed and affordability, Gemini 2.5 Pro targets more demanding tasks with full-scale reasoning and multimodal support.

Gemini 2.5 Pro is Google’s most capable model to date, leading several benchmark tests. It performs well across math, science, and programming tasks, scoring 18.8% on the "Humanity’s Last Exam" and 63.8% on SWE-Bench Verified. Pro is available now in Google AI Studio and to Gemini Advanced subscribers.

But Gemini 2.5 Pro comes with higher pricing. Input tokens cost $1.25 per million for prompts up to 200,000 tokens and $2.50 beyond that. Output tokens—including "thinking"—are priced at $10 per million under 200k tokens, and $15 above.

Together, Gemini 2.5 Flash and Pro offer developers more flexibility across speed, cost, and reasoning power—part of Google’s broader strategy to deliver scalable AI options for a wide range of use cases.

