Google's Gemini 2.0 model family expands with Flash-Lite and Pro

Feb 5, 2025

Gemini

Key Points

Google is expanding its Gemini AI model family with two new versions of Gemini 2.0, each designed for different use cases and price points. The low-cost Flash-Lite is optimized for price/performance, and the experimental Gemini 2.0 Pro targets complex queries and coding.
Benchmark data shows that Gemini 2.0 Pro outperforms its predecessors in areas such as math and factual accuracy. It scores 91.8% on the MATH benchmark and 44.3% on OpenAI's SimpleQA test. The general Gemini 2.0 Flash falls between the Lite and Pro versions.
While Gemini 2.0 Flash costs more than its predecessor, the new Flash Lite is priced the same as Gemini 1.5 Flash while delivering better performance in most benchmarks. All models are available through Google's AI platforms and its Gemini Advanced chatbot.

Google is expanding its AI model family with three new Gemini 2.0 variants, each designed for different use cases and offering varying balances of performance and cost.

The basic Gemini 2.0 Flash model, introduced in December, is now generally available with higher rate limits and improved performance, Google says. Google is also launching Gemini 2.0 Flash-Lite, a cost-effective variant for developers that's currently in public preview through the API.

Completing the lineup is Gemini 2.0 Pro, which Google describes as still experimental. Designed for complex prompts and coding tasks, it features an extended context window of 2 million tokens - twice that of the Flash versions.

While the models only support text output for now, Google plans to add image and audio capabilities as well as live video to Flash and Pro in the coming months. All three models can process image and audio as input.

Feature comparison table: Gemini 2.0 variants (Flash, Flash-Lite, Pro Experimental) showing availability and capabilities across 11 key features. — Flash-Lite provides essential features, Flash and Pro Experimental include advanced tools like code execution and search functionality. Pro Experimental stands out with a doubled context window of 2 million tokens. |Image: Google

Google is also testing Flash Thinking models with Gemini 2.0, which function similarly to OpenAI o3 and Deepseek-R1 by running through additional reasoning steps before generating answers. These models can access YouTube, Maps and Google Search. Notably absent from the announcement is the flagship "Gemini 2.0 Ultra" model.

Gemini Pro 2.0 leads the benchmarks

Google's benchmark data shows that Gemini 2.0 Pro outperforms its predecessors in almost all areas. For mathematical tasks, it scores 91.8 percent on the MATH benchmark and 65.2 percent on HiddenMath, significantly outperforming the Flash variants. The general Flash 2.0 version scores between Flash Lite and Pro and outperforms the older 1.5 Pro model.

Comparison table: Performance metrics of Gemini 2.0 Flash (GA) and Pro Experimental across 12 categories, with mathematics as standout strength. — Results of the new Gemini 2.0 Flash and Pro Experimental models in common benchmarks. Advances have been made in mathematics, multi-language and factuality. | Image: Google

In OpenAI's SimpleQA test, the Pro model reached 44.3 percent, while Gemini 2.0 Flash achieved 29.9 percent. Deepseek-R1 (30.1 percent) and o3-mini-high (13.8 percent) are far behind in this test, likely due to smaller training datasets. The test requires models to answer difficult factual questions without internet access - though this may be less relevant for real-world applications.

For API pricing, Google has removed the previous distinction between short and long context queries. This means that mixed workloads (text and images) may cost less than with Gemini 1.5 Flash, despite performance improvements.

Overall, Gemini 2.0 Flash is pricier than its predecessor. However, the new Flash-Lite is designed to compete with the older 1.5 Flash - it costs the same and performs better in most benchmarks. Only real-world tests will show whether the two models deliver comparable quality.

Pricing table: Detailed cost breakdown for Gemini 2.0 Flash, showing prices per million tokens for various input and output formats. — Gemini 2.0 Flash is pricier than 1.5 - Flash Lite is designed to fill the price-performance gap. | Image: Google

All models are available through Google AI Studio and Vertex AI, as well as Google's premium Gemini Advanced chatbot on desktop and mobile devices.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Google for Developers | Google Blog