Gemini 2.5 Pro: Google has finally caught up
Key Points
- Google DeepMind has unveiled Gemini 2.5 Pro, its most powerful AI model to date, which the company says leads a number of benchmark tests, including the chatbot arena, which measures human preferences.
- The multimodal model can process text, audio, images, video and code, and scores well on mathematical and scientific tests such as GPQA and AIME, as well as 18.8% on the challenging "Humanity's Last Exam" without any special optimizations.
- Gemini 2.5 Pro has a context window of 1 million tokens (with plans to expand to 2 million) and is already available to developers in Google AI Studio and to users of Gemini Advanced, with availability in Vertex AI to follow in the coming weeks.
Google Deepmind has introduced Gemini 2.5 Pro, which the company describes as its most capable AI model to date.
According to Google, the new model already leads numerous benchmark tests by significant margins, including the Chatbot Arena that measures human preferences.
The model represents Google's first major reasoning model following initial experiments with Flash 2.0 Thinking. Google intends to integrate these reasoning capabilities directly into all its future models.
Performance across multiple domains
Gemini 2.5 Pro demonstrates strong capabilities in various areas, Google says. Without specialized optimization, the model achieves solid results in mathematical and scientific tests like GPQA and AIME. It scores 18.8% on the challenging "Humanity's Last Exam" - the highest score among models without additional tools.
For programming tasks, Google claims Gemini 2.5 Pro excels particularly in web app development and code transformation. With a customized agent configuration, it achieves 63.8% on SWE-Bench Verified. Google demonstrates this capability by showing how the model can generate functional game code from a single-line instruction. However, Anthropic's Claude 3.7 Sonnet Thinking still outperforms Google's model in this benchmark.
First true multimodal reasoning model
Like its predecessors, Gemini 2.5 Pro processes text, audio, images, video, and code - a diversity of inputs not yet matched by competing models. The model maintains Google's characteristic large context window of 1 million tokens, with plans to expand this to 2 million.
Developers and businesses can already experiment with Gemini 2.5 Pro in Google AI Studio. Gemini Advanced subscribers can select the model from the dropdown menu on both desktop and mobile devices. Google plans to announce availability in Vertex AI and pricing details in the coming weeks.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now