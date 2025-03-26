AI in practice
Maximilian Schreiner

Gemini 2.5 Pro: Google has finally caught up

Midjourney prompted by THE DECODER
Gemini 2.5 Pro: Google has finally caught up
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Profile
E-Mail
Content
summary Summary

Google Deepmind has introduced Gemini 2.5 Pro, which the company describes as its most capable AI model to date.

According to Google, the new model already leads numerous benchmark tests by significant margins, including the Chatbot Arena that measures human preferences.

The model represents Google's first major reasoning model following initial experiments with Flash 2.0 Thinking. Google intends to integrate these reasoning capabilities directly into all its future models.

Performance across multiple domains

Gemini 2.5 Pro demonstrates strong capabilities in various areas, Google says. Without specialized optimization, the model achieves solid results in mathematical and scientific tests like GPQA and AIME. It scores 18.8% on the challenging "Humanity's Last Exam" - the highest score among models without additional tools.

For programming tasks, Google claims Gemini 2.5 Pro excels particularly in web app development and code transformation. With a customized agent configuration, it achieves 63.8% on SWE-Bench Verified. Google demonstrates this capability by showing how the model can generate functional game code from a single-line instruction. However, Anthropic's Claude 3.7 Sonnet Thinking still outperforms Google's model in this benchmark.

First true multimodal reasoning model

Like its predecessors, Gemini 2.5 Pro processes text, audio, images, video, and code - a diversity of inputs not yet matched by competing models. The model maintains Google's characteristic large context window of 1 million tokens, with plans to expand this to 2 million.

Developers and businesses can already experiment with Gemini 2.5 Pro in Google AI Studio. Gemini Advanced subscribers can select the model from the dropdown menu on both desktop and mobile devices. Google plans to announce availability in Vertex AI and pricing details in the coming weeks.

Summary
  • Google DeepMind has unveiled Gemini 2.5 Pro, its most powerful AI model to date, which the company says leads a number of benchmark tests, including the chatbot arena, which measures human preferences.
  • The multimodal model can process text, audio, images, video and code, and scores well on mathematical and scientific tests such as GPQA and AIME, as well as 18.8% on the challenging "Humanity's Last Exam" without any special optimizations.
  • Gemini 2.5 Pro has a context window of 1 million tokens (with plans to expand to 2 million) and is already available to developers in Google AI Studio and to users of Gemini Advanced, with availability in Vertex AI to follow in the coming weeks.
Sources
Google
