Google's latest version of its Gemini AI model (Exp-1114) has achieved top scores across most test categories in the Chatbot Arena, now sharing the leading position with OpenAI's GPT-4o, according to testing platform lmarena.ai.
Based on more than 6,000 community evaluations, Gemini-Exp-1114 leads the rankings in mathematics, image processing, and creative writing categories. For programming tasks, the model ranks third. In head-to-head comparisons, Gemini wins 50 percent of matches against GPT-4o, 56 percent against o1-preview, and 62 percent against Claude 3.5 Sonnet.
When factoring in style control metrics, which assess pure content performance without considering formatting elements like text length or headings, Gemini's position changes significantly. Under these adjusted metrics, which aim to prevent models from scoring higher simply through longer or visually enhanced responses, Gemini drops to fourth place.
The experimental Gemini version is publicly available through Google's AI Studio platform.
Gemini 2 or just a minor update?
Gemini, first introduced in December 2023 and updated to version 1.5 in February 2024, currently offers a Pro variant that processes up to one million tokens, with a beta version handling up to ten million tokens. The system works with text, images, audio, video, and code. Google integrates Gemini across various products, including Workspace, Google Search, and the Gemini app.
Reports suggest Google plans to introduce Gemini 2 in December, though its performance reportedly falls short of expectations. Whether this new experimental version represents a variant of Gemini 2 remains unclear.