Google rolls out AI model "Gemini Pro", "Gemini Ultra" to beat GPT-4

Google's long-awaited Gemini Pro AI model finally debuts in Bard, albeit in a smaller version with fewer capabilities. With Ultra, Google is teasing a larger Gemini model for early 2024 that is supposed to beat OpenAI's GPT-4.

According to Google, Gemini Pro is a competitor to OpenAI's year-old GPT-3.5 AI model. It is supposed to outperform OpenAI's model in six out of eight benchmarks. An even more compact version, Nano (1.8B parameters and 3.25B parameters), is optimized for Android app development. The Nano models are distilled from the larger Gemini models.

The Pro and Nano will be available through the Google Cloud starting December 13, and Google says they run on its own TPU AI chips. Google does not specify parameters for the larger models. Like other providers' LLMs, Google says Gemini is still struggling with hallucinations.

Google is offering three sizes of the Gemini model. Nano is optimized for mobile devices, Pro is the GPT-3.5 competitor, Ultra is supposed to surpass GPT-4 and will be available in early 2024. | Bild: Google

The largest version of Gemini, Ultra, is expected to outperform OpenAI's GPT-4 on popular benchmarks for text and image understanding and code generation. Ultra will be released in early 2024 and will also be integrated into an "advanced" version of the Bard chatbot (see below).

According to Google, Gemini Ultra should outperform OpenAI GPT-4 on popular benchmarks. An independent study has yet to be done. | Image: Google AI

Google Gemini is allegedly good enough for state-of-the-art results in many tasks, especially in audio and video understanding. | Image: Google AI

Google's benchmark results need to be confirmed by independent, third-party testers. More benchmark results are available from Deepmind.

Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks — notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined.

Google Deepmind, Technical Report

As expected, Gemini is multimodal, meaning it can handle text, images, audio, video, and code. Gemini does not currently offer image generation, but according to the technical documentation, this feature is available and will probably be introduced over time. Gemini can be prompted with images, text, or a combination of the two.

Google Gemini can also generate images based on text, images, or both. | Bild: Image: Google

The following video demonstrates Gemini's multimodal capabilities.

Try Gemini Pro in Google Bard

Google is integrating Gemini with Bard in two phases. Beginning today, Bard will use a customized version of Gemini Pro English that provides enhanced features for understanding, summarizing, planning, and coding. Gemini Pro English is available in more than 170 countries and territories, according to Google.

According to Google, Gemini Pro outperformed GPT-3.5 in six out of eight benchmarks, including Massive Multitask Language Understanding (MMLU) and GSM8K, which measures elementary school-level math problem-solving skills. In independent third-party blind tests, Bard was rated the preferred free chatbot over ChatGPT, according to Google.

Recommendation

AI in practice

OpenAI launches new reasoning model o3-mini for free ChatGPT and API

The second upgrade phase for Bard will introduce Bard Advanced early next year, giving users access to the most advanced models and features, starting with Gemini Ultra. It is not known if Google will charge for this, as OpenAI does for ChatGPT Plus.

Over the next year, Gemini models will be rolled out to other Google products such as search, ads, and the Workspace productivity app.

The Nano model for smartphones will be used in the Pixel 8 Pro. For example, it will create summaries of voice memos.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google rolls out AI model "Gemini Pro", "Gemini Ultra" to beat GPT-4

Try Gemini Pro in Google Bard

OpenAI launches new reasoning model o3-mini for free ChatGPT and API

Perplexity's valuation soared to $18 billion after its latest funding round

OpenAI CEO Sam Altman warns users not to trust ChatGPT agent with sensitive or personal data

Anthropic appears to tighten the usage limits for Claude code

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Google rolls out AI model "Gemini Pro", "Gemini Ultra" to beat GPT-4

Try Gemini Pro in Google Bard

Share

Bank details