Google's long-awaited Gemini Pro AI model finally debuts in Bard, albeit in a smaller version with fewer capabilities. With Ultra, Google is teasing a larger Gemini model for early 2024 that is supposed to beat OpenAI's GPT-4.
According to Google, Gemini Pro is a competitor to OpenAI's year-old GPT-3.5 AI model. It is supposed to outperform OpenAI's model in six out of eight benchmarks. An even more compact version, Nano (1.8B parameters and 3.25B parameters), is optimized for Android app development. The Nano models are distilled from the larger Gemini models.
The Pro and Nano will be available through the Google Cloud starting December 13, and Google says they run on its own TPU AI chips. Google does not specify parameters for the larger models. Like other providers' LLMs, Google says Gemini is still struggling with hallucinations.
The largest version of Gemini, Ultra, is expected to outperform OpenAI's GPT-4 on popular benchmarks for text and image understanding and code generation. Ultra will be released in early 2024 and will also be integrated into an "advanced" version of the Bard chatbot (see below).
Google's benchmark results need to be confirmed by independent, third-party testers. More benchmark results are available from Deepmind.
Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks — notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined.
Google Deepmind, Technical Report
As expected, Gemini is multimodal, meaning it can handle text, images, audio, video, and code. Gemini does not currently offer image generation, but according to the technical documentation, this feature is available and will probably be introduced over time. Gemini can be prompted with images, text, or a combination of the two.
The following video demonstrates Gemini's multimodal capabilities.
Try Gemini Pro in Google Bard
Google is integrating Gemini with Bard in two phases. Beginning today, Bard will use a customized version of Gemini Pro English that provides enhanced features for understanding, summarizing, planning, and coding. Gemini Pro English is available in more than 170 countries and territories, according to Google.
According to Google, Gemini Pro outperformed GPT-3.5 in six out of eight benchmarks, including Massive Multitask Language Understanding (MMLU) and GSM8K, which measures elementary school-level math problem-solving skills. In independent third-party blind tests, Bard was rated the preferred free chatbot over ChatGPT, according to Google.
The second upgrade phase for Bard will introduce Bard Advanced early next year, giving users access to the most advanced models and features, starting with Gemini Ultra. It is not known if Google will charge for this, as OpenAI does for ChatGPT Plus.
Over the next year, Gemini models will be rolled out to other Google products such as search, ads, and the Workspace productivity app.
The Nano model for smartphones will be used in the Pixel 8 Pro. For example, it will create summaries of voice memos.