With the new multimodal AI model Gemini, Google wants to at least catch up with OpenAI's GPT-4. First tests are underway.
According to three anonymous sources from The Information, Google has given a small group of selected companies access to a stripped-down chat version of Gemini. The three sources claim to have direct knowledge of the matter. The largest version of Gemini is still being developed internally.
The first test with external customers could be an indication that the launch of Gemini is getting closer. Google had previously announced it for this year. Earlier reports said it would launch in the fall.
Gemini will be offered to businesses via cloud access and integrated into Google's consumer products. Google plans to use Gemini for all of its AI applications, from the Bard chatbot to the new AI features in Workspace.
Through the Vertex AI service, Google plans to offer different model sizes. Smaller models could perform simpler tasks at a lower cost.
Google user data could give Gemini an edge
A big advantage, according to one tester, is that Google can process data from its products, such as Google Search, in addition to public information from the Web. This could result in the model understanding user intent better than GPT-4. It could also result in fewer incorrect answers, according to the source.
Gemini's code generation is reportedly good enough that Google hopes to compete with Microsoft's GitHub Copilot. Features such as analyzing graphs and interpreting data, as well as performing actions on the computer, such as in the browser, using voice commands, are also being discussed.
Multiple Gemini models
Gemini, according to The Information, is "a set of large language models" that can perform various tasks such as chatbots, text summarization, code, or generating new text. It is unclear whether Gemini will rely on networked expert models, as OpenAI does with its GPT-4 architecture.
Demis Hassabis, Gemini's lead manager, said in late June that Gemini will combine some strengths of the AlphaGo system with the language capabilities of large models.