Content
summary Summary

Google's multimodal AI model Gemini will compete with OpenAI's GPT-4 starting this fall and will also be available to AI app developers.

This is reported by The Information, citing an anonymous person involved in the development of Gemini.

Gemini is "a group of large AI models," the source said, suggesting that, similar to OpenAI, Google could use GPT-4's approach to model architecture consisting of multiple AI expert models with specific capabilities. It could also mean that Google wants to make Gemini available in different sizes, which is likely to be cost-efficient.

Gemini reportedly can generate images as well as text. Since Gemini has also been trained on YouTube video transcripts, it could also be able to generate simple videos, similar to RunwayML Gen-2 or Pika Labs. Gemini is also said to have significantly improved coding capabilities.

Ad
Ad

Google plans to gradually integrate Gemini into its products, such as the Bard chatbot and Google Docs or Slides. Later this year, Gemini will also be available to external developers in the Google Cloud.

Huge launch requires huge staffing

According to The Information, at least two dozen executives are involved in developing the model. The Gemini team, which consists of Google Brain and Deepmind, is said to include several hundred employees.

Google Deepmind was recently merged and is still finding the right balance, such as remote work policies and the technology used to train the models, according to The Information. Deepmind reportedly abandoned its ChatGPT competitor, codenamed "Goodall" and based on an unannounced model called "Chipmunk," in favor of Gemini.

The Gemini team is led by Deepmind founder Demis Hassabis, with support from two Deepmind executives, Oriol Vinyals and Koray Kavukcuoglu, and former Google Brain chief Jeff Dean. Even Google founder Sergey Brin is involved in the development of Gemini, reportedly helping to train and evaluate the model.

The training materials for Gemini are closely monitored by Google's legal department. For example, the development team has had to remove training data from copyrighted books. According to The Information's source, Gemini was also inadvertently trained on "offensive" content, which likely led to a (partial) re-training of the model.

Recommendation

Gemini was officially unveiled in May. Earlier rumors said the model would have at least a trillion parameters. Training is said to use tens of thousands of Google's TPU AI chips.

Gemini CEO Demis Hassabis said in late June that Gemini will be "combining some of the strengths of AlphaGo-type systems with the amazing language capabilities of the large models. We also have some new innovations that are going to be pretty interesting."

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google is training the Gemini multimodal AI model to compete with OpenAI's GPT-4 starting this fall. It will generate text and images and have advanced programming capabilities, according to The Information.
  • The Gemini team consists of several hundred people from Google Brain and DeepMind, and the model will be gradually integrated into Google products and offered as a cloud product to external developers by the end of the year.
  • Training materials for Gemini will be closely monitored by Google's legal department to avoid copyrighted and offensive content.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.