Google's next-generation AI model "Gemini" to launch this fall

Midjourney prompted by THE DECODER

Google's multimodal AI model Gemini will compete with OpenAI's GPT-4 starting this fall and will also be available to AI app developers.

This is reported by The Information, citing an anonymous person involved in the development of Gemini.

Gemini is "a group of large AI models," the source said, suggesting that, similar to OpenAI, Google could use GPT-4's approach to model architecture consisting of multiple AI expert models with specific capabilities. It could also mean that Google wants to make Gemini available in different sizes, which is likely to be cost-efficient.

Gemini reportedly can generate images as well as text. Since Gemini has also been trained on YouTube video transcripts, it could also be able to generate simple videos, similar to RunwayML Gen-2 or Pika Labs. Gemini is also said to have significantly improved coding capabilities.

Google plans to gradually integrate Gemini into its products, such as the Bard chatbot and Google Docs or Slides. Later this year, Gemini will also be available to external developers in the Google Cloud.

Huge launch requires huge staffing

According to The Information, at least two dozen executives are involved in developing the model. The Gemini team, which consists of Google Brain and Deepmind, is said to include several hundred employees.

Google Deepmind was recently merged and is still finding the right balance, such as remote work policies and the technology used to train the models, according to The Information. Deepmind reportedly abandoned its ChatGPT competitor, codenamed "Goodall" and based on an unannounced model called "Chipmunk," in favor of Gemini.

The Gemini team is led by Deepmind founder Demis Hassabis, with support from two Deepmind executives, Oriol Vinyals and Koray Kavukcuoglu, and former Google Brain chief Jeff Dean. Even Google founder Sergey Brin is involved in the development of Gemini, reportedly helping to train and evaluate the model.

The training materials for Gemini are closely monitored by Google's legal department. For example, the development team has had to remove training data from copyrighted books. According to The Information's source, Gemini was also inadvertently trained on "offensive" content, which likely led to a (partial) re-training of the model.

Recommendation

AI in practice

OpenAI's Operator and Computer-Using Agent bring autonomous AI agents closer to reality

Gemini was officially unveiled in May. Earlier rumors said the model would have at least a trillion parameters. Training is said to use tens of thousands of Google's TPU AI chips.

Gemini CEO Demis Hassabis said in late June that Gemini will be "combining some of the strengths of AlphaGo-type systems with the amazing language capabilities of the large models. We also have some new innovations that are going to be pretty interesting."

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google's next-generation AI model "Gemini" to launch this fall

Huge launch requires huge staffing

OpenAI's Operator and Computer-Using Agent bring autonomous AI agents closer to reality

Deepmind expert says trimming documents improves accuracy despite large context windows

Google DeepMind open-sources AI text watermarking for Gemini

Google restructures AI efforts, moves Gemini back to DeepMind

"Cat attack" on reasoning model shows how important context engineering is

Apple's claims about large reasoning models face fresh scrutiny from a new study

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

Google's next-generation AI model "Gemini" to launch this fall

Huge launch requires huge staffing

Share

Bank details