Content
summary Summary

DeepMind's Vice President of Drastic Research and Gemini co-tech lead Oriol Vinyals describes how artificial intelligence is moving from narrowly focused systems toward autonomous agents, and what challenges remain ahead.

Ad

According to Vinyals, AI is going through a fundamental transformation away from highly specialized systems and toward autonomous agents. Speaking on a company podcast, he explained that early AI systems like AlphaStar, which focused on playing StarCraft, were just the beginning of this development.

Today's large language models (LLMs) and multimodal systems serve as a kind of "CPU" - a foundation for more complex capabilities, Vinyals says. The next major step is giving these systems a "digital body" that allows them to interact independently with the (digital) world.

The limitations of scaling

A key challenge lies in the limitations of scaling, according to Vinyals. Simply building larger models is no longer enough, as improvements become exponentially harder to achieve. Vinyals compares it to cleaning a room: "The first 10 minutes that you spend tidying, it's going to make a massive difference. But once you're like 7 hours in, that extra 10 minutes, it's not going to make any difference at all."

Ad
Ad

Training data is also becoming scarce. Vinyals says DeepMind is experimenting with synthetic data and untapped data sources like videos: "There's a lot of it. And we haven't quite seen a moment of take all the video data where you probably can derive a lot of knowledge, a lot of laws of physics, a lot of how the world works, even if there are no words associated with the videos necessarily, and extract that knowledge."

First steps with Gemini 2.0

With Gemini 2.0, Google DeepMind has introduced initial capabilities toward autonomous agents. According to Google's demos, the system can navigate browsers, write code, and act as a "companion" in games. But these abilities are just the beginning: "There are a lot of steps. But if you just fast-forward, anything a human can do on a browser, these things can do in principle. And then if you make them really understand what you want and really good through thinking and other techniques, they'll get better and better," says Vinyals.

The vision extends further: DeepMind is working to give agents capabilities like planning, logical thinking, and different types of memory. While Vinyals draws parallels to the human brain, he emphasizes that artificial systems might take entirely different approaches better suited to computers.

AGI, Agents and AlphaFold

On the development of Artificial General Intelligence (AGI), Vinyals takes a measured view: "If 10 years ago, 5 years ago even, I would have been given the models today, and I would say, look, there's a secret lab, this is a model, play with it and tell me if you think this is actually close to a general intelligence; I would have claimed, oh, yeah, that comes from a future where AGI basically either has happened or I can see that this is very close to it. So the closer you are, the more you find, oh, but it hallucinates. Of course, that's very important. But I think, just zooming out, it just feels like, ok, it's getting pretty close."

He expects initial breakthroughs mainly in scientific areas with clear success criteria, as was the case with AlphaFold. "And we've seen a good example very recently, of course, with AlphaFold. So in that sense, from a domains perspective, we honestly have seen some examples already of narrow but superintelligent system. AlphaFold was only doing that. And I think probably that's the domains to think about where we're going to start seeing superintelligence, even from the general sort of capabilities these models have. You might need to do some specialization. And again, it might be worth it," says Vinyals. "Was it worth it to solve protein folding? Absolutely, right? But I think that's a good test to use."

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • DeepMind's Oriol Vinyals says AI is moving from narrowly focused systems toward autonomous agents that can independently interact with the digital world, using large language models and multimodal systems as a foundation.
  • Scaling AI systems is becoming increasingly difficult due to the limitations of training data and the exponential effort required for improvements. DeepMind is exploring synthetic data and untapped sources like videos to overcome these challenges.
  • With Gemini 2.0, DeepMind has introduced initial capabilities for autonomous agents, such as navigating browsers, writing code, and acting as a companion in games. The company is working on further capabilities like planning, logical thinking, and different types of memory to bring these agents closer to human-like abilities.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.