Google DeepMind's RoboCat improves itself

Google Deepmind's RobotCat can control many robotic arms and is constantly improving through self-generated data.

RobotCat is a self-improving AI agent for robotics that learns a variety of tasks across multiple robotic arms and autonomously generates new training data to improve itself. In doing so, the team aims to address a key problem in robotics: Advances in AI can lead to general-purpose robots, but development is slow because it takes a long time to collect the necessary real-world data.

With Robotic Transformer 1 and projects such as PaLM-SayCan, Google is also trying to apply its experience in other areas of AI to robotics. However, according to Google Deepmind, RoboCat is the first AI agent capable of solving multiple tasks and adapting to different real-world robots.

Google Deepmind RobotCat is based on Deepmind's Gato

RoboCat also learns much faster than other models: The AI agent can learn new tasks in 100 to 1,000 demonstrations; other models can't match RobotCat's success rate for such numbers.

"This capability will help accelerate robotics research, as it reduces the need for human-supervised training, and is an important step towards creating a general-purpose robot," the team said.

RoboCat is based on Deepmind's Gato, which can process language, images, and actions in both simulated and real-world environments. The team made some adjustments to Gato and trained the model using a large training dataset of image and action sequences from different robotic arms performing hundreds of tasks.

After this training, RoboCat's self-improvement phase begins, during which the system learns to perform previously unknown tasks. Training takes place in five stages:

Collecting 100 to 1,000 demonstrations of a new task or robot with a robotic arm controlled by a human.
Fine-tuning RoboCat to the new task/arm, creating a specialized spin-off agent.
The spin-off agent practices the new task/arm an average of 10,000 times, generating more training data.
The demonstration data and the self-generated data are integrated into the existing RoboCat training dataset.
A new version of RoboCat is then trained using the new training data set.

RoboCat can improve itself. | Image: Google Deepmind

RoboCat's ability to learn improves with experience

By combining all these training efforts, RoboCat has a dataset of millions of trajectories from real and simulated robot arms, including self-generated data. Based on this, RoboCat can learn to control new robot arms, even with different grippers, in a matter of hours - and the more RoboCat learns, the better the AI agent can learn the next tasks. For example, the first version of RoboCat, with 500 examples, solved new tasks only 36 percent of the time; the current final version, with significantly more tasks, has doubled the success rate.

"These improvements were due to RoboCat's growing breadth of experience, similar to how people develop a more diverse range of skills as they deepen their learning in a given domain," the company said. "RoboCat’s ability to independently learn skills and rapidly self-improve, especially when applied to different robotic devices, will help pave the way toward a new generation of more helpful, general-purpose robotic agents."

Recommendation

AI research