Content
summary Summary

RoboCasa is a new simulation framework for training robots in everyday environments. It uses AI tools to create a variety of scenes, tasks, and 3D objects.

Researchers at the University of Texas at Austin and Nvidia have introduced RoboCasa, a simulation framework for training household robots in everyday environments. The goal is to generate diverse training data for robots by scaling the simulation to develop adaptive robot models.

RoboCasa builds on the RoboSuite framework, which is optimized for physical realism and high speed, and is based on Deepmind's MuJoCo. RoboCasa extends RoboSuite to support mobile robots, humanoid robots, and quadruped robots with arms.

Video: Nasiriany, Maddukuri, Zhang et al.

Ad
Ad

Generative AI enables a diverse training environment

The framework initially focuses on kitchen environments. It offers 120 realistic kitchen scenes in ten layouts and twelve styles, from simple to luxurious. Textures for walls, floors, and furniture are created using AI image generators such as Midjourney to increase visual variety.

More than 2,500 high-quality 3D objects in 153 categories, from food to kitchen utensils, have been compiled from 3D model databases and text-to-3D services. There are also dozens of interactive kitchen appliances such as microwaves and stoves.

Video: Nasiriany, Maddukuri, Zhang et al.

The researchers defined 100 representative tasks for kitchen activities. 25 of these are atomic tasks, covering basic skills such as reaching and pressing buttons. The remaining 75 are composite tasks suggested by language models such as GPT-4. They include more complex activities such as cooking and setting the table.

For each task, human users first collected 50 teleoperation demonstrations. The dataset was then expanded to over 100,000 synthetic demonstrations using the MimicGen automated process.

Recommendation

Generated training data also improves real-world robots

Experiments have shown that robot models trained with machine-generated data generalize significantly better than those trained only with human demonstrations. The success rate increased with the amount of data generated.

Training with simulated data also helped in the real world: In a real kitchen, robots trained with RoboCasa data had a 79 % higher success rate than those trained only with real data.

The researchers see the combination of simulations, generative AI, and robotic data sets as a promising approach to training basic models for robotics.

The quality of the generated data will be further improved in the future. An extension to other environments and tasks is also planned.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

More information and examples can be found on the RoboCasa project page.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers at the University of Texas at Austin and Nvidia have unveiled RoboCasa, a simulation framework for training household robots in a variety of everyday environments.
  • RoboCasa uses generative AI such as Midjourney and GPT-4 to generate 120 realistic kitchen scenes, over 2,500 3D objects, and 100 representative tasks. Human demonstrations are extended to over 100,000 synthetic demonstrations.
  • Experiments show that robot models additionally trained with machine-generated RoboCasa data generalize significantly better and achieve a 79% higher success rate in the real world than those trained only with real data.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.