Nvidia trains a tiny AI model that controls humanoid robots better than specialists

Nvidia researchers have built a small neural network that controls humanoid robots more effectively than specialized systems, even though it uses far fewer resources. The system works with multiple input methods, from VR headsets to motion capture.

The new system, called HOVER, needs only 1.5 million parameters to handle complex robot movements. For context, typical large language models use hundreds of billions of parameters.

The team trained HOVER in Nvidia's Isaac simulation environment, which speeds up robot movements 10,000 times. According to Nvidia researcher Jim Fan, this means that a full year of training in the virtual space takes just 50 minutes of actual computing time on one GPU.

Small and versatile

HOVER moves zero-shot from simulation to physical robots without the need for fine-tuning, says Fan. The system accepts input from multiple sources, including head and hand tracking from XR devices such as Apple Vision Pro, full-body positions from motion capture or RGB cameras, joint angles from exoskeletons, and standard joystick controls.

The Hover model allows a robot to be remotely controlled via a VR headset without any specific fine-tuning. | Video: Nvidia

The system performs better at each control method than systems built specifically for just one type of input. Lead author Tairan He speculates that this may be due to the system's broad understanding of physical concepts such as balance and precise limb control, which it applies across all control types.

The system builds on the open-source H2O & OmniH2O project and works with any humanoid robot that can run in the Isaac simulator. Nvidia has posted examples and code on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Nvidia trains a tiny AI model that controls humanoid robots better than specialists

Small and versatile

Google brings Gemini for Education and Gemini in Classroom AI tools to schools

Microsoft’s MAI-DxO boosts AI diagnostic accuracy and cuts costs by nearly 70 percent

US Senate moves to block state AI laws for five years if states take broadband funds

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

New Othello experiment supports the world model hypothesis for large language models

ChatGPT might be draining your brain, MIT warns - what ‘cognitive debt’ means for you

Nvidia trains a tiny AI model that controls humanoid robots better than specialists

Small and versatile

Google brings Gemini for Education and Gemini in Classroom AI tools to schools

Microsoft’s MAI-DxO boosts AI diagnostic accuracy and cuts costs by nearly 70 percent

US Senate moves to block state AI laws for five years if states take broadband funds