Content
summary Summary

Google Deepmind has introduced Gemini Robotics On-Device, a version of its robotics model that runs directly on robot hardware, removing the need for a cloud connection. By processing everything locally, robots can operate in environments with unreliable or no internet access.

Ad

The Vision-Language-Action (VLA) model is built on a variant of Gemini Robotics-ER. Its architecture features a VLA backbone that acts as the "brain," interpreting what the robot sees and deciding on the right actions, while a local action decoder translates those decisions into real-world movements. The entire perception-to-action cycle takes just 250 milliseconds, fast enough for smooth, responsive control.

Solid performance, even without the cloud

In tests, Gemini Robotics On-Device handled tasks like unzipping bags, folding clothes, and pouring salad dressing - all without needing to connect to external servers. Google says it outperformed other locally-run systems on seven different manipulation tasks.

Running the model locally does require some trade-offs. For especially complex reasoning tasks, the cloud-based version achieves higher success rates. However, Google says the on-device model delivers strong enough performance for many practical scenarios.

Ad
Ad

Google Deepmind is providing a developer kit to make adaptation easier. Instead of using millions of training examples, the robot can learn new tasks from just 50 to 100 demonstrations. Developers can also run tests in a simulator without needing physical hardware.

One model for many types of robots

Although the base model was originally trained on ALOHA robots, it can be adapted to work with a wide range of systems. For example, on a Franka industrial robot, it achieved a 63 percent success rate on familiar tasks. The model can also control humanoid robots like Apollo, which features a human-like body.

Multiple safety layers are built in. The system checks commands for potential hazards and works with hardware safeguards to prevent collisions. Even so, Google Deepmind recommends thorough testing before deploying the system in real-world settings.

Access to Gemini Robotics On-Device is currently available through a closed testing program. Developers can apply for the Trusted Tester Program as Google Deepmind gathers feedback and gradually improves the system.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google Deepmind has launched Gemini Robotics On-Device, a robotics model that runs entirely on robot hardware, allowing robots to function smoothly without needing an internet connection.
  • The system can handle tasks like folding clothes and pouring salad dressing with a quick 250-millisecond perception-to-action cycle, outperforming other locally-run models on seven manipulation tasks, though cloud-based versions still excel at more complex reasoning.
  • Developers can adapt the model to different robots with as few as 50 to 100 demonstrations, test it in simulation, and benefit from built-in safety checks, but real-world deployment requires thorough testing, and access is limited to a closed Trusted Tester Program for now.
Sources
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.