Content
summary Summary

Google Deepmind has introduced two new AI models, Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, built to enable robots to plan, understand, and execute complex tasks on their own. Both models combine multimodal perception, language processing, and motor control with an internal decision-making system.

Ad

Plan first, then act

Gemini Robotics-ER 1.5 serves as a high-level "brain" for robots. It handles task planning, uses digital tools like Google Search, communicates in natural language, and monitors progress and success rates. According to Google Deepmind, the model delivers state-of-the-art results on 15 embodied reasoning benchmarks, including Point-Bench, ERQA, and MindCube.

The second model, Gemini Robotics 1.5, translates these plans into physical actions. Unlike previous vision-language-action models, it reasons before acting: it builds internal logic chains, plans intermediate steps, breaks down complex tasks, and can explain its decisions. For example, when sorting laundry, the model identifies the goal - such as "light-colored clothes in the white bin" - then plans the grip and executes the movement.

Adapts to different robot platforms

Both models can generalize their abilities across different robot types. Google says movement patterns learned with the ALOHA 2 robot also work on platforms like Apptronik's Apollo or the two-armed Franka robot, with no extra fine-tuning required.

Ad
Ad

The models include built-in safety checks. Before executing an action, Gemini Robotics 1.5 checks if the move is safe and can trigger features like collision avoidance if needed.

Both models are based on the broader Gemini multimodal family and have been specifically adapted for robotics. Gemini Robotics-ER 1.5 is now available through the Gemini API in Google AI Studio, while Gemini Robotics 1.5 is currently limited to select partners. More technical details are available in Deepmind's developer blog.

Google first introduced the Gemini Robotics family in March 2025 to give robots multimodal understanding. In June, the company followed up with Gemini Robotics On-Device, a local version optimized for fast adaptation and robust dexterity on robotic hardware.

The newest models extend these advances with stronger planning, better tool use, and the capacity to operate as agentic systems, mirroring progress made with agentic AI on computers.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google Deepmind has introduced Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, two new AI models that allow robots to plan, understand, and carry out complex tasks on their own.
  • Gemini Robotics-ER 1.5 serves as the planning "brain," using digital tools like Google Search, communicating in natural language, and evaluating progress and chances of success.
  • Gemini Robotics 1.5 carries out these instructions, thinks before acting, plans intermediate steps, and can explain its decisions, with both models able to generalize across different robot types.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.