Ad
Skip to content

Google Deepmind brings agentic AI capabilities into robots with two new Gemini models

Image description
Google Deepmind (Screenshot)

Key Points

  • Google Deepmind has introduced Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, two new AI models that allow robots to plan, understand, and carry out complex tasks on their own.
  • Gemini Robotics-ER 1.5 serves as the planning "brain," using digital tools like Google Search, communicating in natural language, and evaluating progress and chances of success.
  • Gemini Robotics 1.5 carries out these instructions, thinks before acting, plans intermediate steps, and can explain its decisions, with both models able to generalize across different robot types.

Google Deepmind has introduced two new AI models, Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, built to enable robots to plan, understand, and execute complex tasks on their own. Both models combine multimodal perception, language processing, and motor control with an internal decision-making system.

Plan first, then act

Gemini Robotics-ER 1.5 serves as a high-level "brain" for robots. It handles task planning, uses digital tools like Google Search, communicates in natural language, and monitors progress and success rates. According to Google Deepmind, the model delivers state-of-the-art results on 15 embodied reasoning benchmarks, including Point-Bench, ERQA, and MindCube.

The second model, Gemini Robotics 1.5, translates these plans into physical actions. Unlike previous vision-language-action models, it reasons before acting: it builds internal logic chains, plans intermediate steps, breaks down complex tasks, and can explain its decisions. For example, when sorting laundry, the model identifies the goal - such as "light-colored clothes in the white bin" - then plans the grip and executes the movement.

Adapts to different robot platforms

Both models can generalize their abilities across different robot types. Google says movement patterns learned with the ALOHA 2 robot also work on platforms like Apptronik's Apollo or the two-armed Franka robot, with no extra fine-tuning required.

Ad
DEC_D_Incontent-1

The models include built-in safety checks. Before executing an action, Gemini Robotics 1.5 checks if the move is safe and can trigger features like collision avoidance if needed.

Both models are based on the broader Gemini multimodal family and have been specifically adapted for robotics. Gemini Robotics-ER 1.5 is now available through the Gemini API in Google AI Studio, while Gemini Robotics 1.5 is currently limited to select partners. More technical details are available in Deepmind's developer blog.

Google first introduced the Gemini Robotics family in March 2025 to give robots multimodal understanding. In June, the company followed up with Gemini Robotics On-Device, a local version optimized for fast adaptation and robust dexterity on robotic hardware.

The newest models extend these advances with stronger planning, better tool use, and the capacity to operate as agentic systems, mirroring progress made with agentic AI on computers.

Ad
DEC_D_Incontent-2

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Google