Google Deepmind brings agentic AI capabilities into robots with two new Gemini models

Google Deepmind has introduced two new AI models, Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, built to enable robots to plan, understand, and execute complex tasks on their own. Both models combine multimodal perception, language processing, and motor control with an internal decision-making system.

Plan first, then act

Gemini Robotics-ER 1.5 serves as a high-level "brain" for robots. It handles task planning, uses digital tools like Google Search, communicates in natural language, and monitors progress and success rates. According to Google Deepmind, the model delivers state-of-the-art results on 15 embodied reasoning benchmarks, including Point-Bench, ERQA, and MindCube.

The second model, Gemini Robotics 1.5, translates these plans into physical actions. Unlike previous vision-language-action models, it reasons before acting: it builds internal logic chains, plans intermediate steps, breaks down complex tasks, and can explain its decisions. For example, when sorting laundry, the model identifies the goal - such as "light-colored clothes in the white bin" - then plans the grip and executes the movement.

Adapts to different robot platforms

Both models can generalize their abilities across different robot types. Google says movement patterns learned with the ALOHA 2 robot also work on platforms like Apptronik's Apollo or the two-armed Franka robot, with no extra fine-tuning required.

The models include built-in safety checks. Before executing an action, Gemini Robotics 1.5 checks if the move is safe and can trigger features like collision avoidance if needed.

Both models are based on the broader Gemini multimodal family and have been specifically adapted for robotics. Gemini Robotics-ER 1.5 is now available through the Gemini API in Google AI Studio, while Gemini Robotics 1.5 is currently limited to select partners. More technical details are available in Deepmind's developer blog.

Google first introduced the Gemini Robotics family in March 2025 to give robots multimodal understanding. In June, the company followed up with Gemini Robotics On-Device, a local version optimized for fast adaptation and robust dexterity on robotic hardware.

The newest models extend these advances with stronger planning, better tool use, and the capacity to operate as agentic systems, mirroring progress made with agentic AI on computers.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google Deepmind brings agentic AI capabilities into robots with two new Gemini models

Plan first, then act

Adapts to different robot platforms

Ilya Sutskever says a new learning paradigm is necessary and is already chasing it

New study maps how AI models think and where their reasoning breaks down

"Genesis Mission" to pool US data for AI models

Frustrated authors withdraw papers after realizing their reviewers are just lazy LLMs

Gemini 3 Pro tops new AI reliability benchmark, but hallucination rates remain high

Researchers push "Context Engineering 2.0" as the road to lifelong AI memory

Google Deepmind brings agentic AI capabilities into robots with two new Gemini models

Plan first, then act

Adapts to different robot platforms

Ilya Sutskever says a new learning paradigm is necessary and is already chasing it

New study maps how AI models think and where their reasoning breaks down

"Genesis Mission" to pool US data for AI models