summary Summary

Google's SayCan controls robots through a large language model. IBM developer James Weaver is now applying this approach to non-player characters in the sandbox game Roblox.

Large language models such as OpenAI's GPT-3 or Google's PaLM can translate and complete texts, conduct dialogs, or generate code. Although far from all the models' emerging abilities have been explored, AI researchers are already combining them with external modules for accessing Python, with physics simulators, and using them in CICERO for diplomacy games or for inner monologue in robots.

Google's AI division combined current robotics technology with PaLM in the summer of 2022. In PaLM-SayCan, a language model converts text input from users into robot instructions, enabling follow-up questions and promoting more flexible robot behavior.

Google's SayCan uses a language model for more robot flexibility

SayCan from Google uses chain of thought prompting to generate potential steps for carrying out an instruction and then evaluates their probability of success. The robot executes the chain of actions that the language model assigns the highest probability of success.


The advantage here is that the robot accepts instructions in natural language. If someone asks for an energy-rich snack, the robot will preferentially give this person an energy bar. If this is not available, the robot will recognize an apple or a sugar drink as a suitable alternative.

Roblox demo is Google's SayCan for NPCs

IBM developer James Weaver now shows in a Roblox demo how the concept behind SayCan can be used for more flexible NPCs. Weaver uses GPT-3 for his implementation, so anyone who wants to test the demo must have an OpenAI API key ready.

In the sandbox, players can then use a chat window to give instructions to an NPC, such as, "Put the rice chips in the bowl and then put the tea on the table.

All interactive objects are uniquely named and identifiable by the NPC via natural language. | Image: Dr. Alan D. Thompson, YouTube, Roblox

All objects in the demo are simple boxes or round tables, named according to their role. The NPC follows the object names and executes the commands. For this purpose, GPT-3 converts the natural language commands into robot commands in the background, for example into

find(rice chips), pick(rice chips), find(bowl), put(rice chips), find(tea), pick(tea), find(table), put(tea), done() .

The demo is available on the Roblox website. You can find additional information on Weaver's GitHub.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • SayCan from Google controls robots via a large language model. This allows the robots to interact more flexibly with the environment and with humans.
  • A new Roblox demo shows how this approach can also contribute to more flexible non-player characters in games.
  • IBM developer James Weaver implements the SayCan concept in Roblox: Players can have an NPC perform object-related tasks via a text message.
  • GPT-3 automatically converts these natural language requests into robot commands.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.