Voyager uses GPT-4 to guide a learning Minecraft agent through the pixel world. Instead of reinforcement learning, Voyager relies on code generation.
Researchers from Nvidia, Caltech, UT Austin, Stanford, and ASU introduce Voyager, the first lifelong learning agent that plays Minecraft. Unlike other Minecraft agents that use classic reinforcement learning techniques, for example, Voyager uses GPT-4 to continuously improve itself. It does this by writing, improving, and transferring code stored in an external skill library.
This results in small programs that help navigate, open doors, mine resources, craft a pickaxe, or fight a zombie. "GPT-4 unlocks a new paradigm," says Nvidia researcher Jim Fan, who advised the project. In this paradigm, "training" is the execution of code and the "trained model" is the code base of skills that Voyager iteratively assembles.
Voyager consists of three main components:
- An iterative prompting mechanism that incorporates feedback from the game, execution errors, and self-checking to refine programs.
- A skill library with code for storing and retrieving complex behaviors.
- An automated curriculum to maximize exploration.
Voyager Minecraft agent learns in context
To explore the diverse world of Minecraft, the team uses an automated curriculum that suggests appropriate exploration tasks based on the agent's current skills and the current state of the world. For example, the agent learns to collect sand and cactus in a desert before digging for iron.
Together, this creates an agent that is constantly learning and can perform a variety of tasks. The team runs all experiments in the MineDojo environment.
Voyager can currently only build houses with human feedback.
The team compares Voyager to other language model-based agents such as ReAct, Reflection, or Auto-GPT in Minecraft. Voyager discovered 63 different objects with 160 prompt iterations - 3.3 times more than the next best approach, the team says.
The automated search for previously unknown objects causes Voyager to travel extensively: Overall, the Minecraft agent travels more than twice the distance and visits more biomes. Auto-GPT and other methods, on the other hand, often get stuck in their local area.
The skill library built by Voyager is also compatible with Auto-GPT: The AI agent in Minecraft achieves significantly better results with it, but still lags behind Voyager.
Currently, Voyager is only text-based and cannot see what is happening in the block world. So it can't build houses. However, in an early experiment, the team used humans to give the agent visual feedback - so Voyager can learn to build houses and Nether portals, for example.