PokéLLMon is a language model-based AI agent that can beat humans at Pokémon.
PokéLLMon uses large language models, wiki entries, and a form of reinforcement learning to create an AI agent that is comparable to human players.
The Georgia Institute of Technology team sees the project as a test bed for developing agents that can behave like humans in virtual worlds. According to the team, tactical combat games, especially Pokémon Battles, provide a suitable format because they offer measurable victory rates, and consistent opponents, such as AI or human players, are always available.
Pokémon Battles are strategically challenging, requiring players to consider a wide range of factors, from the characteristics of the Pokémon to the environmental conditions of the game.
PokéLLMon reads Pokédex and learns in battle
Without assistance, even the best language models, such as GPT-4, fall far short of the human level. So the team developed a method based on three key elements:
In-Context Reinforcement Learning (ICRL)
In ICRL, PokéLLMon iteratively improves its strategy based on text-based feedback from previous battles. This feedback serves as a kind of "reward" and includes information about the evolution of a Pokémon's HP, the effectiveness of attacks, and the priority of move execution. According to the team, this allows the agent to continually refine its strategies and correct mistakes.
Knowledge Augmented Generation (KAG)
KAG allows PokéLLMon to incorporate external knowledge, such as type advantages and effects of moves or abilities, into its decision-making. This knowledge comes from the Pokédex, an encyclopedia of Pokémon. The team believes that the KAG reduces the problem of hallucinations.
Consistent Action Generation (CAG)
CAG is used to mitigate the phenomenon of "panic switching", where the agent tends to generate inconsistent actions when facing a strong opponent because it wants to avoid fighting. Selecting the most coherent actions as the result ensures that the agent does not act rashly in a state of "panic."
PokéLLMon beats humans, but is inferior to good players
In online battles against human players, PokéLLMon has a 49% win rate in ladder battles and a 56% win rate in one-on-one battles. This puts the Pokémon agent on par with human players on average.
Although PokéLLMon is on par with human players in many areas, it still has weaknesses. According to the researchers, it tends to favour actions that offer short-term advantages and is susceptible to the long-term strategies of human players. It can also be tricked into unfavourable actions by the deceptive manoeuvres of experienced players. The team is now working to address these weaknesses.