This AI understands video games, and it could make cheating easier than ever
Key Points
- Researchers have developed an AI model called VideoGameBunny that specializes in understanding video games. It is based on the open-source Bunny architecture and was trained on over 185,000 screenshots and 390,000 image-text pairs.
- In a benchmark with multiple-choice questions about video game images, VideoGameBunny outperformed the larger but generally trained LLaVA model with 85.1 percent accuracy. It performed particularly well in recognizing game-specific anomalies and understanding HUD information.
- The researchers see potential for AI game assistants but are also aware of the risk of abuse for cheating. To enable further research, they have made the source code, training data, and models publicly available.
Anonymous researchers have presented a specialized AI model called VideoGameBunny in a new paper. VideoGameBunny is a vision-language model that can understand images and answer questions about video games based on screenshots. While this technology could make gaming more accessible, it also has significant potential for abuse in competitive settings.
The open-source multimodal model is based on the Bunny architecture and was trained on an extensive dataset of over 185,000 screenshots from 413 games collected from YouTube using the search term "gameplay walkthroughs." Bunny was developed by an AI research group at the Beijing Academy of Artificial Intelligence and presented in a paper in February.
Hundreds of thousands of text-image pairs for training
For training, the researchers generated nearly 390,000 image-text pairs using Gemini 1.0 Pro, Gemini 1.5 Pro, GPT-4V, LLaMA-3, and GPT-40, including long and short captions, question-answer sets, and structured JSON descriptions of visual elements.

In a benchmark with multiple-choice questions about video game images, VideoGameBunny achieved an accuracy of 85.1 percent compared to 83.9 percent for the much larger but generally trained open-source model LLaVA-1.6-34b. VideoGameBunny showed particular strengths in recognizing game-specific anomalies and understanding HUD information.
When asked whether this game scene shows any glitches or errors, only VideoGameBunny correctly denied this. The unmodified Bunny model, on the other hand, was bothered by the glowing ball in the left half of the screen, while LLaVA claimed that the download bar at the top right was stuck.

VideoGameBunny could help cheaters
To encourage further research, the researchers have made VideoGameBunny's source code, training data, and logs publicly available. In addition to the 8 billion parameter model, there is also an even smaller one with only 4 billion parameters.
Recently, there have been a number of efforts to have AI models play games on their own or assist humans with comments. In May, Microsoft demonstrated the ability of its Copilot to assist inexperienced players in Minecraft.
However, VideoGameBunny seems to take a more holistic approach than previous solutions due to its extensive training material. Instead of specializing in just one game, it could become a general gaming assistant.
The researchers see their model as a first step toward an AI assistant that can perform tasks such as playing, commenting on, and debugging games. However, they are also aware that they could enable cheating: "As AI models becomes more adept at understanding game contents, there is a risk that they could be used to create sophisticated cheating tools." Many such tools already exist, but models like VideoGameBunny could open up new use cases.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now