Nvidia wants to create universal AI agents for all worlds with NitroGen
Nvidia has released a new base model for gaming agents. NitroGen is an open vision action model trained on 40,000 hours of gameplay videos from more than 1,000 games. The researchers tapped into a previously overlooked resource: YouTube and Twitch videos with visible controller overlays. Using template matching and a fine-tuned SegFormer model, they extracted player inputs directly from these recordings.
NitroGen builds on Nvidia's GR00T N1.5 robotics model. According to the researchers, it's the first model to demonstrate that robotics foundation models can work as universal agents across virtual environments with different physics engines and visual styles. The model handles various genres—action RPGs, platformers, roguelikes, and more. When dropped into unfamiliar games, it achieves up to 52 percent better success rates than models trained from scratch.
The team, which includes researchers from Nvidia, Stanford, Caltech, and other universities, has made the dataset, model weights, paper, and code publicly available.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now