Web world models could give AI agents consistent environments to explore

Jan 11, 2026

Feng et al.

Key Points

Researchers at Princeton University, UCLA, and the University of Pennsylvania have developed "Web World Models," a system that combines program code defining virtual world rules with a language model generating stories and descriptions to fill them.
The system replaces traditional databases with hash functions that recalculate content from fixed parameters on demand, guaranteeing consistent results each time the same content is accessed.
Seven demo applications demonstrate the technology in practice, ranging from a travel atlas to a science fiction universe and a card game featuring AI-generated playing cards.

Researchers at Princeton University, UCLA, and the University of Pennsylvania have developed an approach that gives AI agents persistent worlds to explore. Standard web code defines the rules, while a language model fills these worlds with stories and descriptions.

Web world models split the world into two layers. The first is pure code written in TypeScript. This code defines what exists, how things connect, and which actions are allowed. It enforces logical consistency, preventing players from walking through locked doors or spending money they don't have.

Three-column comparison: Left: “Web Framework” with server icon (limited by database, controllable); center: “Web World Model” with LLM cloud and agent figure (unlimited context, code-based, controllable); right: “World Model” with generated cityscape (unlimited context, but not controllable). Green check marks and red crosses indicate the properties. — Web frameworks offer controllability but limited context. Pure world models allow unlimited context but lose controllability. Web world models combine both, code defines the rules, and LLMs fill in the content. | Image: Feng et al.

The second layer comes from a language model. It brings the framework to life, generating environment descriptions, NPC dialogs, and aesthetic details. Crucially, the model can only work within the boundaries the code sets. When a player tries something, the code first checks whether the action is allowed. Only then does the AI describe what happens. This means the language model can't break any rules, no matter how creative it gets.

Hash functions create "infinite" universes without storage

Perhaps the cleverest idea involves storage. How do you save a nearly infinite universe? The researchers skip storage entirely and recalculate each location from its coordinates on demand. When a player visits a particular planet, its coordinates run through a hash function, a formula that always produces the same output for the same input. That output then sets the random parameters for the language model, ensuring the planet looks identical every time.

Three-step flowchart: On the left, an interactive globe with illuminated markers for selecting locations; in the middle, AI-supported generation of a travel guide and visual theme; on the right, the finished presentation of a structured itinerary for Innsbruck with daily planning. — Users select a location on the globe (1), the system uses an LLM to generate a themed travel guide (2), which appears as a structured multi-day itinerary (3). Geographic data comes from the code, descriptions from the language model. | Image: Feng et al.

A player can visit a planet, leave, come back later, and find the same planet. Not because anyone saved it, but because the math always works out the same way. The researchers call this "object permanence with no storage cost."

The system also works without AI. If the language model responds slowly or crashes, the system falls back on pre-made templates. The world loses its descriptive richness, but the rules still work. This sets the approach apart from purely generative systems, where a language model outage would take down the entire application.

From travel guides to card games

To show the concept works across different domains, the researchers built seven applications. An "Infinite Travel Atlas" turns Earth into an explorable globe. Click anywhere, and the system generates information about places, routes, and stories. Geographic data comes from the code, and descriptions come from the language model.

A "Galaxy Travel Atlas" does the same for a fictional sci-fi universe. The code generates galaxies, star systems, and planets according to defined rules. The language model adds missions, characters, and educational content. Visitors to any planet get a briefing on terrain, sky, signals, and dangers.

A card game called "AI Spire" lets players request custom cards. Type "a fireball that does a lot of fire damage but also freezes your opponent," and the system generates a matching card. The code uses schema validation to make sure the card follows game rules and stays within limits for things like costs and card types.

Architectural diagram of a card game: On the left is the combat phase with player and opponent status, on the right is the reward phase with LLM card generation. Below that is validation through typed interfaces and a symbolic game engine that translates effect codes into rule execution and updates the game state. — After winning a battle, an LLM (Gemini 2.5 Flash) generates reward cards, either as a standard selection or based on player input. TypeScript interfaces validate the output before the game engine applies the effects. | Image: Feng et al.

Screenshot of the game interface: At the top, an elite relic “Ephemeral Might” with a collect button; below it, a text field for card requests with a wish button; at the bottom, three selectable cards (Chrono-Sync, Volatile Aura, Unleashed Fury). On the right, an attack card “Inferno Cataclysm” with a fire symbol, generated by request. — Players can collect an AI-generated elite relic, request a wish card through free text input (e.g., "a massive fireball"), or choose from three generated standard cards. The system turns natural language wishes into playable card effects. | Image: Feng et al.

Other demos include a sandbox simulation called "AI Alchemy" where elements react with each other and the AI suggests new reaction rules, a 3D planet explorer ("Cosmic Voyager") with ongoing AI commentary, a generator for Wikipedia-style articles ("WWMPedia"), and a system for generative long-form literature ("Bookshelf").

What this could mean for AI agent training

The researchers position their work as a middle ground between rigid database applications and uncontrollable generative systems. Web world models aim to combine the reliability of classic web development with the flexibility of language models.

This could prove relevant for AI agent development. Agents that perform tasks on their own need training environments that are consistent enough for meaningful learning, yet flexible enough to handle unforeseen situations.

How well the approach scales with more complex interactions remains unclear. The demos are impressive but relatively straightforward. The researchers don't show whether web world models work when many agents act at the same time or when rules need to change on the fly.

Research into training environments for AI agents is picking up steam. A recent study by Microsoft Research and US universities shows that fine-tuned LLMs can fill this role and predict environmental conditions with over 99 percent accuracy. Turing Award winner Richard Sutton sees such world models as key to experience-based learning for future AI agents.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Arxiv