Skyfall-GS turns satellite images into walkable 3D cities

A new AI system called Skyfall-GS can generate walkable 3D city models using only standard satellite images. Unlike older methods that require expensive 3D scanners or fleets of camera cars, Skyfall-GS builds cityscapes entirely from aerial views.

Typical 3D city models built from satellite images have one big limitation: they show only rooftops. Facades, street-level features, and side details are missing, which leads to blurry, distorted, or blocky buildings.

Other methods like naive 3DGS and Sat-NeRF produce blurry facades, while CityDreamer and GaussianCity oversimplify building shapes. Skyfall-GS creates realistic structures and textures that closely resemble real cities. | Image: Lee et al.

Skyfall-GS solves this with a two-step process. First, it creates a rough 3D outline from satellite images. Then, an AI model fills in missing details like facades and street-level textures, similar to how an image generator completes unfinished pictures.

The name “Skyfall” describes the learning strategy: the system starts with high-altitude views and works step-by-step down to street level, refining the model as if a camera were falling from the sky.

How Skyfall-GS builds a city

Skyfall-GS combines two AI techniques. It builds the basic 3D structure using 3D Gaussian splatting, which represents scenes as clouds of light points. Then it uses diffusion models—the same kind that power popular image generators—to fill in realistic details.

The process starts with a rough 3D reconstruction from satellite images, then improves it in stages with an AI diffusion model until facades and streets look sharp and realistic. | Image: Lee et al.

The process runs in five passes. In each round, the virtual camera lowers its angle, moving from 85 degrees down to 45 degrees. The AI creates 54 different views each time, using text prompts to guide improvements.

As the camera angle drops from steep to flat, the model sharpens details from above down to street level, improving with each step. | Image: Lee et al.

These prompts tell the AI what to fix, turning a “satellite image of an urban area with distorted areas and blurring artifacts” into a “clear satellite image with sharp buildings, smooth edges, and natural lighting.”

Diffusion-based refinement removes artifacts and delivers sharper textures and more precise building shapes, adding new details to the model. | Image: Lee et al.

Outperforming older city modeling methods

Researchers tested Skyfall-GS with real satellite images from Jacksonville, Florida, and New York City. The system consistently beat previous methods, creating more realistic buildings and cleaner textures.

Skyfall-GS delivers more accurate geometry and sharper textures in low-angle views than any other approach, according to tests on DFC2019 and GoogleEarth data. | Image: Lee et al.

In a user study with 89 participants, Skyfall-GS was rated best in 97 percent of comparisons for both geometry and overall quality.

Recommendation

AI research

Wait a minute! Researchers say AI's "chains of thought" are not signs of human-like reasoning

It’s also fast: Skyfall-GS runs at 11 frames per second on a standard graphics card and up to 40 frames per second on a MacBook Air. For comparison, CityDreamer, a previous system, manages just 0.18 frames per second on more expensive hardware.

What this could mean for games, film, and robotics

Skyfall-GS could be useful in a range of areas. Game developers might use it to create city environments more efficiently. Film productions could generate digital backgrounds, and robotics teams may find it helpful for simulating real-world spaces.

There’s a large amount of satellite data available: for example, WorldView-3 collects around 680,000 square kilometers per day at up to 31 centimeters per pixel. This could make large-scale automated 3D modeling possible.

The researchers acknowledge that Skyfall-GS still needs a lot of computing power and doesn’t always handle highly detailed street scenes well. They aim to improve performance and scalability in future versions. The code is available as open source on GitHub, and demos can be found on the project website.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Skyfall-GS turns satellite images into walkable 3D cities

How Skyfall-GS builds a city

Outperforming older city modeling methods

Wait a minute! Researchers say AI's "chains of thought" are not signs of human-like reasoning

What this could mean for games, film, and robotics

Tencent's Voyager can simulate camera motion in 3D scenes without traditional modeling pipeline

Tencent releases "Hunyuan World Model 1.0-Lite" for faster, resource-efficient 3D scene generation

Generative AI startup Odyssey demos interactive AI-generated video

Frustrated authors withdraw papers after realizing their reviewers are just lazy LLMs

Gemini 3 Pro tops new AI reliability benchmark, but hallucination rates remain high

Researchers push "Context Engineering 2.0" as the road to lifelong AI memory

Skyfall-GS turns satellite images into walkable 3D cities

How Skyfall-GS builds a city

Outperforming older city modeling methods

What this could mean for games, film, and robotics

Share

Bank details