Google's new Bolt3D AI system can generate complete 3D scenes from photos in just 6 seconds

A new AI system developed by Google Research and Google DeepMind transforms photos into realistic 3D scenes in a matter of seconds, as long as it knows where the camera was positioned.

The system, called Bolt3D, processes photos into complete three-dimensional scenes in just 6.25 seconds on an Nvidia H100 GPU - a task that typically takes other systems minutes or hours to complete.

Bolt3D first figures out where each pixel belongs in 3D space and what color it should be. A second model then determines how transparent each point should be and how it extends through space.

Overview of the Bolt3D methodology: input from multiple images and target poses, latent diffusion models for appearance and geometry, VAE decoder, geometry decoding, Gaussians for splatter images, result as 3D Gaussian scene. — Bolt3D combines diffusion models, VAE decoders, and trained geometry decoding to create a renderable 3D scene from images. | Image: Szymanowicz et al.

The system relies on a technique called "Gaussian splatting" to store its data, organizing the 3D scene using three-dimensional Gaussian functions laid out in 2D grids. Each function tracks position, color, transparency, and spatial information, letting users view the scene from any angle in real time. To keep files manageable, the system strips out transparent areas and compresses the remaining data efficiently.

Video: Szymanowicz et al.

Breaking new ground in 3D generation

Tests show Bolt3D performing significantly better than existing fast methods like Flash3D and DepthSplat. While those systems can only blur areas they can't see, Bolt3D actually generates realistic content for hidden parts of scenes.

This capability comes from a specialized AI model designed specifically for handling spatial data - the researchers found that regular models trained on photos alone couldn't handle the complexities of 3D information.

To build this capability, the team trained Bolt3D on about 300,000 3D scenes, using a mix of photo-based reconstructions and computer-generated models. This extensive dataset helps the system make educated guesses about parts of scenes it can't fully see.

Video: Szymanowicz et al.

Recommendation

AI research

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

The system still has its limitations. It struggles with very fine details (anything less than eight pixels wide), transparent materials like glass, and highly reflective surfaces. The quality of results also depends heavily on how the photos were taken and how large the final scene needs to be.

Even with these limitations, Bolt3D appears to be a step forward in 3D content creation. The paper suggests that its speed could make large-scale 3D scene generation practical for the first time. While there's no word yet on public availability, interested users can find more information and interactive demos on the project's website.

The development comes as Stability AI releases its own SPAR3D system, which can also generate 3D objects from single images very quickly. The key difference: while SPAR3D works with individual objects, Bolt3D can handle entire scenes.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google's new Bolt3D AI system can generate complete 3D scenes from photos in just 6 seconds

Breaking new ground in 3D generation

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Corporate AI agents use simple workflows with human oversight instead of chasing full autonomy

Perplexity's BrowseSafe tries to patch the gaping security holes inherent in AI browser agents

GeoVista brings open-source AI geolocation to near-parity with top commercial models

Corporate AI agents use simple workflows with human oversight instead of chasing full autonomy

Physicist Steve Hsu publishes research built around a core idea generated by GPT-5

The ARC benchmark's fall marks another casualty of relentless AI optimization

Google's new Bolt3D AI system can generate complete 3D scenes from photos in just 6 seconds

Breaking new ground in 3D generation

Share

Bank details