A new AI system developed by Google Research and Google DeepMind transforms photos into realistic 3D scenes in a matter of seconds, as long as it knows where the camera was positioned.
The system, called Bolt3D, processes photos into complete three-dimensional scenes in just 6.25 seconds on an Nvidia H100 GPU - a task that typically takes other systems minutes or hours to complete.
Bolt3D first figures out where each pixel belongs in 3D space and what color it should be. A second model then determines how transparent each point should be and how it extends through space.

The system relies on a technique called "Gaussian splatting" to store its data, organizing the 3D scene using three-dimensional Gaussian functions laid out in 2D grids. Each function tracks position, color, transparency, and spatial information, letting users view the scene from any angle in real time. To keep files manageable, the system strips out transparent areas and compresses the remaining data efficiently.
Video: Szymanowicz et al.
Breaking new ground in 3D generation
Tests show Bolt3D performing significantly better than existing fast methods like Flash3D and DepthSplat. While those systems can only blur areas they can't see, Bolt3D actually generates realistic content for hidden parts of scenes.
This capability comes from a specialized AI model designed specifically for handling spatial data - the researchers found that regular models trained on photos alone couldn't handle the complexities of 3D information.
To build this capability, the team trained Bolt3D on about 300,000 3D scenes, using a mix of photo-based reconstructions and computer-generated models. This extensive dataset helps the system make educated guesses about parts of scenes it can't fully see.
Video: Szymanowicz et al.
The system still has its limitations. It struggles with very fine details (anything less than eight pixels wide), transparent materials like glass, and highly reflective surfaces. The quality of results also depends heavily on how the photos were taken and how large the final scene needs to be.
Even with these limitations, Bolt3D appears to be a step forward in 3D content creation. The paper suggests that its speed could make large-scale 3D scene generation practical for the first time. While there's no word yet on public availability, interested users can find more information and interactive demos on the project's website.
The development comes as Stability AI releases its own SPAR3D system, which can also generate 3D objects from single images very quickly. The key difference: while SPAR3D works with individual objects, Bolt3D can handle entire scenes.