Dreamfusion: Google AI creates 3D models from text

Dreamfusion combines Google's large AI image model Imagen with NeRF's 3D capabilities.

Dreamfusion is the evolution of Dream Fields, a generative 3D AI system that Google unveiled in late 2021. For Dream Fields, Google combined OpenAI's image analysis model CLIP with Neural Radiance Fields (NeRF), which allow a neural network to store 3D models.

Dream Fields leveraged NeRF's ability to generate 3D views and combined it with CLIP's ability to evaluate content from images. After a text input, an untrained NeRF model generates a random view from a single viewpoint, which is evaluated by CLIP. The feedback is used as a correction signal for the NeRF model. This process is repeated up to 20000 times from different viewpoints until a 3D model matching the text description is generated. Dreamfusion further develops this approach.

From 2D images to 3D models

Based on Google's pre-trained 2D text-image diffusion model Imagen, Dreamfusion performs text 3D synthesis. For Dreamfusion, Google is replacing OpenAI's CLIP, which can also be used for 3D generation, with a new loss based on Imagen, which Google says, "could enable many new applications of pre-trained diffusion models."

Therefore, 3D generation does not require training with 3D data that would not be available at the required scale. Instead, Dreamfusion learns the 3D representation using 2D images of an object generated with Imagen from different perspectives. The research team used gaze-dependent prompts such as "front view" or "rear view" for this purpose. The process runs automatically.

Video: Google

Compared to Dream Fields, Dreamfusion creates re-lightable 3D objects with higher quality, depth, and normals based on text input. Multiple 3D models created with Dreamfusion can also be merged into one scene.

Video: Google

"Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors," Google's research team writes.

Recommendation

AI research

Study reveals AI models have hidden capabilities they can't access through normal prompts

Exporting generated 3D models for standard 3D tools

The generated NeRF models can be exported into meshes using the Marching Cubes algorithm and then integrated into popular 3D renderers or modeling software.

"We're excited to incorporate our methods with open-source models and enable a new future for 3D generation," wrote contributing Google Brain researcher Ben Poole on Twitter.

An overview of 3D models generated with Dreamfusion is available on Github.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Dreamfusion: Google AI creates 3D models from text

From 2D images to 3D models

Study reveals AI models have hidden capabilities they can't access through normal prompts

Exporting generated 3D models for standard 3D tools

Google's Veo 3 video generation model launches on Gemini API with a hefty price tag

Google brings Gemini 2.5 Pro and Deep Search to AI Mode and adds AI phone calling to search

Google makes NotebookLM a content platform with curated public notebooks

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Dreamfusion: Google AI creates 3D models from text

From 2D images to 3D models

Exporting generated 3D models for standard 3D tools

Share

Bank details