Content
summary Summary

Dreamfusion combines Google's large AI image model Imagen with NeRF's 3D capabilities.

Dreamfusion is the evolution of Dream Fields, a generative 3D AI system that Google unveiled in late 2021. For Dream Fields, Google combined OpenAI's image analysis model CLIP with Neural Radiance Fields (NeRF), which allow a neural network to store 3D models.

Dream Fields leveraged NeRF's ability to generate 3D views and combined it with CLIP's ability to evaluate content from images. After a text input, an untrained NeRF model generates a random view from a single viewpoint, which is evaluated by CLIP. The feedback is used as a correction signal for the NeRF model. This process is repeated up to 20000 times from different viewpoints until a 3D model matching the text description is generated. Dreamfusion further develops this approach.

From 2D images to 3D models

Based on Google's pre-trained 2D text-image diffusion model Imagen, Dreamfusion performs text 3D synthesis. For Dreamfusion, Google is replacing OpenAI's CLIP, which can also be used for 3D generation, with a new loss based on Imagen, which Google says, "could enable many new applications of pre-trained diffusion models."

Ad
Ad

Therefore, 3D generation does not require training with 3D data that would not be available at the required scale. Instead, Dreamfusion learns the 3D representation using 2D images of an object generated with Imagen from different perspectives. The research team used gaze-dependent prompts such as "front view" or "rear view" for this purpose. The process runs automatically.

Video: Google

Compared to Dream Fields, Dreamfusion creates re-lightable 3D objects with higher quality, depth, and normals based on text input. Multiple 3D models created with Dreamfusion can also be merged into one scene.

Video: Google

"Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors," Google's research team writes.

Recommendation

Exporting generated 3D models for standard 3D tools

The generated NeRF models can be exported into meshes using the Marching Cubes algorithm and then integrated into popular 3D renderers or modeling software.

"We're excited to incorporate our methods with open-source models and enable a new future for 3D generation," wrote contributing Google Brain researcher Ben Poole on Twitter.

An overview of 3D models generated with Dreamfusion is available on Github.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google's new Dreamfusion AI model can generate 3D objects based on text input.
  • The 3D models have a high quality, are re-lightable and exportable. They can be further processed in common 3D tools.
  • Dreamfusion generates the 3D models based on 2D images from the generative image model Imagen. The model therefore does not require hardly available 3D training data.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.