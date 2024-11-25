AI in practice
Nvidia's Edify 3D turns text and images into 3D assets

Nvidia
Matthias Bastian
Nvidia announced a new AI model that creates 3D objects from text descriptions or images in about two minutes. The technology aims to transform how developers create 3D assets for games, films, and extended reality applications.

The system, called Edify 3D, produces both detailed 3D geometry and high-resolution 4K textures with physically based rendering (PBR). Nvidia says these assets work directly in production environments with little or no additional processing.

The team trained Edify 3D using non-public datasets of high-resolution images, pre-rendered multi-angle views and 3D shape data. Multiple pre-processing steps ensured quality and consistency across training assets.

How it works

The system works in several steps. First, a diffusion model generates several matching views of the object from different angles. A specialized reconstruction model then merges these views into a complete 3D model.

Six PBR renderings: backpack, phonograph, robotic arm, knight armor, pilot seat, and adobe house in isometric view.
A collection of six PBR (Physically Based Rendering) renderings generated with Edify 3D. The models appear higher quality compared to previous methods with good prompt alignment. | Imag: Nvidia

A key feature of Edify 3D is that it produces quad meshes - 3D models with clean structures and UV maps ready for 3D artists to modify. The output appears more refined compared to other 3D creation tools currently available.

Flowchart: AI pipeline for generating a 3D steampunk robot character using diffusion, ControlNet, and 3D reconstruction.
Edify 3D uses various AI models for image generation, 3D reconstruction and texturing to go from a textual description to a finished 3D asset. | Image: Nvidia

Beyond single objects

The technology goes beyond creating individual objects. Nvidia shows how Edify 3D can generate complete 3D environments by creating multiple objects from a textual description and combining them into cohesive scenes.

The researchers emphasize that the generated assets can be seamlessly integrated into existing 3D workflows and are suitable for various applications such as video games, augmented reality and film production. It is not clear from Nvidia's technical report if and when Edify 3D will be publicly available.

