Nvidia announced a new AI model that creates 3D objects from text descriptions or images in about two minutes. The technology aims to transform how developers create 3D assets for games, films, and extended reality applications.
The system, called Edify 3D, produces both detailed 3D geometry and high-resolution 4K textures with physically based rendering (PBR). Nvidia says these assets work directly in production environments with little or no additional processing.
The team trained Edify 3D using non-public datasets of high-resolution images, pre-rendered multi-angle views and 3D shape data. Multiple pre-processing steps ensured quality and consistency across training assets.
How it works
The system works in several steps. First, a diffusion model generates several matching views of the object from different angles. A specialized reconstruction model then merges these views into a complete 3D model.
A key feature of Edify 3D is that it produces quad meshes - 3D models with clean structures and UV maps ready for 3D artists to modify. The output appears more refined compared to other 3D creation tools currently available.
Beyond single objects
The technology goes beyond creating individual objects. Nvidia shows how Edify 3D can generate complete 3D environments by creating multiple objects from a textual description and combining them into cohesive scenes.
The researchers emphasize that the generated assets can be seamlessly integrated into existing 3D workflows and are suitable for various applications such as video games, augmented reality and film production. It is not clear from Nvidia's technical report if and when Edify 3D will be publicly available.