LATTE3D generates 3D models almost in real time

Nvidia's LATTE3D turns text input into detailed 3D objects in less than a second, making it the fastest generative AI model for 3D content available today.

LATTE3D can generate three-dimensional representations of objects and animals from text input in less than a second. Developed at NVIDIA's AI lab in Toronto under the direction of Sanja Fidler, vice president of AI research, the ideas behind LATTE3D have the potential to significantly accelerate the design and development process in the video game industry, advertising, and other fields.

A year ago, comparable AI models took an hour to produce 3D visualizations of this quality. Today, the fastest models have reduced this time to a few minutes, sometimes less than a minute at medium quality. With LATTE3D, this young technology now achieves near real-time 3D generation.

Comprehensive pretraining enables the speed of LATTE3D

As with other models, LATTE3D implements a two-step generation process. In the first step, a rough 3D shape is created from the text. In the second step, this shape is refined to add details and textures. This split allows for efficient and detailed generation of 3D models.

The high speed of LATTE3D is achieved by training the model with a large number of tasks simultaneously. The model learns to recognize general patterns and structures that enable it to respond more quickly to new, similar tasks. The team uses 3D datasets as well as prompts generated by ChatGPT to teach the model, for example, that prompts for different breeds of dog start with a basic shape.

This means that LATTE3D does not have to start from scratch with each prompt, but can draw on the basic understanding it has acquired during training. In principle, the team shifts the computing power required: instead of spending several minutes on inference, more time is invested in training.

Results obtained in seconds can be refined in minutes through further inference to obtain more detailed objects. Finished models can then be animated using other methods such as Align Your Gaussians.

More information and examples can be found on the LATTE3D project page.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

LATTE3D generates 3D models almost in real time

Comprehensive pretraining enables the speed of LATTE3D

Nvidia can resume exports of its H20 AI chip to China after a US policy reversal

Malaysia tightens grip on US AI chip shipments

Bloomberg: China’s AI expansion in Xinjiang relies on Nvidia chips despite U.S. export controls

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

LATTE3D generates 3D models almost in real time

Comprehensive pretraining enables the speed of LATTE3D

Nvidia can resume exports of its H20 AI chip to China after a US policy reversal

Malaysia tightens grip on US AI chip shipments

Bloomberg: China’s AI expansion in Xinjiang relies on Nvidia chips despite U.S. export controls