Generative AI for 3D models is making steady progress. The latest system comes from researchers in China and is the fastest yet.
Researchers from MetaApp AI Research and several Chinese universities have developed MetaDreamer, a new tool for rapidly creating 3D models from text descriptions. The method is designed to overcome common problems in creating 3D models, such as inconsistencies from different angles and slow processing times.
To do this, the team separates the generative process. MetaDreamer works in two main stages: First, the tool shapes the 3D object (geometry phase) to ensure it looks correct from all angles. Then, in the texture phase, MetaDreamer adds details and textures to make the object look realistic.
MetaDreamer generates a model in 20 minutes
Specifically, in the geometry phase, the MetaDreamer team optimizes a rough 3D model in Instant-NGP with a reference image generated by a diffusion model and several images generated by a multi-view diffusion model from different angles. In the second phase, the resulting model is further refined in Instant-NGP with additional AI-generated detailed images.
According to the team, this method results in faster and higher quality 3D models. MetaDreamer can create detailed 3D objects from text in just 20 minutes on an Nvidia A100 GPU, which is currently the fastest time in the field.
MetaDreamer shows jump in quality compared to older methods
In tests, the researchers compared MetaDreamer with other text-to-3D methods such as Dreamfusion and Magic3D. MetaDreamer outperformed these methods in terms of speed, quality and the extent to which the models matched the text descriptions. MetaDreamer also achieved the highest score in the T3Bench benchmarks, a standard for measuring the quality of 3D models.
However, the tool is not yet perfect; for example, it struggles to create scenes with multiple objects. The team plans to solve this problem in the future by improving the model's understanding of how objects interact in 3D space.
Other examples of generative AI models for 3D include Google's Dream Fields, CLIP-Mesh, OpenAI's Point-E and Shap-E, Tencent's Dream3D, or, most recently, 3D Gaussian Splatting.. Luma AI also has a commercial offering called Genie.
More information and examples can be found on the MetaDreamer project page.