Content
summary Summary

InstantMesh, an AI framework developed by researchers from Tencent PCG ARC Lab and Shanghai Tech University, can generate high-quality 3D meshes from individual 2D images in just ten seconds, according to a preprint article published by the research team.

Ad

The open-source framework consists of two main components: a multi-view diffusion model and a reconstruction model for 3D meshes from a few views. The multi-view diffusion model synthesizes 3D-consistent views from different angles using a single input image, and these views serve as input for the reconstruction model.

In a first step, InstantMesh generates a multiview view of the 3D model as 2D images from different perspectives. | Image: Xu et al.

InstantMesh relies on meshes instead of the triplane NeRF representation used in previous methods, resulting in smoother meshes and easier post-processing, according to the researchers. The framework achieves significantly better results than current reference methods such as TripoSR, LGM, and Stable Video 3D, both in terms of perceived quality of synthesized new views and geometric accuracy.

Image: Xu et al.

Demo available on Hugging Face

Along with the paper, the researchers have made all the code, trained model variants, and a demo available on Hugging Face. Users can choose from predefined sample images or upload their own, including both photographs and AI-generated images.

Ad
Ad
Image: Screenshot / THE DECODER

If the result is poor, the developers recommend changing the seed, which determines the multi-view perspectives and can significantly affect the quality of the 3D object.

Plans for InstantMesh include increasing the resolution of the generated 3D meshes generated and using more advanced multi-view diffusion architectures to further improve consistency between views.

Technologies like these could significantly increase productivity in the 3D industry, especially in video game development. But it's still an open question how much better these models can get to the point where they can be used without a lot of post-processing, which in the current state could result in more work.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers from Tencent and Shanghai Tech University have developed InstantMesh, an open-source framework that can generate high-quality 3D meshes from single images in seconds.
  • The architecture of InstantMesh consists of a multi-view diffusion model that synthesizes coherent 3D views from different angles of an input image, and a reconstruction model for 3D meshes.
  • In experiments, InstantMesh outperforms current comparative methods. All code, trained model variants, and a demo are available on Hugging Face. The technology could be a productivity boost especially for video game development.
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.