Content
summary Summary

Stabilty AI has released a new image model and workflow for creating better 3D models.

The new model is called Stable Zero123 and is a new version of the model series of the same name. Stable Zero123 does not generate 3D models directly — rather, it is a central building block in a generative workflow that starts with a text prompt and ends with a 3D model. Specifically, Zero123 can take an image of an object and generate multiple new images of the object from different view angles.

These panoramic images can then be used by another model, e.g., to condition a NeRF on these images and finally generate a 3D model.

Stable Zero123 was trained with a huge 3D data set

According to Stability AI, the Stable Zero123 should achieve significantly better results than its predecessor Zero123-XL. This is primarily made possible by a better training data set. To achieve this, the start-up has exclusively filtered high-quality 3D models from the Objaverse data set. During training and inference, the stable Zero123 receives not only the images but also estimated camera angles that support the model's predictions.

Ad
Ad
Zero123 produces more consistent results than Zero123-XL. | Image: Stability AI

Together with other improvements, such as the ability to train with larger batches, Stability AI says this has led to a 40-fold increase in training efficiency compared to Zero123-XL.

StableZero123 plus threestudio for 3D generation

Stable Zero123 is released for research purposes only and is not intended for commercial use. Those interested in using Stability AI's 3D solutions for commercial products or purposes should contact the company directly.

To create 3D objects with Stable Zero123, the team is releasing the model with instructions on HuggingFace. The threestudio framework and the model are required. While the VRAM requirements for generating the new views are at the level of Stable Diffusion 1.5, generating the 3D objects takes significantly more time, and 24 gigabytes of VRAM is recommended.

Stable Zero123 is also available via the Stable 3D Private Preview for text-to-3D generation.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Stability AI releases Stable Zero123, a new image model that generates multiple images of an object from different viewpoints, which can then be further processed to create a 3D model.
  • Stable Zero123 achieves better results than its predecessor, Zero123-XL, thanks to an improved training dataset and a 40x increase in training efficiency.
  • The model is released for research purposes only and is not intended for commercial use.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.