Content
newsletter Newsletter

Stable Diffusion is an open-source alternative to OpenAI DALL-E 2 that runs on your graphics card.

You have a whole range of products available if you want to use artificial intelligence to generate images via text input. Besides the forerunners, DALL-E 2 from OpenAI and the weaker Craiyon, especially Midjourney is very popular.

Recently, startup StabilityAI announced the release of Stable Diffusion, a DALL-E 2-like system initially available through a closed Discord server.

What makes Stable Diffusion special is that the powerful generative model was jointly developed by researchers from Stability AI, RunwayML, LMU Munich, EleutherAI and LAION. It is open-source and runs on a standard graphics card.

Ad
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer

Stable Diffusion is available on many platforms

To train the Stable Diffusion model, Stability AI used 4,000 Nvidia A100 GPUs and a variant of the LAION-5B dataset. Stable Diffusion can therefore generate images of prominent people and other subjects that OpenAI does not allow with DALL-E 2.

After testing via Discord, Stability AI released access via the web interface Dreamstudio. Here, however, there is a NSFW filter (“not safe for work”) and some restrictions on input. HuggingFace also offers a rudimentary web interface for Stable Diffusion.

Now the Stable Diffusion team has published the fully trained model on HuggingFace including the related code on Github. Less than a week ago, an older version of the model had already been leaked on 4chan.

Stable Diffusion runs on a local graphics card with a graphical user interface

With the nearly four gigabyte model and associated repos, anyone with an Nvidia graphics card with more than 4 gigabytes of VRAM can run Stable Diffusion locally. However, higher image resolutions require more VRAM.

AMD graphics cards are not officially supported, but can still be used with a few tricks. Apple’s M1 chips are also supposed to be supported in the future.

Recommendation

Thanks to the open-source community, there is an option to run Stable Diffusion with a few lines of code in a local browser window with a functional interface (instructions for running Stable Diffusion locally). If you need help with prompts, you can use the excellent prompt builder.

Without fear of a bit of code, you can already use Stable Diffusion with GUI in your browser.

Those without a suitable graphics card can continue to use Dreamstudio or resort to one of the numerous Google Collabs. More information and instructions can be found in the Reddit thread on running Stable Diffusion.

Midjourney’s new beta model also uses Stable Diffusion as part of its graphics pipeline.

Stable diffusion ushers in a media revolution

Many people will use Stable Diffusion to create interesting images. Some will create questionable material with appropriately specialized Stable Diffusion variants. But Stable Diffusion is more than that: the enhancements added in the few hours after release, such as the user interface described above, are just the beginning.

Ad
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Join our communit
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

The open-source community will develop numerous applications that will surprise and reach new media formats. After the first version of Stable Diffusion, there will be others that surpass the capabilities of the current version. Stability AI is already working on other open-source models, for example, for HarmonAI generative audio tools.

There’s a sense of optimism in open-source AI research because until now, powerful generative AI systems have been limited by filters, access, and hardware requirements. The release of Stable Diffusion marks the beginning of a new era in which the open-source community has a free hand, as noted German AI researcher Joscha Bach points out on Twitter.

The researcher predicts nothing less than a media revolution. Anyone who has followed the development of generative AI in recent years knows that Bach has good reasons for his prediction.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our communit
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.