Stable Diffusion could soon generate images much faster

A new method speeds up diffusion models by up to 256 times. This could be a step towards real-time AI image generation.

Diffusion models have outpaced alternative image generation systems like GANs. They generate high-quality, high-resolution images, can modify existing images, and even generate 3D shapes. However, this requires dozens to hundreds of denoising steps, which is compute-intensive and thus time-consuming.

Nevertheless, the speed between prompt input and output of an image is already impressive for generative AI models such as DALL-E 2, Midjourney, or Stable Diffusion: depending on the computing power and AI model, it takes only a few seconds.

To further reduce the computational effort - and possibly enable real-time image generation in the near future - researchers are investigating how to reduce denoising steps.

Distilled diffusion dramatically speeds up AI image generation

Researchers from Stanford University, Stability AI, and Google Brain are now showing progress by reducing the denoising steps of models by at least 20-fold.

Building on previous work by the contributing authors, the team uses progressive network distillation. In this process, an AI model learns to reproduce the output of the original large model. It is then gradually reduced to a diffusion model that requires significantly fewer steps to denoise an image.

In network distillation, a large AI model acts as a teacher and a small one as a student. During training, the large AI passes on its knowledge: in the case of a language AI, for example, the 20 most likely words that complete an incomplete sentence. The small AI model thus learns to reproduce the results of the large AI model - without adopting its size.

Distilled #StableDiffusion2

> 20x speed up, convergence in 1-4 steps

We already reduced time to gen 50 steps from 5.6s to 0.9s working with @nvidia

Paper drops shortly, will link, model soon

Will be presented @NeurIPS by @chenlin_meng & @robrombach

Interesting eh 🙃 https://t.co/DQJwAaeRBA pic.twitter.com/eQdqsKGSEW

— Emad (@EMostaque) December 1, 2022

According to the paper, the Distilled Diffusion model speeds up inference by "at least ten times" compared to existing methods on ImageNet 256x256 and LAION datasets. For smaller images, the speedup is even a factor of 256.

Distilled Diffusion is extremely fast - even on Apple hardware

Relative to standard diffusion models, Distilled Diffusion can produce images at a similarly high level with only four sampling steps. Compared to diffusion models such as Stable Diffusion, which require dozens to hundreds of steps to produce a good image, Distilled Diffusion could even produce "highly realistic images" in as few as one to four denoising steps. Image manipulations such as AI-assisted image processing also work in as few as two to four steps.

Recommendation

AI research

"Cat attack" on reasoning model shows how important context engineering is

Delighted to have native support for the AI neural engines for Stable Diffusion from @Apple, one of the 1st optimised models. 8s on MacBook Air M2, will be < 1s with distilled #StableDiffusion2

AI for all. Can't wait to see what everyone creates.

🚀✨ https://t.co/GPpjS7Ufb3

— Emad (@EMostaque) December 1, 2022

Stability AI founder Emad Mostaque is optimistic that this research will soon be applied in practice. Combined with native support for the neural engines in Apple's silicon chips, Stable Diffusion could shorten the image generation process from eight seconds to less than one.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Stable Diffusion could soon generate images much faster

Distilled diffusion dramatically speeds up AI image generation

Distilled Diffusion is extremely fast - even on Apple hardware

"Cat attack" on reasoning model shows how important context engineering is

Alibaba's new GPT-4o competitor Qwen VLo is no longer open source

Studio Ghibli founder Hayao Miyazaki's viral AI criticism lacks crucial context

Google adds native image generation to Gemini language models

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Stable Diffusion could soon generate images much faster

Distilled diffusion dramatically speeds up AI image generation

Distilled Diffusion is extremely fast - even on Apple hardware

Share

Bank details