Content
summary Summary

Large image models like Stable Diffusion can generate many graphics in a very short time. But that's not all, as developer Matthias Bühlmann shows.

Bühlmann experimented with Stable Diffusion in application scenarios besides image generation. He found that the AI model can deliver better image quality at high compression than the JPG and WebP web standards at 512 x 512 pixel resolution.

Image: Matthias Bühlmann

According to Bühlmann, stable diffusion compression offers "vastly superior image quality" at a smaller file size compared to JPG and WebP.

Bühlmann compares his compression method to an artist with photographic memory who sees the uncompressed image and then reproduces it as accurately - and as reduced - as possible. The process can preserve even fine details such as the grain of the camera.

Ad
Ad

Deceptively real AI artifacts

However, Bühlmann's method has one crucial drawback: it can alter the content of the image, such as the shape of buildings. The deceptive aspect of this is that the compressed image still looks high-quality and thus authentic.

Typical compression artifacts in JPG and WebP can also significantly alter the image, but usually, these are clearly identifiable as artifacts. Bühlmann illustrates the problem with the following image.

Bild: Matthias Bühlmann

The stable diffusion model 1.4 used by Bühlmann also has issues with the compression of faces and text. Version 1.5 should already be able to handle faces better, and Bühlmann intends to update his method further.

He sees Stable Diffusion as "very promising as a basis for a lossy image compression scheme" with "a lot more potential" beyond his current experiments.

The programmer emphasizes as a major advantage of his approach that it builds on the Stable Diffusion model that has already been trained. This means that there are no additional training costs for special image compression models - even though these could possibly deliver even better results. Stable Diffusion's training cost about $600,000.

Recommendation

Bühlmann describes his method in detail on his Medium blog and makes his code available on Google Colab.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Programmer Matthias Bühlman shows that there are more applications for large image AI models like Stable Diffusion than image generation.
  • He uses Stable Diffusion for image compression at a resolution of 512 x 512 pixels to achieve higher image quality at a smaller file size than JPG or WebP.
  • A disadvantage of the method is that Stable Diffusion can change the content of the compressed image.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.