Content
summary Summary

UC Berkeley and Google researchers demonstrate a new method for generative AI that could replace diffusion models.

Generative AI models, such as GANs, diffusion models, or more recently consistency models, generate images by mapping an input, such as random noise, a sketch, or a low-resolution or otherwise corrupted image, to outputs that correspond to a given target data distribution, usually natural images. Diffusion models, for example, do this by "denoising" an image in several steps, learning the target data distribution during training.

Researchers from UC Berkeley and Google now present a new generative model, called "Idempotent Generative Networks" (IGNs), that learns through training to generate a suitable image from any form of input, ideally in a single step. The proposed method is intended to be a "global projector" that projects any input data onto the target data distribution and, unlike other methods, is not limited to specific inputs.

Incidentally, the team cites a scene from Seinfeld as inspiration for the work, which sums up the eponymous concept of idempotent operators.

Ad
Ad

Idempotent generative networks show potential in first study

IGNs differ from GANs and diffusion models in two important ways: Unlike GANs, which require separate generator and discriminator models, IGNs are "self-antagonistic" - they fulfil both roles. Unlike diffusion models, which perform incremental steps, IGNs attempt to map the inputs to the data distribution in a single step.

The researchers demonstrate the potential of IGNs using the MNIST and CelebA datasets. The team shows applications such as converting a sketch into a photorealistic image, generating an image from noise, or repairing a damaged image.

Image: Shocher et al.

Although the image quality is not yet state-of-the-art, the examples show that the method works, allows simple manipulations such as adding a headset to a face, and can handle any input such as sketches or damaged images.

Google how to scale up new generative AI method

IGNs could be much more efficient at inference because they produce their results in a single step after training. They could also produce more consistent results, which could be beneficial for certain applications such as medical image repair.

"We see this work as a first step towards a model that learns to map arbitrary inputs to a target distribution, a new paradigm for generative modeling."

From the paper.

Next, the team plans to scale up IGNs with significantly more data, hoping to realise the full potential of the new generative AI model. The code will soon be available on GitHub.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers at UC Berkeley and Google present Idempotent Generative Networks (IGNs), a new method for generative AI that can generate matching images from different input forms in a single step.
  • Unlike GANs, IGNs are self-antagonistic, fulfilling the roles of both generator and discriminator; unlike diffusion models, IGNs attempt to generate images in a single step.
  • The team plans to scale up IGNs with more data to realise the full potential of the new generative AI model.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.