summary Summary

Watermarked AI-generated images can make people think of cats without them knowing why, according to a new study by Deepmind.


Watermarks in AI-generated images can be an important safety measure, for example, to quickly prove that an image is (not) real without the need for forensic investigation.

Watermarking often involves adding features to the image that are invisible to the human eye, such as slightly altered pixel structures, also known as adversarial perturbations. They cause a machine learning model to misinterpret what it sees: For example, an image might show a vase, but the machine labels it a cat.

Until now, researchers believed that these image distortions, which are intended for computer vision systems, would not affect humans.


Watermarks can affect human perception

Researchers at Deepmind have now tested this theory in an experiment and shown that subtle changes to digital images also affect human perception.

Gamaleldin Elsayed and his Deepmind team showed human test subjects pairs of images that had been subtly altered with pixel modifications.

In a sample image showing a vase, an AI model incorrectly identified the vase as a cat or a truck after manipulation. The human subjects still saw only the vase.

However, when asked which of the two images looked more like a cat, they tended to choose the image that had been manipulated to look like a cat for the AI model. This was even though both images looked the same.

Video: Deepmind


According to the Deepmind team, this is no coincidence. The study showed that for many pairs of manipulated images, the selection rate for the manipulated image was reliably above chance, even when no pixel was changed by more than two levels on the scale from 0 to 255.

Subjects reliably selected the manipulated image from two images identical to the human eye. | Image: Deepmind

Concerns about subtle crowd manipulation

The effects of this image manipulation are much more dramatic with machines than with humans, especially in the case of negative influence. Nevertheless, it is possible to nudge people into making decisions that would be made by machines.

The study emphasizes that these small changes can have a large impact when implemented on a large scale.

"Even if the effect on any individual is weak, the effect at a population level can be reliable, as our experiments may suggest," the team writes.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

For example, if image manipulation makes a politician's photo seem more like a cat or a dog, this could affect people's perceptions of the politician.

The Deepmind scientists recommend that AI safety research should be based on experimentation, rather than relying on intuition and self-reflection. Cognitive science and neuroscience also need to develop a more profound understanding of AI systems and their potential impact.

Deepmind is also developing a watermarking system with SynthID that it uses in image generation systems such as Imagen 2, although it may work differently than the adversarial perturbations discussed in the above research.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • A study by Deepmind shows that watermarks in AI-generated images can influence human perception, for example, making people think more of cats when they see an otherwise unremarkable image of a vase.
  • Previous research suggested that such watermarks, designed to influence computer vision systems, did not affect human perception.
  • The findings underscore the need for AI safety research to experiment to develop a more profound understanding of possible effects on human perception, rather than relying solely on intuition and self-perception.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.