CLIPN: New computer vision method teaches AI to say 'no'

Midjourney prompted by THE DECODER

CLIPN teaches CLIP the "semantics of negations". This should help computer vision to recognize classes that were not part of the training data.

Computer vision models recognize objects in the images on which they were trained. In real-world applications, however, these models often encounter unknown objects outside their training data, leading to poor results. AI researchers have proposed several techniques to enable AI models to recognize when input is "out-of-distribution" (OOD)-that is, from unknown classes not seen during training. However, existing methods have limitations in identifying OOD examples that resemble known classes.

Researchers at the Hong Kong University of Science and Technology have now developed a new technique called CLIPN that aims to improve OOD detection by teaching the well-known CLIP model to reject unknown inputs. The basic idea is to use both positive and negative text cues along with user-defined training techniques to give CLIP a sense of when an input is OOD.

The challenge: hard-to-distinguish unknowns

Suppose a model has been trained on images of cats and dogs. If it is asked to process an image of a squirrel, the squirrel is an out-of-distribution class because it does not belong to the known classes of cats and dogs.

Many OOD detection methods evaluate how well an input matches known classes. However, these methods may misclassify the image of the squirrel as a cat or a dog because it has (some) visual similarities.

CLIPN therefore extends CLIP with new learnable "no" prompts and "no" text encoders to capture the semantics of negations. In this way, CLIP learns when and how to say "no" to recognize when an image falls outside its known classes. For example, the CLIPN technique teaches the model to say "No, that's not a cat/dog" in the case of the squirrel, marking the class as OOD.

In experiments, the team shows that CLIPN identifies OOD examples that standard CLIP does not. According to the researchers, CLIPN improves OOD detection in 9 reference datasets by up to nearly 12 percent compared to existing methods.

However, they say it is still unclear whether the method works on specialized datasets, such as medical or satellite images, and whether it is suitable for other applications, such as image segmentation.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

CLIPN: New computer vision method teaches AI to say 'no'

The challenge: hard-to-distinguish unknowns

SciArena lets scientists compare LLMs on real research questions

Microsoft’s MAI-DxO boosts AI diagnostic accuracy and cuts costs by nearly 70 percent

Researchers say they may have found a ladder to climb the "data wall"

Cloudflare CEO Matthew Prince sees trouble ahead for the open web

New Othello experiment supports the world model hypothesis for large language models

ChatGPT might be draining your brain, MIT warns - what ‘cognitive debt’ means for you

CLIPN: New computer vision method teaches AI to say 'no'

The challenge: hard-to-distinguish unknowns

SciArena lets scientists compare LLMs on real research questions

Microsoft’s MAI-DxO boosts AI diagnostic accuracy and cuts costs by nearly 70 percent

Researchers say they may have found a ladder to climb the "data wall"