OpenAI's DALL-E 2 still has a few problems with concepts

OpenAI's DALL-E 2 produces impressive images, but the AI system is not perfect. First experiments show its current limitations.

A few weeks ago, OpenAI demonstrated the impressive capabilities of DALL-E 2. The multimodal AI model sets a new standard for AI-generated images: From sometimes complex text descriptions, DALL-E 2 generates images in a variety of styles, from oil paintings to photorealism.

OpenAI chief Sam Altman sees DALL-E 2 as an early example of AI's impact on the labor market. A decade ago, he says, physical and cognitive labor were singled out as the first victims of AI systems - while creative labor was targeted last. Now it looks like the order is reversing, Altman says.

OpenAI's DALL-E 2 makes mistakes

In the scientific paper that accompanied the unveiling of DALL-E 2, OpenAI points out some limitations of the system. For example, the researchers tested DALL-E's ability to perform compositionality, which is the meaningful merging of multiple object properties, such as color, shape and positioning in the image.

The tests show that DALL-E 2 does not understand the logical relationships given in the descriptions and therefore arranges colored cubes incorrectly, for example. The following motifs show DALL-E's attempt to place a red cube on top of a blue cube.

DALL-E 2 is supposed to represent a red cube on a blue one. The system fails. | Image: OpenAI

Meanwhile, some applicants have gained access to the closed beta test of the system and reveal further limitations of DALL-E 2.

Twitter user Benjamin Hilton reports in a corresponding thread that he often needs numerous input variants for a good result. As an example, he provides an image for the input "A renaissance-style painting of a modern supermarket aisle. In the aisle is a crowd of shoppers with shopping trolleys trying to get reduced items".

#dalle pic.twitter.com/2kuoVnaqR2

- Benjamin Hilton (@benjamin_hilton) April 28, 2022

Although shopping carts and customers are part of the image, the supermarket looks anything but modern. Misunderstandings would also occur in other cases, e.g. when the English word "operated" is processed in the medical sense instead of, for example, operating a machine.

Didn't do the captions - it's not great at text #dalle pic.twitter.com/4YvxdAqZPZ

- Benjamin Hilton (@benjamin_hilton) April 28, 2022

In some cases, complex input produced no meaningful results at all. As an example, Hilton cites the description "Two dogs dressed like roman soldiers on a pirate ship looking at New York City through a spyglass."

Recommendation

AI in practice

Anthropic CEO expects major AI breakthrough, plans to launch "virtual collaborators"

DALL-E 2 mixes concepts - and has a very positive attitude

In some cases, DALL-E 2 also mixes concepts: In one image, a skeleton and a monk are supposed to be sitting together, but the monk still looks pretty bony even after multiple attempts.

#dalle pic.twitter.com/rsCKclmtD0

- Benjamin Hilton (@benjamin_hilton) April 28, 2022

DALL-E 2 also had problems with faces, coherent plans like a site plan or a maze, and with text. The system could not handle negations at all: An input like "A spaceship without an apple" results in a spaceship with an apple.

By the way, DALL-E can represent 2 apples excellently - only when counting, the system is not so accurate. It only counts up to four.

6) "10 apples" pic.twitter.com/MoGgpNFSn4

- Benjamin Hilton (@benjamin_hilton) April 30, 2022
Ad

Ad

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

OpenAI's DALL-E 2 still has a few problems with concepts - and can't count

OpenAI's DALL-E 2 makes mistakes

Anthropic CEO expects major AI breakthrough, plans to launch "virtual collaborators"

DALL-E 2 mixes concepts - and has a very positive attitude

OpenAI DALL-E 2 Prompt Guide: How to use the generative AI model

AI significantly improves early detection of sepsis in hospitals

AI artwork wins art competition and artists are upset

Google downplays AI's environmental impact in new study

Deepseek’s first hybrid model V3.1 surpasses its R1 reasoning model on benchmarks

Meta's human-like chatbot personas can mislead users and result in real-world harm

OpenAI's DALL-E 2 still has a few problems with concepts - and can't count

OpenAI's DALL-E 2 makes mistakes

DALL-E 2 mixes concepts - and has a very positive attitude

Share

Bank details