Content
newsletter Newsletter
Update
  • Added details about alternatives
  • Prompt engineering tips added

OpenAI's DALL-E 2 shows impressive AI creativity - if you know how to control it. A little tour of DALL-E 2 in 2023.

Ad

OpenAI's DALL-E 2 pioneered generative AI models and was the first text-to-image offering on the market. A lot has happened since then: Alternatives such as Midjourney have emerged, usually producing better results with less complicated prompts, and the underlying model is improved regularly. There is also an open-source alternative with Stable Diffusion and Stable Diffusion XL.

But with the right prompts and for special applications like inpainting, DALL-E can still make sense. An example: DALL-E converts my prompt "an antique bust of a Greek philosopher wearing a vr headset, realistic, photography, 2023" into a suitable - albeit low-resolution - image, but Midjourney refuses to add a VR headset to the much higher-resolution bust.

In the following I would like to give you a short insight into the functions of DALL-E 2 and the basics of prompt engineering.

Ad
Ad

OpenAI DALL-E 2 can create, edit or modify images

The user interface of DALL-E 2 is kept simple: Via an input field you can enter your text image command, the so-called "prompt", and send it to the AI system by pressing "Generate". After a short wait, four generated images are displayed.

Generating AI images is simple: You put text into a text field. The input can be short or detailed. Your prompt has a strong impact on output.

Below the input field, you can alternatively upload your own picture - as long as it does not show a real person. From uploaded and newly created images, DALL-E 2 can generate variants. This makes it relatively easy to create images inspired by existing subjects that can then be further edited. In this way, the AI system can be controlled even more precisely.

A click on an image opens the detailed view. Here, variations can be created, or the image can be edited.

In addition, the edit function can be used to mark an area in the image, which can then be changed by DALL-E 2. For this, the desired result must simply be described via text prompt again.

The area to be edited can be marked with a brush.

DALL-E 2 then generates three variants of the original containing the corresponding changes. Here I have added a fancy mustache to the statue.

A mustache for a Greek philosopher? No problem for DALL-E 2.

OpenAI DALL-E 2 and prompt engineering

As is already clear from the example of the ancient bust of the Greek VR pioneer, DALL-E 2 can be controlled via text input. OpenAI has trained the AI system with over 650 million images - so DALL-E 2 has seen and can reproduce numerous subjects, styles, exposures, and other image properties.

Recommendation

Using so-called prompt engineering - the design of the appropriate text description - DALL-E 2 can, for example, generate photorealistic images with different lens specifications to simulate small focal lengths or motion blur.

DALL-E 2 can reproduce the image style of different cameras, here Polaroid.

With the right descriptions, you can also capture moods, define structures or proportions, reproduce styles such as steampunk or cyberpunk, determine camera angles and exposure, or use the design of TV series or movies as a template.

Numerous illustration styles can be imitated by DALL-E 2, as well as 3D art or historical paintings. This ability to imitate styles is also demonstrated by DALL-E 2 for numerous artistic styles, individual artists, or specific works.

Thanks to extensive training, DALL-E 2 can also reproduce styles such as steampunk.
DALL-E 2 can also imitate the style of individual artists or paintings.

If you want to capture the style of a particular work of art or artist, you can also use AI help: In the so-called unbundling, you can ask models like ChatGPT or GPT-4 to describe the characteristics and style of a painting. The AI response can then be used for prompt engineering.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

In addition to antique busts, DALL-E 2 can also create other objects - from embroidery to statues, bodies, stuffed animals, architecture, or designer chairs, it's all there.

Half dog, half Jedi, half Greek philosopher - DALL-E 2 impresses with meaningful interpretations.

DALL-E 2: Six tips for prompt engineering

Prompt aspects Explanation
Precision Use precise descriptions for the desired objects or scenes, e.g., "a white husky playing in a snowy forest."
Adjectives and adverbs Add adjectives and adverbs to provide more detail, e.g., "a sparkling blue road bike on an empty path."
Creativity Be creative with your prompts, e.g., "a dog made of clouds."
Compare Use comparisons to make your ideas clearer, e.g., "a house whose color is as yellow as ripe bananas."
Context Consider the context in which the images are used, e.g., pictures of colorful butterflies for a children's book.
Simplicity Keep your prompts concise and focus on one or two key elements, e.g., the main character and the setting.

DALL-E 2: External image editing and outpainting

With the already introduced editing function, details in the image can be changed, such as adding a mustache, replacing objects, or the entire background.

Since the generated images can also be downloaded, an external image editing program can be used to get even more out of DALL-E 2. In the simplest version, our bust of the Greek philosopher can be reduced in size and used as the basis for a new image.

With simple tricks, the pictures can be edited further. Here, for example, you can generate a statue to go with the head.

Paintings can be added using the same method. DALL-E 2 can give Mona Lisa a body, and our Greek VR philosopher gets company.

DALL-E 2 adds the VR philosopher's torso and environment, matching the desired style. With further adjustments, the results can be refined.

If you repeat this process often, you can zoom out further and further - some artists already create impressive journeys through DALL-E 2 worlds or giant murals.

By combining external image processing, intelligent prompt engineering, and the editing function of DALL-E 2, many other applications are possible.

Ad
Ad

If you want to dig deeper, you should check out the DALL-E 2 Prompt Book by Guy Parsons. This gives a comprehensive overview of many of the prompt engineering tips discovered so far, and additional methods for getting the most out of DALL-E 2. Many of these tips can also be applied to Midjourney or Stable Diffusion.

Will there be a DALL-E 3? We don't know for sure yet, but OpenAI is already researching alternative architectures for generative AI models, such as consistency models.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.