Content
summary Summary

Google has released Imagen 3, a new AI model for generating images from text descriptions. According to Google, it surpasses all previous models in quality and attention to detail.

Ad

Google introduced Imagen 3 in May and made it available to initial testers in June. The model is now freely available via ImageFX, at least in some countries.

The release comes with a paper stating that Imagen 3 sets a new standard for text-to-image models in terms of quality and detail.

Imagen 3 outperforms competitors in Google's evaluations

Imagen 3 was trained on a large dataset of images, texts, and annotations. The data underwent a multi-step filtering process to remove unsafe, violent, or low-quality content, as well as AI-generated images. Duplicates were also removed, and similar images were downgraded.

Ad
Ad

In Google's human and automated evaluations, Imagen 3 performed better than Imagen 2, DALL-E 3, Midjourney v6, Stable Diffusion 3, and Stable Diffusion XL 1.0. Imagen 3 was particularly strong in matching text descriptions to generated images and handling detailed prompts. Comparisons with the recently released FLUX model are missing.

However, there are now comparisons on X, where user Dogan Ural shared side-by-side examples of Midjourney, Imagen, and FLUX.

According to Google, there are still weaknesses, such as tasks requiring numerical reasoning, like generating an exact number of objects. Prompts involving spatial reasoning and complex language also remain challenging.

Imagen 3 is available in the US via ImageFX.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google has released Imagen 3, a new AI model for generating images from text descriptions that Google claims outperforms all previous models in terms of quality and attention to detail.
  • Imagen 3 was trained on a large, multi-level filtered dataset and outperformed Imagen 2, DALL-E 3, Midjourney v6, Stable Diffusion 3, and Stable Diffusion XL 1.0 in Google's evaluations, particularly in matching text descriptions and generated images, as well as detailed prompts.
  • Despite improvements, weaknesses remain in tasks requiring numerical and spatial reasoning.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.