Google has released Imagen 3, a new AI model for generating images from text descriptions. According to Google, it surpasses all previous models in quality and attention to detail.
Google introduced Imagen 3 in May and made it available to initial testers in June. The model is now freely available via ImageFX, at least in some countries.
The release comes with a paper stating that Imagen 3 sets a new standard for text-to-image models in terms of quality and detail.
Imagen 3 outperforms competitors in Google's evaluations
Imagen 3 was trained on a large dataset of images, texts, and annotations. The data underwent a multi-step filtering process to remove unsafe, violent, or low-quality content, as well as AI-generated images. Duplicates were also removed, and similar images were downgraded.
In Google's human and automated evaluations, Imagen 3 performed better than Imagen 2, DALL-E 3, Midjourney v6, Stable Diffusion 3, and Stable Diffusion XL 1.0. Imagen 3 was particularly strong in matching text descriptions to generated images and handling detailed prompts. Comparisons with the recently released FLUX model are missing.
However, there are now comparisons on X, where user Dogan Ural shared side-by-side examples of Midjourney, Imagen, and FLUX.
Google just released Imagen 3!
Their latest text-to-image generator.
Here's a couple of side-by-side with Midjourney & Flux pic.twitter.com/7b8XrjP2BI
- Dogan Ural (@doganuraldesign) August 9, 2024
According to Google, there are still weaknesses, such as tasks requiring numerical reasoning, like generating an exact number of objects. Prompts involving spatial reasoning and complex language also remain challenging.
Imagen 3 is available in the US via ImageFX.