A team of ex-Googlers is positioning itself as a competitor to Midjourney with a new text-to-image system.
Ideogram AI is doing what Google isn't: turning the search giant's high-quality generative AI research into a product. Ideogram AI has raised $16.5 million in seed funding, including from a16z and Indexventures.
Ideogram's team includes alumni of Google Brain, UC Berkeley, Carnegie Mellon University, and the University of Toronto. They have worked on projects such as Google's image AI Imagen and Imagen Video, as well as many other AI technologies.
Ideogram v0.1 enters beta phase
With "Ideogram v0.1", Ideogram AI now presents the first beta version of a text-to-image software that runs directly in the browser, in contrast to Midjourney, which uses the chat software Discord as an interface.
The Ideogram community has already generated plenty of images that can be viewed on the platform, including prompts. Like Midjourney, IdeogramAI supports many styles, from photorealistic to fantastically abstract.
At first glance, the platform seems to be at least on par with Midjourney in terms of variety, accuracy, and level of detail. Its strength also lies in font generation, a capability that Google has demonstrated in its text-to-image prototypes and that existing text-to-image systems largely lack.
It's also interesting that Ideogram AI treats the web platform directly as a social network with profiles and handles. So it's possible to take another person's generation, image and prompt, and create a new image from it. This could further simplify image creation.
Details on the technology used and the pricing model are not yet known. Interested users can sign up for a waiting list, but will need a Google login. You can also find more images on Twitter by searching for the hashtag #ideogram.
Google and commercial generative AI systems - it's complicated
In May 2022, Google presented Imagen, a capable text-to-image system that significantly outperformed OpenAI's DALL-E 2. It was followed by Parti, Re-Imagen, and Muse, which could generate even more detailed, better, and faster images that very closely matched prompts.
One special characteristic: the ability to accurately render text, where all existing text-to-image AI systems fail. Google's systems can accurately take text from prompts and place it as a font in the image. The Ideogram model also shows this capability.
However, Google has not yet succeeded in turning one of its text-to-X research projects into a commercial product comparable to Midjourney or DALL-E. There is a tentative beta test with Imagen in the US, but that's it so far.
Presumably, the business is too small for the search giant and the technology is more likely to be integrated into existing software such as Android image processing. Like Midjourney, Ideogram could successfully position itself in the niche of image creation.