Image AI startup Recraft has released a new text-to-image model that sets new performance standards according to independent testing.
Recraft says its latest model v3 excels at generating text within images, maintaining anatomical accuracy, understanding prompts, and producing high-quality visuals. A key advancement is the ability to correctly render long passages of text in a single generation, while other models typically struggle with anything more than a few words.
The model takes first place in Hugging Face's text-to-image benchmark with an ELO score of 1172, outperforming recent competitors Flux and Ideogram. Here, users compare image pairs from different models in blind tests using a chess-style ELO rating system.
Controlling the uncontrollable
Recraft v3 introduces precise control features aimed at professional designers. Users can specify exact text placement and dimensions within images, and use multiple reference images to maintain consistent brand styles. These adjustments don't require retraining of models, according to Recraft.
The model also generates vector graphics ranging from simple icons to complex illustrations. Additional features include AI-based image editing tools such as AI eraser, modify area, inpaint, outpaint, AI mockuper, creative and clarity upscaler, AI fine-tuning, and background remover.
The web interface offers 50 free credits daily, with a basic plan providing 1,000 credits for €10 per month. Recraft also provides an API for developers and enterprise customers.
Midjourney and OpenAI could update their models soon
Meanwhile, AI image champion Midjourney is preparing to release its v7 model. While Midjourney's current version v6.1 produces highly regarded aesthetic results, it falls behind in prompt following and text rendering capabilities compared to more recent models. The company recently added a robust image editor that works with uploaded images.
OpenAI CEO Sam Altman teased an upcoming update to DALL-E 3 or a new image tool at a recent OpenAI event in London. The company's new multimodal GPT-4o can already generate high-quality images with precise prompt following and shows capabilities beyond DALL-E 3, we know this from demos, although these features haven't been released yet. OpenAI may be waiting so that its image generation doesn't interfere with the U.S. election, or it may simply lack the necessary computing power to bring this feature to market.