Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.

Content Summary

ByteDance introduces Seedream 3.0, a new text-to-image model. Benchmarks suggest improvements over GPT-4o and Midjourney in speed, accuracy, and visual quality.

ByteDance has released Seedream 3.0, a new text-to-image generation model that, according to internal and external evaluations, outperforms its predecessor Seedream 2.0 and rivals or exceeds the quality of current systems like GPT-4o, Midjourney v6.1, and Imagen 3.

The model was trained on twice the amount of data compared to Seedream 2.0. This includes images previously excluded from training due to visual defects, which are now masked during preprocessing. New training techniques—such as resolution-aware sampling and mixed-resolution training—aim to improve output fidelity across different image sizes. Seedream 3.0 supports native 2K resolution and can generate a 1K image in approximately three seconds.

Seedream 3.0 ranks ahead of GPT-4o in image quality benchmarks

In benchmarks such as the Artificial Analysis Arena—where users compare outputs from different models—Seedream 3.0 initially ranked first at the time of its paper’s release. It now sits just one point behind GPT-4o (Arena ELO 1156 vs. 1157). The model performs especially well on text-heavy prompts, achieving a text rendering rate of 94 percent in both English and Chinese, even with dense typography.

To support this performance, the model was trained on datasets with detailed aesthetic and stylistic descriptions. According to ByteDance, the results not only outperform GPT-4o but also can surpass design-focused platforms like Canva in tasks such as poster and sticker creation. These comparisons focus on typography quality and the integration of text within images.

In the domain of photorealistic portraits, ByteDance claims Seedream 3.0 also outperforms Midjourney v6.1. According to the company, the model produces more realistic skin textures and finer details—including wrinkles and hair—while avoiding the overly smooth appearance typical of many AI-generated portraits. Unlike some competing systems, Seedream 3.0 does not require post-processing upscaling to produce high-resolution images.

SeedEdit enhances in-image editing capabilities

ByteDance has also introduced SeedEdit, a companion tool to Seedream that enables both image and text editing within generated visuals. According to the company, SeedEdit performs better than GPT-4o and Gemini 2.0 Flash in making precise edits without compromising the overall identity of the image. The system reportedly achieves more accurate results with fewer visible artifacts in tasks such as text removal, replacement, or insertion.

The Seedream 3.0 paper includes numerous visual comparisons with outputs from other models, which appear to support ByteDance’s claims. While the examples shown represent favorable use cases, the model appears competitive at the highest level. ByteDance plans to integrate Seedream 3.0 into its chatbot platform Doubao.

