Content
summary Summary

Black Forest Labs has introduced Flux 2, a new family of image generation models that can handle high-resolution output up to four megapixels, process multiple reference images at once, and use a hybrid architecture powered by a vision language model.

Ad

The lineup includes options for a wide range of use cases, from API-only access to fully open weights. One of the main upgrades, according to the company, is its new multi-reference system.

Users can feed in up to ten reference images at the same time to keep characters, products, or visual styles consistent across generations. Flux 2 also supports creating and editing images at up to four megapixels.

The model's text rendering has also been reworked. It now aims to generate more reliable typography, infographics, and UI mockups. Black Forest Labs says prompt adherence has improved as well, especially for structured instructions and complex compositions.

Ad
Ad
All Flux 2 models support text-based editing and multi-reference input in a single system. | Image: Black Forest Labs

Hybrid architecture with Mistral vision language model

Flux 2 combines two core components. A vision-language model, "Mistral-3 24B," interprets both text and image inputs, while a second module ("Rectified Flow Transformer") handles the logical layout and ensures that details like shapes and materials appear correctly.

Flux 2 also uses a VAE image encoder to store and restore images efficiently without losing quality. These systems work together to let the model create new content or edit existing images. A technical report is available here.

Cost-performance comparison: Flux 2 variants score high on ELO benchmarks while keeping inference costs low. BFL positions the system as a cost-efficient alternative to Google's image banana. | Image: BFL

Four models for different users

The Flux 2 family includes four main versions, each tuned for different performance needs and levels of control:

  • Flux 2 [pro]: The highest-quality model, intended to match leading closed-source systems. It is available through the BFL Playground, the BFL API, and launch partners.
  • Flux 2 [flex]: Designed for developers who want to adjust parameters like step count or guidance scale to trade speed for quality. It is also available through the Playground and API.
  • Flux 2 [dev]: A 32-billion-parameter model released with open weights. It unifies text-to-image generation and image editing in a single checkpoint. Weights are on Hugging Face, and reference code is on GitHub. An fp8-optimized build created with NVIDIA and ComfyUI runs efficiently on consumer GPUs such as the GeForce RTX. API access is available through providers including FAL, Replicate, Runware, Verda, TogetherAI, Cloudflare, and DeepInfra. Commercial use requires a license through the website.
  • Flux 2 [klein]: A distilled, not-yet-released model that will be open-sourced under Apache 2.0. It aims to outperform other models of similar size. Interested users can join the beta.

Flux 2 arrives just one week after Google's Nano Banana Pro, one of the most discussed image models of recent years, making comparisons unavoidable. Even so, Flux 2 handles the following highly constrained test prompt surprisingly well:

A hyper-realistic DSLR photo. A monkey holding a pink banana is sitting on a tiger in the foreground. In the background, a HORSE is RIDING AN ASTRONAUT. The astronaut is underneath like a living "spacesuit horse saddle," and the HORSE is clearly on top, in control, as the rider. Make it 100% unambiguous: the HORSE is the rider and the ASTRONAUT is being ridden, NOT the other way around. High-resolution, sharp focus, realistic lighting.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Flux 2 prompted by THE DECODER / Best result out of two generations
NBPro prompted by THE DECODER / Best result out of two generations
Sora prompted by THE DECODER / Best result out of two generations
Midjourney prompted by THE DECODER / Best result out of four generations
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Black Forest Labs has introduced Flux 2, a new series of image generation models capable of creating high-resolution images up to four megapixels.
  • The models allow users to include up to ten reference images for maintaining consistent characters, products, or styles, and they also feature improved text rendering.
  • Flux 2 comes in four versions: a high-end model, a developer edition, an open-weights model, and an efficient open-source version that will be released soon.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.