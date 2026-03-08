Ask about this article… Search

Luma AI introduces Uni-1, its first model to combine image understanding and image generation in a single architecture.

Like Google's Nano Banana Pro and GPT Image 1.5, Uni-1 is built on an autoregressive transformer, an AI model that generates content token by token in sequence, instead of pulling images out of noise the way traditional diffusion models do. Text and images share the same processing pipeline.

Luma says the model can reason through prompts before and during generation, breaking down complex instructions and planning out scenes. This approach typically leads to much more accurate prompt following, and Uni-1 is no exception. It can, for example, take several photos and merge them into an entirely new composition.

Beyond basic generation, Luma says Uni-1 can refine subjects across multiple conversation turns while keeping context intact, convert images into over 76 art styles, accept sketches and visual instructions as input, and transfer identities, poses, and compositions into new images from reference photos. In one demo, the model generated an entire sequence from a single reference image, gradually aging a pianist from childhood to old age.

According to Luma, Uni-1 scores highest on the RISEBench test for logic-based image processing, narrowly beating both Nano Banana 2 and GPT Image 1.5. The image generation capability also boosts the model's visual understanding. In object recognition, for instance, it nearly matches Google's Gemini 3 Pro. The model supports multiple languages.

Uni-1 will soon be available through Luma Agents, a newly launched creative assistant, and the Luma API. No pricing has been announced yet.