Ad
Skip to content

Luma AI's new Uni-1 image model tops Nano Banana 2 and GPT Image 1.5 on logic-based benchmarks

Image description
Luma AI

Luma AI introduces Uni-1, its first model to combine image understanding and image generation in a single architecture.

Like Google's Nano Banana Pro and GPT Image 1.5, Uni-1 is built on an autoregressive transformer, an AI model that generates content token by token in sequence, instead of pulling images out of noise the way traditional diffusion models do. Text and images share the same processing pipeline.

Luma says the model can reason through prompts before and during generation, breaking down complex instructions and planning out scenes. This approach typically leads to much more accurate prompt following, and Uni-1 is no exception. It can, for example, take several photos and merge them into an entirely new composition.

Multiple ordinary pet photos were combined into a single AI-generated scene showing a dog, cat, and Boston Terrier wearing academic regalia in front of a whiteboard with scientific diagrams and the Luma AI logo.
Multiple ordinary pet photos were combined into the scene above. Prompt: "Combine the black and white curly-haired dog with pink bandana, the Boston Terrier in plaid harness, and the black-and-white cat into a single scene where they are dressed in academic regalia, standing before a whiteboard filled with scientific diagrams and text, with the Luma AI logo placed in the top-left corner." | Image: Luma

Beyond basic generation, Luma says Uni-1 can refine subjects across multiple conversation turns while keeping context intact, convert images into over 76 art styles, accept sketches and visual instructions as input, and transfer identities, poses, and compositions into new images from reference photos. In one demo, the model generated an entire sequence from a single reference image, gradually aging a pianist from childhood to old age.

Ad
DEC_D_Incontent-1

Screenshot of the Luma AI website showing six keyframes from an AI-generated image sequence: a boy at a piano ages through the stages of child, teenager, newlywed, young parent, middle age, and elderly. The corresponding text prompt and a description of the fifth keyframe are shown alongside.
From a single reference image, Uni-1 generates a sequence showing a pianist aging from childhood to old age - keeping the same camera angle and consistent scene throughout. | Image: Luma AI

According to Luma, Uni-1 scores highest on the RISEBench test for logic-based image processing, narrowly beating both Nano Banana 2 and GPT Image 1.5. The image generation capability also boosts the model's visual understanding. In object recognition, for instance, it nearly matches Google's Gemini 3 Pro. The model supports multiple languages.

Bar chart showing RISEBench benchmark results for Uni-1, Nano Banana 2, Nano Banana Pro, GPT Image 1.5, GPT Image, and Qwen-Image-2 across the categories Overall, Causal, Spatial, Temporal, and Logical. Uni-1 achieves the highest overall score of 0.51.
Uni-1 tops the overall RISEBench ranking, just ahead of Nano Banana 2 and GPT Image 1.5, the current image model powering ChatGPT. | Image: Luma AI

Uni-1 will soon be available through Luma Agents, a newly launched creative assistant, and the Luma API. No pricing has been announced yet.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Lumalabs