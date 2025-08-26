Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Google Deepmind is adding a new image editing model to the Gemini app that can make dramatic changes to photos on demand while keeping people and animals recognizable.

The new "Gemini 2.5 Flash Image Generation" model builds on Gemini's earlier native image generation tools but delivers much sharper prompt handling. Google says it often outperforms the GPT-4o model used in ChatGPT, especially when it comes to following text prompts for image edits. While many pure image models still struggle with prompt accuracy, Gemini 2.5 Flash gets it right more often.

A key feature is "character consistency": the model can keep a person, animal, or object visually consistent across multiple images, even as poses, backgrounds, or lighting change.

This opens up new possibilities for creating image series or product shots from multiple angles. Google says the model is ideal for generating consistent brand assets and product catalogs, and claims Gemini 2.5 Flash outperforms other image systems on a wide range of editing tasks.

The model also supports precise, localized edits through text prompts. Users can blur backgrounds, remove blemishes, add colors, or erase entire objects without manual selection. A template app called "PixShop" shows off these editing features with a simple interface and prompt controls.

Image composition, style transfer, and real-world reasoning

Gemini 2.5 Flash can blend up to three images at once. For example, you can combine a product photo and a room photo to create a realistic interior scene. Complex compositions with several elements can be generated from a single prompt. Google also offers an interactive canvas tool for multi-image fusion.

The model handles style transfer too, moving patterns, colors, or textures from one object to another while keeping the shape and details intact. Typical examples include dresses with butterfly patterns or boots with floral textures.

Gemini 2.5 Flash can also visualize simple cause-and-effect, which Google calls "real-world reasoning." In one demo, the model generates an image of a balloon drifting toward a cactus, then another image showing what happens next.

These semantic features draw on Gemini 2.5's world knowledge, Google says. You can try them yourself using a painting app that follows text instructions.

Available to users and developers

The Gemini 2.5 Flash image tools are now available in the Gemini app. Instead of selecting the "Imagen" image model in the chat bar, you need to switch to the "Flash" language model at the top left to use the new features. The setup might be a little confusing at first, but it makes sense given Gemini's language-based editing approach.

After picking the right model, you can upload an image and give Gemini editing instructions. Every image comes with both a visible watermark and an invisible SynthID digital watermark.

Gemini 2.5 Flash Image is also available in preview through the Gemini API, Google AI Studio, and Vertex AI. Pricing is $30 per million output tokens. Each image uses about 1,290 tokens, or roughly $0.039 per image, the same as Gemini 2.0 Flash Image.

