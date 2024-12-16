AI in practice
Jonathan Kemper

Google launches Whisk, an AI tool that combines multiple images for generation

Google
Google launches Whisk, an AI tool that combines multiple images for generation
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Profile
Content
summary Summary

Google Labs has released Whisk, its latest generative AI experiment, in the United States. Unlike traditional image generators, which mostly rely on text prompts, Whisk focuses on using images as the primary input method.

Ad

Users can either upload images directly into Whisk or generate them within the tool, specifying elements for the subject, scene, and style. The system allows users to mix and match these components and fine-tune the results with additional text prompts if needed.

Digital illustration: pink walrus with flower wreath and polka dot dress in a pastel-colored spring landscape with falling petals.
Image: Google

Behind the scenes, Google's language model—likely the recently released Gemini 2.0 Flash—creates detailed descriptions of the input images automatically. These descriptions then feed into Google's latest image generation model, Imagen 3, which captures the essential features of the subject rather than creating exact copies.

Creative tool, not a perfect copier

Since Whisk only pulls out a few key elements from each source image, Google warns that the results might not match what users expect. The generated images could come out with different heights, weights, hairstyles, or skin tones than the originals. Google knows these details can make or break a project, so it lets users see and edit the text prompts that drive the image generation process.

Ad
Ad
Digital illustration: mystical cat with galaxy-like fur rests on lily pad, surrounded by water lilies and magical lights.
Image: Google

Early testers, including artists and creative professionals, say Whisk feels more like a new kind of creative tool than a standard image editor. Google built it for quick visual brainstorming rather than pixel-perfect editing, letting users quickly generate and sort through dozens of options before saving their favorites.

According to The Verge's hands-on testing, while Whisk is enjoyable to use, you'll need to wait a few seconds for each new image to generate. These delays could be temporary, though, likely caused by high traffic as people rush to try the new tool.

Where to try Whisk

Right now, Google is only letting users in the United States test Whisk. If you're in the US, you can try it for free at labs.google/whisk and share your feedback. Users outside the US won't be able to access the tool.

Whisk lives in Google Labs, the company's testing ground for experimental AI projects. This is where Google tries out practical applications for its AI models like Gemini, Imagen, and Veo, including its latest video model, Veo 2.

While most projects stay in the experimental phase, a few graduate to become full products - like NotebookLM, the company's AI research assistant that recently made the jump to general release.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
AI in practice
Update

OpenAI has a "highly accurate" ChatGPT text detector, but won't release it for now

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google Labs has launched a new generative AI experiment called Whisk in the USA, which allows users to create images primarily using visual inputs for subject, scene, and style, rather than relying on lengthy text prompts.
  • Whisk utilizes Google's language model to automatically generate a detailed description of the input images, which is then processed by Google's Imagen 3 image generation model to capture the essence of the subject without creating an exact replica.
  • Initial tests suggest that Whisk is a unique creative tool for rapid visual design. Users in the USA can now access it for free at labs.google/whisk, although Google is currently restricting access from other countries.
Sources
Google
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Profile
AI in practice

Google makes its Imagen 3 image AI available to all Gemini users

News, tests and reports about VR, AR and MIXED Reality.
Ray-Ban Meta Smart Glasses get three new AI features Maestro brings the magic of Harry Potter to virtual reality on Meta Quest One of the best VR co-op games just got its most requested feature MIXED-NEWS.com
AI in practice

Google launches Imagen 3 and two million token context windows in its Vertex AI cloud

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google launches Whisk, an AI tool that combines multiple images for generation

Bank details

IBAN: DE87 1203 0000 1086 0070 75
Account holder: DEEP CONTENT GbR
Purpose: Support THE DECODER
AI in practice

Google launches Gemini 2.0, focusing on AI agents and multimodal capabilities

AI in practice

OpenAI launches Sora video generator for ChatGPT subscribers

AI in practice

OpenAI launches o1 and ChatGPT Pro for $200 per month

Google News