Google launches Imagen 2.0 with image animations and opens access to Gemini 1.5 Pro

Google Cloud unveiled new models and features for Vertex AI at its Next conference, including public test versions of its LLM with the largest context window to date, Gemini Pro 1.5, and "living images" with Imagen 2.0.

Gemini 1.5 Pro, now available as a public test version in Vertex AI, offers up to one million token context windows, exceeding the largest commercially available context window of 200K in Claude 3 by a factor of five. However, these large context windows still have significant weaknesses in seamlessly processing input information.

The large context window enables native multimodal inference over large amounts of data. Google says customers can use it to develop new use cases such as AI-powered customer service agents and academic tutors, analyze large collections of complex financial documents, identify documentation gaps, and explore entire code bases or natural language data collections.

Video: Google Deepmind

To improve language model response accuracy, Google is expanding Vertex AI's grounding capabilities, including the ability to ground answers directly from Google search or enterprise data, giving users access to current, high-quality information and improving model response accuracy.

Grounding in specific data is also key to developing the next generation of AI agents that go beyond chat to proactively search for information and perform user tasks, the company said.

Google has also expanded Vertex AI's MLOps capabilities with a new prompt management service and large model evaluation tools to help companies move from experimentation to production faster, and customers can now limit ML processing to the U.S. or European Union when using Gemini 1.0 Pro and Imagen.

Imagen 2.0 animates images

The Imagen 2.0 family of image-generating models can now create short, four-second "live images" from prompts at 24 frames per second, with a resolution of 360x640 pixels. It is suitable for subjects such as nature, food and animals, and can produce a range of camera angles and movements while maintaining visual consistency, Google said.

Imagen 2.0 also offers advanced image editing features such as inpainting and outpainting to remove unwanted elements, add new ones, and extend image edges with text input.

Recommendation

AI in practice

The great AI scaling debate continues into 2025

The digital watermarking feature, based on Google's DeepMind SynthID, is now generally available, allowing customers to create invisible watermarks and verify images and live images generated by the Imagen model family.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google launches Imagen 2.0 with image animations and opens access to Gemini 1.5 Pro

Imagen 2.0 animates images

The great AI scaling debate continues into 2025

Google unveils MedGemma, an open-source AI model suite for medical applications

Google adds three new Gemini-powered AI modes to Firebase Studio

Google launches image-to-video feature for Veo 3 in Gemini

AI coding can make developers slower even if they feel faster

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

"Cat attack" on reasoning model shows how important context engineering is

Google launches Imagen 2.0 with image animations and opens access to Gemini 1.5 Pro

Imagen 2.0 animates images

Share

Bank details