Content
summary Summary

Google unveiled two new AI models today: Veo 2 for video generation and Imagen 3 for images. According to human evaluators, both models set new benchmarks in their respective fields.

Ad

The new Veo 2 model can generate 4K videos and responds to specific filmmaking instructions, including different types of lenses and camera effects. Unlike previous models limited to short clips, Veo 2 videos can be "extended to minutes in length."

One improvement in Veo 2, Google says, is how it handles common AI generation problems. The model produces fewer "hallucinations" - weird artifacts like extra fingers or random objects that often pop up in AI-generated content. Google also says the model has gotten better at representing realistic physics.

The company conducted direct comparison tests using 1,003 prompts from Meta's MovieGenBench dataset, with human raters evaluating 720p resolution, eight-second video clips. In these head-to-head comparisons, Veo 2 came out ahead of its competitors, including OpenAI's Sora Turbo, both in overall video quality and in how well it followed the given instructions.

Ad
Ad
Two bar graphs: Comparison of preference for Veo versus other AI video models (Meta, Kling, Minimax, Sora). Green: Veo preferred; white: no preference; pink: other model preferred. Left: overall preference; right: prompt adherence.
In the comparison charts, green bars indicate the percentage of times evaluators preferred Veo 2's output over its competitors.| Image: Google Deepmind

Despite these improvements, Google is upfront about Veo 2's limitations. The company admits that creating consistently realistic and dynamic videos remains a significant challenge. In particular, the model still struggles with complex scenes and motion sequences - suggesting there's still plenty of room for improvement in future versions.

For now, Google is taking a cautious approach with Veo 2's rollout. Veo 2 will be limited to select products including VideoFX, YouTube, and the Vertex AI platform. The system will expand to YouTube Shorts and other products in 2025. All videos generated by the system will include an invisible SynthID watermark identifying them as AI-generated.

Imagen 3 update from Google brings more vibrant AI images with better color balance and detail

Along with Veo 2, Google announced a major update to its image generation AI. The new Imagen 3 model produces more vibrant images with better color balance, thanks to several under-the-hood improvements.

Google says Imagen 3 can now handle a broader range of artistic styles. Whether you're looking for photorealistic images, impressionist paintings, abstract art, or anime-style illustrations, the model can adapt accordingly. The company also highlights Imagen 3's ability to create more detailed images with improved textures and finer elements.

Google is making Imagen 3 widely available through its ImageFX tool, launching in more than 100 countries. While users can already access Imagen through Google's Gemini Chat, the company hasn't announced when this platform will get the upgraded Imagen 3 model.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google introduces Veo 2 and Imagen 3, two AI models designed to deliver top-tier performance in video and image generation based on human evaluation.
  • Veo 2 generates 4K resolution videos, comprehends cinematographic instructions, and minimizes unwanted details. In direct comparison tests, Veo 2 outperforms competitors, particularly OpenAI's Sora Turbo.
  • Imagen 3 offers enhanced color balance, more vibrant images, and improved detail through various technical advancements. The model has also been optimized to showcase different art styles.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.