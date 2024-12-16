AI in practice
Matthias Bastian

Google's Veo 2 outperforms OpenAI's Sora Turbo in head-to-head AI video generation tests

Google Deepmind
Google's Veo 2 outperforms OpenAI's Sora Turbo in head-to-head AI video generation tests
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Profile
E-Mail
Content
summary Summary

Google unveiled two new AI models today: Veo 2 for video generation and Imagen 3 for images. According to human evaluators, both models set new benchmarks in their respective fields.

Ad

The new Veo 2 model can generate 4K videos and responds to specific filmmaking instructions, including different types of lenses and camera effects. Unlike previous models limited to short clips, Veo 2 videos can be "extended to minutes in length."

One improvement in Veo 2, Google says, is how it handles common AI generation problems. The model produces fewer "hallucinations" - weird artifacts like extra fingers or random objects that often pop up in AI-generated content. Google also says the model has gotten better at representing realistic physics.

The company conducted direct comparison tests using 1,003 prompts from Meta's MovieGenBench dataset, with human raters evaluating 720p resolution, eight-second video clips. In these head-to-head comparisons, Veo 2 came out ahead of its competitors, including OpenAI's Sora Turbo, both in overall video quality and in how well it followed the given instructions.

Ad
Ad
Two bar graphs: Comparison of preference for Veo versus other AI video models (Meta, Kling, Minimax, Sora). Green: Veo preferred; white: no preference; pink: other model preferred. Left: overall preference; right: prompt adherence.
In the comparison charts, green bars indicate the percentage of times evaluators preferred Veo 2's output over its competitors.| Image: Google Deepmind

Despite these improvements, Google is upfront about Veo 2's limitations. The company admits that creating consistently realistic and dynamic videos remains a significant challenge. In particular, the model still struggles with complex scenes and motion sequences - suggesting there's still plenty of room for improvement in future versions.

For now, Google is taking a cautious approach with Veo 2's rollout. Veo 2 will be limited to select products including VideoFX, YouTube, and the Vertex AI platform. The system will expand to YouTube Shorts and other products in 2025. All videos generated by the system will include an invisible SynthID watermark identifying them as AI-generated.

Imagen 3 update from Google brings more vibrant AI images with better color balance and detail

Along with Veo 2, Google announced a major update to its image generation AI. The new Imagen 3 model produces more vibrant images with better color balance, thanks to several under-the-hood improvements.

Google says Imagen 3 can now handle a broader range of artistic styles. Whether you're looking for photorealistic images, impressionist paintings, abstract art, or anime-style illustrations, the model can adapt accordingly. The company also highlights Imagen 3's ability to create more detailed images with improved textures and finer elements.

Google is making Imagen 3 widely available through its ImageFX tool, launching in more than 100 countries. While users can already access Imagen through Google's Gemini Chat, the company hasn't announced when this platform will get the upgraded Imagen 3 model.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
AI in practice

Open-source voice cloning model "Voice Craft" steamrolls over OpenAI's ethical concerns

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google introduces Veo 2 and Imagen 3, two AI models designed to deliver top-tier performance in video and image generation based on human evaluation.
  • Veo 2 generates 4K resolution videos, comprehends cinematographic instructions, and minimizes unwanted details. In direct comparison tests, Veo 2 outperforms competitors, particularly OpenAI's Sora Turbo.
  • Imagen 3 offers enhanced color balance, more vibrant images, and improved detail through various technical advancements. The model has also been optimized to showcase different art styles.
Sources
Google AI Google Deepmind
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Profile
E-Mail
AI in practice

OpenAI makes ChatGPT search free for everyone, adds maps and voice support

News, tests and reports about VR, AR and MIXED Reality.
Ray-Ban Meta Smart Glasses get three new AI features Maestro brings the magic of Harry Potter to virtual reality on Meta Quest One of the best VR co-op games just got its most requested feature MIXED-NEWS.com
AI in practice

Google launches Whisk, an AI tool that combines multiple images for generation

AI in practice

Elon Musk's Grok AI on X is now faster and cites legacy media as sources

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google's Veo 2 outperforms OpenAI's Sora Turbo in head-to-head AI video generation tests

Bank details

IBAN: DE87 1203 0000 1086 0070 75
Account holder: DEEP CONTENT GbR
Purpose: Support THE DECODER
AI in practice

Google launches Gemini 2.0, focusing on AI agents and multimodal capabilities

AI in practice

OpenAI launches Sora video generator for ChatGPT subscribers

AI in practice

OpenAI launches o1 and ChatGPT Pro for $200 per month

Google News