Content
summary Summary

Google’s Veo 3 video generation model is now available through the Gemini API, with a price point that puts it among the more expensive options for AI video.

Ad

The Gemini API integration targets developers looking to bring advanced video generation into their own apps or build production-ready prototypes. For now, the API is limited to text-to-video, but image-to-video support—already live in the Gemini app—is on the way. Veo 3 is Google’s first model that can generate high-resolution video and synchronized audio from a single text prompt. It creates visuals, dialog, music, and sound effects all at once.

To help developers get started, Google AI Studio offers an SDK template and a starter app for quick prototyping. Access requires an active Google Cloud project with billing enabled. Google says Veo 3 has already been used millions of times across the Gemini app, Flow, and Vertex AI.

$0.75 per second for video with audio

Veo 3 access through the Gemini API is only available on Google Cloud’s paid tier. Pricing is $0.75 per second for 720p, 24fps video with audio in 16:9 format—25 cents more than Veo 2, which did not include sound. Google has also announced a "Veo 3 Fast" mode that’s both faster and cheaper, but it’s not yet available for the API.

Ad
Ad

At current rates, an eight-second video costs $6, and a five-minute video costs $225. Because generating the perfect result often takes multiple tries, costs can rise quickly. For example, if you need ten times as much footage to end up with five minutes of usable video, the total cost could reach $2,250. Still, Google is likely betting that for some use cases, this might be cheaper than traditional video production.

Real-world examples

Google says Cartwheel uses Veo 3 to turn 2D videos into realistic 3D character animations, mapping the generated movements onto rigged models for client projects.

Game studio Volley uses Veo 3 to create cutscenes for its role-playing game "Wit's End", allowing developers to quickly experiment with new story ideas and visuals. So far, these examples point to fairly specialized use cases, which could suggest that Google doesn't have larger integrations to highlight yet. It's also possible that some companies are using Veo 3 behind the scenes but aren't ready to go public.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google has made its Veo 3 AI video model available through the Gemini API, allowing users to generate high-resolution videos with synchronized soundtracks, including dialog, music, and sound effects, from text descriptions.
  • Access to Veo 3 via the Gemini API is priced at $0.75 per second for 720p videos with audio under the "Paid Tier."
  • Developers can integrate Veo 3 into their own apps, with Google AI Studio offering an SDK template and a starter application to help with implementation.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.