Content
summary Summary
Update
  • Added Veo 3 Fast and image-to-video API integration

Update from July 31, 2025:

Ad

Google has rolled out the Veo 3 Fast version and image-to-video support through its API. According to Google, Veo 3 Fast is engineered for speed and cost efficiency, aimed at developers who need to iterate quickly or generate content at scale, such as for programmatic advertising or rapid A/B testing. Google says Veo 3 Fast still delivers "high quality."

Both Veo 3 and Veo 3 Fast accept text and image prompts, generate videos at 720p and 24 fps, and output eight-second clips by default, with one video per request. They share the same technical specs, including a maximum of 1,024 tokens per text input and native audio generation.

Veo 3 Fast is priced at $0.40 per second of video with audio, while standard Veo 3 costs $0.75 per second—an 87.5 percent difference. An eight-second video clip runs $3.20 with Veo 3 Fast or $6.00 with Veo 3, and a five-minute video costs $120 with Veo 3 Fast compared to $225 with Veo 3. Google doesn't specify exactly how the two models differ in output quality.

Ad
Ad

Image-to-Video Feature

A new image-to-video feature is now available in both Veo 3 and Veo 3 Fast via the API. Users can combine a single image with a text prompt to generate dynamic videos with audio. According to Google, this feature helps maintain stylistic consistency and allows for more precise control over movement, narrative structure, and sound through the prompt.

Integration happens through the same Gemini API as before. Google says videos generated from images are billed at the same rate as text-to-video outputs for each model. These new features are available now in a paid preview through its API. Developers can use the API documentation and the Veo Cookbook to build their own applications.

Article from July 17, 2025:

Google's Veo 3 video generation model launches on Gemini API with a hefty price tag

Google’s Veo 3 video generation model is now available through the Gemini API, with a price point that puts it among the more expensive options for AI video.

The Gemini API integration targets developers looking to bring advanced video generation into their own apps or build production-ready prototypes. For now, the API is limited to text-to-video, but image-to-video support—already live in the Gemini app—is on the way. Veo 3 is Google’s first model that can generate high-resolution video and synchronized audio from a single text prompt. It creates visuals, dialog, music, and sound effects all at once.

Recommendation

To help developers get started, Google AI Studio offers an SDK template and a starter app for quick prototyping. Access requires an active Google Cloud project with billing enabled. Google says Veo 3 has already been used millions of times across the Gemini app, Flow, and Vertex AI.

$0.75 per second for video with audio

Veo 3 access through the Gemini API is only available on Google Cloud’s paid tier. Pricing is $0.75 per second for 720p, 24fps video with audio in 16:9 format—25 cents more than Veo 2, which did not include sound. Google has also announced a "Veo 3 Fast" mode that’s both faster and cheaper, but it’s not yet available for the API.

At current rates, an eight-second video costs $6, and a five-minute video costs $225. Because generating the perfect result often takes multiple tries, costs can rise quickly. For example, if you need ten times as much footage to end up with five minutes of usable video, the total cost could reach $2,250. Still, Google is likely betting that for some use cases, this might be cheaper than traditional video production.

Real-world examples

Google says Cartwheel uses Veo 3 to turn 2D videos into realistic 3D character animations, mapping the generated movements onto rigged models for client projects.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Game studio Volley uses Veo 3 to create cutscenes for its role-playing game "Wit's End", allowing developers to quickly experiment with new story ideas and visuals. So far, these examples point to fairly specialized use cases, which could suggest that Google doesn't have larger integrations to highlight yet. It's also possible that some companies are using Veo 3 behind the scenes but aren't ready to go public.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google has made its Veo 3 AI video model available through the Gemini API, allowing users to generate high-resolution videos with synchronized soundtracks, including dialog, music, and sound effects, from text descriptions.
  • Access to Veo 3 via the Gemini API is priced at $0.75 per second for 720p videos with audio under the "Paid Tier."
  • Developers can integrate Veo 3 into their own apps, with Google AI Studio offering an SDK template and a starter application to help with implementation.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.