About 100 hours after its initial launch, Google is opening up its AI video model Veo 3 to users in 71 additional countries.
The news came from Josh Woodward, Vice President of Gemini at Google, who made the announcement on X. For now, EU countries are not included in the rollout.
Gemini Pro subscribers get a trial package of ten Veo 3 generations through the web interface. These ten generations are a one-time trial offer.
Users with an Ultra subscription at $250 per month get the maximum number of generations Google allows, including daily refreshes. In Flow mode, which targets AI filmmakers specifically, Ultra users get 125 generations monthly, and Pro subscribers receive ten generations per month.
There are still some limitations. For now, Veo 3 works only in the web version of Gemini Pro, and currently supports only English audio output—though other languages may occasionally surface. Flow mode doesn’t support voice output when users upload their own images.
Precise prompts, viral videos
Following last year's NotebookLM Audio Overviews, Veo 3 is shaping up to be Google's second AI viral sensation - even with limited access. Users are flooding social media with demo videos that showcase how the combination of audio and video sets a new quality standard for AI-generated content. Google's model also follows prompts with impressive precision.
Prompt: "The camera follows a dachshund running through a living room and out of an open front door and onto a porch. It stands on the top stair overlooking the neighborhood as an ice cream truck drives by." | Video: Nick Matarese via X
But Veo 3 also makes it easier to create fake content that looks and sounds real. In one example, users generated a fake interview on a fictional car show—something that could just as easily be applied to fabricated protest footage or other misleading material.
Video: László Gaál via X
Veo 3 confirms concerns about generative AI's role in spreading disinformation while demonstrating just how far the technology has come. A few years ago, creating a "deepfake" - even just replacing someone's face in a video - required hours of work and serious technical skills. Today, a single line of text generates realistic scenes with both images and sound. This makes it more important than ever for people to scrutinize visual content from unverified sources.