Gemini 3.1 Flash Live is Google's most natural-sounding AI voice model yet

Mar 26, 2026

Google has unveiled Gemini 3.1 Flash Live, its best voice and audio AI model yet. It delivers faster responses, more natural conversations, and configurable thinking levels for developers. Google says it's better at detecting pitch and emotions and more reliable in noisy environments. The model now powers live mode in the Gemini app.

According to Artificial Analysis, the model scores 95.9 percent on the Big Bench Audio Benchmark at "High" thinking, second only to Step-Audio R1.1 Realtime (97.0 percent) with a 2.98-second response time. At "Minimal," quality drops to 70.5 percent, but response time falls to 0.96 seconds.

Gemini 3.1 Flash Live scores 95.9 percent on Big Bench Audio at its highest thinking level, just behind Step-Audio R1.1 Realtime. | Image: Artificial Analysis

The model is available through the Gemini Live API, Google AI Studio, Gemini Live, and Search Live in over 200 countries. Pricing matches its Gemini 2.5 predecessor at $0.35 per hour of audio input and $1.40 per hour of audio output, making it one of the cheapest audio AI models available. The slightly better-performing Step Audio model is cheaper on input but pricier on output.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Google