Ad
Skip to content

Google's Gemini 2.0 can hold real-time conversations over video

Google has released a new streaming API for its Gemini 2.0 multimodal model that enables real-time interactions through audio, video, and text. Developer Simon Willison demonstrated the technology in a one-minute iPhone video, showing a live conversation with Gemini about objects it could see through the camera. The API is now available in preview form for developers who want to test it, though some technical setup is required. The release comes as OpenAI introduced a similar capability for ChatGPT that lets the AI discuss smartphone video content in real-time.

Ad
DEC_D_Incontent-1

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.