Ad
Skip to content

Google's Gemini 2.0 can hold real-time conversations over video

Google has released a new streaming API for its Gemini 2.0 multimodal model that enables real-time interactions through audio, video, and text. Developer Simon Willison demonstrated the technology in a one-minute iPhone video, showing a live conversation with Gemini about objects it could see through the camera. The API is now available in preview form for developers who want to test it, though some technical setup is required. The release comes as OpenAI introduced a similar capability for ChatGPT that lets the AI discuss smartphone video content in real-time.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

AI news without the hype
Curated by humans.

  • More than 16% discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder