YouTuber builds Google's staged Gemini demo in real-time with GPT-4 Vision

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

YouTuber "Greg Technology" has recreated Google's discredited multimodal Gemini AI demo using OpenAI's GPT-4 Vision to demonstrate real-time voice and vision prompts. The original Gemini AI demo video, which was criticized for being staged and not recorded in real-time, featured voice interactions that were later dubbed in. In response, Greg Technology released a video using GPT-4V in which he discussed a drawing, asked about emoticons, and had the AI identify a game. It's not as polished as Google's demo, of course, but it's real-time and real. Greg has published his demo code on GitHub.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:

Bank transfer

Sources

YouTube

Matthias Bastian

Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Profile

E-Mail

AI research

Jan 3, 2024Jan 3, 2024

Google's Gemini Pro and OpenAI's GPT-4V compete in visual capabilities

News, tests and reports about VR, AR and MIXED Reality.

What happens next with MIXED My personal farewell to MIXED Meta and Anduril are now jointly developing XR headsets for the US military MIXED-NEWS.com

Google News

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

YouTuber builds Google's staged Gemini demo in real-time with GPT-4 Vision

Google's Gemini Pro and OpenAI's GPT-4V compete in visual capabilities

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

YouTuber builds Google's staged Gemini demo in real-time with GPT-4 Vision

Google's Gemini Pro and OpenAI's GPT-4V compete in visual capabilities