YouTuber "Greg Technology" has recreated Google's discredited multimodal Gemini AI demo using OpenAI's GPT-4 Vision to demonstrate real-time voice and vision prompts. The original Gemini AI demo video, which was criticized for being staged and not recorded in real-time, featured voice interactions that were later dubbed in. In response, Greg Technology released a video using GPT-4V in which he discussed a drawing, asked about emoticons, and had the AI identify a game. It's not as polished as Google's demo, of course, but it's real-time and real. Greg has published his demo code on GitHub.

Ad
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.