YouTuber "Greg Technology" has recreated Google's discredited multimodal Gemini AI demo using OpenAI's GPT-4 Vision to demonstrate real-time voice and vision prompts. The original Gemini AI demo video, which was criticized for being staged and not recorded in real-time, featured voice interactions that were later dubbed in. In response, Greg Technology released a video using GPT-4V in which he discussed a drawing, asked about emoticons, and had the AI identify a game. It's not as polished as Google's demo, of course, but it's real-time and real. Greg has published his demo code on GitHub.
Ad