OpenAI's GPT-4 now accepts visual input, and this capability will help it serve as a Virtual Volunteer for Be My Eyes to help the blind see.
Generative AI models like ChatGPT and DALL-E showed many practical use cases for artificial intelligence. While OpenAI's new GPT-4 can't generate images, it combines visual and language skills by adding the ability to accept images as input, boosting computer vision with the power of the large language model.
Be My Eyes is an app and service for people who are blind or have low vision that launched in 2015. Sighted volunteers guide and assist by watching through the user's phone, verbally describing what they can see, and answering questions.
With 6.3 million volunteers, Be My Eyes minimizes wait times and provide assistance as quickly as possible. The goal is to answer calls within seconds, but volunteers aren't always available, leading to delays of several minutes.
GPT-4 becomes a Virtual Volunteer
Since GPT-4 can understand the world through both text and images, it can act as a Virtual Volunteer for Be My Eyes and provide specialized help in ways that could be challenging for a human volunteer.
For example, with GPT-4 integration, translation is easy and doesn't require waiting for a person with knowledge of two particular languages. It could be used to navigate a train system, while traveling through a foreign country, browse websites and social media platforms, or do online shopping - "the possibilities are limitless, and we’re just getting started," Be My Eyes writes.
The user sends an image to GPT-4 in the enhanced Be My Eyes app. The response is in text form and can be read aloud by the app at the user's preferred speed. This could make getting help faster and more versatile.
Be My Eyes beta tester Lucy Edwards shared examples of how the AI-enhanced app works on her YouTube channel.
Be My Eyes GPT-4 availability
The GPT-4 integration is in beta testing for the next few weeks. The Virtual Volunteer will be rolled out to more users in the coming months, depending on feedback from early testers.