OpenAI won't be showing a new LLM-based search engine or GPT-5 on Monday. Instead, the company might focus on agents, voice input, and a set of "quality of life" features for ChatGPT.
OpenAI is reportedly developing an AI voice assistant that aims to surpass Google and Apple's efforts in voice AI. The assistant is expected to offer better speech and image recognition and improved reasoning compared to current products, The Information reports.
What makes this new assistant different from the existing audio feature is the integration of all these features in a single model, which is said to outperform GPT-4 Turbo in certain areas. In addition, this new combined model is expected to be less expensive than GPT-4 Turbo. OpenAI also plans to offer discounts of up to 50% to API users who pre-pay for tokens.
The new audio features can help customer service agents, for example, by better understanding a caller's voice intonation or detecting sarcasm in queries. These features will eventually be integrated into the free version of ChatGPT. The Information doesn't have any information on the timeline for the rollout.
OpenAI CEO Sam Altman considers assistants to be a technology as transformative as smartphones. Altman has repeatedly emphasized that personal assistance systems should be the next stage in AI development, and the company has been consistently working towards this goal for months.
AI assistants could potentially serve as tutors for students or provide visual information to the visually impaired. Altman's long-term vision is to have personal audio bots similar to those in the sci-fi film "Her," The Information reports.
According to its sources, GPT-5 could be completed and made available to the public by the end of the year. The new model should also expand the capabilities of the agents.
Many improvements for ChatGPT in the pipeline
ChatGPT app developer Tibor Blaho summarized a number of potential new features for ChatGPT at the end of April, which he extracted from a publicly available test environment.
These features include a redesigned user interface, an improved audio mode, new features for GPTs such as memory and automatic GPT interactions ("Contacts"), an improved text editor based on Prose Mirror with prompt assistance, a GPT optimized for writing (similar to DALL-E 3 for images), contextual connectors for Google Drive, Microsoft 365, and Notion, search capabilities with web search and citations, an improved data analysis tool, and better chat sharing options.
Blaho's predictions based on these web findings have usually been accurate in the past, although not all of the many new features might be unveiled on Monday or rolled out immediately.
The combination of LLMs and Web search, which Altman finds very exciting and is reportedly under development at OpenAI, could be unveiled at Apple's WWDC developer conference this summer if the partnership between the two companies comes to fruition.