OpenAI's new Realtime API lets developers add realistic conversations to their apps

OpenAI announced new features for app developers at its DevDay conference. The company is now offering its advanced speech synthesis technology for integration into third-party applications.

The new "Realtime API" lets developers add six AI voices to their apps. These voices are different from those used in ChatGPT. To avoid legal issues, developers can't use third-party voices.

OpenAI showed off a travel planning app using the Realtime API. Users could talk to an AI assistant about a London trip and get quick responses. The API can also add restaurant suggestions to maps.

The technology works for phone calls too, like placing orders. OpenAI doesn't automatically disclose it's an AI voice, leaving that up to developers for now.

New GPT-4o features and cost savings

OpenAI also announced that developers can now use images to fine-tune GPT-4o. With just 100 example images, the model's performance can be improved for specific visual tasks.

A new prompt caching feature aims to reduce costs and latency. By reusing recently seen input tokens, developers can get a 50 percent discount and faster processing times.

Prompt caching is automatically applied to the latest versions of GPT-4o, GPT-4o mini, o1-preview and o1-mini, as well as fine-tuned versions of these models.

"Model distillation" allows smaller models like GPT-4o mini to be optimized using outputs from larger models. OpenAI is providing new integrated tools for this, including saved completions and evaluation options.

OpenAI is doubling the rate limit for its new o1 model. To help developers get started, the company is offering free training quotas for GPT-4o and GPT-4o mini until the end of October.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI in practice

OpenAI's new Realtime API lets developers add realistic conversations to their apps

New GPT-4o features and cost savings

OpenAI launches Sora video generator for ChatGPT subscribers

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

OpenAI pushes back as the New York Times demands access to 120 million ChatGPT chat logs

OpenAI launches Study Mode for ChatGPT while education users are told to wait and learn later

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Google upgrades Gemini with Deep Think and flags early warning risks

OpenAI's new Realtime API lets developers add realistic conversations to their apps

New GPT-4o features and cost savings

Share

Bank details