Content
summary Summary

OpenAI announced new features for app developers at its DevDay conference. The company is now offering its advanced speech synthesis technology for integration into third-party applications.

Ad

The new "Realtime API" lets developers add six AI voices to their apps. These voices are different from those used in ChatGPT. To avoid legal issues, developers can't use third-party voices.

OpenAI showed off a travel planning app using the Realtime API. Users could talk to an AI assistant about a London trip and get quick responses. The API can also add restaurant suggestions to maps.

The technology works for phone calls too, like placing orders. OpenAI doesn't automatically disclose it's an AI voice, leaving that up to developers for now.

Ad
Ad

New GPT-4o features and cost savings

OpenAI also announced that developers can now use images to fine-tune GPT-4o. With just 100 example images, the model's performance can be improved for specific visual tasks.

A new prompt caching feature aims to reduce costs and latency. By reusing recently seen input tokens, developers can get a 50 percent discount and faster processing times.

Prompt caching is automatically applied to the latest versions of GPT-4o, GPT-4o mini, o1-preview and o1-mini, as well as fine-tuned versions of these models.

"Model distillation" allows smaller models like GPT-4o mini to be optimized using outputs from larger models. OpenAI is providing new integrated tools for this, including saved completions and evaluation options.

OpenAI is doubling the rate limit for its new o1 model. To help developers get started, the company is offering free training quotas for GPT-4o and GPT-4o mini until the end of October.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI has introduced new features for developers, including integrating realistic AI voices into applications and fine-tuning GPT-4o with images. The aim is to make interaction with AI systems more natural.
  • The Realtime API offers six AI voices to choose from and can be integrated into applications such as travel planning apps or phone calls. OpenAI leaves it up to developers to disclose the use of AI voices.
  • Other new features include immediate caching to reduce costs, model distillation to optimize smaller models, and new evaluation tools. OpenAI also doubles the rate limit for the o1 model.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.