Content
summary Summary

OpenAI has updated ChatGPT's voice features for subscribers, aiming to make the AI's speech more natural and expressive.

Ad

According to OpenAI, the revamped "Advanced Voice Mode" now delivers smoother, more emotionally nuanced speech, with improvements to intonation, pauses, and the ability to convey empathy or sarcasm in a more lifelike way.

The update also adds real-time translation capabilities. Users can now ask ChatGPT to translate between specific language pairs. The AI will then continuously interpret both sides of a conversation until instructed otherwise. OpenAI suggests this could be useful for scenarios like restaurant orders or multilingual workplace discussions.

Paying users can access these voice improvements across all platforms by clicking the language icon in the chat interface.

Ad
Ad

Known limitations

Some issues remain. OpenAI notes that users may still encounter occasional drops in audio quality, such as unexpected changes in pitch or volume, which can be more noticeable with certain voices.

In addition, so-called hallucinations continue to occur: ChatGPT sometimes produces odd sounds—including snippets that resemble ads, random noises, or even background music—without being prompted.

In one recent case, a user reported that ChatGPT suddenly played an advertisement in the middle of a conversation, even though OpenAI doesn't actually serve ads.

OpenAI first introduced Advanced Voice Mode in May 2024 with a gradual rollout, and expanded availability to the EU in October 2024. The goal is to enable natural, real-time interaction with the AI, including the ability to interrupt and express emotion during conversations. If users also turn on their cameras, ChatGPT can comment live on objects or surroundings. Google offers similar features in its Gemini app.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI has updated ChatGPT's voice capabilities for paying users, making its speech sound more fluent, emotionally nuanced, and natural, with improved handling of empathy and sarcasm, as well as the ability to translate conversations in several languages at once.
  • The new features are now available on all platforms and include continuous translation between chosen languages, which stays active until users turn it off or switch languages.
  • Some issues continue: Tests showed unexpected shifts in pitch and volume, as well as occasional "hallucinations" where the AI generates unprompted noises or background music.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.