Ad
Skip to content

ChatGPT's voice is now more natural and can consistently translate conversations in real time

Image description
Sora prompted by THE DECODER

Key Points

  • OpenAI has updated ChatGPT's voice capabilities for paying users, making its speech sound more fluent, emotionally nuanced, and natural, with improved handling of empathy and sarcasm, as well as the ability to translate conversations in several languages at once.
  • The new features are now available on all platforms and include continuous translation between chosen languages, which stays active until users turn it off or switch languages.
  • Some issues continue: Tests showed unexpected shifts in pitch and volume, as well as occasional "hallucinations" where the AI generates unprompted noises or background music.

OpenAI has updated ChatGPT's voice features for subscribers, aiming to make the AI's speech more natural and expressive.

According to OpenAI, the revamped "Advanced Voice Mode" now delivers smoother, more emotionally nuanced speech, with improvements to intonation, pauses, and the ability to convey empathy or sarcasm in a more lifelike way.

The update also adds real-time translation capabilities. Users can now ask ChatGPT to translate between specific language pairs. The AI will then continuously interpret both sides of a conversation until instructed otherwise. OpenAI suggests this could be useful for scenarios like restaurant orders or multilingual workplace discussions.

Paying users can access these voice improvements across all platforms by clicking the language icon in the chat interface.

Ad
DEC_D_Incontent-1

Known limitations

Some issues remain. OpenAI notes that users may still encounter occasional drops in audio quality, such as unexpected changes in pitch or volume, which can be more noticeable with certain voices.

In addition, so-called hallucinations continue to occur: ChatGPT sometimes produces odd sounds—including snippets that resemble ads, random noises, or even background music—without being prompted.

In one recent case, a user reported that ChatGPT suddenly played an advertisement in the middle of a conversation, even though OpenAI doesn't actually serve ads.

OpenAI first introduced Advanced Voice Mode in May 2024 with a gradual rollout, and expanded availability to the EU in October 2024. The goal is to enable natural, real-time interaction with the AI, including the ability to interrupt and express emotion during conversations. If users also turn on their cameras, ChatGPT can comment live on objects or surroundings. Google offers similar features in its Gemini app.

Ad
DEC_D_Incontent-2

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: OpenAI