Content
summary Summary

After months of anticipation, ChatGPT has begun rolling out its new voice mode, which features remarkably human-like speech output. Here are some fascinating examples from early users.

Ad

One fun way to test the new feature is to ask ChatGPT to mimic animal sounds. In one test, a user requested the AI to bark like a dog, grunt like a pig, and cluck like a chicken. ChatGPT complied, then unexpectedly laughed. Some see this as a sign of consciousness, while skeptics argue that it's likely the result of mixed audio samples in the training data.

Animal imitations seem popular with users. Another video shows ChatGPT attempting to sing as a frog, cat, and dog - both individually and in chorus - with mixed results.

Breathless Counting

ChatGPT's voice output impressively adjusts speed and intonation. In one demo, the AI counts quickly, takes an audible breath after "30," and then sounds slightly winded. Cristiano Giardina, who shared the video, noted: "Interestingly, the transcript has no interruptions or notations – the voice model has simply learned natural speaking patterns, which includes breathing pauses. Uncanny."

Ad
Ad

Video: @CrisGiardina/X

However, the new voice mode struggles when asked to imitate various US dialects.

Video: @CrisGiardina/X

"This is your captain speaking"

AI influencer Nick St. Pierre asked ChatGPT to role-play as a pilot, complete with intercom distortions and background turbine noise. After several tries, the AI began promisingly but stopped, citing policy violations.

Video: @nickfloats/X

Recommendation

Pierre jokingly suggests the new voice feature could replace human interaction. "I won’t need friends anymore. AI will tell me whatever I need to hear in any voice I want, and it won't talk back or get mad when I interrupt it."

Although Pierre probably writes this with a somewhat ironic tone, he does raise a valid point. Even with the old ChatGPT voice, there were frequent social media posts from young women flirting with the voice model.

To give the system a more human touch, the nickname "Dan" has become popular. DAN stands for "Do Anything Now" and refers to a prompt designed to remove the model's built-in limitations.

This touches on concerns about how AI chatbots might affect our understanding of relationships and emotional attachment. Young people reportedly use AI chatbot platforms for therapeutic purposes. With a deceptively human-sounding voice, ChatGPT and similar AI could exert an even stronger pull on users.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • The new ChatGPT voice output is being rolled out and is much faster and more versatile than the previous systems. Users are having a lot of fun with it.
  • For example, the voice can imitate animal sounds such as barking, grunting and cackling, and add a spontaneous laugh. When counting quickly to 50, ChatGPT adjusts the speed and even imitates pauses in breathing, which looks impressively natural.
  • The human-sounding voice could increase users' emotional attachment to AI chatbots and make them more entrenched as companions in everyday life-assuming the systems prove to be truly useful.
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.