A 15-second voice sample is all OpenAI's Voice Engine needs to clone your voice

Ideogram prompted by THE DECODER

OpenAI gives a glimpse into Voice Engine, a model that creates voice clones from short 15-second voice samples. The results sound very realistic - and that poses a risk, the company says.

OpenAI has revealed early findings and outcomes of its Voice Engine AI model. The model can produce a natural-sounding voice clone from a brief text input and a 15-second voice sample that sounds almost identical to the original voice.

English reference audio (15 seconds)

Generated voice based on reference audio

Voice engine was developed in late 2022 and is already being used for predefined voices in the Text-to-Speech API, as well as for ChatGPT Voice and Read Aloud. At the same time, OpenAI is cautious about a wider release due to the potential for abuse.

Since the end of last year, OpenAI has been privately testing voice engine with a small group of partners. Some early application examples include:

Improving support for people who can't read and for children, using natural and expressive voices.
Translating videos and podcasts so creators can reach a wider audience in their native language (HeyGen, see demo below).
Improving basic services in remote areas.
Helping people who cannot speak, such as for speech therapy applications.
Recreating the voice of patients with sudden or gradual voice loss.

OpenAI recognizes the significant risks of Voice Engine, especially the potential for voter manipulation in an election year. Current test partners have to follow usage guidelines that prohibit impersonation without consent. They must get explicit permission from the original speaker and can't allow users to create their own voices. AI-generated voices must be clearly labeled.

English reference audio

Voice clone in German language (HeyGen)

Recommendation

AI in practice

Update

OpenAI's new 'o1' model thinks longer to give smarter answers

OpenAI calls for restrictions on voice authentication

OpenAI is sharing its findings with Voice Engine to demonstrate what's possible with AI voice cloning technology. It's important for the world to understand where this technology is headed - whether or not OpenAI ends up using it on a large scale, the company says.

OpenAI advocates the elimination of voice authentication for sensitive data, protections for the use of voices, education about the capabilities and limitations of AI, and better content tracking techniques, as well as authentication processes and blacklists for known voices. The company uses security measures such as watermarking for traceability and proactive usage monitoring.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

A 15-second voice sample is all OpenAI's Voice Engine needs to clone your voice

OpenAI's new 'o1' model thinks longer to give smarter answers

OpenAI calls for restrictions on voice authentication

OpenAI’s head of ChatGPT says AI will not displace doctors but will displace not going to the doctor

OpenAI will debut an open-weight LLM soon and launch a browser with integrated AI chat

OpenAI and the American Federation of Teachers plan to train 400,000 U.S. teachers in AI

AI coding can make developers slower even if they feel faster

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

"Cat attack" on reasoning model shows how important context engineering is

A 15-second voice sample is all OpenAI's Voice Engine needs to clone your voice

OpenAI calls for restrictions on voice authentication

Share

Bank details