Ad
Skip to content

Resemble AI drops Chatterbox Turbo, an open-source text-to-speech model that clones voices in five seconds

AI startup Resemble AI is taking on Elevenlabs with "Chatterbox Turbo," an open text-to-speech model that can clone voices from just five seconds of audio. The company claims its new model beats both Elevenlabs and Cartesia on voice quality while delivering first audio output in under 150 milliseconds. That speed could make it attractive for developers building real-time agents, customer support systems, games, avatars, and social platforms. Companies in regulated industries might also find the model's built-in "PerTh" watermark useful for verifying that speech was AI-generated.

Resemble AI released Chatterbox Turbo under an MIT license, meaning anyone can use, tweak, and redistribute it for free, even for commercial projects. The model is available to try on Hugging Face, RunPod, Modal, Replicate, and Fal, with the full code available on GitHub. Resemble AI also offers a hosted service, with a low-latency version on the way.

Ad
DEC_D_Incontent-1

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Dev Shah via X