Ad
Skip to content

Resemble AI drops Chatterbox Turbo, an open-source text-to-speech model that clones voices in five seconds

AI startup Resemble AI is taking on Elevenlabs with "Chatterbox Turbo," an open text-to-speech model that can clone voices from just five seconds of audio. The company claims its new model beats both Elevenlabs and Cartesia on voice quality while delivering first audio output in under 150 milliseconds. That speed could make it attractive for developers building real-time agents, customer support systems, games, avatars, and social platforms. Companies in regulated industries might also find the model's built-in "PerTh" watermark useful for verifying that speech was AI-generated.

Resemble AI released Chatterbox Turbo under an MIT license, meaning anyone can use, tweak, and redistribute it for free, even for commercial projects. The model is available to try on Hugging Face, RunPod, Modal, Replicate, and Fal, with the full code available on GitHub. Resemble AI also offers a hosted service, with a low-latency version on the way.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

AI news without the hype
Curated by humans.

  • More than 16% discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder