Ad
Skip to content

Hume AI open-sources TADA, a speech model five times faster than rivals with zero hallucinated words

Hume AI has open-sourced TADA, an AI system for speech generation that processes text and audio in sync. Unlike previous systems that generate significantly more audio frames per text token, TADA maps exactly one audio signal to each text token. The result, according to Hume AI: TADA is over five times faster than comparable systems and produced zero transcription hallucinations—no made-up or skipped words compared to the source text—across tests with more than 1,000 samples. In human evaluations, the system scored 3.78 out of 5 for naturalness.

Hume AI says TADA is compact enough to run on smartphones, though longer texts can cause the voice to occasionally drift. The system comes in two sizes—1B and 3B parameters—both based on Llama. The smaller model supports English, while the 3B version covers seven additional languages. All code and models are available on GitHub and Hugging Face under the MIT license, and the full technical details can be found in the paper.

Ad
DEC_D_Incontent-1

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Hume AI Blog | Blog