Resemble AI's open-source model transforms noisy audio into crystal-clear speech

DALL-E 3 prompted by THE DECODER

Resemble Enhance is an open-source AI model that can significantly improve the quality of audio recordings.

The startup Resemble AI offers several AI tools for voice cloning, blending, and localization, as well as text-to-speech, speech-to-speech, and voice dubbing capabilities for various applications.

Now, the company has released Resemble Enhance, an AI model that converts noisy audio into clear speech. Unlike the company's other models, Resemble Enhance is open source.

Resemble Enhance for podcasts and historical recordings

Resemble sees applications for the technology in areas such as podcasting, the general entertainment industry, and the restoration of historical audio documents. The company shows what this sounds like with an example of an old lecture.

Video: Resemble AI

The model consists of two main components: a denoiser and an enhancer. The denoiser uses a UNet model to separate speech from background noise to improve intelligibility. The enhancer uses a latent conditional flow matching (CFM) model to correct audio distortion and expand audio bandwidth.

The development team plans to continue improving Resemble Enhance, including optimizing processing times and extending control over individual speech elements to further improve audio quality. In the long run, the model should also be able to improve audio recordings that are more than 75 years old.

Resemble offers a demo of Resemble Enhance on HuggingFace. The code is available on GitHub.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Resemble AI's open-source model transforms noisy audio into crystal-clear speech

Resemble Enhance for podcasts and historical recordings

Google releases Magenta RealTime, an open source AI model for live music creation

Chatterbox is a free open-source voice cloning model with emotional tone control

Elevenlabs' new AI voice system enables smoother interactions through real-time analysis

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Resemble AI's open-source model transforms noisy audio into crystal-clear speech

Resemble Enhance for podcasts and historical recordings

Google releases Magenta RealTime, an open source AI model for live music creation

Chatterbox is a free open-source voice cloning model with emotional tone control

Elevenlabs' new AI voice system enables smoother interactions through real-time analysis