AudioCraft is Meta's free model kit for text-to-audio tasks

Midjourney prompted by THE DECODER

With Audiocraft, Meta releases three AI tools for music and audio generation for research purposes.

Audiocraft consists of Meta's MusicGen, an AI model introduced in June 2023 that can generate melodies and musical pieces from text and other music. Also part of Audiocraft is AudioGen, a Transformer-based generative AI model introduced in October 2022 that can generate sounds to match text input from scratch or extend existing audio files.

Meta's audio tokenizer EnCodec, which breaks audio files into smaller pieces for AI processing, is the third part of Audiocraft and is now available in an enhanced version that Meta says produces higher-quality music with fewer artifacts.

Model kit for AI audio experiments

According to Meta, the Audiocraft family of models can produce high-quality, consistent, and longer audio using only natural language interaction. The release provides full access to Meta's research in generative audio AI over the past several years, according to the company.

"There are nearly limitless possibilities once you give people access to the models to tune them to their needs," Meta writes.

With Audiocraft, musicians or sound designers, for example, would have professional tools for faster inspiration, brainstorming, or refining existing compositions.

MusicGen example: Earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves

Audiogen example: Whistling with wind blowing

Generative audio to lower entry barriers to music and audio

The Meta research team continues to work on generative audio, specifically high-quality audio based on diffusion models, the same technique that has led to huge quality improvements in image generation.

Recommendation

AI in practice

Update

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

The goal, for example, is to enable musicians to create new compositions without having to play a single note on an instrument or to help indie developers on a shoestring budget fill virtual worlds with believable and varied sound effects. For Instagram, generative audio AI could provide the right soundtrack for posts. However, Audiocraft does not yet allow commercial use, so this will not happen just yet.

Following the release, Meta once again stresses the importance of open-source models: "Responsible innovation can’t happen in isolation. Open sourcing our research and resulting models helps ensure that everyone has equal access."

Audiocraft's code is available here.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

AudioCraft is Meta's free model kit for text-to-audio tasks

Model kit for AI audio experiments

Generative audio to lower entry barriers to music and audio

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

Google releases Magenta RealTime, an open source AI model for live music creation

Chatterbox is a free open-source voice cloning model with emotional tone control

Elevenlabs' new AI voice system enables smoother interactions through real-time analysis

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

AudioCraft is Meta's free model kit for text-to-audio tasks

Model kit for AI audio experiments

Generative audio to lower entry barriers to music and audio

Share

Bank details