Content
summary Summary

With Audiocraft, Meta releases three AI tools for music and audio generation for research purposes.

Audiocraft consists of Meta's MusicGen, an AI model introduced in June 2023 that can generate melodies and musical pieces from text and other music. Also part of Audiocraft is AudioGen, a Transformer-based generative AI model introduced in October 2022 that can generate sounds to match text input from scratch or extend existing audio files.

Meta's audio tokenizer EnCodec, which breaks audio files into smaller pieces for AI processing, is the third part of Audiocraft and is now available in an enhanced version that Meta says produces higher-quality music with fewer artifacts.

Model kit for AI audio experiments

According to Meta, the Audiocraft family of models can produce high-quality, consistent, and longer audio using only natural language interaction. The release provides full access to Meta's research in generative audio AI over the past several years, according to the company.

Ad
Ad

"There are nearly limitless possibilities once you give people access to the models to tune them to their needs," Meta writes.

With Audiocraft, musicians or sound designers, for example, would have professional tools for faster inspiration, brainstorming, or refining existing compositions.

MusicGen example: Earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic instrumentation, gentle grooves

Audiogen example: Whistling with wind blowing

Generative audio to lower entry barriers to music and audio

The Meta research team continues to work on generative audio, specifically high-quality audio based on diffusion models, the same technique that has led to huge quality improvements in image generation.

Recommendation

The goal, for example, is to enable musicians to create new compositions without having to play a single note on an instrument or to help indie developers on a shoestring budget fill virtual worlds with believable and varied sound effects. For Instagram, generative audio AI could provide the right soundtrack for posts. However, Audiocraft does not yet allow commercial use, so this will not happen just yet.

Following the release, Meta once again stresses the importance of open-source models: "Responsible innovation can’t happen in isolation. Open sourcing our research and resulting models helps ensure that everyone has equal access."

Audiocraft's code is available here.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Meta releases the Audiocraft framework, a set of three AI tools for music and audio generation, consisting of MusicGen, AudioGen, and an improved version of the audio tokenizer EnCodec.
  • Audiocraft enables the creation of high-quality, consistent, and longer audio content using only language interaction, and provides researchers with full access to Meta's previous research in generative AI for audio.
  • The technology is expected to open new creative possibilities for musicians and sound designers, and provide developers with believable sound effects. Commercial use is not permitted at this time.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.