Content
summary Summary

Adobe Research and Northwestern University have developed Sketch2Sound, an AI system that turns vocal imitations and text descriptions into professional sound effects and atmospheres.

Ad

Adobe Research and Northwestern University have created an AI system that could transform how sound designers work. Called Sketch2Sound, the tool lets users create professional audio by humming, making sound effects with their voice, and describing what they want in plain text.

The system analyzes three key elements of vocal input: loudness, timbre (which determines how bright or dark a sound is), and pitch. It then combines these characteristics with text descriptions to generate the desired sounds.

Video: García et al, Adobe Research

Ad
Ad

What makes Sketch2Sound interesting is how it understands context. For example, if someone enters "forest atmosphere" and makes short vocal sounds, the system automatically recognizes these should become bird calls - without needing specific instructions.

The same intelligence applies to music. When creating drum patterns, users can input "bass drum, snare drum" and hum a rhythm using low and high notes. The system automatically places bass drums on the low notes and snare drums on the high ones.

Fine-tuned control for professionals

The research team built in special filtering technology that lets users adjust how precisely they want to control the generated sounds. Sound designers can choose between exact, detailed control or a more relaxed, approximate approach depending on their needs.

This flexibility could make Sketch2Sound particularly valuable for Foley artists - the professionals who create sound effects for films and TV shows. Instead of manipulating physical objects to make sounds, they could potentially create effects more quickly through voice and text descriptions.

The researchers note that spatial audio characteristics from the input recordings can sometimes affect the generated sound in unwanted ways, but it's working to solve this issue. Adobe hasn't announced when or if Sketch2Sound will become a commercial product.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Adobe Research and Northwestern University have developed an AI system called "Sketch2Sound" that assists sound designers in creating sounds using vocal imitations and text descriptions.
  • The AI analyzes the volume, timbre, and pitch of the vocal imitation and combines this information with text instructions to generate the desired sound, understanding the intention behind the imitation.
  • Sketch2Sound could be particularly useful for Foley artists who specialize in creating sounds for film and television productions.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.