Content
summary Summary

Until now, AI videos have been silent movies. But that is about to change: Pika Labs is introducing a new generative audio model.

Ad

Pika Labs has introduced text prompt-based sound effect generation for its generative AI videos. With this feature, users can add simple sounds to their videos: sizzling bacon, screeching eagles, or roaring engines.

The audio generation is still independent of the video content, so the text prompt that guides the video generation also generates the audio, which is then overlaid on the video. Alternatively, you can enter specific audio prompts to better control the generation. Pika Labs says it has trained its own model for audio generation.

Future models could perform audio and video generation in a single step, automatically generating the sounds that match the video based on an analysis of individual video frames, and inserting them in the right places. In large language models such as GPT-4, visual comprehension is good enough for detailed image descriptions.

Ad
Ad

The new feature is currently only available to subscribers on the Pro plan, but will be rolled out on a larger scale soon. Just a few days ago, Pika Labs introduced a lip-syncing tool that allows users to add lip-synced voices to characters in AI-generated videos.

Pika Labs is an AI startup focused on generative AI for video. It was founded by Stanford graduate students Demi Guoa and Chenlin Meng. With their product, the Pika Video Generator, users can create and edit videos in various styles such as 3D animation, anime, cartoon, and film.

Pika Labs offers text-to-video and image-to-video generation. With the latter, existing images can be animated. The startup has raised about $55 million in its first rounds of funding.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Pika Labs introduces the creation of sound effects from text prompts for its generative AI videos, allowing users to add simple sounds to their videos, such as sizzling bacon, screaming eagles, or engine noises.
  • The audio generation currently works independently of the video content and is based on a proprietary audio model developed by Pika Labs; future models could perform audio and video generation in one step.
  • The new feature is currently available to Pro Plan subscribers only, but will be rolled out more widely in the near future.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.