Ad
Skip to content

CAT4D from Google Deepmind turns videos into simple 3D scenes

Image description
Google Deepmind

A new AI system from Google Deepmind can turn ordinary videos into dynamic 3D scenes. The team, which includes researchers from Columbia University and UC San Diego, calls their creation CAT4D.

The system uses a diffusion model to take a video shot from a single angle and generate views from multiple perspectives. It then builds these different viewpoints into a dynamic 3D scene. The end result? A video where you can look at the subject from many angles.

Video: Google Deepmind

Until now, capturing something like this required elaborate setups with multiple cameras recording the same scene simultaneously. CAT4D simplifies the process by working with regular video footage.

Ad
DEC_D_Incontent-1

Training challenges and solutions

The team faced one problem: there wasn't much existing data to train their AI. To work around this, they got creative and mixed real-world footage with computer-generated content. The training data included multi-view images of static scenes, single-perspective videos, and synthetic 4D data.

Image: Google Deepmind

The diffusion model learns to create images from specific angles at specific moments in time. According to the researchers, CAT4D produces higher quality results than similar systems, though it still struggles with generating videos longer than the original footage.

Technology like CAT4D could find its way into several industries, the researchers say. Game developers might use it to create virtual environments, while filmmakers and AR developers could incorporate it into their workflows.

Anyone interested in seeing more examples can check out the project's GitHub page.

Ad
DEC_D_Incontent-2

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

  • Over 20 percent launch discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder