YouTube CEO's warning to OpenAI over Sora training data could backfire spectacularly

Apr 4, 2024

Midjourney prompted by THE DECODER

YouTube CEO Neal Mohan said that using YouTube videos to train OpenAI's text-to-video generator Sora would violate the platform's rules, Bloomberg reports.

In an interview, Mohan said that creators have "certain expectations" when they upload "their hard work" to YouTube. One of those expectations is that "the terms of service is going to be abided by," which do not allow transcripts or parts of videos to be used elsewhere. Mohan said that would be a "clear violation" of YouTube's rules.

AI tools like Sora work because the companies that build them scrape various types of content from the web, both licensed and unlicensed. They use that data to train AI models that, through that training process, learn to create new content similar to the training data.

Mohan said he didn't know if OpenAI used YouTube videos to train Sora. OpenAI CTO Mira Murati wouldn't discuss Sora's training data in a recent interview.

Mohan does a massive disservice to Google's AI strategy

Mohan's warning to OpenAI could spell trouble for Google, which is already fighting numerous lawsuits from artists and authors who claim Google has taken their data from the Internet without permission to train its AI models, including text, images, music, videos, and code.

But Google argues that scraping data for AI training is "fair use" because it's transformative, meaning the model only uses the data to learn, not to reproduce the data itself.

Mohan said Google has only used YouTube videos to train its own AI models in a way that follows YouTube's rules. But in other cases, Google has used data from other platforms and probably hundreds of thousands of creators. It hasn't been completely candid about it.

If used against Google in court, Mohan's comments could cause real problems for the company.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

YouTube CEO's warning to OpenAI over Sora training data could backfire spectacularly

Mohan does a massive disservice to Google's AI strategy

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.