Content
summary Summary

A federal judge in New York has allowed The Intercept to move forward with parts of its lawsuit against OpenAI over alleged violations of the Digital Millennium Copyright Act (DMCA) related to AI training data.

Ad

The case centers on allegations that OpenAI removed copyright management information like titles and author names from news articles when creating training data for ChatGPT. The Intercept argues this removal violates DMCA provisions protecting authorship documentation.

Judge Jed S. Rakoff of the U.S. District Court for the Southern District of New York dismissed some claims, including all allegations against Microsoft, but allowed the main DMCA complaint against OpenAI to proceed.

"This decision shows that the DMCA provides critical safeguards for news organizations against encroachment by AI companies," said Matt Topic, The Intercept's attorney, calling it a "first-of-its kind decision" with potential wider implications.

Ad
Ad

Courts wrestle with AI copyright interpretation

The courts face a significant challenge in applying existing copyright laws to AI systems using protected material for training. Recently, another New York federal judge dismissed a similar lawsuit against OpenAI filed by news sites Raw Story and AlterNet.

This case also involved potential DMCA violations for removing copyright information. However, Judge Colleen McMahon clarified that the real issue wasn't the removal of copyrighted information from OpenAI's training datasets. Instead, the plaintiffs' main concern was OpenAI's use of their articles to develop ChatGPT without compensation.

On this basis, Judge McMahon found that the plaintiffs had failed to show sufficient concrete harm from the use of their articles as AI training data. She sided with OpenAI's position that ChatGPT produces AI-generated responses based on training content, with a low probability of exact copying of articles, and noted that if copying occurred, it would be an error rather than an intended feature. She also emphasized that pure facts aren't copyrighted. This is Big AI's fair use argument in a nutshell.

However, in The Intercept's case, Judge Rakoff found the plaintiff could potentially prove concrete damage from the copyright information removal. This aspect of the case will now be subject to further legal review.

The Intercept filed its lawsuit in February, part of a growing wave of media companies taking legal action against OpenAI and Microsoft over various copyright issues in AI development. These initial rulings likely mark the beginning of a longer legal battle over the use of copyrighted content to train AI models.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • The Intercept, a US news site, has been granted partial permission to continue its lawsuit against OpenAI for alleged violations of the Digital Millennium Copyright Act (DMCA) in relation to AI training data.
  • The lawsuit accuses OpenAI of removing copyright information, including titles and author names, from articles used to train ChatGPT without obtaining proper permission.
  • The legal landscape remains complex, as another judge recently dismissed a similar lawsuit, siding with Big AI's fair use argument.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.