A federal judge in New York has allowed The Intercept to move forward with parts of its lawsuit against OpenAI over alleged violations of the Digital Millennium Copyright Act (DMCA) related to AI training data.
The case centers on allegations that OpenAI removed copyright management information like titles and author names from news articles when creating training data for ChatGPT. The Intercept argues this removal violates DMCA provisions protecting authorship documentation.
Judge Jed S. Rakoff of the U.S. District Court for the Southern District of New York dismissed some claims, including all allegations against Microsoft, but allowed the main DMCA complaint against OpenAI to proceed.
"This decision shows that the DMCA provides critical safeguards for news organizations against encroachment by AI companies," said Matt Topic, The Intercept's attorney, calling it a "first-of-its kind decision" with potential wider implications.
Courts wrestle with AI copyright interpretation
The courts face a significant challenge in applying existing copyright laws to AI systems using protected material for training. Recently, another New York federal judge dismissed a similar lawsuit against OpenAI filed by news sites Raw Story and AlterNet.
This case also involved potential DMCA violations for removing copyright information. However, Judge Colleen McMahon clarified that the real issue wasn't the removal of copyrighted information from OpenAI's training datasets. Instead, the plaintiffs' main concern was OpenAI's use of their articles to develop ChatGPT without compensation.
On this basis, Judge McMahon found that the plaintiffs had failed to show sufficient concrete harm from the use of their articles as AI training data. She sided with OpenAI's position that ChatGPT produces AI-generated responses based on training content, with a low probability of exact copying of articles, and noted that if copying occurred, it would be an error rather than an intended feature. She also emphasized that pure facts aren't copyrighted. This is Big AI's fair use argument in a nutshell.
However, in The Intercept's case, Judge Rakoff found the plaintiff could potentially prove concrete damage from the copyright information removal. This aspect of the case will now be subject to further legal review.
The Intercept filed its lawsuit in February, part of a growing wave of media companies taking legal action against OpenAI and Microsoft over various copyright issues in AI development. These initial rulings likely mark the beginning of a longer legal battle over the use of copyrighted content to train AI models.