A New York federal judge has dismissed a lawsuit brought by news sites Raw Story and AlterNet against OpenAI for using their articles to train AI systems. The ruling could affect similar ongoing cases.
Judge Colleen McMahon dismissed the case after finding that the plaintiffs failed to show concrete harm from OpenAI's use of their content as training data. Unlike other lawsuits targeting AI companies, this case focused on the removal of copyright management information rather than direct copyright violations—though Judge McMahon noted the underlying issue remained the same.
The judge's decision supported the fair use defense of OpenAI and other AI companies, noting that ChatGPT creates synthesized responses from its training rather than copying content directly. She emphasized that the likelihood of ChatGPT reproducing exact copies of articles is minimal, and pointed out that factual information in articles isn't copyrighted anyway.
The ruling also addressed past instances where ChatGPT had copied text verbatim, noting these cases can't be replicated with current versions—supporting OpenAI's position that such copying, when it occurs, is a rare bug rather than intended functionality.
While dismissing the current case, Judge McMahon left the door open for the plaintiffs to file an amended complaint. However, she expressed doubt about their ability to "allege a cognizable injury," according to Reuters. "Whether there is another statute or legal theory that does elevate this type of harm remains to be seen," McMahon said.
Raw Story's lawyer, Matt Topic, expressed confidence that the court's concerns would be addressed in a revised complaint. OpenAI has not yet commented on the decision.
Broader implications for AI copyright battles
McMahon's decision could set the tone for further copyright lawsuits against AI companies, particularly the New York Times lawsuit against OpenAI and cases filed by music companies against AI music generators. The Times case specifically challenges OpenAI's unauthorized use of its articles to create what it considers a competing product.
The ruling strengthens OpenAI's position by supporting its argument that AI-generated content is a synthesis of training data rather than a direct copy, a distinction that could prove crucial in future legal battles over AI training practices.