Ad
Skip to content

OpenAI apparently going all-in on ChatGPT copyright lawsuit

Image description
Midjourney prompted by THE DECODER

Key Points

  • Authors Sarah Silverman, Chris Golden, and Richard Kadrey are suing OpenAI for copyright infringement, alleging that their works were part of the training material for OpenAI's AI models.
  • OpenAI denies the allegations on all counts and bases its defense on fair use, which states that copyright law should not impede technological innovation. At the same time, however, the company is seeking clarification from the court on the direct copyright infringement allegation, as it is not asking for the charge to be dismissed, although it is making clear that it believes it to be false.
  • If the case goes to trial, the ruling could bring clarity to the copyright debate surrounding AI training data, which also affects other major AI companies such as Meta and Google.

Authors are suing OpenAI because their copyrighted works have become part of the training material for GPT models without their consent. The company denies the allegations in all points, but still seems to be seeking fundamental legal clarification.

In early July, news broke that comedian Sarah Silverman and authors Chris Golden and Richard Kadrey had filed a lawsuit against OpenAI, alleging that their works had become part of the training material for OpenAI's AI models. The allegations are

  • direct copyright infringement
  • vicarious copyright infringement
  • removal of copyright management information (DMCA)
  • unfair competition
  • unjust enrichment
  • and negligence.

OpenAI does not dispute (but does not confirm) that the books by the named authors have been used for AI training. Nevertheless, OpenAI moves to dismiss allegations two through six - but not the first allegation. More on that later.

In its motion to dismiss, OpenAI cites fair use, where copyright should not impede technological innovation, and the workings of large language models that generate substantially new content not directly incorporating specific copyrighted passages from training data. Large language models would rely on large amounts of text for training, rather than a single, specific text.

Ad
DEC_D_Incontent-1

OpenAI cites several cases in which the use of copyrighted material in innovative and transformative ways has been found not to infringe copyright. Claims that copyright-relevant information, such as the author's name, has been removed are simply false and unsubstantiated, they say.

OpenAI seeks clarity from court ruling

As copyright specialist Andres Guadamuz points out on Twitter, despite this line of argument, OpenAI is explicitly not asking for the first claim, the claim of direct copyright infringement, to be dismissed.

Guadamuz calls the move "surprising," but suggests it is tactical: OpenAI may be hoping for a ruling that AI training falls under fair use.

"That would be big," Guadamuz says, if the direct copyright infringement charge actually goes to trial. He gives OpenAI a good chance of getting the other charges dismissed, as requested.

Ad
DEC_D_Incontent-2

That would make direct copyright infringement the focus of the trial. Guadamuz also says that OpenAI might think it has a good chance of winning this case, and that "many copyright lawyers I've talked to in the last couple of months seem to agree."

The ruling could bring some clarity to the copyright debate over text and image data for AI training, which goes far beyond this case and involves other major AI companies such as Meta and Google.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.