The next copyright lawsuit against OpenAI is underway. This time, the big names in the author community are weighing in.
17 authors, including John Grisham, George R.R. Martin and Jodi Picoult, have filed a copyright infringement lawsuit against OpenAI in New York federal court. The lawsuit was filed by the Authors Guild, which had previously threatened to sue OpenAI.
AI copyright expert Andres Guadamuz calls the case arguably the most important of the many ongoing negotiations. "This was always going to be settled by the intervention of the large copyright holders with good lawyers," Guadamuz writes.
The fact that the lawsuit was filed in New York rather than California, as many others have been, is a strategic move in case the California litigation ends in favor of the rights holders, Guadamuz says.
New lawsuit, same content
The substance of the case is similar to the ongoing lawsuits: The plaintiffs allege that OpenAI used copyrighted books for AI training without permission, referring to the books2 dataset. As evidence, they cite attributions from ChatGPT and its ability to summarize the books and generate works in the style of the original works.
For example, the plaintiffs cite a generated outline of a "Game of Thrones" prequel called "A Dawn of Direwolves," which features characters from George R.R. Martin's "A Song of Ice and Fire."
"For fiction writers, OpenAI's unauthorized use of their work is identity theft on a grand scale. Fiction authors create entirely new worlds from their imaginations—they create the places, the people, and the events in their stories," writes the Authors Guild.
It's mostly about fair use
All of this is weak evidence, Guadamuz points out, because ChatGPT may not be able to name its training data or may be hallucinating about it. Book summaries can also come from Internet sources such as Wikipedia. Even if OpenAI were using the books for AI training, it could still be fair use, which is what this case is really about.
A quick resolution is unlikely: Guadamuz expects the case to drag on for years, with many appeals. He does not expect an out-of-court settlement. OpenAI also seems to want to settle the matter once and for all in court.
Meanwhile, Big AI is likely to continue training large AI models and learning from potential mistakes in the selection of training data for their initial models. This is already evident in newer image models such as DALL-E 3, where OpenAI offers artists an opt-out of their work from the training data.