"Napster-style" piracy allegations put Anthropic at risk of a billion-dollar class action lawsuit

Jul 19, 2025

Sora prompted by THE DECODER

A California federal court has cleared the way for a billion-dollar class action lawsuit against Anthropic, the company behind the Claude language model, over claims of large-scale copyright infringement.

The suit alleges that Anthropic downloaded as many as seven million books from pirate sites like LibGen and PiLiMi between 2021 and 2022. This puts the company in the crosshairs for potentially massive damages, even after a partial win on fair use grounds just weeks earlier.

A "Napster-style" piracy case

According to the court order from July 17, 2025, Anthropic is accused of using the BitTorrent protocol to download pirated books from LibGen and PiLiMi. These files - typically in .epub, .pdf, or .txt format - were stored in a central internal database, regardless of whether they were later used to train AI models.

Judge William Alsup described the company's actions as "Napster-style downloading of millions of works." The order details how, between January 2021 and July 2022, an Anthropic co-founder first downloaded about 200,000 books from the Books3 collection, followed by roughly five million from LibGen and another two million from PiLiMi, targeting titles not already in LibGen.

The court decided the case should move forward as a class action, given the sheer volume and complexity of the evidence. Only works sourced from LibGen and PiLiMi are included; Books3 was left out due to missing metadata.

The financial risk for Anthropic is significant. Under US law, damages for willful copyright infringement can reach up to $150,000 per work. Even a much smaller amount per title could still total billions.

Anthropic must turn over a complete metadata list of its LibGen and PiLiMi downloads by August 1, 2025, while plaintiffs are required to submit a detailed list of titles and registrations by September 1, 2025.

Fair use doesn't apply to piracy

In June, the same court ruled that training AI models on legally obtained books may qualify as fair use, especially if the use is "transformative" and no copies are distributed. But the court also made it clear: storing pirated works in an internal library doesn't qualify as fair use.

While the legal status of mass web scraping and the use of public data for AI training is still up in the air, the court’s ruling sets a clear boundary: pirated content can't be justified as fair use, even for AI research or innovation.

The Anthropic case could set a major precedent for the industry, making it clear that AI companies can't sidestep copyright laws when sourcing training data, regardless of how they use it later. The decision could ripple out to ongoing lawsuits against Meta, OpenAI, and others accused of using copyrighted material to train language models.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

"Napster-style" piracy allegations put Anthropic at risk of a billion-dollar class action lawsuit

A "Napster-style" piracy case

Fair use doesn't apply to piracy

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.