Ziff Davis pushes to erase ChatGPT models trained on its articles as AI-media tensions rise

Apr 26, 2025

The media company Ziff Davis has filed a lawsuit against OpenAI in the U.S. District Court in Delaware, alleging massive copyright violations, breaches of the Digital Millennium Copyright Act (DMCA), unlawful enrichment, and trademark dilution.

According to the complaint, OpenAI systematically copied material from Ziff Davis websites, bypassed technical safeguards, and removed copyright notices to build training datasets for products like ChatGPT. Ziff Davis owns major digital brands including IGN, Mashable, CNET, ZDNET, PCMag, and Everyday Health, publishing nearly two million articles and updates annually.

Complaint details web scraping and copyright removal

The lawsuit documents how OpenAI's GPTBot web crawler allegedly ignored robots.txt files that prohibit content collection. Court filings show crawler activity increased after Ziff Davis sent a cease-and-desist letter in May 2024.

The complaint alleges OpenAI used specialized software to strip copyright information from articles. As evidence, it points to the public "WebText" dataset, which contains unattributed Ziff Davis content. Analysis shows Ziff Davis material appears millions of times in training data, at rates exceeding its normal web presence.

Content reproduction and attribution issues

Tests by Ziff Davis show OpenAI's models can reproduce exact passages from its articles, even offline, suggesting direct copying of training data. With internet access enabled, the models actively pull from Ziff Davis content.

Image comparison: AI (ChatGPT, OpenAI) reproduces copyrighted articles by Ziff Davis (Dicebreaker, Mashable). — Examples from the Ziff Davis v. OpenAI lawsuit: ChatGPT reproduces protected content and OpenAI training data includes copyrighted material. | Image: via PCMag

The lawsuit also details how OpenAI's products can misrepresent Ziff Davis material by generating incorrect summaries, misquoting sources, creating fake article links, or "hallucinating" facts and attributing them to Ziff Davis publications. The company argues that these practices dilute the media group's brand and damage its reputation as a trustworthy source of information.

Davis v. OpenAI lawsuit demonstrates how ChatGPT cites outdated articles and overlooks newer, more relevant content from the same source, which, according to Ziff Davis, harms its brands. | Image: via CourtListener

Independent studies have also found that chatbots may misrepresent news content or cite incorrect sources.

Ziff Davis describes AI-generated content as a fundamental threat to its business, arguing that it competes directly with original articles and diverts user traffic—thereby undermining advertising and commission-based revenue streams. The company alleges that OpenAI profits from Ziff Davis's investment and labor without compensation, while at the same time entering license agreements with other publishers.

The lawsuit seeks damages, attorneys' fees, and a permanent injunction. Notably, Ziff Davis requests the destruction of all OpenAI training datasets and AI models containing or derived from its copyrighted works. According to the filing, OpenAI rejected attempts by Ziff Davis to negotiate a license agreement prior to the lawsuit.

Lack of transparency in OpenAI’s media licensing agreements

OpenAI and other AI companies have recently entered into largely opaque licensing agreements with selected media organizations. These deals, often retroactive, appear designed to preempt potential litigation over the use of content in AI training. Journalism researcher Jeff Jarvis described such payments as hush money.

These practices effectively position AI companies as gatekeepers, giving them discretion over which publishers benefit financially. The consolidation of such power may pose a greater threat to media diversity than the current reliance on search engines like Google, since users seldom leave chatbot platforms.

The Ziff Davis case is one of many ongoing lawsuits between content providers and AI companies. OpenAI is currently involved in more than 15 such disputes. Within the sector, the litigation between The New York Times and OpenAI is viewed as a defining case regarding the use of copyrighted text in AI training.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder