Content
summary Summary

Reddit has filed a lawsuit against Anthropic in Superior Court in San Francisco, accusing the AI startup of systematically scraping Reddit posts to train its Claude language models without permission.

Ad

According to the platform's user agreement, commercial use of Reddit content requires an explicit license. Reddit says Anthropic ignored that rule, bypassed technical safeguards like robots.txt files and IP-based rate limits, and never connected to Reddit's compliance API—the tool that tells licensees when a user deletes a post so it can be removed from their systems.

According to the lawsuit, Anthropic has publicly acknowledged using Reddit data in past research and even listed more than 40 subreddits—including r/science, r/IAmA, and r/relationship_advice—as "high-quality" sources for training Claude. Reddit says Anthropic gathered that data without consent and despite those protective measures.

The lawsuit states an Anthropic spokesperson claimed in July 2024 that Reddit had been on ClaudeBot's blocklist since May. Reddit's internal logs tell a different story, showing more than 100,000 hits from Anthropic bots on Reddit's servers in the months after that claim.

Ad
Ad

Reddit seeks injunction and damages

Reddit's lawsuit accuses Anthropic of multiple legal violations, from breach of contract to unfair competition. The platform is seeking damages for lost licensing revenue, demanding Anthropic delete all AI models and datasets containing Reddit content, and asking the court to prevent Anthropic from commercially using Claude or any AI models trained on Reddit data.

Reddit argues that Anthropic's actions threaten both the company's business interests and its users' privacy. Without a license or a connection to the compliance API, there is no way to confirm whether deleted or sensitive posts are still embedded in Claude.

"If parties such as Anthropic scrape Reddit content without a licensing agreement, Reddit users enjoy none of the protections present in Reddit’s Public Content Policy and Privacy Policy, in part, because Reddit users have no way of knowing which parties have scraped and obtained their data," the lawsuit states.

The platform notes that other AI companies have chosen a different path. Google reportedly pays Reddit $60 million a year for training data, and the partnership has given Reddit a boost in Google Search visibility in recent months.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Reddit has taken legal action against Anthropic, alleging that the company systematically collected and used content from subreddits to train its AI models without authorization.
  • The complaint states that Anthropic violated Reddit's terms of use, technical safeguards, and failed to secure a license, putting both economic interests and the privacy of deleted or sensitive user data at risk.
  • Reddit is seeking damages, the removal of all AI models trained with Reddit data, and a stop to Anthropic's further commercial use of these systems.
Sources
Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.