Reddit says AI companies misused the Wayback Machine to scrape its content
Reddit is heavily restricting the Internet Archive's access to its platform.
Going forward, the Wayback Machine will only be able to index Reddit's homepage, blocking access to individual posts, user comments, and profile pages. According to company spokesperson Tim Rathschmidt, the move is a response to incidents where AI companies scraped Reddit content via the Wayback Machine, violating the platform's rules. Reddit notified the Internet Archive ahead of time, and the new restrictions are being rolled out immediately.
The change is part of a broader push by Reddit to crack down on data scraping and free data use by AI firms. In 2024, Reddit signed licensing deals with Google and OpenAI for access to its data, while blocking search engines that don't pay. The company has also filed a lawsuit against Anthropic over alleged unauthorized data scraping for AI training.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe nowAI news without the hype
Curated by humans.
- More than 16% discount.
- Read without distractions – no Google ads.
- Access to comments and community discussions.
- Weekly AI newsletter.
- 6 times a year: “AI Radar” – deep dives on key AI topics.
- Up to 25 % off on KI Pro online events.
- Access to our full ten-year archive.
- Get the latest AI news from The Decoder.