Google gathers triple OpenAI's AI data through its search monopoly

Dec 5, 2025

GPT-Image-1 prompted by THE DECODER

Cloudflare data underscores how Google's combined search and AI crawling gives it a massive data advantage over OpenAI and Anthropic.

Cloudflare CEO Matthew Prince argues that Google benefits from an unusually privileged level of access to the web, driven by the way it links its search crawler with its AI data collection systems.

Prince says internal Cloudflare measurements show that Google currently sees 3.2 times more pages than OpenAI. The gap widens even further with other competitors: Google captures 4.6 times more content than Microsoft and 4.8 times more than Anthropic or Meta. According to Prince, this imbalance stems from Google's decision to bundle its search crawler with its AI crawler. Site owners cannot block AI training without also disappearing from Google Search, creating a dilemma that effectively gives Google exclusive access to vast amounts of data.

Prince frames this as a misuse of long-standing market dominance, suggesting that Google's behavior lets it extend its historical monopoly into the emerging AI landscape.

How search lock-in limits publishers' ability to block AI scraping

The scale of the imbalance becomes clearer when looking at how aggressively site owners are trying to push back. Since July 1, Cloudflare has already blocked 416 billion AI requests for its customers. These blocks mainly affect companies that follow standards or identify their crawlers separately. Google, however, avoids that barrier through the tight coupling of its search and AI systems.

Publishers face a binary choice: allow their content to be used to train Google's AI models or lose visibility in Search, a trade-off that could be financially ruinous for many.

Prince told WIRED that Google is the central obstacle to progress unless pressured or persuaded to separate its search and AI crawlers. Without that split, publishers have almost no practical way to protect their content or negotiate licensing models that will be critical in the era of generative AI.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Google gathers triple OpenAI's AI data through its search monopoly

How search lock-in limits publishers' ability to block AI scraping

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.