The mere existence of Google TPUs reportedly saved OpenAI 30% on Nvidia chips

Nov 29, 2025

Nano Bana Pro prompted by THE DECODER

Google is shifting from an internal chip user to a retailer, a move that directly challenges Nvidia's market dominance. A new analysis suggests the mere existence of Google's latest TPUs is already driving down prices for AI computing power.

For years, Google kept its Tensor Processing Units (TPUs) almost exclusively for its own AI models. That strategy changes with the new TPUv7 "Ironwood." According to an analysis by the chip experts at SemiAnalysis, Google is now aggressively selling its silicon to third parties, positioning itself as a direct rival to Nvidia.

Anthropic headlines the customer list. The analysis indicates the startup's deal involves around one million TPUs, split between direct hardware purchases and cloud rentals via the Google Cloud Platform (GCP). The infrastructure required to run this hardware reportedly consumes more than one gigawatt of power.

The market is already feeling the impact. SemiAnalysis reports that OpenAI negotiated a roughly 30 percent discount on its Nvidia fleet simply by credibly threatening to switch to TPUs or other alternatives.

"The more (TPU) you buy, the more (NVIDIA GPU capex) you save," analysts Dylan Patel, Myron Xie, and Daniel Nishball write, a playful riff on Nvidia CEO Jensen Huang's famous "the more you buy, the more you save" catchphrase.

TPUs prove they can handle top-tier AI models

Usage data shows that TPUs are no longer a second-tier alternative. Two of the most powerful AI models released recently, Google's Gemini 3 Pro and Anthropic's Claude 4.5 Opus, rely predominantly on Google TPUs and Amazon's Trainium chips. Gemini 3 was trained entirely on TPUs.

Technically, the TPUv7 "Ironwood" nearly matches Nvidia's Blackwell generation in theoretical computing power (FLOPs) and memory bandwidth, according to SemiAnalysis. But the real killer feature is the price tag.

For Google, the total cost of ownership (TCO) per chip is roughly 44 percent lower than a comparable Nvidia GB200 system. Even for external customers like Anthropic—who pay a markup—the cost per effective compute unit could be 30 to 50 percent lower than Nvidia systems, based on the analysts' model.

This advantage scales for teams that optimize their software. Google's system can link up to 9,216 chips into a single, densely networked domain. This architecture makes distributing massive AI training runs easier compared to conventional Nvidia systems, which typically cluster just 64 to 72 chips closely together.

Software updates aim to break the CUDA lock-in

Software has long been the biggest hurdle for TPU adoption, with Nvidia's CUDA platform serving as the industry standard. Google is investing heavily to remove this barrier. The report notes the company is working on native support for the popular PyTorch framework and integration with inference libraries like vLLM.

The goal is to make TPUs a viable alternative without forcing developers to rebuild their entire toolchain. However, the core of the TPU software stack—the XLA compiler—remains proprietary. SemiAnalysis views this as a missed opportunity, as open-sourcing it could have accelerated adoption by the broader community.

To deploy this massive amount of silicon, Google is using creative financing. The company is partnering with "neoclouds" like Fluidstack and crypto miners like TeraWulf. In these deals, Google often acts as a financial backstop: if the operator fails, Google guarantees the rental payments. This strategy allows for the rapid conversion of existing crypto mining data centers into AI facilities.

Nvidia's next generation could wipe out the price advantage

Facing pressure from Google's success, Nvidia is preparing a technological counterattack. Its next-generation "Vera Rubin" chips, expected in 2026 or 2027, will feature aggressive design choices like HBM4 memory and extremely high bandwidths.

Google's planned response, the TPUv8, follows a dual strategy according to SemiAnalysis. The company plans to release two variants: one developed with longtime partner Broadcom (codenamed "Sunfish") and another with MediaTek (codenamed "Zebrafish"). Despite this diversification, the designs appear conservative. Analysts note the project is suffering from delays and relies on architecture that avoids the aggressive use of TSMC's 2nm process or HBM4 seen in the competition.

The stakes are high for Google. If Nvidia executes well on the performance gains for Rubin, the current cost advantage of TPUs could evaporate. SemiAnalysis warns of a scenario where Nvidia's Rubin systems—specifically the "Kyber Rack"—become more economical than Google's own TPUv8, even for internal workloads.

"The cards have been shown by Google, and now Nvidia has to execute to remain the lion at the top of the food chain," SemiAnalysis concludes. If the market leader executes its roadmap flawlessly, it stays on top. But if Nvidia stumbles on performance or misses the Rubin schedule, its dominance could be in serious trouble.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder