Nvidia CEO Jensen Huang: The idea that AI will destroy software is "ridiculous"
Jensen Huang explains why AI agents will use software rather than replace it. Nvidia has redesigned its entire rack architecture accordingly.
"A lot of people would say, 'You know AI is gonna completely destroy software. We don't need software anymore. We don't even need tools anymore.' That's ridiculous," Jensen Huang says on the Lex Fridman Podcast.
His counterargument is a thought experiment: even the most impressive agent we can imagine in the next ten years - a humanoid robot - would most likely just use the existing microwave rather than beam microwaves out of its fingers. The first time it walks up to the microwave, it probably doesn't know how to use it. "But that's okay. It's connected to the internet. It reads the manual of this microwave, reads it, instantly becomes an expert." With that, Huang says, he had essentially described "almost all of the properties of OpenClaw." He says he sketched the concept for such agents two years earlier on the GTC stage.
Huang compares OpenClaw's impact to ChatGPT
Huang sees OpenClaw as a turning point on par with ChatGPT. According to Huang, the framework "did for agentic systems what ChatGPT did for generative systems." He explains the breakthrough in practical terms: OpenClaw went viral "because consumers could reach it." He calls it "the iPhone of tokens" and "the fastest-growing application in history."
Behind this lies a broader economic argument. According to Huang, tokens are becoming a commodity with differentiated price tiers, from free tokens to premium tokens. The idea that someone will be willing to pay $1,000 per million tokens is "just around the corner. It's not if, it's only when," he says. In his view, data centers are transforming from warehouses for data into factories for tokens whose revenue directly correlates with token production.
Nvidia's new rack architecture is built for agents, not just LLMs
For Nvidia, this conviction has real consequences. The Grace Blackwell racks were still purely optimized for LLM inference. The new Vera Rubin platform consists of five specialized rack types instead. These include dedicated Vera CPU racks for agent sandboxing, BlueField-4 storage racks for massive KV cache context, and the Groq 3 LPX rack for ultra-low-latency inference. "This entire rack system is completely different than the previous one," Huang says. The last one was designed to run MoE large language models for inference. This one is built to run agents. And agents, as Huang puts it, "bang on tools."
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe nowAI news without the hype
Curated by humans.
- More than 16% discount.
- Read without distractions – no Google ads.
- Access to comments and community discussions.
- Weekly AI newsletter.
- 6 times a year: “AI Radar” – deep dives on key AI topics.
- Up to 25 % off on KI Pro online events.
- Access to our full ten-year archive.
- Get the latest AI news from The Decoder.