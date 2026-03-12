Ask about this article… Search

An SEC filing reveals that Nvidia plans to spend $26 billion on open-weight AI models over the next five years. The move doubles as a strategic response to the growing dominance of Chinese open-source models - and a way to keep developers locked into Nvidia's hardware ecosystem.

Nvidia will invest $26 billion over five years to build open-source AI models, according to a financial filing with the US Securities and Exchange Commission. Company executives confirmed the plans in interviews with WIRED.

Alongside the announcement, Nvidia released Nemotron 3 Super, its most capable open-weight model to date with 128 billion parameters. On the Artificial Analysis Index benchmark suite, the model narrowly outperforms OpenAI's GPT-OSS and lands roughly on par with Anthropic's Claude 4.5 Haiku, but still falls short of competitors like Chinese Qwen3.5 122B A10B.

Nvidia used several technical innovations during training to improve reasoning capabilities and long-context handling. Like its smaller variants, Nemotron 3 Super is a hybrid model combining Transformer with Mamba architecture.

Chinese providers have taken the lead in open-source AI

The investment comes at a time when the balance of power in open AI models is shifting. Meta led the way with Llama, but CEO Zuckerberg recently signaled that future models may not be fully open. OpenAI's GPT-OSS remains significantly weaker than the company's proprietary offerings. Anthropic doesn't offer open models at all.

Meanwhile, Chinese providers like DeepSeek, Alibaba, Moonshot AI, and MiniMax release nearly all of their model weights for free. There have been recent signs of change here too, with departures from Alibaba's Qwen team. But Chinese models remain the best open alternative for many use cases, even as newer benchmarks repeatedly show that the practical gap to the best Western models is often larger than older benchmarks - which all vendors optimize for - would suggest.

Still, broad adoption of these models in Western industry hasn't materialized. The trend points toward closed models from providers like Anthropic or OpenAI.

Open models double as a tool to sell more Nvidia hardware

In January 2025, DeepSeek caused turmoil on the stock market with an efficient open-source model that called into question the perceived lead of Western AI labs and the amount of hardware needed to compete. The next shock may be imminent: a new DeepSeek model has reportedly been trained exclusively on chips from Chinese manufacturer Huawei, which is under US sanctions. If true, more companies and researchers could shift to Huawei hardware, particularly in China.

Other reports suggest that DeepSeek also has access to sanctioned Nvidia Blackwell GPUs and continues to train on them. Under pressure from the Chinese government, DeepSeek has been trying for some time to train on Huawei chips, but according to reports from last year, the effort failed due to technical problems including unstable performance and an immature software toolchain. At the same time, Nvidia has received permission to export more powerful AI chips to China again after years of strict sanctions. Chinese companies are eager to buy, but China's leadership wants to prevent renewed dependency.

Releasing its own open models optimized specifically for Nvidia hardware creates a counterweight here - and, provided Nvidia can produce truly competitive models, an alternative for Western companies. Those who use Nemotron and related models stay within the Nvidia ecosystem. The company is also targeting a market where the major AI labs have been largely absent: robotics and other edge AI applications.

Bryan Catanzaro, VP of Applied Deep Learning Research at Nvidia, put it diplomatically to WIRED: "We're an American company, but we work with companies across the world. It's in our interest to make the ecosystem diverse and strong everywhere." Nvidia has already pretrained a 550-billion-parameter model and released specialized models for robotics, climate modeling, and protein folding.

Kari Briski, VP of Generative AI Software, pointed to another strategic dimension: the models also serve to stress-test Nvidia's own supercomputer-scale data centers and push its hardware roadmap forward. "We build it to stretch our systems and test not just the compute but also the storage and networking, and to kind of build out our hardware architecture roadmap," she told WIRED.