CES 2026: Nvidia promises five times the AI performance and ten times cheaper inference with Vera Rubin
Key Points
- Nvidia presented the Vera Rubin platform at CES 2026: The new Ruby GPU is set to deliver three times as much AI training compute and five times as much inference compute as its predecessor Blackwell. Availability from the second half of 2026.
- With Alpamayo, the company is releasing new models for autonomous driving. Mercedes-Benz will introduce Nvidia DRIVE in the new CLA in 2026.
- DLSS 4.5 brings a new Transformer model and a 6x multi-frame generation mode for the RTX-50 series, which automatically adjusts frame generation to the refresh rate.
The chipmaker announced a new AI supercomputer, open-source autonomous driving software, and improved graphics upscaling. The message is clear: Nvidia wants to dominate the entire AI value chain.
Jensen Huang used CES 2026 in Las Vegas to showcase Nvidia's ambitions. The CEO unveiled three major product lines: the Vera Rubin AI computing platform, an open-source autonomous driving platform called Alpamayo, and the next generation of graphics upscaling with DLSS 4.5.
Six chips form an AI supercomputer
The Vera Rubin platform, named after the American astronomer, consists of six different chips: the Vera CPU, Rubin GPU, NVLink-6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch. Nvidia says the system is already in "full production" and will be available through partners starting in the second half of 2026.
The performance claims are impressive: The Rubin GPU is said to deliver three times the AI training compute and five times the AI inference compute of its predecessor, Blackwell. That's specifically for NVFP4. CEO Huang emphasized that the third-generation Transformer Engine with hardware-accelerated adaptive compression plays a major role in these gains. Inference token costs should drop by a factor of ten, and training large mixture-of-experts models will require only a quarter of the GPUs.
The complete Vera Rubin NVL72 rack reaches 260 terabytes per second of bandwidth, according to Nvidia. The sixth generation of NVLink delivers 3.6 terabytes per second per GPU.
What "full production" actually means
The emphasis on Vera Rubin being in "full production" raises some questions, though. According to Wired, it's unclear what Nvidia means by that. Typically, production of such complex chips starts at low volume while testing and validation are still underway.
Austin Lyons, an analyst at Creative Strategists, told Wired: “This CES announcement around Rubin is to tell investors, ‘We’re on track.’” Wall Street had been buzzing with rumors about delays to the Rubin GPU. In 2024, Nvidia had to delay shipments of Blackwell chips due to a design flaw that caused overheating in server racks.
Open-source push for autonomous driving
With the Alpamayo family, Nvidia is also bringing new models into the self-driving car space. The platform includes open-source AI models, simulation tools, and driving datasets. At its core is a vision-language-action model that uses a chain-of-thought approach: The vehicle works through complex scenarios step by step, similar to how a human driver thinks.
Alpamayo 1 with ten billion parameters is available on Hugging Face. The model is complemented by AlpaSim, an open-source simulation framework on GitHub, along with a dataset containing more than 1,700 hours of driving data.
So-called "long-tail" scenarios - rare but critical driving situations - remain one of the biggest challenges for autonomous systems. Traditional architectures separate perception and planning, which can cause problems in unusual situations. Nvidia's approach aims to tackle this through its reasoning model.
The Alpamayo models aren't designed for direct deployment in vehicles. They serve as large "teacher models" that developers can fine-tune and distill for their own AV stacks.
Mercedes-Benz brings the technology to the road
Mercedes-Benz will roll out Nvidia's DRIVE AV software in the new CLA in the United States in 2026. The system uses a dual-stack architecture: An AI end-to-end stack handles core driving functions, while a parallel classical safety stack based on Nvidia's Halos system provides redundancy.
A journalist from The Verge got to test the system in a Mercedes CLA. During a 40-minute drive, the vehicle reportedly handled traffic lights, four-way stops, double-parked cars, and unprotected left turns. Ali Kani, Vice President of Automotive at Nvidia, compared the system to Tesla's Full Self-Driving. In head-to-head tests over long distances, the number of driver interventions was comparable, he said.
An interesting dynamic emerges from Nvidia's relationship with Tesla: The electric carmaker is one of Nvidia's biggest customers and uses tens of thousands of GPUs to train its AI models. Even if Tesla wins the autonomous driving race, Nvidia wins in a sense too, The Verge notes.
Ambitious roadmap through 2028
Nvidia's automotive division is still modest compared to the rest of the company. In the third quarter, it brought in $592 million in revenue out of a total of $51.2 billion - just 1.2 percent.
Still, Nvidia laid out a detailed timeline. Level 2 features for highway and city driving should be available in the first half of 2026. By the end of 2026, the L2++ system is expected to cover the entire United States. Nvidia also plans a "small" Level 4 trial in 2026, similar to Waymo's robotaxis. By 2028, Nvidia expects Level 4 technology in private vehicles and Level 3 highway driving.
Safety experts remain skeptical of Level 3 systems, though. At Level 3, drivers can take their hands and eyes off the road but who's responsible when accidents happen remains unclear.
DLSS 4.5 brings dynamic multi-frame generation
For gamers, Nvidia delivered the next major update to Deep Learning Super Sampling. DLSS 4.5 features an overhauled Transformer model for super resolution and a new 6x multi-frame generation mode developed exclusively for the RTX 50 series.
DLSS 4.5 aims to address a long-standing problem with temporal anti-aliasing and earlier super-resolution models. According to Nvidia, these techniques compressed brightness values to reduce flickering in game scenes. The downside: Extreme differences between bright and dark areas got washed out, leading to muted lighting, clipped details, and crushed shadows in high-contrast scenes.
Nvidia says the new model was trained on the original brightness values from game engines and designed to work without this compression during use. Since the AI model is powerful enough to control flickering without brightness compression, glowing neon signs and bright reflections can retain their full color range and detail. Performance mode should now match or even exceed native image quality.
DLSS 4.5 also introduces Dynamic Multi-Frame Generation, which Nvidia describes as an "automatic transmission for the GPU." The system automatically switches between different frame multipliers - including the new 6x multi-frame generation mode - to match the frame rate to the monitor's refresh rate. In graphics-intensive scenes, it increases frame generation; when the load is lighter, it reduces the multiplier.
To train the new second-generation Transformer model, Nvidia says it used five times the computing power of the original model. It was trained on a significantly expanded, high-resolution dataset and should enable a deeper understanding of game scenes.
AI News Without the Hype – Curated by Humans
As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.
Subscribe now