AI in practice
Matthias Bastian

AMD's software woes leave Nvidia unchallenged in AI chip market, study finds

AMD
AMD's software woes leave Nvidia unchallenged in AI chip market, study finds
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Profile
E-Mail
Content
summary Summary

A five-month investigation by SemiAnalysis reveals that AMD's new MI300X AI chips fall short of their potential due to major software problems, leaving Nvidia's market dominance unchallenged.

Ad

The research found that AMD's software is plagued with bugs that make training AI models nearly impossible without significant debugging. While AMD struggles with quality assurance and ease of use, Nvidia keeps widening the gap by rolling out new features, libraries, and performance updates.

The analysts ran extensive tests, including GEMM benchmarks and single-node training, only to find that AMD can't overcome what they call the "CUDA moat" - Nvidia's strong software advantage.

On paper, the MI300X looks impressive, offering 1,307 TeraFLOPS in FP16 calculations and 192 GB of HBM3 memory. This compares to Nvidia's H100 with 989 TeraFLOPS and 80 GB memory, though Nvidia's newer H200 closes this memory gap with its 141 GB configuration. AMD systems also offer lower total ownership costs thanks to cheaper prices and more affordable Ethernet networks.

Ad
Ad

Hardware advantages overshadowed by software problems

However, these advantages mean little in practice. According to SemiAnalysis, comparing these specs is like "comparing cameras by merely examining megapixel count" - suggesting that AMD is merely playing a numbers game without delivering enough real-world performance.

The analysts had to work directly with AMD engineers to fix numerous bugs just to get usable benchmark results. In contrast, Nvidia's systems worked smoothly right out of the box.

"AMD's Out of the Box Experience is very difficult to work with and can require considerable patience and elbow grease to move towards a usable state," they write.

In a particularly telling detail, SemiAnalysis revealed that Tensorwave, AMD's largest GPU cloud provider, had to give AMD's own team free access to GPUs—the same hardware Tensorwave had purchased from AMD—just to fix software issues.

SemiAnalysis recommends that AMD CEO Lisa Su invest heavily in software development and testing. Specifically, they suggest allocating thousands of MI300X chips for automated testing - following Nvidia's approach - and simplifying the complex environment variables while implementing better default settings. "Make the out-of-the-box experience usable!" they write.

Recommendation
AI in practice

Hundreds of examples in prompts can significantly boost LLM performance, study finds

While SemiAnalysis wants to see AMD succeed as a competitor to Nvidia, they say "unfortunately, there is still much work to be done." Without major improvements to its software, AMD risks falling further behind as Nvidia prepares to launch its next-generation Blackwell chips, though reports suggest Nvidia's next-gen rollout isn't going entirely smoothly either.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • AMD's new MI300X AI chips are facing significant challenges in reaching their full potential due to software issues, including bugs and a difficult "out-of-the-box" experience, making AI model training nearly impossible, according to an analysis by SemiAnalysis.
  • Despite the MI300X's theoretical advantages in computing power, memory, and total cost of ownership, AMD is struggling to close the gap with Nvidia's CUDA platform, which continues to strengthen its lead with new features and updates.
  • SemiAnalysis recommends that AMD CEO Lisa Su urgently invest more resources in software development and testing, provide thousands of MI300X chips for automated testing, and improve the out-of-box experience to address these issues and compete effectively in the AI chip market.
Sources
Semianalysis
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Profile
E-Mail
AI in practice

AWS releases Multi-Agent Orchestrator for managing multiple AI agents

News, tests and reports about VR, AR and MIXED Reality.
Haptic VR gun Mavrik for Meta Quest comes in a bundle with three VR games Win 4 VR hits: Metro Awakening, Arizona Sunshine Remake and more in the MIXED Advent Calendar Nvidia RTX 5090: Leak reveals details of the next high-end graphics card MIXED-NEWS.com
AI in practice

IBM releases updated Granite 3.1 open-source language models

AI in practice

OpenAI rolls out enhanced memory for ChatGPT, allowing it to reference previous chats

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

AMD's software woes leave Nvidia unchallenged in AI chip market, study finds

Bank details

IBAN: DE87 1203 0000 1086 0070 75
Account holder: DEEP CONTENT GbR
Purpose: Support THE DECODER
AI in practice

OpenAI unveils o3, its most advanced reasoning model yet

AI research

Study shows: 'Test-time compute scaling' is a path to better AI systems

AI in practice

Google launches Gemini 2.0, focusing on AI agents and multimodal capabilities

Google News