Content
summary Summary

Huawei is advancing work on a new AI processor and a large-scale cloud system, aiming to challenge Nvidia’s dominance in AI infrastructure. The company has achieved a technical milestone despite significant restrictions imposed by US sanctions.

Ad

Huawei Technologies is currently testing a new AI processor, the Ascend 910D, which is intended to replace Nvidia’s more powerful products in the future. According to the Wall Street Journal, Huawei expects to receive the first samples of the 910D in May 2025. The chip is designed to outperform Nvidia’s H100, which has been the industry standard for AI training since 2022 and is now being replaced in Western markets by successors from the Blackwell generation. The most powerful Nvidia chips are no longer permitted for sale in China due to export restrictions.

However, compared with Nvidia’s H100, the Ascend 910D is less energy efficient and has significantly higher power consumption. Huawei is employing new packaging technologies to connect multiple silicon dies and increase performance, the Wall Street Journal reports. Development remains at an early stage, and comprehensive testing will determine when the chip is ready for the market.

CloudMatrix 384: Huawei’s response to Nvidia’s rack-scale systems

In parallel, Huawei has introduced a new rack-scale system called CloudMatrix 384, which is still based on the earlier Ascend 910C chip. The system interconnects 384 of these chips. According to SemiAnalysis, CloudMatrix 384 achieves approximately 300 PFLOPs of BF16 compute performance—nearly double that of Nvidia’s GB200 NVL72 system. Nvidia recently introduced its successor, the GB300 NVL72.

Ad
Ad

Huawei’s system also leads in memory, with 3.6 times greater aggregate capacity and 2.1 times the memory bandwidth compared to Nvidia’s offering. However, CloudMatrix 384 requires about 4.1 times more energy than Nvidia’s comparable system. SemiAnalysis notes that the system’s energy efficiency per FLOP is also 2.5 times lower.

A central feature of the CloudMatrix 384 is its fully optical interconnect: Huawei has eliminated copper cables entirely, instead using 6,912 400G transceivers, according to SemiAnalysis. Each of the 384 GPUs uses seven transceivers for internal scaling. SemiAnalysis notes that the architecture resembles concepts Nvidia previously abandoned due to cost. Analysts consider Huawei to be “one generation” ahead of Nvidia and AMD in this respect.

Ongoing reliance on foreign suppliers despite sanctions

Despite sweeping US sanctions, Huawei remains dependent on foreign suppliers for chip manufacturing. SemiAnalysis reports that the previously shipped Ascend 910B and 910C chips were produced by TSMC in Taiwan. Huawei is said to have arranged for these chips through intermediary firms such as Sophgo. TSMC could face a penalty of up to $1 billion as a result.

Huawei also relied on foreign sources for access to high bandwidth memory (HBM). According to SemiAnalysis, the company acquired large volumes of HBM stacks from Samsung, accumulating around 13 million units in storage. Despite export controls, HBM reached China via intermediaries such as Faraday and CoAsia.

China’s largest chip manufacturer, SMIC, has expanded its 7-nanometer production capacity, according to SemiAnalysis, but continues to lag behind leading manufacturers in both yield and technology. These capacities could potentially increase in the medium term, provided export controls do not become stricter.

Recommendation

Prioritizing system-level optimization over single-chip performance

Given these structural limitations, Huawei is focusing on system-level optimization rather than maximizing individual chip performance. The company is building large, interconnected systems to achieve scale. SemiAnalysis observes that CloudMatrix 384 leverages China’s nearly unlimited power supply to provide competitive AI infrastructure, despite high energy consumption.

According to SemiAnalysis, the system delivers 70 percent more FLOPS than Nvidia’s current rack, even though its energy efficiency is substantially lower. In China, the additional energy demand is considered acceptable in light of the political priority attached to technological independence.

Recently, Huawei demonstrated its continued ability to deploy high-performance chips despite US sanctions with the Mate 60 smartphone, which used a processor manufactured in China.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Huawei is currently testing the new Ascend 910D AI processor, which is said to be more powerful than Nvidia's H100, but consumes significantly more power and is still in the early stages of development.
  • With its CloudMatrix 384 rack-scale system, Huawei optically interconnects 384 Ascend 910C chips to deliver nearly twice the computing power of Nvidia's current NVL72 system - but with 4.1 times the power consumption.
  • Despite U.S. sanctions, Huawei remains dependent on foreign manufacturing and memory supplies, but is increasingly turning to large-scale systems to remain competitive through sheer size and available power.
Sources
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.