Content
summary Summary

The Chinese AI company Deepseek reportedly delayed the launch of its newest AI model after a failed attempt to train it using chips from domestic tech giant Huawei.

Ad

According to the Financial Times, Chinese regulators encouraged Deepseek to switch from Nvidia's leading chips to Huawei's Ascend processors after the release of its R1 model in January. That plan ran into trouble. Deepseek faced persistent technical issues with the Ascend chips while training its R2 model, the FT reports. Even with Huawei engineers on-site, the team couldn't complete a successful training run.

These problems forced Deepseek to fall back on Nvidia chips for the compute-heavy training process. The delays pushed back the model's launch from May and gave competitors a head start. As a workaround, Deepseek now uses Nvidia hardware to train its models, but relies on Huawei's Ascend chips for less demanding inference tasks. Industry sources told the FT that Chinese chips still lag behind Nvidia in stability, connectivity, and software quality.

Deepseek V3.1 targets new Chinese hardware

Despite these setbacks, Deepseek has released an updated version of its V3 model. As The Register notes, the new V3.1 was trained using a special data type called UE8M0 FP8. In a WeChat post, Deepseek said this data type was designed for a next generation of domestically produced chips, which would be released soon.

Ad
Ad

This suggests that more powerful Chinese accelerators could be on the way. Huawei's current top chip, the Ascend 910C, doesn't natively support the FP8 data type. The move from the previously used E4M3 format seems less about efficiency and more about future hardware compatibility. V3.1 builds on an earlier V3 checkpoint, but adds a hybrid reasoning mode.

Ritwik Gupta, an AI researcher at the University of California, Berkeley, told the Financial Times that Huawei is likely experiencing some "growing pains" with its chips, but said it's only "a matter of time" before the company catches up.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Deepseek delayed the launch of its latest AI model after experiencing technical problems with Huawei’s Ascend chips, despite regulator encouragement to switch away from Nvidia hardware.
  • The company ultimately used Nvidia chips for training and shifted to Huawei’s Ascend processors only for inference, as industry sources said Chinese chips still lag behind Nvidia in stability, connectivity, and software quality.
  • Deepseek’s new V3.1 model was trained using a data type designed for upcoming domestic chips, signaling future hardware improvements, although Huawei’s current top chip does not yet support this format.
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.