Content
summary Summary

Nvidia unveiled its new H200 AI accelerator at the SC23 Supercomputing Show. Thanks to its faster memory, it is said to almost double the inference speed of AI models.

Nvidia's H200 is the first GPU to use the fast HBM3e memory and offers 141 gigabytes of HBM3e instead of 90 gigabytes of HBM2e and a bandwidth of 4.8 terabytes per second instead of 3.35 compared to its predecessor H100. These are around 2 to 2.5 times higher than the A100.

The faster memory and other optimizations should be particularly noticeable in inference compared to the H100: according to Nvidia, Meta's Llama 2, for example, runs almost twice as fast on the H200 in the 70 billion parameter variant. The GPU is also said to be more suitable for scientific HPC applications. The company expects further improvements from software optimizations.

Systems and cloud instances with H200 are expected to be available from the second quarter of 2024, including HGX H200 systems. The H200 can be deployed in multiple data center environments, including on-premises, cloud, hybrid cloud, and edge.

Ad
Ad

Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure will be among the first cloud service providers to offer H200-based instances starting next year.

Nvidias H200 it the GPU in the GH200 with 144 gigabytes of HBM3e

According to Nvidia, the H200 GPU will also be available in a 144-gigabyte version of the GH200 Grace Hopper super chip starting 2024. The GH200 connects Nvidia's GPUs directly to Nvidia's Grace CPUs, and in the recently released MLPerf benchmark version 3.1, the current GH200 variant with lower bandwidth and 96 gigabytes of HMB3 memory showed a speed advantage of almost 17 percent over the H100 when training AI models.

This version will be replaced by the HBM3e version in 2024. GH200 chips will be used in more than 40 supercomputers worldwide, including the Jülich Supercomputing Centre (JSC) in Germany and the Joint Center for Advanced High Performance Computing in Japan, the company said.

JUPITER supercomputer uses 24,000 GH200

JSC will also operate the JUPITER supercomputer, which is based on the GH200 architecture and is designed to accelerate AI models in areas such as climate and weather research, materials science, pharmaceutical research, industrial engineering, and quantum computing.

JUPITER is the first system to use a four-node configuration of the Nvidia GH200 Grace Hopper super chip.

In total, nearly 24,000 GH200 chips will be installed, making JUPITER the fastest AI supercomputer in the world. JUPITER is due to be installed in 2024 and is one of the supercomputers being built as part of the EuroHPC Joint Undertaking.

There was also news at SC23 about Nvidia's recently unveiled Eos supercomputer. German chemical company BASF plans to use EOS to run 50-qubit simulations on Nvidia's CUDA Quantum platform.

The aim is to study the properties of the compound NTA, which is used to remove toxic metals from urban wastewater.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Nvidia has unveiled its new H200 AI accelerator, designed to nearly double AI model inference thanks to faster HBM3e memory.
  • Systems and cloud instances with the H200 are expected to be available starting in Q2 2024, including HGX H200 systems and in various data centre environments.
  • The H200 will also be incorporated into the GH200 super chip starting in 2024, which will also be used by the new JUPITER supercomputer in Germany.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.