summary Summary

As Cerebras, Inflection, Amazon, and others bring new hardware online, this next generation of AI supercomputers will be available for new generative AI models.

The use of large language models and other types of generative AI is growing rapidly. With ChatGPT, Bard, Copilot, and others, there are now chatbots in nearly every digital ecosystem, with models like GPT-4 still leading the race. But the next generation of generative AI models is on the horizon. To train these complex AI systems, technology companies are building a new generation of AI supercomputers.

Cerebras unveils 2 Exaflop supercomputer

An example of this new generation comes from Cerebras, which just unveiled its Condor Galaxy 1 system, capable of 2 Exaflops. It was assembled and put into operation in just 10 days and consists of 32 Cerebras CS-2 computers. According to the company, Condor Galaxy is on track to double in size within the next 12 weeks. It is owned by G42, an Abu Dhabi-based holding company that owns nine AI-based companies, including G42 Cloud, one of the largest cloud computing providers in the Middle East.

Over the next 18 months, Cerebras plans to install additional systems to reach a total of 36 Exaflops across 9 installations. This will position it as one of the largest supercomputers focused on AI workloads.


Each CS-2 is powered by the Waferscale Engine-2, an AI-specific processor with 2.6 trillion transistors and 850,000 AI cores built from a single wafer of silicon.

Cerebras CEO Andrew Feldman noted that "the number of companies training neural network models with 50 billion or more parameters went from 2 in 2021 to more than 100 this year."

Inflection AI's 22,000 GPU supercomputer

Like many others, startup Inflection AI is betting on Nvidia to build an AI supercomputer powered by a whopping 22,000 Nvidia H100 GPUs. With the supply of AI chips limited, Inflection's connections as an Nvidia investment target likely helped secure this large GPU allocation. The startup recently raised $1.3 billion in funding from Microsoft, Nvidia, and others.

Inflection recently introduced its first proprietary language model, Inflection-1, which is said to be on par with GPT 3.5, Chinchilla, and PaLM-540B, and powers its personal AI Pi.

Inflection AI focuses on generating and understanding natural language. People should be able to talk to computers and give them complex tasks without having to learn code, says CEO Mustafa Suleyman, a co-founder of DeepMind. He expects a breakthrough in conversational AI in the next five years, which should open up "a whole new suite of things that we can do in the product space."


Cloud providers provide access to Nvidia H100 GPUs

Aside from Inflection, every major cloud platform has introduced new instances powered by Nvidia's H100 GPUs. Just yesterday, AWS launched P5 instances, which can run up to 20,000 H100 GPUs. Similar hardware is available through Microsoft Azure, Google, and Core Weave.

With easy access to scaled resources, developers can quickly prototype generative AI applications, as training times for the H100 with its Transformer engine can be greatly reduced. In the MLPerf benchmark, Core Weave and Nvidia recently trained a GPT-3 model with 175 billion parameters on about 3,500 GPUs in less than 11 minutes.

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
  • The next generation of AI supercomputers is being developed by companies like Cerebras and Inflection AI to train increasingly complex generative AI models, with Cerebras unveiling its 2 Exaflops Condor Galaxy 1 system.
  • Inflection AI is building an AI supercomputer powered by 22,000 Nvidia H100 GPUs focused on natural language generation and understanding, with its CEO predicting a breakthrough in conversational AI within the next five years.
  • Major cloud platforms have launched new instances powered by Nvidia H100 GPUs, enabling developers to quickly prototype generative AI applications and reduce training times for large language models.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.