Ad
Skip to content

IBM brings Groq's ultra-fast AI inference to watsonx platform

Image description
Sora prompted by THE DECODER

IBM is integrating Groq's inference technology into its watsonx platform, aiming to deliver faster and more affordable AI for enterprise customers.

The partnership gives IBM clients access to GroqCloud through watsonx Orchestrate. Groq claims its proprietary Language Processing Unit (LPU) architecture can process workloads over five times faster and more cost-efficiently than traditional GPU-based systems.

IBM highlights potential use cases like healthcare, where thousands of patient questions need to be processed simultaneously, and HR automation in retail. The companies also plan to combine Red Hat's open-source vLLM technology with Groq's LPU hardware, and IBM's Granite models will be supported on GroqCloud as well. IBM clients can access GroqCloud's capabilities immediately.

Founded in 2016, Groq says it now has over two million developers using its platform. The company positions itself as a GPU alternative and part of the "American AI Stack." The partnership aims to help customers scale AI agents from pilot projects to production, with a focus on industries like healthcare, finance, government, retail, and manufacturing, where speed, cost, and reliability are essential.

Ad
DEC_D_Incontent-1

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

  • Over 20 percent launch discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder