IBM is integrating Groq's inference technology into its watsonx platform, aiming to deliver faster and more affordable AI for enterprise customers.
The partnership gives IBM clients access to GroqCloud through watsonx Orchestrate. Groq claims its proprietary Language Processing Unit (LPU) architecture can process workloads over five times faster and more cost-efficiently than traditional GPU-based systems.
IBM highlights potential use cases like healthcare, where thousands of patient questions need to be processed simultaneously, and HR automation in retail. The companies also plan to combine Red Hat's open-source vLLM technology with Groq's LPU hardware, and IBM's Granite models will be supported on GroqCloud as well. IBM clients can access GroqCloud's capabilities immediately.
Founded in 2016, Groq says it now has over two million developers using its platform. The company positions itself as a GPU alternative and part of the "American AI Stack." The partnership aims to help customers scale AI agents from pilot projects to production, with a focus on industries like healthcare, finance, government, retail, and manufacturing, where speed, cost, and reliability are essential.