IBM
and Groq have announced a strategic go-to-market and technology partnership designed to give clients immediate access to Groq’s inference technology, GroqCloud, on watsonx Orchestrate. It will provide clients high-speed AI inference capabilities at a cost that helps accelerate agentic AI deployment. As part of the partnership, Groq and IBM plan to integrate and enhance Red Hat open source vLLM technology with Groq’s LPU architecture. IBM Granite models are also planned to be supported on GroqCloud for IBM clients.
Enterprises moving AI agents from pilot to production still face challenges with speed, cost, and reliability, especially in mission-critical sectors like healthcare, finance, government, retail, and manufacturing. This partnership combines Groq’s inference speed, cost efficiency, and access to the latest open-source models with IBM’s agentic AI orchestration to deliver the infrastructure needed to help enterprises scale.
Toronto-based Groq was the original pioneer of the LPU (Language Processing Unit, designed for AI), in 2016, the first chip purpose-built for inference. Every design choice focuses on keeping intelligence fast and affordable. This partnership combines Groq’s inference speed, cost efficiency, and access to the latest open-source models with IBM’s agentic AI orchestration to deliver the infrastructure needed to help enterprises scale. GPU architectures are optimized for training workloads. The LPU–purpose-built hardware for inference–preserves quality while eliminating architectural bottlenecks which create latency in the first place.
Groq’s custom LPU was built to run Large Language Models (LLMs) and other models faster. Accordingly, GroqCloud delivers over 5X faster and more cost-efficient inference than traditional GPU systems. The result is consistently low latency and dependable performance, even as workloads scale globally. This is especially powerful for agentic AI in regulated industries.
For example, IBM’s healthcare clients receive thousands of complex patient questions simultaneously. With Groq, IBM’s AI agents can analyze information in real-time and deliver accurate answers immediately to enhance customer experiences and allow organizations to make faster, smarter decisions.
This technology is also being applied in non-regulated industries. IBM clients across retail and consumer packaged goods are using Groq for HR agents to help enhance automation of HR processes and increase employee productivity.
“Many large enterprise organizations have a range of options with AI inferencing when they’re experimenting, but when they want to go into production, they must ensure complex workflows can be deployed successfully to ensure high-quality experiences,” said Rob Thomas, SVP, Software and Chief Commercial Officer at IBM. “Our partnership with Groq underscores IBM’s commitment to providing clients with the most advanced technologies to achieve AI deployment and drive business value.”
Thomas stressed that that Groq’s technology was exactly what clients were looking for.
“We looked at every possibility in the market and the clients were looking for significant performance,” he said. “There’s something that changes how your call centre operates or how your supply chain runs, and then you combine that with a fraction of the cost and suddenly the economics makes sense. AI does have a cost problem and we think this breaks through that. And we think the combination of IBM and Grog can make this a reality for any company.”
“With Groq’s speed and IBM’s enterprise expertise, we’re making agentic AI real for business,” said Jonathan Ross, CEO & Founder at Groq. “Together, we’re enabling organizations to unlock the full potential of AI-driven responses with the performance needed to scale. Beyond speed and resilience, this partnership is about transforming how enterprises work with AI, moving from experimentation to enterprise-wide adoption with confidence, and opening the door to new patterns where AI can act instantly and learn continuously.”
IBM will offer access to GroqCloud’s capabilities starting immediately and the joint teams will focus on delivering the following capabilities to IBM clients. These include high speed and high-performance inference that unlocks the full potential of AI models and agentic AI, powering use cases such as customer care, employee support and productivity enhancement. In addition, security and privacy-focused AI deployment is designed to support the most stringent regulatory and security requirements, enabling effective execution of complex workflows. Finally, seamless integration with IBM’s agentic product, watsonx Orchestrate, provide clients with flexibility to adopt purpose-built agentic patterns tailored to diverse use cases.
The partnership also plans to integrate and enhance Red Hat open source vLLM technology with Groq’s LPU architecture to offer different approaches to common AI challenges developers face during inference. The solution is expected to enable watsonx to leverage capabilities in a familiar way and let customers stay in their preferred tools while accelerating inference with GroqCloud. This integration will address key AI developer needs, including inference orchestration, load balancing, and hardware acceleration, ultimately streamlining the inference process.
