VMware Private AI Foundation with NVIDIA is built on VMware Cloud Foundation, optimized for AI and will run fully secure Generative AI applications.
LAS VEGAS – On Tuesday, the big product announcements from the VMware Explore event were unfurled here, with basically all of them revolving around two key themes: multi-cloud and Generative AI. The highlight of the Generative AI news was the announcement of VMware Private AI foundation, the latest fruit from the partnership between VMware and NVIDIA, and the product of four years of joint research from the two companies.
“We are excited to announce VMware Private AI Foundation with NVIDIA, which is a giant offering which NVIDIA and VMware have put together,” said Raghu Raghuram, VMware’s CEO, during the event’s opening keynote. “Private AI is an architectural approach which balances business gains from AI with privacy and compliance needs.”
The joint offering is built on VMware Cloud Foundation and optimized for AI. It will let enterprises customize models and run generative AI applications, including intelligent chatbots, assistants, search and summarization.
“With Private AI Foundation, enterprises will be able to do private AI in multi cloud environments and know that it is fully secure,” said Jensen Huang, NVIDIA’s CEO, who joined Raghuram live onstage for the announcement. “This ability to train and fine tune models, will lead to the reinvention of the future of enterprise computing. It should be easy peasy.”
“This is the most transformational event of our lifetimes, stated Justin Boitano, VP of Enterprise AI at NVIDIA. “We see AI being infused into every firm over the next decade.”
Paul Turner, Vice President for Product Management vSphere at VMware, stressed that Private AI overcomes several issues which have retarded the growth of this technology in the past.
“There have been several issues to overcome,” he said. “First, how do we manage data so that you have confidence in it. The second is cost and complexity issues, which we address by building from open base models and just focus on fine tuning. And the third has been performance.” Private AI Foundation runs on NVIDIA accelerated infrastructure to provide performance equal to and even exceeding bare metal in some use cases.
“Generative AI is a new class of workloads which is unique in some ways, such as the scale-out workloads,” said Kit Colbert, VMware’s CTO. “It may need quite a bit of hardware and compute. We see that as a huge opportunity for us to increase the footprint we have while improving customers’ communication.”
Colbert also stressed that with this, VMware is focused on avoiding broad industry mistakes around cloud in the past.
“AI is another type of workload that we have to support in a multi-cloud environment,” he said. “How do we architect them and how do we learn from the past. Initially, with Cloud First, cloud architectures were single cloud. This led to Cloud Chaos. This time, we need to architect things smarter from the beginning. We need to skip the Cloud Chaos phase and go straight to Cloud Smart.”
“VMware Private AI foundation with NVIDIA is a full stack platform for private Generative AI with toolkits,” Turner emphasized
“Private AI is critical to protect your data, and fine tuning a large language model will run most efficiently through NVIDIA NeMo and Triton,” Boitano added.
Enterprises will also have a wide choice about where to build and run their models, including the NVIDIA NeMo cloud native enterprise services toolkit to Meta’s Llama 2 toolkit, as well as leading OEM hardware configurations and, in the future, public cloud and service provider offerings.
“AI in the past has been very use case specific, where you had to build a model for each use case,” Raghuram said, noting that this drove up costs exponentially. “With Generative AI and its foundation Large Language Models they have general applicability, not just to a specific use case.”
Out of the gate, VMware Private AI Foundation with NVIDIA will be supported by Dell Technologies, Hewlett Packard Enterprise and Lenovo — which will be among the first to offer systems that supercharge enterprise LLM customization and inference workloads with NVIDIA L40S GPUs, NVIDIA BlueField-3 DPUs and NVIDIA ConnectX-7 SmartNICs. But these are just the first, with many more offerings from many more OEMs coming on their heels.
“We are announcing a broad system of OEMs, and we expect over 100 servers by the end of the year, running on over 20 OEMs.” Boitano said.
Pricing will be GPU-based. Reference architectures are immediately available, with availability of the full integrated product scheduled for early 2024. It will be available through the MSP program as well.