Rancher Labs puts support for over one million Kubernetes clusters on roadmap

Today, with Rancher 2.4, the company is scaling up to go from managing a few dozen clusters in the very recent past, to 2000 today, and to a million clusters or more looking forward, as IoT edge use cases ramp up demand.

Sheng Liang, Rancher Labs’ CEO

Today Rancher Labs is announcing the general availability of the 2.4 release of their Kubernetes management platform. The release has four central themes, with the one likely to get the most attention being the addition of exponential scalability, to get from managing a few dozen clusters, to 2000 now, and with a million or more on the roadmap. In addition to the scalability, the other enhancements are Zero Downtime Maintenance, CIS scanning for improved security, and the availability of Rancher as a hosted option for their 24/7 customers.

The 2.4 announcement comes on the heels of Rancher’s Series D funding announcement two weeks back, which indicates their strong momentum within the Kubernetes management space.

“We raised $40 million in our Series D round, bringing the total funding to date to 95 million,” said Peter Smails, Rancher Labs’ CMO.

With the 2.4 release, the biggest of the four news buckets is the provision for much more massive scalability in the Rancher platform than has been needed in the past, which is being driven by edge use cases.

“We see the future as many more clusters running in many places,” Smails said. “Customers are increasingly moving to multi-cloud, and there is a lot of interest around the edge and k3s. It’s still early, and a lot of this is not in production yet. Potentially, however, it means managing millions of clusters. Rancher 2.4 puts in the architectural foundations for managing this kind of scale, and established a path to one million clusters and beyond.”

Today, with 2.4, the numbers are still far short of that. The product will now support 2,000 clusters and 100,000 nodes. That’s still a huge jump over the requirements of the very recent past.

“In our 2.0 release, no one was doing much multi-cluster management, and when we started, we supported just a few clusters,” said Sheng Liang, Rancher’s CEO. “Dozens has been the norm. We haven’t seen an interest beyond a couple hundred cluster until very recently. Even today, 2000 will be more big users, like telco, retail or automotive.”

It’s the IoT use cases that will warrant the mega-cluster support going forward, Liang noted.

“It’s all use case specific,” he said. “Kubernetes for branch offices likely won’t go beyond 10,000 clusters, because few organizations have that many branches. The million cluster support is for edge and IoT scenarios.  IoT gateways like surveillance cameras and ATM machines – that’s what driving the 1 million number.”

Those IoT environments are potentially problematic because they will have limited connectivity for maintenance.

Peter Smails, CMO at Rancher Labs

“Those environments will have limited connectivity online,” Smails said. “We are announcing limited connectivity maintenance with K3s, where the K3s clusters themselves can manage with limited or no connectivity. Limited connectivity speaks to limited ability to do upgrades, but here it is managed on the local K3s clusters.”

Rancher 2.4 also adds Zero Downtime Maintenance for RKE, providing a non-disruptive way to upgrade Kubernetes clusters and nodes without application interruption.

“This was quite a technical feat for us,” Smails said. “It lets organizations maintain their underlying Kubernetes infrastructure by managing and updating the underlying Kubernetes software without maintenance downtime.”

“This is like an HA [High Availability] feature,” Liang said. “Historically, Kubernetes  was designed for cloud native applications which were resilient, but people now are moving more mission-critical apps which require more robust infrastructure. This requires a reliable cluster upgrade strategy, and with this, when people maintain their clusters, the app does not get impacted at all.”

The third new bucket is enhanced security with CIS Benchmark Scanning, to address continued security concerns around Kubernetes containers.

CIS Scan Report in Rancher 2.4

“We addressed this issue in Rancher 2.3 with the introduction of cluster templates,” Smails said. “Rancher 2.4 adds the ability to run CIS scans, which are ad hoc security scans of their RKE clusters against over 100 CIS benchmarks from the Centre for Internet Security.” It lets custom test configurations be configured from that, and generates reports that show corrective action if they fail, to ensure the clusters meet all security requirements.

Finally, for customers who don’t want to deal with managing infrastructure, Hosted Rancher is now available as an option. Each customer gets a dedicated AWS instance of a Rancher Server management control plan, with a 99.9 percent SLA.

“Support tickets for many customers who run on-prem are related to environmental issues, rather than Rancher, and hosting it on the cloud for a customer minimizes that,” Smails said.

Platinum customers, Rancher’s highest support tier with 24/7 support, are eligible for the hosted option. Liang indicated that they make up about half of Rancher’s customer base.

“Our lower grade support today is not 24/7, and if we made those eligible for hosting it would be de facto 24/7 support, rather than business hours,” he said. “The lower grade option is for companies who are more cash-strapped or where Kubernetes is not in mission-critical use.”