While Splunk Enterprise 8.0 delivers manageability at scale, Splunk Data Fabric Search provides the ability to search humungous data sets, including ones outside Splunk in stores like AWS and Hadoop. Data Steam Processor will also speed up analysis by doing it in a stream in motion, before it gets to a Splunk Index.
LAS VEGAS – Splunk’s platform enhancements at their .conf19 event here this week were, primarily, about delivery on past promises – but they still created considerable excitement. Two of the major announcements were the general availability of Splunk Data Fabric Search [DFS] and Splunk Data Stream Processor [DSP] – both of which were announced a year ago at .conf18 in Orlando. The general availability of the flagship SIEM Splunk Enterprise 8.0 was also announced.
Josh Klahr, Splunk’s VP, Product Management for Core Products, got to introduce all these new products in the product keynote on Wednesday. He emphasized that the big news with the new Splunk Enterprise 8.0 release is about scale.
“A big piece of 8.0 is manageability at scale,” Klahr said. “We are seeing larger and larger deployments, with hundreds of thousands of Splunk nodes. It has become increasingly hard for these customers to manage content, performance and workloads.”
This focus on scalability contrasted with the focus of last year’s 7.0 release.
“Much of 7.0 was analytics, adding support for metrics and its visualization,” Klahr noted. “I almost view it as a pendulum. Last year was mainly about metrics and their visualization, and Dark Mode was the big hit. But even as people applauded Dark Mode, our Splunk Trust [Splunk’s Community MVPs] meetings last year told us not to forget about manageability at scale.”
Klahr said that 8.0’s achievements in this area were considerable.
“It was a significant engineering challenge,” he said. “In a distributed systems workload, it involved splitting query workloads across multiple nodes, making sure the data is distributed evenly. Our DSP processor also handles a challenge on the ingest side, and we use Kubernetes on the back end to scale up and down resource consumption for indexing.”
Klahr noted that the 8.0 release also migrated from Python 2.7 to 3x, and said that this was a significant shift.
“Python 2x support ends in January, so it was necessary to upgrade,” he said. “End of support means that they would stop making security patches, and those are critical for us. There was also strong customer demand for Python 3x.”
The Splunk Analytics Workspace, which provides a UI that allows monitoring and analysis of metrics without using SPL, gets an upgrade in 8.0
“You can drag and drop from the Workspace into a dashboard,” Klahr said. “The new dashboards are fully customizable, and you can drag and drop onto them and do the work in a fraction of the time.
The 5.0 version of the Machine Learning Toolkit in Splunk Enterprise has some significant enhancements too.
“I think the Machine Learning Toolkit is really the core of Splunk Enterprise, since analytics is at the heart of what Splunk provides,” Klahr said. “We have added new Smart Assistants to build workflows and detect anomalies faster. These also make machine learning more accessible to a broader set of Splunk users.”
Klahr noted that of the 100 or so customer talks at .conf19, over half related to some kind of machine learning.
“We are seeing a lot of adoption, mainly prediction and anomaly detection,” he said. “We have had support for knowledgeable customers building and applying machine learning models using SPL, but the actual process of data selection and model maintenance wasn’t provided before. It was a customer responsibility. Smart Assistants allow users to be handled through the process of building and maintaining the machine learning model.”
Klahr said this is really aimed at what today is called ‘citizen data scientists.’
“Our internal data scientist team builds industrial grade projects, and they scoff at this, but our customers aren’t doing complicated data science, and they do get insight out of this,” he stated.
Another new item is new field level access controls, which Klahr called this year’s Dark Mode – something that was very popular among attendees.
“It provides access to roles with different security privileges, so that they access the index and only see the data they need,” he said. “You define a role, like Windows Admin Analyst, and now we dynamically attach a clause to show them only the conditions they need to see. It’s a data visibility control thing, and it’s something that we never had before.”
New workload management capabilities create support for fine-grained rules and auto-remediation of runaway searches.
“You can also now kill queries by mobile phones, Klahr said. “Killing bad queries is often done when they might block completion of other queries. So admins kill them or move them to a lower priority queue.”
Data Fabric Search, which was announced a year ago and is now available after a long beta, is incredibly important. It makes data analytics much easier by quickly bringing together the capability to get insights from massive datasets in different data stores, even if they are not Splunk based. DFS provides a single view of all this data for complete visibility across the entire organization.
“It changes the game of large volume search, with the capability to cover billions or tens of billions of events in a single search,” Klahr said. “AWS and Hadoop are the two main sources outside of Splunk where this will be used now.”
While DFS was announced last year, and the magnitude of the search powers were emphasized, the ability to go into these non-Splunk sources was not stressed.
“It was not emphasized strongly last year because the feature set was further away,” Klahr indicated. “Now, the data landscape of where customers want to put data is increasingly heterogeneous. S3 has become the new Data Lake. Customers want to put data into Splunk Enterprise to monitor the data, but they want to store the raw logs in S3. And they want to be able to merge that data together to look at old data to see trends. DFS will make that a seamless experience.”
The other product now available is Splunk Data Stream Processor, a real-time stream processing solution that continuously collects high-volume data from diverse sources, analyzes it, and makes it available in milliseconds.
“Splunk DSP provides instant access to data in motion, while it’s on the fly, before it gets to a Splunk index,” Klahr aid. “Customers have been screaming that managing scalable data ingestion pipelines for Splunk is complicated. Doing analytics in a stream instead of in a Splunk index creates a scalable and easy to manage business processing tier. Customers see this as the future. The response after the General Session where this was announced was that the booth on the floor was swamped. There were lines down the hall. This will let customers run alerts faster, and in real time, which will let them find threats faster.”