WEBVTT 00:00:01.040 --> 00:00:03.280 Hi there. My name is Greg Ainslie-Malik, 00:00:03.280 --> 00:00:05.040 and I'd like to take you on a really 00:00:05.040 --> 00:00:06.319 brief tour 00:00:06.319 --> 00:00:08.320 through Splunk's machine learning 00:00:08.320 --> 00:00:10.160 toolkit. 00:00:10.160 --> 00:00:14.240 Originally developed for what Gartner 00:00:14.240 --> 00:00:17.279 termed citizen data scientists, 00:00:17.279 --> 00:00:19.520 the machine learning toolkit presents a 00:00:19.520 --> 00:00:20.720 whole host of 00:00:20.720 --> 00:00:24.240 features for customers 00:00:24.240 --> 00:00:26.800 mostly focused around assistance and 00:00:26.800 --> 00:00:27.840 experiments 00:00:27.840 --> 00:00:29.519 to help users who aren't familiar with 00:00:29.519 --> 00:00:31.359 data science 00:00:31.359 --> 00:00:34.000 train and test machine learning models 00:00:34.000 --> 00:00:36.640 and deploy them into production. 00:00:36.640 --> 00:00:38.879 And most of these assistants present as 00:00:38.879 --> 00:00:41.600 kind of guided interfaces where you can 00:00:41.600 --> 00:00:44.000 input some SPL, something that our users 00:00:44.000 --> 00:00:46.000 are very familiar with, 00:00:46.000 --> 00:00:47.760 select some algorithms, do some 00:00:47.760 --> 00:00:49.200 pre-processing, 00:00:49.200 --> 00:00:50.879 things that our users are less familiar 00:00:50.879 --> 00:00:53.840 with, and then view a set of dashboards, a 00:00:53.840 --> 00:00:56.000 set of reports that tell them about 00:00:56.000 --> 00:00:59.840 their model's performance. 00:01:00.000 --> 00:01:03.359 However, what we see from the telemetry 00:01:03.359 --> 00:01:06.240 is that these experiments are generally 00:01:06.240 --> 00:01:09.439 used as almost like pseudo training to help 00:01:09.439 --> 00:01:13.680 users familiarize themselves with MLTK, but of 00:01:13.680 --> 00:01:15.840 the monthly active users, 00:01:15.840 --> 00:01:19.680 actually more than 95% of them run 00:01:19.680 --> 00:01:22.400 MLTK searches straight from the search 00:01:22.400 --> 00:01:23.439 bar. 00:01:23.439 --> 00:01:25.840 So here you can see an example of that 00:01:25.840 --> 00:01:27.600 where we're using the fit command 00:01:27.600 --> 00:01:30.799 that ships with MLTK to apply an anomaly 00:01:30.799 --> 00:01:32.880 detection search. 00:01:32.880 --> 00:01:34.720 And you can see that this is actually 00:01:34.720 --> 00:01:37.119 just two lines of SPL. 00:01:37.119 --> 00:01:40.000 So for our NOC and SOC personas, those 00:01:40.000 --> 00:01:41.439 who are very familiar to us 00:01:41.439 --> 00:01:44.720 at Splunk, this is quite a simple thing 00:01:44.720 --> 00:01:47.040 to do. 00:01:47.280 --> 00:01:50.159 Now, while the search bar and the 00:01:50.159 --> 00:01:52.399 experiments can help our users develop 00:01:52.399 --> 00:01:53.520 and deploy 00:01:53.520 --> 00:01:55.439 simple techniques like this for finding 00:01:55.439 --> 00:01:58.399 anomalies or making predictions, 00:01:58.399 --> 00:02:01.360 what we're starting to see is a trend 00:02:01.360 --> 00:02:02.079 towards 00:02:02.079 --> 00:02:04.479 use case focused workflows. Here we have 00:02:04.479 --> 00:02:07.670 one for ITSI 00:02:07.670 --> 00:02:08.560 where 00:02:08.560 --> 00:02:10.399 more complex techniques can be run 00:02:10.399 --> 00:02:11.840 against data without 00:02:11.840 --> 00:02:14.319 having to see the details of the ML 00:02:14.319 --> 00:02:15.760 that's being applied at all. 00:02:15.760 --> 00:02:17.840 So here we have a list of episodes, 00:02:17.840 --> 00:02:20.239 incidents in ITSI. 00:02:20.239 --> 00:02:24.000 Where I'm clicking on an incident, some- 00:02:24.000 --> 00:02:26.160 a technique called causal inference gets 00:02:26.160 --> 00:02:27.360 run in the background 00:02:27.360 --> 00:02:29.360 to determine the root cause of that 00:02:29.360 --> 00:02:31.040 incident, and you can see here a graph 00:02:31.040 --> 00:02:33.040 structure that has mapped out 00:02:33.040 --> 00:02:36.080 those root cause relationships, and up 00:02:36.080 --> 00:02:38.080 here you can see a table where 00:02:38.080 --> 00:02:40.400 for the service that was impacted by the 00:02:40.400 --> 00:02:41.200 incident, 00:02:41.200 --> 00:02:43.200 here are all the KPIs that are affected 00:02:43.200 --> 00:02:45.120 it. And I'm clicking in this, 00:02:45.120 --> 00:02:48.319 we can quickly drill down and see what 00:02:48.319 --> 00:02:50.720 the raw data looked like, 00:02:50.720 --> 00:02:52.400 and I could draw the conclusion that 00:02:52.400 --> 00:02:54.720 perhaps it was disk space used 00:02:54.720 --> 00:02:57.120 that was the reason behind this incident 00:02:57.120 --> 00:03:01.840 in this case.