0:00:01.040,0:00:03.280 Hi there. My name is Greg Ainslie-Malik, 0:00:03.280,0:00:05.040 and I'd like to take you on a really 0:00:05.040,0:00:06.319 brief tour 0:00:06.319,0:00:08.320 through Splunk's machine learning 0:00:08.320,0:00:10.160 toolkit. 0:00:10.160,0:00:14.240 Originally developed for what Gartner 0:00:14.240,0:00:17.279 termed citizen data scientists, 0:00:17.279,0:00:19.520 the machine learning toolkit presents a 0:00:19.520,0:00:20.720 whole host of 0:00:20.720,0:00:24.240 features for customers 0:00:24.240,0:00:26.800 mostly focused around assistance and 0:00:26.800,0:00:27.840 experiments 0:00:27.840,0:00:29.519 to help users who aren't familiar with 0:00:29.519,0:00:31.359 data science 0:00:31.359,0:00:34.000 train and test machine learning models 0:00:34.000,0:00:36.640 and deploy them into production. 0:00:36.640,0:00:38.879 And most of these assistants present as 0:00:38.879,0:00:41.600 kind of guided interfaces where you can 0:00:41.600,0:00:44.000 input some SPL, something that our users 0:00:44.000,0:00:46.000 are very familiar with, 0:00:46.000,0:00:47.760 select some algorithms, do some 0:00:47.760,0:00:49.200 pre-processing, 0:00:49.200,0:00:50.879 things that our users are less familiar 0:00:50.879,0:00:53.840 with, and then view a set of dashboards, a 0:00:53.840,0:00:56.000 set of reports that tell them about 0:00:56.000,0:00:59.840 their model's performance. 0:01:00.000,0:01:03.359 However, what we see from the telemetry 0:01:03.359,0:01:06.240 is that these experiments are generally 0:01:06.240,0:01:09.439 used as almost like pseudo training to help 0:01:09.439,0:01:13.680 users familiarize themselves with MLTK, but of 0:01:13.680,0:01:15.840 the monthly active users, 0:01:15.840,0:01:19.680 actually more than 95% of them run 0:01:19.680,0:01:22.400 MLTK searches straight from the search 0:01:22.400,0:01:23.439 bar. 0:01:23.439,0:01:25.840 So here you can see an example of that 0:01:25.840,0:01:27.600 where we're using the fit command 0:01:27.600,0:01:30.799 that ships with MLTK to apply an anomaly 0:01:30.799,0:01:32.880 detection search. 0:01:32.880,0:01:34.720 And you can see that this is actually 0:01:34.720,0:01:37.119 just two lines of SPL. 0:01:37.119,0:01:40.000 So for our NOC and SOC personas, those 0:01:40.000,0:01:41.439 who are very familiar to us 0:01:41.439,0:01:44.720 at Splunk, this is quite a simple thing 0:01:44.720,0:01:47.040 to do. 0:01:47.280,0:01:50.159 Now, while the search bar and the 0:01:50.159,0:01:52.399 experiments can help our users develop 0:01:52.399,0:01:53.520 and deploy 0:01:53.520,0:01:55.439 simple techniques like this for finding 0:01:55.439,0:01:58.399 anomalies or making predictions, 0:01:58.399,0:02:01.360 what we're starting to see is a trend 0:02:01.360,0:02:02.079 towards 0:02:02.079,0:02:04.479 use case focused workflows. Here we have 0:02:04.479,0:02:07.670 one for ITSI 0:02:07.670,0:02:08.560 where 0:02:08.560,0:02:10.399 more complex techniques can be run 0:02:10.399,0:02:11.840 against data without 0:02:11.840,0:02:14.319 having to see the details of the ML 0:02:14.319,0:02:15.760 that's being applied at all. 0:02:15.760,0:02:17.840 So here we have a list of episodes, 0:02:17.840,0:02:20.239 incidents in ITSI. 0:02:20.239,0:02:24.000 Where I'm clicking on an incident, some- 0:02:24.000,0:02:26.160 a technique called causal inference gets 0:02:26.160,0:02:27.360 run in the background 0:02:27.360,0:02:29.360 to determine the root cause of that 0:02:29.360,0:02:31.040 incident, and you can see here a graph 0:02:31.040,0:02:33.040 structure that has mapped out 0:02:33.040,0:02:36.080 those root cause relationships, and up 0:02:36.080,0:02:38.080 here you can see a table where 0:02:38.080,0:02:40.400 for the service that was impacted by the 0:02:40.400,0:02:41.200 incident, 0:02:41.200,0:02:43.200 here are all the KPIs that are affected 0:02:43.200,0:02:45.120 it. And I'm clicking in this, 0:02:45.120,0:02:48.319 we can quickly drill down and see what 0:02:48.319,0:02:50.720 the raw data looked like, 0:02:50.720,0:02:52.400 and I could draw the conclusion that 0:02:52.400,0:02:54.720 perhaps it was disk space used 0:02:54.720,0:02:57.120 that was the reason behind this incident 0:02:57.120,0:03:01.840 in this case.