1 00:00:01,040 --> 00:00:03,280 Hi there. My name is Greg Ainslie-Malik, 2 00:00:03,280 --> 00:00:05,040 and I'd like to take you on a really 3 00:00:05,040 --> 00:00:06,319 brief tour 4 00:00:06,319 --> 00:00:08,320 through Splunk's machine learning 5 00:00:08,320 --> 00:00:10,160 toolkit. 6 00:00:10,160 --> 00:00:14,240 Originally developed for what Gartner 7 00:00:14,240 --> 00:00:17,279 termed citizen data scientists, 8 00:00:17,279 --> 00:00:19,520 the machine learning toolkit presents a 9 00:00:19,520 --> 00:00:20,720 whole host of 10 00:00:20,720 --> 00:00:24,240 features for customers 11 00:00:24,240 --> 00:00:26,800 mostly focused around assistance and 12 00:00:26,800 --> 00:00:27,840 experiments 13 00:00:27,840 --> 00:00:29,519 to help users who aren't familiar with 14 00:00:29,519 --> 00:00:31,359 data science 15 00:00:31,359 --> 00:00:34,000 train and test machine learning models 16 00:00:34,000 --> 00:00:36,640 and deploy them into production. 17 00:00:36,640 --> 00:00:38,879 And most of these assistants present as 18 00:00:38,879 --> 00:00:41,600 kind of guided interfaces where you can 19 00:00:41,600 --> 00:00:44,000 input some SPL, something that our users 20 00:00:44,000 --> 00:00:46,000 are very familiar with, 21 00:00:46,000 --> 00:00:47,760 select some algorithms, do some 22 00:00:47,760 --> 00:00:49,200 pre-processing, 23 00:00:49,200 --> 00:00:50,879 things that our users are less familiar 24 00:00:50,879 --> 00:00:53,840 with, and then view a set of dashboards, a 25 00:00:53,840 --> 00:00:56,000 set of reports that tell them about 26 00:00:56,000 --> 00:00:59,840 their model's performance. 27 00:01:00,000 --> 00:01:03,359 However, what we see from the telemetry 28 00:01:03,359 --> 00:01:06,240 is that these experiments are generally 29 00:01:06,240 --> 00:01:09,439 used as almost like pseudo training to help 30 00:01:09,439 --> 00:01:13,680 users familiarize themselves with MLTK, but of 31 00:01:13,680 --> 00:01:15,840 the monthly active users, 32 00:01:15,840 --> 00:01:19,680 actually more than 95% of them run 33 00:01:19,680 --> 00:01:22,400 MLTK searches straight from the search 34 00:01:22,400 --> 00:01:23,439 bar. 35 00:01:23,439 --> 00:01:25,840 So here you can see an example of that 36 00:01:25,840 --> 00:01:27,600 where we're using the fit command 37 00:01:27,600 --> 00:01:30,799 that ships with MLTK to apply an anomaly 38 00:01:30,799 --> 00:01:32,880 detection search. 39 00:01:32,880 --> 00:01:34,720 And you can see that this is actually 40 00:01:34,720 --> 00:01:37,119 just two lines of SPL. 41 00:01:37,119 --> 00:01:40,000 So for our NOC and SOC personas, those 42 00:01:40,000 --> 00:01:41,439 who are very familiar to us 43 00:01:41,439 --> 00:01:44,720 at Splunk, this is quite a simple thing 44 00:01:44,720 --> 00:01:47,040 to do. 45 00:01:47,280 --> 00:01:50,159 Now, while the search bar and the 46 00:01:50,159 --> 00:01:52,399 experiments can help our users develop 47 00:01:52,399 --> 00:01:53,520 and deploy 48 00:01:53,520 --> 00:01:55,439 simple techniques like this for finding 49 00:01:55,439 --> 00:01:58,399 anomalies or making predictions, 50 00:01:58,399 --> 00:02:01,360 what we're starting to see is a trend 51 00:02:01,360 --> 00:02:02,079 towards 52 00:02:02,079 --> 00:02:04,479 use case focused workflows. Here we have 53 00:02:04,479 --> 00:02:07,670 one for ITSI 54 00:02:07,670 --> 00:02:08,560 where 55 00:02:08,560 --> 00:02:10,399 more complex techniques can be run 56 00:02:10,399 --> 00:02:11,840 against data without 57 00:02:11,840 --> 00:02:14,319 having to see the details of the ML 58 00:02:14,319 --> 00:02:15,760 that's being applied at all. 59 00:02:15,760 --> 00:02:17,840 So here we have a list of episodes, 60 00:02:17,840 --> 00:02:20,239 incidents in ITSI. 61 00:02:20,239 --> 00:02:24,000 Where I'm clicking on an incident, some- 62 00:02:24,000 --> 00:02:26,160 a technique called causal inference gets 63 00:02:26,160 --> 00:02:27,360 run in the background 64 00:02:27,360 --> 00:02:29,360 to determine the root cause of that 65 00:02:29,360 --> 00:02:31,040 incident, and you can see here a graph 66 00:02:31,040 --> 00:02:33,040 structure that has mapped out 67 00:02:33,040 --> 00:02:36,080 those root cause relationships, and up 68 00:02:36,080 --> 00:02:38,080 here you can see a table where 69 00:02:38,080 --> 00:02:40,400 for the service that was impacted by the 70 00:02:40,400 --> 00:02:41,200 incident, 71 00:02:41,200 --> 00:02:43,200 here are all the KPIs that are affected 72 00:02:43,200 --> 00:02:45,120 it. And I'm clicking in this, 73 00:02:45,120 --> 00:02:48,319 we can quickly drill down and see what 74 00:02:48,319 --> 00:02:50,720 the raw data looked like, 75 00:02:50,720 --> 00:02:52,400 and I could draw the conclusion that 76 00:02:52,400 --> 00:02:54,720 perhaps it was disk space used 77 00:02:54,720 --> 00:02:57,120 that was the reason behind this incident 78 00:02:57,120 --> 00:03:01,840 in this case.