1
00:00:01,040 --> 00:00:03,280
Hi there. My name is Greg Ainslie-Malik,

2
00:00:03,280 --> 00:00:05,040
and I'd like to take you on a really

3
00:00:05,040 --> 00:00:06,319
brief tour

4
00:00:06,319 --> 00:00:08,320
through Splunk's machine learning

5
00:00:08,320 --> 00:00:10,160
toolkit.

6
00:00:10,160 --> 00:00:14,240
Originally developed for what Gartner

7
00:00:14,240 --> 00:00:17,279
termed citizen data scientists,

8
00:00:17,279 --> 00:00:19,520
the machine learning toolkit presents a

9
00:00:19,520 --> 00:00:20,720
whole host of

10
00:00:20,720 --> 00:00:24,240
features for customers

11
00:00:24,240 --> 00:00:26,800
mostly focused around assistance and

12
00:00:26,800 --> 00:00:27,840
experiments

13
00:00:27,840 --> 00:00:29,519
to help users who aren't familiar with

14
00:00:29,519 --> 00:00:31,359
data science

15
00:00:31,359 --> 00:00:34,000
train and test machine learning models

16
00:00:34,000 --> 00:00:36,640
and deploy them into production.

17
00:00:36,640 --> 00:00:38,879
And most of these assistants present as

18
00:00:38,879 --> 00:00:41,600
kind of guided interfaces where you can

19
00:00:41,600 --> 00:00:44,000
input some SPL, something that our users

20
00:00:44,000 --> 00:00:46,000
are very familiar with,

21
00:00:46,000 --> 00:00:47,760
select some algorithms, do some

22
00:00:47,760 --> 00:00:49,200
pre-processing,

23
00:00:49,200 --> 00:00:50,879
things that our users are less familiar

24
00:00:50,879 --> 00:00:53,840
with, and then view a set of dashboards, a

25
00:00:53,840 --> 00:00:56,000
set of reports that tell them about

26
00:00:56,000 --> 00:00:59,840
their model's performance.

27
00:01:00,000 --> 00:01:03,359
However, what we see from the telemetry

28
00:01:03,359 --> 00:01:06,240
is that these experiments are generally

29
00:01:06,240 --> 00:01:09,439
used as almost like pseudo training to help

30
00:01:09,439 --> 00:01:13,680
users familiarize themselves with MLTK, but of

31
00:01:13,680 --> 00:01:15,840
the monthly active users,

32
00:01:15,840 --> 00:01:19,680
actually more than 95% of them run

33
00:01:19,680 --> 00:01:22,400
MLTK searches straight from the search

34
00:01:22,400 --> 00:01:23,439
bar.

35
00:01:23,439 --> 00:01:25,840
So here you can see an example of that

36
00:01:25,840 --> 00:01:27,600
where we're using the fit command

37
00:01:27,600 --> 00:01:30,799
that ships with MLTK to apply an anomaly

38
00:01:30,799 --> 00:01:32,880
detection search.

39
00:01:32,880 --> 00:01:34,720
And you can see that this is actually

40
00:01:34,720 --> 00:01:37,119
just two lines of SPL.

41
00:01:37,119 --> 00:01:40,000
So for our NOC and SOC personas, those

42
00:01:40,000 --> 00:01:41,439
who are very familiar to us

43
00:01:41,439 --> 00:01:44,720
at Splunk, this is quite a simple thing

44
00:01:44,720 --> 00:01:47,040
to do.

45
00:01:47,280 --> 00:01:50,159
Now, while the search bar and the

46
00:01:50,159 --> 00:01:52,399
experiments can help our users develop

47
00:01:52,399 --> 00:01:53,520
and deploy

48
00:01:53,520 --> 00:01:55,439
simple techniques like this for finding

49
00:01:55,439 --> 00:01:58,399
anomalies or making predictions,

50
00:01:58,399 --> 00:02:01,360
what we're starting to see is a trend

51
00:02:01,360 --> 00:02:02,079
towards

52
00:02:02,079 --> 00:02:04,479
use case focused workflows. Here we have

53
00:02:04,479 --> 00:02:07,670
one for ITSI

54
00:02:07,670 --> 00:02:08,560
where

55
00:02:08,560 --> 00:02:10,399
more complex techniques can be run

56
00:02:10,399 --> 00:02:11,840
against data without

57
00:02:11,840 --> 00:02:14,319
having to see the details of the ML

58
00:02:14,319 --> 00:02:15,760
that's being applied at all.

59
00:02:15,760 --> 00:02:17,840
So here we have a list of episodes,

60
00:02:17,840 --> 00:02:20,239
incidents in ITSI.

61
00:02:20,239 --> 00:02:24,000
Where I'm clicking on an incident, some-

62
00:02:24,000 --> 00:02:26,160
a technique called causal inference gets

63
00:02:26,160 --> 00:02:27,360
run in the background

64
00:02:27,360 --> 00:02:29,360
to determine the root cause of that

65
00:02:29,360 --> 00:02:31,040
incident, and you can see here a graph

66
00:02:31,040 --> 00:02:33,040
structure that has mapped out

67
00:02:33,040 --> 00:02:36,080
those root cause relationships, and up

68
00:02:36,080 --> 00:02:38,080
here you can see a table where

69
00:02:38,080 --> 00:02:40,400
for the service that was impacted by the

70
00:02:40,400 --> 00:02:41,200
incident,

71
00:02:41,200 --> 00:02:43,200
here are all the KPIs that are affected

72
00:02:43,200 --> 00:02:45,120
it. And I'm clicking in this,

73
00:02:45,120 --> 00:02:48,319
we can quickly drill down and see what

74
00:02:48,319 --> 00:02:50,720
the raw data looked like,

75
00:02:50,720 --> 00:02:52,400
and I could draw the conclusion that

76
00:02:52,400 --> 00:02:54,720
perhaps it was disk space used

77
00:02:54,720 --> 00:02:57,120
that was the reason behind this incident

78
00:02:57,120 --> 00:03:01,840
in this case.