Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK

Rollback to version 2

0:00 - 0:02

Okay. In this video, we'll be discussing
0:02 - 0:05

about how we can implement linear
0:05 - 0:08

regression in Splunk MLTK, okay? So
0:08 - 0:10

in my previous video, we have seen how we
0:10 - 0:13

can install Splunk MLTK and it's
0:13 - 0:15

related packages, right? And also if you
0:15 - 0:18

remember when I was discussing about the
0:18 - 0:21

machine learning core algorithm, I was
0:21 - 0:26

also introduced the core data set we'll
0:26 - 0:27

be using for our linear regression
0:27 - 0:29

modeling, okay?
0:29 - 0:31

That's the graduate admission dataset
0:31 - 0:35

where we have for various students we
0:35 - 0:37

have their GRE score, TOEFL score,
0:37 - 0:40

university rating, statement of purpose
0:40 - 0:41

rating okay,
0:41 - 0:45

reference rating CGPA their whether
0:45 - 0:47

they have done research or not based on
0:47 - 0:51

all these fails we we will try to
0:51 - 0:54

predict the chances of admit okay so now
0:54 - 0:59

to implement linear regression so we
0:59 - 1:02

will be we will be implementing linear
1:02 - 1:05

regression for this one and see how best
1:05 - 1:07

the model is fitting the particular data
1:07 - 1:11

okay so so to implement linear
1:11 - 1:13

regression what you have to do you have
1:13 - 1:15

to go to a Splunk machine learning
1:15 - 1:20

toolkit okay as I stated before the
1:20 - 1:21

landing page of the machine learning
1:21 - 1:24

toolkit app is this showcased - boot
1:24 - 1:27

right where it has basically a lot of
1:27 - 1:30

examples based on whatever the different
1:30 - 1:32

different algorithm machine learning
1:32 - 1:35

algorithms plung supports okay now to
1:35 - 1:39

implement the machine learning on your
1:39 - 1:42

own data set what introduce you to come
1:42 - 1:47

to experiments tab ok so now if you do
1:47 - 1:50

not have any other models or if it is
1:50 - 1:52

the first time you are coming to this
1:52 - 1:54

particular dashboard this will be the
1:54 - 1:56

default view okay but if you have
1:56 - 1:58

already experimented on different models
1:58 - 2:00

the view will be slightly different
2:00 - 2:03

which we'll see later ok so now as in
2:03 - 2:06

linear regression we are trying to do a
2:06 - 2:08

prediction on the numeric fields right
2:08 - 2:11

so we will go over here okay
2:11 - 2:13

the predict numeric field
2:13 - 2:15

we'll creaking over here now it is
2:15 - 2:18

asking me for an experiment title and a
2:18 - 2:22

description so I will say graduate
2:23 - 2:30

admission prediction let's give the exam
2:30 - 2:33

experiment title like this one
2:33 - 2:37

prediction okay now you can give some
2:37 - 2:38

description as well meaningful
2:38 - 2:44

description so I'll click on create okay
2:44 - 2:47

so now this particular view comes up
2:47 - 2:50

over here now if you see here here we
2:50 - 2:52

have two tabs experiment settings and
2:52 - 2:55

experiment history initially the
2:55 - 2:56

experiment history will be blank
2:56 - 2:59

there is nothing over here okay now
2:59 - 3:01

based on the experiment settings
3:01 - 3:03

experiment history will be updated
3:03 - 3:06

accordingly which will see it later okay
3:06 - 3:08

now the first thing is it is asking me
3:08 - 3:13

for a search right so now let me let me
3:13 - 3:16

show you the data so this this
3:16 - 3:20

particular data I already indexed in my
3:20 - 3:24

main index okay so I'll just write the
3:24 - 3:30

query index equals to main and just
3:30 - 3:33

abling it all my different different
3:33 - 3:39

features and chances of admin ok so this
3:39 - 3:42

is my data set so this data set I will
3:42 - 3:45

be using it for my training purpose not
3:45 - 3:47

the full data set all not all the 500
3:47 - 3:50

records maybe some of the data I will be
3:50 - 3:53

using it for training purpose and rest
3:53 - 3:55

of the data I will be using it for the
3:55 - 3:57

prediction purpose just to see how my
3:57 - 3:59

model is working ok so I'll give this
3:59 - 4:06

query over here and then I'll click on
4:06 - 4:11

search ok so by default it is showing me
4:11 - 4:14

this is the my data initial data preview
4:14 - 4:18

right now let's go to the next one so
4:18 - 4:20

here if you see there are a lot of
4:20 - 4:23

pre-processing steps over here right so
4:23 - 4:27

now in machine learning when you train
4:27 - 4:30

particular model right you you there is
4:30 - 4:32

it there may be some meat to pre-process
4:32 - 4:35

there data so that you will reduce lot
4:35 - 4:37

of noise from the data now there are a
4:37 - 4:38

lot of people searching algorithm
4:38 - 4:41

present over there so when we will
4:41 - 4:44

discuss those algorithms we'll come back
4:44 - 4:47

to this page again and work on it okay
4:47 - 4:50

so for now I will not be doing any kind
4:50 - 4:52

of pre-processing because this data is
4:52 - 4:56

clean enough data okay so now the
4:56 - 4:59

algorithm I will be choosing linear
4:59 - 5:00

regression now there are a lot of
5:00 - 5:02

regulation algorithm so currently we
5:02 - 5:05

studied only about regression and we'll
5:05 - 5:07

be implementing linear regression only
5:07 - 5:09

in this video so I will be choosing the
5:09 - 5:11

linear regression over here okay so now
5:11 - 5:14

fields to predict that means which feel
5:14 - 5:16

you want to predict so as I will be
5:16 - 5:18

predicting my chances of at Mynt I will
5:18 - 5:22

be choosing that and then fill the use
5:22 - 5:24

for predicting that means here basically
5:24 - 5:27

you are choosing your features right so
5:27 - 5:30

I will be choosing all my columns so
5:30 - 5:34

here if you see the concept of simple
5:34 - 5:36

linear regression and multiple linear
5:36 - 5:38

regression comes up right if I choose a
5:38 - 5:41

single feature it will become a simple
5:41 - 5:43

linear regression if I choose multiple
5:43 - 5:45

feature it will become a multiple linear
5:45 - 5:48

regression so for now I will be choosing
5:48 - 5:52

for all okay now here if you see the
5:52 - 5:55

split for training right so here
5:55 - 5:57

basically it is what is happening is you
5:57 - 6:00

are splitting the whole dataset between
6:00 - 6:03

a training and test data set there here
6:03 - 6:04

currently it is 50 percent 50 percent
6:04 - 6:07

that means the first 50 percent data
6:07 - 6:09

will be used for training and the rest
6:09 - 6:11

50 percent data will be used for testing
6:11 - 6:15

purpose I'll slide this one it goes like
6:15 - 6:19

this one I'll keep 70 and 30 okay now
6:19 - 6:23

fit intercept okay that means if you
6:23 - 6:25

remember from my machine learning video
6:25 - 6:30

not only we have the slope value for
6:30 - 6:33

each and every feature we also have ay
6:33 - 6:36

intercept of y axis intercept value
6:36 - 6:39

right so by this option you are
6:39 - 6:41

basically choosing
6:41 - 6:44

your model should include an implicit
6:44 - 6:47

intercept thumbs or not okay now notes
6:47 - 6:49

you can you can give some meaningful
6:49 - 6:52

notes maybe the notes could be like what
6:52 - 6:53

are the fields you are using for
6:53 - 6:56

prediction purpose so and some
6:56 - 6:58

meaningful note which will be useful in
6:58 - 7:00

later when we'll see the history of the
7:00 - 7:05

model okay so I will say using all the
7:05 - 7:11

features using all the features okay now
7:11 - 7:13

after all is done you need to click on
7:13 - 7:17

freak model so it's basically behind the
7:17 - 7:20

scene what it do it runs say Splunk
7:20 - 7:22

custom command which basically
7:22 - 7:25

implemented I think is kick it learn so
7:25 - 7:28

using that particular command it is
7:28 - 7:30

trying to come up with the equation of
7:30 - 7:32

that line right which we discussed
7:32 - 7:37

before and and if you remember from my
7:37 - 7:39

multiple linear regression video we come
7:39 - 7:42

up with a linear algebra solution over
7:42 - 7:46

there right with math matrix inversion
7:46 - 7:49

and matrix transpose right so behind the
7:49 - 7:50

scene it is doing the same thing over
7:50 - 7:51

there okay
7:51 - 7:54

so now if you see the result came up
7:54 - 7:56

right after clicking on the fit model
7:56 - 8:00

now if you see apart from our own data
8:00 - 8:04

it's actually added two new columns over
8:04 - 8:06

here one is the predicted chances of
8:06 - 8:09

admin and the residual column right now
8:09 - 8:11

predicted chance of admin is actually
8:11 - 8:13

the actual prediction happen on the data
8:13 - 8:17

right so if you see for the first row
8:17 - 8:20

the actual chance of add bit is 0.7
8:20 - 8:23

three that means 73% now the predicted
8:23 - 8:26

was 0.7 zero that means 70 percent so
8:26 - 8:28

the residual column is the difference
8:28 - 8:30

between the actual chance of admin had
8:30 - 8:34

the predicted chance of an MIT okay so
8:34 - 8:37

so this is how after fitting the model
8:37 - 8:40

it it came up with this kind of
8:40 - 8:40

visualization
8:40 - 8:44

it also shows up there are other five to
8:44 - 8:46

six charts over here okay now let us
8:46 - 8:49

discuss one by one this one the first
8:49 - 8:52

chart show me the actual versus
8:52 - 8:54

predicted line chart that means
8:54 - 8:57

if you see the chance of admit the blue
8:57 - 9:00

color graph is the actual one and the
9:00 - 9:02

predicted chance of Edmund the yellow
9:02 - 9:04

color one is the prediction one right
9:04 - 9:07

and if you see by seeing this one we can
9:07 - 9:10

at least see this particular model is
9:10 - 9:13

okay fit to this particular data
9:13 - 9:15

somewhere it is lagging over here if you
9:15 - 9:18

see it right but somehow it's it's
9:18 - 9:22

actually fitting good over there now the
9:22 - 9:24

residual chart whatever you are seeing
9:24 - 9:26

it over here the line chart it is
9:26 - 9:29

showing up over here okay so now the
9:29 - 9:32

more this that particular chart is um
9:32 - 9:35

close to zero that means the model is
9:35 - 9:37

fitting really really good but over here
9:37 - 9:40

if you see the latter part of this one
9:40 - 9:43

the residuals are more right because it
9:43 - 9:46

is more sparse more distance from the
9:46 - 9:48

zeroth line and the same thing is
9:48 - 9:50

reflecting over here as well the model
9:50 - 9:53

has some kind of lagging over here right
9:53 - 9:56

so so this kind of analysis you can do
9:56 - 9:59

it from there how the model is fitting
9:59 - 10:02

your data and this particular graph is
10:02 - 10:04

showing me the scatter plot of the
10:04 - 10:06

actual and the predicted one and here
10:06 - 10:09

basically you can see the how the line
10:09 - 10:12

is fitting your data over here through
10:12 - 10:16

this chart okay now it also provides say
10:16 - 10:20

residual histogram where let us
10:20 - 10:22

understand this one as well so what we
10:22 - 10:24

have the zeroth line over here if you
10:24 - 10:28

see it's basically shows up for each and
10:28 - 10:30

every residual value how many counts are
10:30 - 10:33

there if you see so if you if you just
10:33 - 10:37

think about it if for all my data points
10:37 - 10:41

this residual is zero that that's the
10:41 - 10:43

ideal scenario right that means I am
10:43 - 10:45

predicting the LED level right
10:45 - 10:49

so from this histogram if you see that
10:49 - 10:52

means if you see the residual error
10:52 - 10:54

equals to zero the sample count is 24
10:54 - 10:57

over here right if the more and more
10:57 - 11:00

samples are very close to this zero that
11:00 - 11:04

means my model is doing good that's it's
11:04 - 11:06

actually good fit model and if it is
11:06 - 11:08

more sparse if
11:08 - 11:12

that means if we have more number of big
11:12 - 11:14

lines over here that means somehow the
11:14 - 11:16

model is not good not a good fit for
11:16 - 11:18

that particular data so this kind of
11:18 - 11:20

interpretation you can do it from this
11:20 - 11:24

particular diagram okay so now there are
11:24 - 11:27

another two things over here is called R
11:27 - 11:30

squared statistic and root mean square
11:30 - 11:33

okay so these two are actually a measure
11:33 - 11:37

about how accurate the model is okay so
11:37 - 11:41

I'll be discussing this measurement in
11:41 - 11:43

very detail in in in separate video
11:43 - 11:45

there will be discussing about R square
11:45 - 11:47

statistic root mean square and also some
11:47 - 11:51

other way to determine how the how the
11:51 - 11:53

accurate the model is just like bias
11:53 - 11:55

variance there are a lot of other
11:55 - 11:56

measurement as well
11:56 - 11:59

we'll discuss in detail over there okay
11:59 - 12:00

but for now just just try to remember
12:00 - 12:03

like this is the fit measurement of fit
12:03 - 12:05

like it may be R squares I just say we
12:05 - 12:09

can think of it is more close to 1 it's
12:09 - 12:11

a good fit something like this okay
12:11 - 12:15

mmm so we will see like how how to best
12:15 - 12:18

judge a model based on that okay but
12:18 - 12:20

still like even for our square statistic
12:20 - 12:25

it's all depend on the context the field
12:25 - 12:28

you are solving you're implementing
12:28 - 12:30

linear regression as well we'll discuss
12:30 - 12:32

those stuff as well in future ok and now
12:32 - 12:35

if you see the last graph it is showing
12:35 - 12:38

me the model parameters if you remember
12:38 - 12:40

the big equation we have written it over
12:40 - 12:43

there right so let me open the bamboo
12:43 - 12:56

paper here if you remember when we
12:56 - 13:00

talked about multiple linear regression
13:02 - 13:05

we defined we started our discussion
13:05 - 13:07

with a big equation right so let me go
13:07 - 13:11

back go back over there
13:21 - 13:26

yes so this one right so we're beta 1
13:26 - 13:30

beta 2 2 beta P is our our slow value
13:30 - 13:31

coefficient of each and every feature
13:31 - 13:35

and beta 0 is my intercept right and
13:35 - 13:37

what what we are doing basically at the
13:37 - 13:39

end of the day we came up with a big
13:39 - 13:42

equation to determine this whole beta
13:42 - 13:46

vector right so this is the same stuff
13:46 - 13:49

over here it is representing so it is
13:49 - 13:51

basically giving me like for each and
13:51 - 13:53

every feature what is the coefficient
13:53 - 13:57

value okay so and the intercept value as
13:57 - 14:00

well if you see this is my beta 0 and my
14:00 - 14:02

beta 1 to beta P is this these guys
14:02 - 14:04

other guys now if you see it closely
14:04 - 14:07

there are some of the coefficient which
14:07 - 14:10

have very greater value some of the
14:10 - 14:11

coefficient which are very less value
14:11 - 14:14

over here like the way to interpret the
14:14 - 14:19

coefficient is like how much it is
14:19 - 14:22

influencing influencing the end result
14:22 - 14:25

so to understand that let us see this
14:25 - 14:29

one let's say I have a variable called X
14:29 - 14:34

and I am writing something like 0.9 Y
14:34 - 14:36

now what do I mean by this particular
14:36 - 14:41

equation by 9 into Y right so that means
14:41 - 14:45

if I if I give y equals to 1 that means
14:45 - 14:50

my x will become 0.9 right so what do
14:50 - 14:52

you mean by that that means one unit
14:52 - 14:53

change in Y
14:53 - 14:56

it's basically 0.9 new it we are
14:56 - 15:01

changing in X right so this kind of
15:01 - 15:03

interpretation you can do it so that
15:03 - 15:09

means how Y is influencing X right so
15:09 - 15:11

this is how we are interpreting this
15:11 - 15:14

kind of coefficients as well in linear
15:14 - 15:17

regression so that means we will know
15:17 - 15:19

from the coefficient itself which
15:19 - 15:22

particular feature is mostly influencing
15:22 - 15:24

that one and now if you see it over here
15:24 - 15:24

I think
15:24 - 15:27

CGP is the most influencing factor to
15:27 - 15:31

determine whether my chances of admin
15:31 - 15:32

admit is higher
15:32 - 15:36

or not right considering we are
15:36 - 15:38

implementing a linear regression there
15:38 - 15:40

could be a better fit of this particular
15:40 - 15:43

data which we need to experiment and see
15:43 - 15:45

but the forint for the current linear
15:45 - 15:47

regression implementation we we can
15:47 - 15:50

conclude this kind of stuff over here
15:50 - 15:58

right okay so so this is how the model
15:58 - 16:02

parameters summary visualization table
16:02 - 16:03

visualization is telling me those
16:03 - 16:06

different those details right so now if
16:06 - 16:09

you see we actually fit our model right
16:09 - 16:12

so we still not clear that our model
16:12 - 16:14

until analyst we are saving it that's
16:14 - 16:17

why it is showing me as a drop status of
16:17 - 16:22

your model right and you can now go to
16:22 - 16:27

experiment history to see what you have
16:27 - 16:29

done till now so it will be maintaining
16:29 - 16:32

a history over there so now I can see
16:32 - 16:36

using using this all these features my R
16:36 - 16:39

square statistic is somewhere around 78%
16:39 - 16:41

and these are my coefficient and I am
16:41 - 16:43

coming up with a conclusion that maybe
16:43 - 16:46

CGPA is the most influential factor over
16:46 - 16:49

here okay so let us do another
16:49 - 16:53

experiment okay so in here I'll keep my
16:53 - 16:57

CGPA over here just to see whether it is
16:57 - 17:00

actually true or not okay so now what I
17:00 - 17:03

will do here is I will keep cgpa
17:03 - 17:06

I'll give the stat of 5:1 I will keep
17:06 - 17:09

the I will keep the yellower okay I'll
17:09 - 17:14

keep the research 1 and I will keep the
17:14 - 17:19

GRE score okay so I'll click over here
17:19 - 17:22

again I will keep the GRE score I will
17:22 - 17:24

remove the TOEFL score I will remove the
17:24 - 17:26

university rating I will remove the SOP
17:26 - 17:29

CGPA research and a lower I will keep so
17:29 - 17:32

now I am trying to do this experiment
17:32 - 17:35

with four features which I am thinking
17:35 - 17:40

maybe most influential one so maybe the
17:40 - 17:43

other feature may not have much impact
17:43 - 17:46

on on this particular prediction okay
17:46 - 17:50

so now using only I'll keep a note using
17:50 - 17:55

only four features so this is how this
17:55 - 17:58

particular note is coming into hand you
17:58 - 18:01

over here right so it is when I will see
18:01 - 18:03

the history I will come to know what I
18:03 - 18:05

have done over there okay so I will
18:05 - 18:09

click on fit model again let's see how
18:09 - 18:14

it's how it's working now so similar
18:14 - 18:15

stuff is happening over there it's
18:15 - 18:19

running the that custom comments in in
18:19 - 18:22

in later videos we will discuss in
18:22 - 18:24

detail of this this customs command
18:24 - 18:34

custom command as well okay so now if
18:34 - 18:38

you see it again predicted that one now
18:38 - 18:41

if you see from the actual versus line
18:41 - 18:43

chart it's more or less keeping same
18:43 - 18:45

even though I removed three features
18:45 - 18:48

right even this one has well more or
18:48 - 18:48

less
18:48 - 18:52

okay now if you see my R square
18:52 - 18:54

statistics has improved a lot with 82%
18:54 - 18:58

right so by this one at least I am
18:58 - 19:01

confident that really those three
19:01 - 19:04

features are not not impacting much of
19:04 - 19:08

it and if you see from this one residual
19:08 - 19:11

histogram residuals histogram the more
19:11 - 19:13

and more features are very close to zero
19:13 - 19:16

right with residual or residual either
19:16 - 19:18

more or more receivers are very very
19:18 - 19:20

close to zero right
19:20 - 19:23

so by this kind of analysis we can say
19:23 - 19:25

this particular model is better than
19:25 - 19:28

compared to my previous model right so
19:28 - 19:31

now what I will do is I will save this
19:31 - 19:33

particular model okay so I will say I
19:33 - 19:38

will give the experiment title as
19:39 - 19:47

graduate date predictor okay I will
19:47 - 19:51

click on save so now a data a model will
19:51 - 19:55

be created okay so now if I just we have
19:55 - 19:56

two options over here after you save the
19:56 - 19:58

model whether you have two you can go to
19:58 - 20:00

the listing page
20:00 - 20:02

or you continue editing okay let us
20:02 - 20:04

continue editing to see how experiment
20:04 - 20:06

history is looking now now experiment
20:06 - 20:09

history has two rows over there okay
20:09 - 20:12

the first row is my the current
20:12 - 20:14

experiment with my four features right
20:14 - 20:18

with R square value of 82% the second
20:18 - 20:20

row is telling with my older one right
20:20 - 20:22

so at any point of time you can load
20:22 - 20:24

this corresponding settings and
20:24 - 20:26

experiment with it okay
20:26 - 20:27

it will also show you the data
20:27 - 20:30

corresponding to H and XP experiment
20:30 - 20:32

okay so now let's go back to our
20:32 - 20:35

experiment tab and see what is happening
20:35 - 20:37

over there okay now if you see my
20:37 - 20:40

experiment tab it's not showing with
20:40 - 20:43

those big big blocks right and it is
20:43 - 20:44

showing with this kind of view where I
20:44 - 20:47

have a predict numeric fills a single
20:47 - 20:50

experiment I have done I have given the
20:50 - 20:53

experiment name like this one right it
20:53 - 20:54

the algorithm I have chosen linear
20:54 - 20:56

regression there are lot of actions you
20:56 - 20:59

can do on this particular model so
20:59 - 21:01

before publishing let us talked about
21:01 - 21:03

that one okay you can create an alert
21:03 - 21:07

from this model just to see so suppose
21:07 - 21:10

the model is predicting data right so
21:10 - 21:12

you can choose an alert create an alert
21:12 - 21:15

something like when my predicted chance
21:15 - 21:16

of administrator at the 90 percent that
21:16 - 21:20

means 0.9 okay fine 99 maybe that means
21:20 - 21:23

the model is really really working good
21:23 - 21:25

over there right so this kind of alert
21:25 - 21:30

you can do okay next you can edit the
21:30 - 21:32

title and description it's a simple
21:32 - 21:36

enough now you can see scheduler
21:36 - 21:37

training this is an interesting feature
21:37 - 21:41

where we whatever we have done till now
21:41 - 21:43

we have done a manual training over
21:43 - 21:45

there right now in the scheduled
21:45 - 21:47

training feature that you can create a
21:47 - 21:49

scheduler which will run it training
21:49 - 21:52

based on the data now if you see there
21:52 - 21:54

is a time range over there so you can
21:54 - 21:56

choose the time range of the data you
21:56 - 21:59

want to use for training purpose okay
21:59 - 22:01

let's say real interesting feature you
22:01 - 22:03

have so that means the more and more
22:03 - 22:06

data coming to your system you can use
22:06 - 22:09

those particular data right to training
22:09 - 22:11

purposes as well automatically using
22:11 - 22:13

this scheduled training okay
22:13 - 22:17

similarly for other scheduling stuff the
22:17 - 22:18

schedule priority and schedule window
22:18 - 22:21

you can set it up as well even you can
22:21 - 22:23

trigger an action as well when the
22:23 - 22:24

scheduling is happening you either you
22:24 - 22:27

can run a log you can send the log file
22:27 - 22:30

output to a look up everything this is
22:30 - 22:32

normal scheduling purposes okay that is
22:32 - 22:34

also you can do over here so this is a
22:34 - 22:36

very versatile feature as well with the
22:36 - 22:39

model you can do and now you can delete
22:39 - 22:43

it as well that's fine so now we will
22:43 - 22:45

publish this model okay
22:45 - 22:56

let's say chances of admit model okay
22:56 - 22:58

this is the model name and the
22:58 - 23:00

destination app you will be choosing
23:00 - 23:02

over here the model will be saved over
23:02 - 23:04

there okay I will be choosing my search
23:04 - 23:08

and reporting app I will click on submit
23:08 - 23:13

okay so the model is created now so how
23:13 - 23:15

the model is created in the background
23:15 - 23:17

it's basically a look of file so let us
23:17 - 23:23

see that okay so from the Splunk home
23:23 - 23:28

etc' apps search okay
23:28 - 23:31

lookups okay so currently if you see it
23:31 - 23:33

over here mmm
23:33 - 23:36

it's the by default the model is saved
23:36 - 23:39

as a user context so it is that's why it
23:39 - 23:41

is not coming up under search so further
23:41 - 23:43

what I need to do and to go to e.t.c
23:43 - 23:47

then I need to go to users currently I'm
23:47 - 23:50

the admin user you put you at mean and
23:50 - 23:53

I'll go to the Search app and here in
23:53 - 23:55

the look of folder this is how the model
23:55 - 23:57

is getting stored over there okay so I
23:57 - 24:00

think this lookup is in read-only format
24:00 - 24:04

so if I just open in notepad so this is
24:04 - 24:06

how it looks like so is this is
24:06 - 24:09

basically saving lot of the information
24:09 - 24:11

the metadata related information about
24:11 - 24:13

the model over here okay what are the
24:13 - 24:16

feature variables whatever the columns I
24:16 - 24:19

have in my data okay all of these things
24:19 - 24:22

ever from the rather there are others
24:22 - 24:24

features which we do not have any
24:24 - 24:26

control about it is saving over there
24:26 - 24:32

okay so now we created our own model
24:32 - 24:34

right nowI to apply this right how we
24:34 - 24:36

are going to apply this there is a
24:36 - 24:40

command called apply in Splunk ml TK
24:40 - 24:43

okay so by using that command you can
24:43 - 24:46

apply that particular model on any data
24:46 - 24:49

set okay on or specifically we'll be
24:49 - 24:53

doing in itself otherwise if you apply
24:53 - 24:56

that model on any evil Evan that I said
24:56 - 24:59

it will anyhow not not not gonna not
24:59 - 25:01

going to give you a proper results so
25:01 - 25:04

this is how you will be applying the
25:04 - 25:08

model so I'll have my this is my data
25:08 - 25:10

set based data set right I'll just
25:10 - 25:13

choose say lots last hundred records
25:13 - 25:20

okay let's last 200 records okay now I
25:20 - 25:23

will be using the apply command don't
25:23 - 25:24

worry about it I will be discussing this
25:24 - 25:27

Ron came LT k commands in detail in in
25:27 - 25:31

my next video so here we will just see
25:31 - 25:34

how we are just applying the model so
25:34 - 25:37

now I will see my apply command then my
25:37 - 25:41

model name right so we have given our
25:41 - 25:45

model name as chances of admit model
25:45 - 25:55

I'll just copy it and I will just run it
25:55 - 25:58

so what it should do basically it will
25:58 - 26:01

apply this particular model or that what
26:01 - 26:05

whatever okay so it is permission denied
26:05 - 26:08

it is saying up so for that what I need
26:08 - 26:15

to do is settings lookups okay lookup
26:15 - 26:24

table files I'll choose this one search
26:24 - 26:27

and reporting okay this is my chances of
26:27 - 26:29

admin model currently it is in private
26:29 - 26:32

mode that's why I am NOT able to apply
26:32 - 26:35

it on from the Search app so I choose
26:35 - 26:38

this app only readwrite currently I will
26:38 - 26:40

give I'll click on save
26:40 - 26:47

okay internal either detected node we
26:47 - 26:50

retain on to okay so let me see what's
26:50 - 26:56

going on over there okay so I think
26:56 - 26:58

there was some technical glitch so I
26:58 - 27:02

just did the permission again and I just
27:02 - 27:05

I my chosen all apps I think it it works
27:05 - 27:09

now so now let us see whether our search
27:09 - 27:11

is working or not
27:11 - 27:15

okay so I have taken the last 200
27:15 - 27:18

records and I'm just clicking on apply
27:18 - 27:20

the machine learning one machine
27:20 - 27:23

learning model so it is if you see that
27:23 - 27:24

it is applying that model on this
27:24 - 27:27

particular two hundred records two
27:27 - 27:29

hundred events over there and it has
27:29 - 27:31

created a new column called predicted
27:31 - 27:34

chances of advic okay so this is how we
27:34 - 27:36

are applying that model even you can
27:36 - 27:39

create your own alert using this
27:39 - 27:41

particular command as well so that
27:41 - 27:43

whenever you you want you want something
27:43 - 27:46

like tons of admit is more than 90
27:46 - 27:48

percent eighty percent or any other
27:48 - 27:50

everything you want you can use this
27:50 - 27:53

particular command to to achieve that
27:53 - 27:55

same thing over there okay so this is
27:55 - 28:00

how you can experiment with machine
28:00 - 28:02

learning specifically the linear
28:02 - 28:07

regression in Splunk ml TK and and we
28:07 - 28:09

saw of the lot of experiments we have
28:09 - 28:12

done it regarding this one right so this
28:12 - 28:13

is how you experiment with your data as
28:13 - 28:17

well and see how is how its best fit
28:17 - 28:19

your data and you can achieve a lot of
28:19 - 28:22

other stuff like automatically training
28:22 - 28:24

creating alerts from these things as
28:24 - 28:27

well okay in next video we will talk
28:27 - 28:29

more details we will basically deep dive
28:29 - 28:31

into what basically internally happening
28:31 - 28:34

over here we will talk about different
28:34 - 28:36

Splunk commands internally running the
28:36 - 28:37

custom commands internal running and
28:37 - 28:40

whatever we have done this experiment we
28:40 - 28:42

have done from the UI the same thing can
28:42 - 28:45

be achieved from the from the search
28:45 - 28:46

command
28:46 - 28:49

as well from Splunk SPL as well okay see
28:49 - 28:52

you in next video

Title:: Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
Description:: more » « less
Video Language:: English
Duration:: 28:51

	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK
	OEVIDEOS edited English subtitles for Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK

Show all

English subtitles

Revisions Compare revisions

Revision 11 Edited

OEVIDEOS
Revision 10 Edited

OEVIDEOS
Revision 9 Edited

OEVIDEOS
Revision 8 Edited

OEVIDEOS
Revision 7 Edited

OEVIDEOS
Revision 6 Edited

OEVIDEOS
Revision 5 Edited

OEVIDEOS
Revision 4 Edited

OEVIDEOS
Revision 3 Edited

OEVIDEOS
Revision 2 Edited

OEVIDEOS
Revision 1 Uploaded

OEVIDEOS

	Revision Number	Author	Created
	11	OEVIDEOS
	10	OEVIDEOS
	9	OEVIDEOS
	8	OEVIDEOS
	7	OEVIDEOS
	6	OEVIDEOS
	5	OEVIDEOS
	4	OEVIDEOS
	3	OEVIDEOS
	2	OEVIDEOS
	1	OEVIDEOS

Splunk MLTK : Implementation Of Linear Regression In Splunk MLTK

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)