-
Hi there. My name is Greg Ainslie-Malik,
-
and I'd like to take you on a really
-
brief tour
-
through Splunk's machine learning
-
toolkit.
-
Originally developed for what Gartner
-
termed citizen data scientists,
-
the machine learning toolkit presents a
-
whole host of
-
features for customers
-
mostly focused around assistance and
-
experiments
-
to help users who aren't familiar with
-
data science
-
train and test machine learning models
-
and deploy them into production.
-
And most of these assistants present as
-
kind of guided interfaces where you can
-
input some SPL, something that our users
-
are very familiar with,
-
select some algorithms, do some
-
pre-processing,
-
things that our users are less familiar
-
with, and then view a set of dashboards, a
-
set of reports that tell them about
-
their model's performance.
-
However, what we see from the telemetry
-
is that these experiments are generally
-
used as almost like pseudo training to help
-
users familiarize themselves with MLTK, but of
-
the monthly active users,
-
actually more than 95% of them run
-
MLTK searches straight from the search
-
bar.
-
So here you can see an example of that
-
where we're using the fit command
-
that ships with MLTK to apply an anomaly
-
detection search.
-
And you can see that this is actually
-
just two lines of SPL.
-
So for our NOC and SOC personas, those
-
who are very familiar to us
-
at Splunk, this is quite a simple thing
-
to do.
-
Now, while the search bar and the
-
experiments can help our users develop
-
and deploy
-
simple techniques like this for finding
-
anomalies or making predictions,
-
what we're starting to see is a trend
-
towards
-
use case focused workflows. Here we have
-
one for ITSI
-
where
-
more complex techniques can be run
-
against data without
-
having to see the details of the ML
-
that's being applied at all.
-
So here we have a list of episodes,
-
incidents in ITSI.
-
Where I'm clicking on an incident, some-
-
a technique called causal inference gets
-
run in the background
-
to determine the root cause of that
-
incident, and you can see here a graph
-
structure that has mapped out
-
those root cause relationships, and up
-
here you can see a table where
-
for the service that was impacted by the
-
incident,
-
here are all the KPIs that are affected
-
it. And I'm clicking in this,
-
we can quickly drill down and see what
-
the raw data looked like,
-
and I could draw the conclusion that
-
perhaps it was disk space used
-
that was the reason behind this incident
-
in this case.