-
all right welcome to another lame
-
Creations video this is going to be more
-
or less a re uh visit of the uh
-
splunk.com address I gave this uh
-
previous June 2024 in Vegas and so let's
-
go give it it was entitled anomaly
-
detection so easy your Gramma could do
-
it no ml degree
-
required I'm not going to spend much
-
time if you don't know who I am I'm log
-
I'm Troy Moore from log analysis Made
-
Easy here's my contact information
-
all right what we're going to discuss in
-
this uh conference is breakout session
-
is common baselines you might need what
-
are some things you might need in your
-
company how to make those baselines
-
let's give a demo I'm not a Death by
-
PowerPoint person so we're going to go
-
into a live demo on this what should you
-
do after seeing this presentation and
-
some gotas to
-
baselining let's discuss what baselining
-
is baselining is the expected values or
-
conditions against which all
-
performances are compared I'd have given
-
a definition to it what is that in
-
practice let's go the other way common
-
baselines let's discuss what some are
-
maybe a hardware Baseline can I get a
-
software Baseline can I get network
-
ports and protocol baselines user
-
baselines user Behavior
-
baselines I know that I've been working
-
in uh the cyber world for many many
-
years and I will ask as an auditor do
-
you by chance have an inventory and you
-
know what it's a funny answer ask
-
yourself do you does your your company
-
have a network inventory what how
-
thorough is it how accurate is that
-
Network inventory well if you don't have
-
that can you give me a baseline of what
-
your is on your network a lot of people
-
tell you it's really difficult to give a
-
baseline if you don't have a network
-
inventory and so you'll ask these
-
questions and often you don't get the
-
answers I know I've told otter year
-
after year at the places I've worked I
-
don't have an inventory I don't have
-
those kind of
-
things what we're going to do in this
-
presentation is we're going to show how
-
Splunk and statistical models make you
-
that hero you can be the person who
-
provides that inventory you can be the
-
person in this presentation who provides
-
baselines and can show what is normal in
-
our
-
environment and what I'm going to do is
-
in order to know what's normal we need
-
to look to the past it sounds hopefully
-
this makes sense if you don't know
-
what's happening in the past you won't
-
be able to know what's normal the the
-
past is what defines normal normaly and
-
so what we can do is we can look at his
-
historical IP logs have the connections
-
and that can help us build a baseline we
-
can track the processes that have been
-
running and we can build a baseline we
-
can look at the ports used by systems
-
historically and we can build a baseline
-
we can track historical login events and
-
we can build a login event Baseline
-
Splunk is a logging system so if you've
-
been if you've been getting those logs
-
then you have a tool now to build a
-
baseline so here is the concept that
-
I'll need you to understand to be able
-
to understand everything else there are
-
two methods for baselining there's what
-
I call the rolling window and the allow
-
listing the the rolling window is the
-
easiest way to start a baseline it's in
-
my opinion the most simple method and
-
the concept is we're going to use this
-
little bar here we've got an X we're
-
going to say this is a full line and
-
this is a timeline so this might be one
-
day this might be a a week a month a
-
year whatever the cas may be 3 months
-
and this is a a a historical part of
-
that time so let's say it's a one day
-
this could be 23 hours this could be a
-
week so this could be six days this
-
could be a month this could be 29 days
-
this could be a year this could be 11
-
months or the case may and Y is a
-
portion of that time that we're going to
-
look at the X portion will be our
-
Baseline it'll be the historical events
-
the Y is what we're going to look at
-
we're going to say hey looking at all
-
these events that have occurred are
-
there there any events in y that were
-
not in this Baseline we're not in X and
-
if we do that that's the definition of
-
an anomaly anomalies are things that are
-
not in your Baseline so we can use a
-
rolling window and I will actually demo
-
how to do that the other one is allow
-
listing in which case we build a list of
-
our Baseline again you've got to figure
-
out how you're going to build that
-
Baseline but if you do that you put that
-
list into a lookup file we can do that
-
by using the output lookup command and
-
then we use a lookup command in our logs
-
and we look for uh at all the logs
-
coming in and say are any of these logs
-
do we not have a matching pair of them
-
in our lookup and that would be an
-
indication that this is a new anomalous
-
event all right I've given the the
-
PowerPoint presentation of that we're
-
now going to go into demo time we want
-
to demo this and show how this works in
-
actual practice again the queries are at
-
the end of the presentation um I'm going
-
to have to give a link to this uh p PF
-
so you can just grab them if you want to
-
uh use them or just slow down the video
-
and I'm sorry you'll have to record them
-
that way um but anyway that is what
-
we're going to
-
do all right for this demo I wanted to
-
make sure that any of you could go home
-
and use this very same thing the same
-
data set so I went and grabbed the
-
Splunk freely available boss of the sock
-
referred to as Bots V3 you could go grab
-
V2 V3 but you can and you can these
-
things are going to work on your own
-
data but I wanted to make sure that you
-
could do these very same scenarios when
-
you went back uh back home after this
-
conference so I went to Google I type in
-
Bots V3 and I type GitHub and that
-
brings me back with this little link
-
here and it's just an app you can go
-
download it it's a relatively large app
-
because it contains all the data in an
-
index already pre-indexed for you so
-
when you put this in here you will have
-
all the exact same data that I'm using
-
in this in these queries allowing you to
-
easily be able to run the exact same
-
thing in your own environment and as you
-
learn these queries and you'll be able
-
to use them elsewhere all the
-
documentation is right here if you want
-
to use any of these Source types they're
-
all there
-
available and any of the required
-
software that you need to run any of the
-
Tas in order to be able to get your data
-
to parse correctly and we're going to
-
primarily use the stream data and some
-
Network loog host
-
logs I'm going to jump over here into my
-
environment if I do a head 100 on this
-
little command here this spots V3 stream
-
TCP this is TCP Network traffic and I
-
can see the network traffic I can see
-
Sears going through I can see
-
connections with bytes in and bytes out
-
destination IP Source IP desk Port
-
Source port and what I'm going to want
-
to do is I want to Baseline what is the
-
normal IP traffic on my network and
-
thereby when I see abnormal IP traffic I
-
want to be alerted of it and this has
-
varying levels of success based on how
-
random how many new machines do your
-
systems go out and visit workstations
-
browsing the internet they're going to
-
have a lot of new IP addresses on
-
everyday basis servers they're probably
-
not going to go out and talk to a whole
-
lot of different devices uh specialized
-
devices such as OT operation technology
-
devices they won't talk to a lot of
-
machines there their communication is
-
pretty standard so we can actually use
-
that with uh to understand what's going
-
on if I come in here let's run that
-
query the concept is I'm using an
-
alltime query but that's because I'm
-
using this bots 3 data I have it in the
-
notes in the in the in my PowerPoint how
-
to turn this into a 90day rolling window
-
but to make this work on the Bots data I
-
had to actually set my time and do some
-
a little a little bit different so I'll
-
show you how that looks what I'm done
-
what we're going to do is index equals
-
bots 3 Source type equals stream TCP and
-
what we're going to do here this is the
-
magic we're just going to use a stats
-
command if I just did stats count by
-
Source IP destination IP that would give
-
me every tupal that I've seen during
-
this window but if I put the stats Min
-
time it's going to give me the earliest
-
time the smallest time value that it is
-
seen in this tupal and so this is giving
-
me the earliest time this is POS popped
-
up so I've got a 90-day rolling window
-
of tup PS and I will tag it with the
-
earliest value scene and if I do that
-
just like
-
this we'll see here comes back the
-
earliest
-
time and if I undo that now I'm going to
-
come in here I'm going to change it now
-
I want to set a time I want to know
-
anytime that this earliest time is
-
greater so in normal thing I might go
-
back 86400 that's the amount of seconds
-
in a day and so I might be looking for
-
any new tupal in a day I had to use this
-
value here to move it to a New Day based
-
off this bots 3 data there's only two
-
and a half days worth of data in this
-
bot's data set so I had but in order to
-
make it work so I had to put this
-
specific time stamp in normally you
-
would be using something like now minus
-
86400 and I'll show that but we're going
-
to come down here we'll go where
-
earliest time is greater than now and so
-
if this first time it's been seen is
-
greater than this thing we're going to
-
get the values back if it's not it'll
-
dis it won't show up that's going to
-
show and if I do it like in a day Tim
-
stamp that'll only show me any of the
-
new tupal that I've never seen in 90
-
days that have shown up today so if I
-
run this we're going to flip this to
-
fast
-
mode this will come back with all the
-
tupol brand new tupol that it's ever
-
seen I'm going to tell you this is still
-
too long large of a list but part of
-
this list would normally drop down the
-
fact is the bigger the window you make
-
the less values you will have if I'm
-
looking at one day and I'm looking at
-
the new values you're going to have a
-
you're going to have more results if I
-
go 90 days the amount of new tupal will
-
shrink the bigger the window you have
-
over here the smaller the amount of
-
results will come because the more of
-
those every now and then I go to will be
-
included in my list
-
all right this works let's grab
-
something even a little easier to grasp
-
now I'm showing this these are processes
-
when I look at the processes I'm looking
-
at processes firing off this is
-
calculator being run application frame
-
host crash plan desktop these are
-
processes Auto Machine I want to know if
-
there's new processes that have fired
-
off we're going back to the exact same
-
query we're going at this time we're
-
grouping by instance which is like this
-
host name here
-
sorry instance is application frame host
-
and the host here we're going to look at
-
instance and host and we're going to
-
again grab the earliest time it was
-
seen and we're going to do an eval time
-
when it's later and then we're going to
-
run it so we're basically saying Hey did
-
I see this
-
value ear what's the newest instances
-
that have fired up in the last 24 hours
-
that I've not seen over my my time
-
period I run
-
that and and you can see software
-
running processes are going to be a lot
-
less running on your system and so we
-
can run
-
that we get back the new in processes
-
that ran on this machine in the last 24
-
hours and you can
-
see what happens is SCP and SSH those
-
are brand new processes if I was doing
-
an investigation also sudden machines
-
they've never done it they start
-
involving SCP and SE SSH that's probably
-
be something I want to be looking at and
-
so baselining and knowing what your
-
systems run and then when new processes
-
fire we can look at them and say do I
-
want to look at this we can build alerts
-
off of
-
it let's jump to another example this
-
time listening ports the amount of ports
-
that your machine is listening on should
-
be very static it's not going to change
-
a ton but if someone's opened up new
-
applications they might be opening up
-
new listening ports so you want to look
-
at that we can see here kind of the data
-
coming back we can see M what machine
-
what ports are being opened what they're
-
listening on if we use this very same
-
concept we can see mintime as early as
-
time this time we're looking at host and
-
desport that's my pairing that's what
-
I'm looking for for anomalies in grab
-
the earliest time
-
scene grab the window that I want to
-
track and I'm going to say where early
-
time is greater than now time and in
-
this one make sure I flip it to verbose
-
because ports are really
-
static what a surprise I'm going to get
-
zero results back and that's actually
-
what I'm looking for that'll work out
-
really well for me so I've shown three
-
examples here of how you can just grab
-
any form of data you look for what you
-
want to find as group it by nor at
-
what's normal grab a big window and then
-
set your time to say anything that's
-
occurred that any new Tuple that I I see
-
new since this
-
time so we jump over here we can quickly
-
see this is how it look like in real
-
this how we do it at my
-
place
-
now minus 86400 last 90 days we do the
-
now minus 86400 this says give me a
-
90-day window go back one
-
day very simple we don't change much and
-
we just have that
-
working now if we come over here we can
-
do the exact same thing with our Splunk
-
searches we can come over here we can
-
take this to the next level another way
-
instead of saying I want a 90-day window
-
there's a problem with a 90-day window
-
as soon as this anomalous event occurs
-
it's not going to be anomalous tomorrow
-
so what we can do is we can actually
-
build a lookup and say I'm going to make
-
everything I'm going to do the same
-
concept Tuple them together this time I
-
don't need a time I'm just going to grab
-
all of my tupal and I'm going to Output
-
look up into a CSV or I could do a KV
-
store and that becomes my window that's
-
be and so new anomalous events will not
-
repopulate unless I rerun this output
-
lookup so when I run this I can do so
-
this would be i' build my Baseline and
-
then I would I'd have a scheduled search
-
or whatever that would search search and
-
I'm going to do again stats count and
-
I'm going to do a lookup going to match
-
on Source IP and desk IP and output
-
account say as matched and then I'm
-
going to do where is null is matched
-
meaning I've got a source IP and
-
destination IP but that are not in this
-
lookup that would make it null and this
-
will alert me and if tomorrow the same
-
Source IP and destination IP appears it
-
will also alert me because I as long as
-
I don't update this lookup table it will
-
always be anomalous
-
and so there's pros and cons this is a
-
dynamic growing list these ones I did
-
over here with the where earliest grader
-
than but over here we're building a
-
lookup list and we're doing a match same
-
principle over here we can take the
-
perfmon process exact same thing we're
-
going to Output it to a CSV and then we
-
can set up a search to run every day we
-
do this look up on instance on host and
-
where it's matched or we can go to
-
listening ports we can output the lookup
-
we can do this one of the things you
-
could do is you could actually take a an
-
evaluation of the two you could actually
-
look take all the the alerts that are
-
popping each day and compare it to this
-
L this list and see how much variance
-
there is so you could grab a 90-day
-
table and then compare it to this output
-
lookup there's a lot of ways to Value
-
how much is changing on your environment
-
but the big key is use the his use your
-
historical data to create a Baseline and
-
search on
-
it all right basic summary there that
-
video we showed how we can use the stats
-
command to Baseline normal behavior from
-
historical data we use that Baseline
-
determine new events we're able to
-
detect anomalous network connections
-
anomalous processes and anomalous open
-
ports we then did those very same things
-
with a CSV and we Baseline normal
-
behavior and we were able to use that
-
CSV to detect the same thing network
-
connections processes
-
hosts so there are some gotas that you
-
need to be aware of this is really cool
-
process but as you start to get on it my
-
don't let the gotas get you don't let
-
the Quest for Perfection get in the way
-
of getting something done or having a
-
good product and the rolling window and
-
allow list will get you a good answer
-
it's not perfect and there'll be some
-
gotes along the road but it's will get
-
you most of the way but now that you've
-
got those baselines now we're going to
-
tell you some things you want to be
-
careful of rolling window you're going
-
to be alerted one time that the
-
anomalous connection occurred and then
-
if you remember that X and Y the X being
-
the Baseline y being the new events the
-
new events from y are going to roll into
-
X and now that anomalous event will be
-
part of your Baseline so you'll detect
-
once and then your anomaly is part of
-
your Baseline so you need to be aware of
-
that and the frequency of the the times
-
you run the alert is run in a day that
-
you need to make sure that how often you
-
run this alert remember that you can
-
have a small window say I'm going to
-
look at a 90-day window and I just want
-
to look at 1 second so the day will be
-
the Y will be 1 second and the Baseline
-
will be 899 days 23 hours ex uh 59
-
minutes and 59 seconds whatever the fact
-
is it's still going to look at 90 days
-
worth of data and so no matter how big
-
your y window is it's always to take the
-
time it's required to run the entire X
-
and Y window together so you need to be
-
aware that it can take some time to run
-
this alert it sounds great to run a
-
really long I want an alltime or run a
-
real time I want a a yearlong twoyear
-
recognize that if you run it every day
-
you're still running that query every
-
day so there is it's going to take it's
-
going to take some time up and you want
-
to make sure it doesn't impact the rest
-
your the stuff you're doing allow
-
listing other on the other hand it's
-
going to run against whatever window so
-
if you look at the last 10 seconds it's
-
only going to run on a 10-second window
-
you look at the last hour it's going to
-
run on a 1 hour window so it will run
-
faster but uh you need to remember that
-
as you one how am I going to build that
-
Baseline how do I get items new items
-
into the Baseline um you'll need to
-
address that and remember that a
-
baseline it's a CSV a KV store it's
-
going to occupy space on your search
-
head and you can run out of dis space
-
you only have so much typically you
-
build a lot of space on our indexers our
-
search heads are not huge on dis space
-
just be aware as you start to build
-
large baselines one you'll have
-
performance issues the more values it
-
has to look against the slower your
-
search will run and it's going to take
-
up physical disc spas on your uh machine
-
so that's something you need to just
-
just be aware
-
of I'm going to recommend a hybrid
-
approach and that's the ability to com
-
combine both we're going to do a rolling
-
window and AOW listing and so the basic
-
concept is we're going to use uh your
-
query goes here so you're going to write
-
your query and then you're going to use
-
this collect command this uh this is not
-
a Comm about the different syntax and
-
Splunk but just know that if you use
-
this pipe collect command you will write
-
to a summary index summary indexes are a
-
form of indexing that do not cost you on
-
inest on an ingestion license you can
-
write to a summary index and then you
-
can query that index just like you could
-
query any other index and so you can
-
save your results in an index and the
-
concept is I typically like if I want to
-
build these I might write every day a
-
query and I'm going to write it to the
-
index and it will timestamp it with
-
today's information tomorrow we'll have
-
tomorrow's information and yesterday
-
we'll have yesterday's information and
-
we can query it and search it and so
-
you'll basically write your query you'll
-
run the collect command index equal
-
summary source and give it a name then
-
what you want to do is now that you've
-
done that you'll build that's going to
-
be building your alert uh then you're
-
going to come in here and you're going
-
to look at that summary data and you're
-
going going to append to that those
-
results that fired for that day so you
-
look at the last set of time it ran so
-
maybe you run this once a day you look
-
at yesterday's results you put that in
-
here and then you'll use this append
-
command which will append the lookup I
-
said allow list it should be a disallow
-
list that was a bad writing here gota
-
love gota be careful the uh descriptions
-
you use this is a you're going to grab a
-
list of things you don't you consider to
-
be anomalous so if you see these you
-
want to flag on them it's not what I've
-
done before the CSV which is this is my
-
normal Baseline these are bad events I
-
don't want these events and so I'm going
-
to do the input lookup allow list. CSV
-
and I'm going to do a table on the
-
matching Fields so this what matching
-
from this query over here and then I'm
-
going to Stats count by the matching
-
fields that will basically dup for those
-
who the DD command will remove the
-
duplicates but stats does the same thing
-
and it's more efficient so if you want
-
to write D you can but I I recommend
-
that you learn the power of the stats
-
command it is fast it's got it uh it's
-
just the right it's the right command to
-
use so stats count by matching this
-
basically removing the duplicate so if
-
it was in this index summary index and
-
it's in my lookup we're not going to
-
write it in twice and then we'll write
-
it to this allow lless CSV which will
-
update it which means all the new things
-
that were found will be then written
-
into this lookup and then it will be
-
updated it'll have a new lookup with the
-
results combined so an example that
-
would be index equals Bots V3 uh so
-
Source typ equals perm MK process stats
-
Min time as early as time Max this is
-
all the exact same query nothing's
-
really changed the difference is after I
-
do this eval time I'm not going to do
-
this lookup lookup I'm going to use the
-
name of this CSV I'm going to do
-
instance as instance host as host output
-
instance as recurring I need a value
-
that shows hey I matched the CSV and
-
then I'm going to go where and I
-
actually changed this I forgot Max time
-
I want a mint time and a Max time and
-
the reason being is the min time is
-
looking to see if the value falls in the
-
x of the XY on my rowing window the max
-
time is to use to find if it's in the Y
-
area and we'll explain that so I still
-
have the same where earliest time is
-
greater than now time that's going to
-
say hey I've never seen this event
-
before or recurring equals star meaning
-
hey I got a match on this value and the
-
latest time is greater than the now time
-
meaning it's in the Y section that means
-
this alert that is on my list of things
-
I don't want to keep seeing I don't want
-
to
-
allow it just showed up again that way
-
you can you'll be notified again that it
-
occurred you're updating your key your
-
uh lookup file and you're using a
-
rolling window and so you kind of get
-
the best of both worlds and you can use
-
this to a as a method to automate
-
keeping up to date on any of your
-
alerts and now we're going to demo that
-
I've got a video of it we're going to go
-
watch that and then we'll come back
-
all right so this is the hybrid approach
-
that we've been talking about it's going
-
to look exactly like we did before
-
you've got yourself a the the normal
-
query we're going to just make some form
-
of query here uh this is going to build
-
our processes and then we're going to
-
write a collect command this col collect
-
command will write our results
-
into a summary index and that's the
-
index equals summary and then we go
-
Source equals new process that's going
-
to define the the name of the source for
-
this summary index and so we're going to
-
do that run the run the results and we
-
can see that we got two values coming
-
back if we jump over here this is where
-
we're going to take that very same we
-
can see the summary index being run
-
there's my two results we can query them
-
just like any other index and you'll
-
notice index equals summary Source
-
equals new process I'm going to use this
-
append command this append is going to
-
add this lookup this new process. CSV
-
and then we're going to put in the table
-
of the instance and the host I'm just
-
going to do a stats count by instance
-
host that's basically going to dup so I
-
don't it's going to take the index
-
summary and the input lookup and make
-
them one if there's any duplicates there
-
so I don't get them twice that's what
-
that command's going to do it's faster
-
than DD I'm going to Output the lookup
-
to new process. CSV and that's going to
-
write that what was the original new
-
process CSV and it's going to update
-
with any new fills coming from this
-
summary
-
index so we can see that being run if I
-
go over and we're going to get that
-
taken care of we go run
-
this what it's going to do it's going to
-
grab the summary index stuff and it's
-
going to grab the stuff that was already
-
in my CSV and it's going to write them
-
in there and so now I have four
-
values and those all got written to the
-
CSV now I'm going to write my query that
-
I've been doing this rolling window all
-
all over again we're going to do the
-
difference is I'm going to add a Max
-
time in there it's not just going to be
-
a m time it's also going to be a Max
-
time so I can look at the Y side of the
-
equation and then I'm going to do this
-
lookup new process instance as instance
-
host as host output instance as
-
recurring I need to make the output
-
instance that shows me what matched on
-
this lookup it's kind of like a join
-
command it's going to join the CSV to
-
the previous values and whatever
-
whatever matches is going to be output
-
and I'm going to do same like I've
-
always been doing earliest time greater
-
than now time and that's normal and then
-
we're going to add this recurring equals
-
star recurring equals star means it
-
matched on something I have a value and
-
latest time is greater than now time and
-
that's going to look is there a value in
-
the Y field and is it a recurring field
-
and if it is that's going to alert and
-
so we can see that being
-
run and we're just going to see the
-
we're going to go back to those two
-
Fields these two new field these two
-
Fields occurred during the new window
-
and they'll keep showing up as often as
-
they occur
-
all right we basically showed how we can
-
combine those two approaches in a hybrid
-
me uh approach we created our look up of
-
anomalous Behavior so they don't get
-
excluded we use that to validate how and
-
then there are other things like we can
-
do we can result look at the results of
-
one against the other and we can see if
-
there's changes in our environment how
-
much change is going on there's a lot of
-
things that this gives you abilities to
-
gain more insights
-
into all right so what's next I shown
-
you how you can build baselines I've
-
shown you examples of them I've given
-
you multiple methods what I want you to
-
do is now look at your environment think
-
right now what do I have in my
-
environment that I could Baseline what
-
could I grab what logs could I do I
-
could take that very same approach and I
-
want you to think about it right now
-
write it down and let's go go do it this
-
video is great but if you don't take
-
action on it this video will not have
-
served its full purpose so take that
-
time right now to think what video
-
videos what logs do I have that I can
-
use and what approach can I do to make a
-
Baseline and check for anomalous
-
events thank you so much for your time
-
and I now open it up two questions