-
[Music].
-
Welcome to another L.A.M.E. (Log Analysis
-
Made Easy) tutorial. In this one, we're going
-
to talk about stats, event stats, and
-
stream stats. And we're... Basically,
-
this tutorial will brief you on
-
the difference between the three, and
-
they are slightly different. I'm going to
-
try a few different ways to
-
show it, and hopefully, by the end of this
-
tutorial, you'll have a good idea of
-
how they can be used. I'll put another
-
video after this one with use cases, for
-
example, analytic hunting and stuff that
-
you might actually use the different
-
queries for. But let's start. First off, the
-
stats command. I just started here with
-
index = internal, table is source
-
and sourcetype, stats give me the
-
distinct count of the source by the
-
sourcetype. DC is distinct count.
-
Space count, looking at an internal log. I
-
just want to do something you can do
-
anywhere you want, and I'm just
-
getting all the distinct sources by
-
sourcetype. When I ran that, I see that
-
this sourcetype, Splunk_assist_internal_log,
-
has two sources. This one
-
has three. Most of these just have one.
-
This one, Splunk_d, has four sources. And
-
what you'll note is it takes 55,151 events
-
and collapses them down.
-
It is a transforming command. I use these terms
-
in case you ever want to get Splunk
-
certified or hear these things. These are
-
transformation commands. Transformation
-
commands take logs and change them into
-
primarily tables. If it takes the
-
raw log format and turns it into a table, an
-
option with stats will collapse, like
-
here, a massive reduction. Anyway,
-
we've done that. So let's show stats.
-
Let's show event stats. Event stats is
-
going to take--oh, here’s another
-
example of that command we’re going to
-
just use. This is correlate index. I’m
-
looking at my connection logs. I’m doing
-
source IP, destination. I’m still staying
-
with the stats command. Here, I'm going to
-
give me all the distinct
-
counts of destination IPs to a source IP.
-
So how many different IP addresses did
-
each source IP go to? There were 31,800
-
total events, but it only displays 81
-
because it collapses them down. I can see
-
that 192.168.0.103 went to 33 different
-
addresses, 25, 7, 10, 40, 43, etc. And that is
-
stats. Now look, let's look at event stats.
-
Event stats, going back to my original
-
example, we had 155,118 events shown.
-
Here, the exact same query gave me a
-
distinct count on this
-
internal. What you'll notice is I had
-
155,118 results come back--close enough.
-
Clearly, it was based on when they ran,
-
and how many displays? 155,118.
-
All your statistics show up as individual lines of the
-
entire group. So, it's going to go look at
-
this entire dataset and come back with
-
the statistical numbers for each line.
-
And so, we can... if we move on we'll see
-
when Splunk metric log
-
changes. Somewhere down the line, we'll
-
eventually get there. It changes. Now we
-
have this access log, and there's just
-
one unique, two unique. And so each--here
-
you’ve got the two. You
-
can and two, blah blah blah. And down the lines
-
we go.
-
So basically, this is just statistic,
-
stating, and each line gets its stuff
-
added to it.
-
Another example using my Corelight logs,
-
hopefully, this pushes out. Here's my
-
source IP. Here's my destination IP.
-
One of the things you’ll notice: be
-
careful with stats. You lose values when
-
you use stats. So here has stats 10,000.
-
I would need to do something
-
different to allow me to
-
bring back more than 10,000 events. But
-
just so you know, we're just going to move
-
on and ignore the fact that if I let
-
the limits be as big, it would be
-
31,780 events. And so I come back, and we
-
can see how many times did zero, how many
-
different IP addresses did 0.0.0.0 talk to?
-
One. This is it, and it doesn’t matter how
-
many times it shows up. It only talked
-
one time. Now here, we can see
-
133. It says there were two. We can
-
see the first one, 192.168.0.125. 120.5, 120.5, 120.5, still the same.
-
But somewhere
-
around here, there's going to be--oh, there
-
it is. This one here, there's
-
my second one, and that's why we have
-
two. But it marks two for every one of these events.
-
And same if I had something with
-
three or more, like here 44. If I count it
-
all, there will be 44 distinct IP
-
addresses in all these pairings that
-
go together. Here, I've got two, which is 251119.
-
That's why I've got two.
-
So event stats, it'll take the entire
-
beginning to end of all your
-
data, do its mathematical analysis, and
-
every log that came back will get that
-
value written into it. Stream stats does
-
slightly differently. Stream stats, I'm
-
going to show my last example here.
-
This one’s my last example. Nope.
-
Where did I put that? Okay. This one
-
here, I’m just going to show--stream stats
-
actually does very similarly to what event
-
stats does, but it takes each line as it
-
comes through the stream from the
-
indexer and computes it and keeps
-
growing. So for example here, I did a head
-
100. I’m not going to use any of the
-
values. I’m just gonna say stream
-
stats count. I just want to know. So
-
if you’d done stats count, if I’d done a head
-
100 and I do a stats count, guess what the
-
count’s going to be? 100 or less if
-
there aren’t 100 values that come back.
-
But if I do stream stats, my
-
count has event count so I can see it
-
growing. And I’m going to table it. And
-
the very first value that comes back, it
-
says, how many total events are there?
-
Well, when the first event comes back,
-
there’ll be one. Then when the second
-
event comes back, how many will
-
there be? Two. When the third one comes in
-
line, how many will there be? 3.
-
4, 5, 6, 7, etc., until I reach
-
the back, and it’s 100. So what happens is
-
the statistical number keeps growing as
-
the items come through the stream. Event
-
stats totals the
-
entire bundle from beginning to end,
-
statistical numbers, and puts them on
-
each line. Stream stats takes each line
-
as it comes through and does the math on
-
them. So let’s show another, kind of
-
putting this into practice here. These are my
-
internal logs. Source--we’re doing the distinct count.
-
11111. And we could basically... okay.
-
So, 11111. Nothing's changing.
-
Is there a place where we get
-
something that changes?
-
Too much. Alright. Let’s see.
-
We might go
-
to my bro log. Make it easier.
-
Yeah. Too many of these to mess
-
around with. We’ll go to bro. I did
-
stream stats. Not that one. Stream stats here.
-
I’m doing IPs.
-
And so we can see here, one.
-
So all these come back. Well, it talked. How
-
many times has 468 talked here? How many
-
distinct IPs? One. Still, when it comes
-
here, is it seeing anything new? Nope. So
-
it's one. Seeing anything new? Nope. It's
-
one. So is it seeing anything new? Nope.
-
It's one. Oh, wait. This is a new IP
-
pairing. So the number jumps to two. Now
-
it flips back, but it’s already seen that
-
one, so it stays at two. 2, 2, 2, 2,
-
2. And then when it reaches a brand new
-
pair, how many times has it seen this one
-
talk to this one? It goes back to one and then
-
it grows again because, oh, there’s a new--
-
there’s a new communication there. So 2,
-
2, 2... Oh, brand new communication, so it
-
resets back to one. And so that’s what
-
stream stats will do. It will, based off
-
your pairing, by each time you have
-
a by on there, the field
-
changes, and the count restarts. If I didn’t
-
put a by in there,
-
this number would just keep
-
growing each time it finds a new
-
distinct count on the destination
-
IP. And basically, it’s just going to keep
-
adding up. So you’ve got stats, which
-
aggregates all of your
-
events into very simplified forms, and
-
it does statistics for the
-
entire, the entire summarized set of
-
data there. Then you have event stats,
-
which grabs the entire data set from beginning
-
to end, does the mathematical statistics
-
on it, and adds that value to each line,
-
repeating it. So if there were seven
-
distinct values here, all seven would
-
have the exact same value. And stream
-
stats? It orders it. Basically, each
-
item coming through the pipe, through the
-
stream, will change your statistics. And
-
so it’s a
-
different way of looking at it. All three
-
are different ways of looking at
-
statistical packages. So, just getting
-
some understanding of the data as it
-
flows through. But that's the basic principle.
-
If you want it quick and dirty, you want
-
just a summarized bit of data on there,
-
stats is your
-
key. Stream stats is the other
-
example where you're basically looking
-
for anomalies or averages over time, over
-
the period. And I will be showing another
-
tutorial right after this one with useful
-
queries where you can change the windows
-
and change how it groups things together.
-
But stream stats is an
-
amazing query for being able to know
-
if previous values have an effect on
-
future values, especially when looking for anomalies.
-
Anyway, I hope this helps you in your journey
-
from being a L.A.M.E. analyst to a Splunk
-
ninja. If you like this, feel free
-
to subscribe to my channel. Please, put
-
down below any comments or questions you
-
might have, or any content you want me to
-
do a video on. I love to hear from you
-
guys. I like to do content you
-
want to see. Anyway, I hope you’ll keep
-
coming back, and keep watching these videos.