[Music].

Welcome to another L.A.M.E. (Log Analysis

Made Easy) tutorial. In this one, we're going

to talk about stats, event stats, and

stream stats. And we're... Basically,

this tutorial will brief you on

the difference between the three, and

they are slightly different. I'm going to

try a few different ways to

show it, and hopefully, by the end of this

tutorial, you'll have a good idea of

how they can be used. I'll put another

video after this one with use cases, for

example, analytic hunting and stuff that

you might actually use the different

queries for. But let's start. First off, the

stats command. I just started here with

index = internal, table is source

and sourcetype, stats give me the

distinct count of the source by the

sourcetype. DC is distinct count.

Space count, looking at an internal log. I

just want to do something you can do

anywhere you want, and I'm just

getting all the distinct sources by

sourcetype. When I ran that, I see that

this sourcetype, Splunk_assist_internal_log,

has two sources. This one

has three. Most of these just have one.

This one, Splunk_d, has four sources. And

what you'll note is it takes 55,151 events

and collapses them down.

It is a transforming command. I use these terms

in case you ever want to get Splunk

certified or hear these things. These are

transformation commands. Transformation

commands take logs and change them into

primarily tables. If it takes the

raw log format and turns it into a table, an

option with stats will collapse, like

here, a massive reduction. Anyway,

we've done that. So let's show stats.

Let's show event stats. Event stats is

going to take--oh, here’s another

example of that command we’re going to

just use. This is correlate index. I’m

looking at my connection logs. I’m doing

source IP, destination. I’m still staying

with the stats command. Here, I'm going to

give me all the distinct

counts of destination IPs to a source IP.

So how many different IP addresses did

each source IP go to? There were 31,800

total events, but it only displays 81

because it collapses them down. I can see

that 192.168.0.103 went to 33 different

addresses, 25, 7, 10, 40, 43, etc. And that is

stats. Now look, let's look at event stats.

Event stats, going back to my original

example, we had 155,118 events shown.

Here, the exact same query gave me a

distinct count on this

internal. What you'll notice is I had

155,118 results come back--close enough.

Clearly, it was based on when they ran,

and how many displays? 155,118.

All your statistics show up as individual lines of the

entire group. So, it's going to go look at

this entire dataset and come back with

the statistical numbers for each line.

And so, we can... if we move on we'll see

when Splunk metric log

changes. Somewhere down the line, we'll

eventually get there. It changes. Now we

have this access log, and there's just

one unique, two unique. And so each--here

you’ve got the two. You

can and two, blah blah blah. And down the lines

we go.

So basically, this is just statistic,

stating, and each line gets its stuff

added to it.

Another example using my Corelight logs,

hopefully, this pushes out. Here's my

source IP. Here's my destination IP.

One of the things you’ll notice: be

careful with stats. You lose values when

you use stats. So here has stats 10,000.

I would need to do something

different to allow me to

bring back more than 10,000 events. But

just so you know, we're just going to move

on and ignore the fact that if I let

the limits be as big, it would be

31,780 events. And so I come back, and we

can see how many times did zero, how many

different IP addresses did 0.0.0.0 talk to?

One. This is it, and it doesn’t matter how

many times it shows up. It only talked

one time. Now here, we can see

133. It says there were two. We can

see the first one, 192.168.0.125. 120.5, 120.5, 120.5, still the same.

But somewhere

around here, there's going to be--oh, there

it is. This one here, there's

my second one, and that's why we have

two. But it marks two for every one of these events.

And same if I had something with

three or more, like here 44. If I count it

all, there will be 44 distinct IP

addresses in all these pairings that

go together. Here, I've got two, which is 251119.

That's why I've got two.

So event stats, it'll take the entire

beginning to end of all your

data, do its mathematical analysis, and

every log that came back will get that

value written into it. Stream stats does

slightly differently. Stream stats, I'm

going to show my last example here.

This one’s my last example. Nope.

Where did I put that? Okay. This one

here, I’m just going to show--stream stats

actually does very similarly to what event

stats does, but it takes each line as it

comes through the stream from the

indexer and computes it and keeps

growing. So for example here, I did a head

100. I’m not going to use any of the

values. I’m just gonna say stream

stats count. I just want to know. So

if you’d done stats count, if I’d done a head

100 and I do a stats count, guess what the

count’s going to be? 100 or less if

there aren’t 100 values that come back.

But if I do stream stats, my

count has event count so I can see it

growing. And I’m going to table it. And

the very first value that comes back, it

says, how many total events are there?

Well, when the first event comes back,

there’ll be one. Then when the second

event comes back, how many will

there be? Two. When the third one comes in

line, how many will there be? 3.

4, 5, 6, 7, etc., until I reach

the back, and it’s 100. So what happens is

the statistical number keeps growing as

the items come through the stream. Event

stats totals the

entire bundle from beginning to end,

statistical numbers, and puts them on

each line. Stream stats takes each line

as it comes through and does the math on

them. So let’s show another, kind of

putting this into practice here. These are my

internal logs. Source--we’re doing the distinct count.

11111. And we could basically... okay.

So, 11111. Nothing's changing.

Is there a place where we get

something that changes?

Too much. Alright. Let’s see.

We might go

to my bro log. Make it easier.

Yeah. Too many of these to mess

around with. We’ll go to bro. I did

stream stats. Not that one. Stream stats here.

I’m doing IPs.

And so we can see here, one.

So all these come back. Well, it talked. How

many times has 468 talked here? How many

distinct IPs? One. Still, when it comes

here, is it seeing anything new? Nope. So

it's one. Seeing anything new? Nope. It's

one. So is it seeing anything new? Nope.

It's one. Oh, wait. This is a new IP

pairing. So the number jumps to two. Now

it flips back, but it’s already seen that

one, so it stays at two. 2, 2, 2, 2,

2. And then when it reaches a brand new

pair, how many times has it seen this one

talk to this one? It goes back to one and then

it grows again because, oh, there’s a new--

there’s a new communication there. So 2,

2, 2... Oh, brand new communication, so it

resets back to one. And so that’s what

stream stats will do. It will, based off

your pairing, by each time you have

a by on there, the field

changes, and the count restarts. If I didn’t

put a by in there,

this number would just keep

growing each time it finds a new

distinct count on the destination

IP. And basically, it’s just going to keep

adding up. So you’ve got stats, which

aggregates all of your

events into very simplified forms, and

it does statistics for the

entire, the entire summarized set of

data there. Then you have event stats,

which grabs the entire data set from beginning

to end, does the mathematical statistics

on it, and adds that value to each line,

repeating it. So if there were seven

distinct values here, all seven would

have the exact same value. And stream

stats? It orders it. Basically, each

item coming through the pipe, through the

stream, will change your statistics. And

so it’s a

different way of looking at it. All three

are different ways of looking at

statistical packages. So, just getting

some understanding of the data as it

flows through. But that's the basic principle.

If you want it quick and dirty, you want

just a summarized bit of data on there,

stats is your

key. Stream stats is the other

example where you're basically looking

for anomalies or averages over time, over

the period. And I will be showing another

tutorial right after this one with useful

queries where you can change the windows

and change how it groups things together.

But stream stats is an

amazing query for being able to know

if previous values have an effect on

future values, especially when looking for anomalies.

Anyway, I hope this helps you in your journey

from being a L.A.M.E. analyst to a Splunk

ninja. If you like this, feel free

to subscribe to my channel. Please, put

down below any comments or questions you

might have, or any content you want me to

do a video on. I love to hear from you

guys. I like to do content you

want to see. Anyway, I hope you’ll keep

coming back, and keep watching these videos.