< Return to Video

Splunk Tutorial For Beginners | Stats vs Eventstats vs Streamstats Command in Splunk

  • 0:00 - 0:09
    [Music].
  • 0:10 - 0:13
    Welcome to another L.A.M.E. (Log Analysis
  • 0:13 - 0:15
    Made Easy) tutorial. In this one, we're going
  • 0:15 - 0:19
    to talk about stats, event stats, and
  • 0:19 - 0:22
    stream stats. And we're... Basically,
  • 0:22 - 0:24
    this tutorial will brief you on
  • 0:24 - 0:27
    the difference between the three, and
  • 0:27 - 0:28
    they are slightly different. I'm going to
  • 0:28 - 0:29
    try a few different ways to
  • 0:29 - 0:31
    show it, and hopefully, by the end of this
  • 0:31 - 0:34
    tutorial, you'll have a good idea of
  • 0:34 - 0:36
    how they can be used. I'll put another
  • 0:36 - 0:39
    video after this one with use cases, for
  • 0:39 - 0:41
    example, analytic hunting and stuff that
  • 0:41 - 0:43
    you might actually use the different
  • 0:43 - 0:46
    queries for. But let's start. First off, the
  • 0:46 - 0:48
    stats command. I just started here with
  • 0:48 - 0:51
    index = internal, table is source
  • 0:51 - 0:53
    and sourcetype, stats give me the
  • 0:53 - 0:55
    distinct count of the source by the
  • 0:55 - 0:59
    sourcetype. DC is distinct count.
  • 0:59 - 1:01
    Space count, looking at an internal log. I
  • 1:01 - 1:02
    just want to do something you can do
  • 1:02 - 1:05
    anywhere you want, and I'm just
  • 1:05 - 1:07
    getting all the distinct sources by
  • 1:07 - 1:09
    sourcetype. When I ran that, I see that
  • 1:09 - 1:13
    this sourcetype, Splunk_assist_internal_log,
  • 1:13 - 1:15
    has two sources. This one
  • 1:15 - 1:17
    has three. Most of these just have one.
  • 1:17 - 1:20
    This one, Splunk_d, has four sources. And
  • 1:20 - 1:25
    what you'll note is it takes 55,151 events
  • 1:25 - 1:27
    and collapses them down.
  • 1:27 - 1:30
    It is a transforming command. I use these terms
  • 1:30 - 1:31
    in case you ever want to get Splunk
  • 1:31 - 1:33
    certified or hear these things. These are
  • 1:33 - 1:35
    transformation commands. Transformation
  • 1:35 - 1:38
    commands take logs and change them into
  • 1:38 - 1:41
    primarily tables. If it takes the
  • 1:41 - 1:43
    raw log format and turns it into a table, an
  • 1:43 - 1:45
    option with stats will collapse, like
  • 1:45 - 1:48
    here, a massive reduction. Anyway,
  • 1:49 - 1:53
    we've done that. So let's show stats.
  • 1:53 - 1:56
    Let's show event stats. Event stats is
  • 1:56 - 2:01
    going to take--oh, here’s another
  • 2:01 - 2:02
    example of that command we’re going to
  • 2:02 - 2:05
    just use. This is correlate index. I’m
  • 2:05 - 2:07
    looking at my connection logs. I’m doing
  • 2:07 - 2:08
    source IP, destination. I’m still staying
  • 2:08 - 2:11
    with the stats command. Here, I'm going to
  • 2:11 - 2:12
    give me all the distinct
  • 2:12 - 2:15
    counts of destination IPs to a source IP.
  • 2:15 - 2:18
    So how many different IP addresses did
  • 2:18 - 2:21
    each source IP go to? There were 31,800
  • 2:21 - 2:23
    total events, but it only displays 81
  • 2:23 - 2:27
    because it collapses them down. I can see
  • 2:27 - 2:31
    that 192.168.0.103 went to 33 different
  • 2:31 - 2:36
    addresses, 25, 7, 10, 40, 43, etc. And that is
  • 2:36 - 2:39
    stats. Now look, let's look at event stats.
  • 2:39 - 2:41
    Event stats, going back to my original
  • 2:41 - 2:45
    example, we had 155,118 events shown.
  • 2:46 - 2:48
    Here, the exact same query gave me a
  • 2:48 - 2:50
    distinct count on this
  • 2:50 - 2:54
    internal. What you'll notice is I had
  • 2:54 - 2:58
    155,118 results come back--close enough.
  • 2:58 - 3:01
    Clearly, it was based on when they ran,
  • 3:01 - 3:05
    and how many displays? 155,118.
  • 3:29 - 3:32
    All your statistics show up as individual lines of the
  • 3:32 - 3:34
    entire group. So, it's going to go look at
  • 3:34 - 3:36
    this entire dataset and come back with
  • 3:36 - 3:39
    the statistical numbers for each line.
  • 3:39 - 3:41
    And so, we can... if we move on we'll see
  • 3:41 - 3:43
    when Splunk metric log
  • 3:43 - 3:45
    changes. Somewhere down the line, we'll
  • 3:45 - 3:48
    eventually get there. It changes. Now we
  • 3:48 - 3:50
    have this access log, and there's just
  • 3:50 - 3:53
    one unique, two unique. And so each--here
  • 3:53 - 3:55
    you’ve got the two. You
  • 3:55 - 3:59
    can and two, blah blah blah. And down the lines
  • 3:59 - 4:00
    we go.
  • 4:01 - 4:03
    So basically, this is just statistic,
  • 4:03 - 4:05
    stating, and each line gets its stuff
  • 4:05 - 4:06
    added to it.
  • 4:06 - 4:10
    Another example using my Corelight logs,
  • 4:10 - 4:13
    hopefully, this pushes out. Here's my
  • 4:13 - 4:16
    source IP. Here's my destination IP.
  • 4:16 - 4:17
    One of the things you’ll notice: be
  • 4:17 - 4:20
    careful with stats. You lose values when
  • 4:20 - 4:24
    you use stats. So here has stats 10,000.
  • 4:24 - 4:25
    I would need to do something
  • 4:25 - 4:27
    different to allow me to
  • 4:27 - 4:29
    bring back more than 10,000 events. But
  • 4:29 - 4:31
    just so you know, we're just going to move
  • 4:31 - 4:33
    on and ignore the fact that if I let
  • 4:33 - 4:35
    the limits be as big, it would be
  • 4:35 - 4:39
    31,780 events. And so I come back, and we
  • 4:39 - 4:42
    can see how many times did zero, how many
  • 4:42 - 4:45
    different IP addresses did 0.0.0.0 talk to?
  • 4:45 - 4:48
    One. This is it, and it doesn’t matter how
  • 4:48 - 4:50
    many times it shows up. It only talked
  • 4:50 - 4:52
    one time. Now here, we can see
  • 4:52 - 4:56
    133. It says there were two. We can
  • 4:56 - 5:04
    see the first one, 192.168.0.125. 120.5, 120.5, 120.5, still the same.
  • 5:04 - 5:05
    But somewhere
  • 5:05 - 5:07
    around here, there's going to be--oh, there
  • 5:07 - 5:10
    it is. This one here, there's
  • 5:10 - 5:11
    my second one, and that's why we have
  • 5:11 - 5:14
    two. But it marks two for every one of these events.
  • 5:16 - 5:18
    And same if I had something with
  • 5:18 - 5:21
    three or more, like here 44. If I count it
  • 5:21 - 5:24
    all, there will be 44 distinct IP
  • 5:24 - 5:27
    addresses in all these pairings that
  • 5:27 - 5:32
    go together. Here, I've got two, which is 251119.
  • 5:32 - 5:34
    That's why I've got two.
  • 5:34 - 5:38
    So event stats, it'll take the entire
  • 5:38 - 5:41
    beginning to end of all your
  • 5:41 - 5:43
    data, do its mathematical analysis, and
  • 5:43 - 5:46
    every log that came back will get that
  • 5:46 - 5:49
    value written into it. Stream stats does
  • 5:49 - 5:52
    slightly differently. Stream stats, I'm
  • 5:52 - 5:55
    going to show my last example here.
  • 5:55 - 5:59
    This one’s my last example. Nope.
  • 5:59 - 6:02
    Where did I put that? Okay. This one
  • 6:02 - 6:05
    here, I’m just going to show--stream stats
  • 6:05 - 6:07
    actually does very similarly to what event
  • 6:07 - 6:10
    stats does, but it takes each line as it
  • 6:10 - 6:13
    comes through the stream from the
  • 6:13 - 6:15
    indexer and computes it and keeps
  • 6:15 - 6:19
    growing. So for example here, I did a head
  • 6:19 - 6:21
    100. I’m not going to use any of the
  • 6:21 - 6:22
    values. I’m just gonna say stream
  • 6:22 - 6:25
    stats count. I just want to know. So
  • 6:25 - 6:28
    if you’d done stats count, if I’d done a head
  • 6:28 - 6:30
    100 and I do a stats count, guess what the
  • 6:30 - 6:33
    count’s going to be? 100 or less if
  • 6:33 - 6:35
    there aren’t 100 values that come back.
  • 6:35 - 6:37
    But if I do stream stats, my
  • 6:37 - 6:39
    count has event count so I can see it
  • 6:39 - 6:42
    growing. And I’m going to table it. And
  • 6:42 - 6:43
    the very first value that comes back, it
  • 6:43 - 6:45
    says, how many total events are there?
  • 6:45 - 6:47
    Well, when the first event comes back,
  • 6:47 - 6:49
    there’ll be one. Then when the second
  • 6:49 - 6:50
    event comes back, how many will
  • 6:50 - 6:53
    there be? Two. When the third one comes in
  • 6:53 - 6:55
    line, how many will there be? 3.
  • 6:55 - 6:58
    4, 5, 6, 7, etc., until I reach
  • 6:58 - 7:02
    the back, and it’s 100. So what happens is
  • 7:02 - 7:05
    the statistical number keeps growing as
  • 7:05 - 7:08
    the items come through the stream. Event
  • 7:08 - 7:09
    stats totals the
  • 7:09 - 7:12
    entire bundle from beginning to end,
  • 7:12 - 7:14
    statistical numbers, and puts them on
  • 7:14 - 7:16
    each line. Stream stats takes each line
  • 7:16 - 7:19
    as it comes through and does the math on
  • 7:19 - 7:21
    them. So let’s show another, kind of
  • 7:21 - 7:25
    putting this into practice here. These are my
  • 7:25 - 7:28
    internal logs. Source--we’re doing the distinct count.
  • 7:28 - 7:31
    11111. And we could basically... okay.
  • 7:31 - 7:33
    So, 11111. Nothing's changing.
  • 7:34 - 7:36
    Is there a place where we get
  • 7:36 - 7:39
    something that changes?
  • 7:43 - 7:46
    Too much. Alright. Let’s see.
  • 7:46 - 7:48
    We might go
  • 7:48 - 7:52
    to my bro log. Make it easier.
  • 7:53 - 7:55
    Yeah. Too many of these to mess
  • 7:55 - 7:58
    around with. We’ll go to bro. I did
  • 7:58 - 8:01
    stream stats. Not that one. Stream stats here.
  • 8:01 - 8:03
    I’m doing IPs.
  • 8:03 - 8:06
    And so we can see here, one.
  • 8:06 - 8:09
    So all these come back. Well, it talked. How
  • 8:09 - 8:12
    many times has 468 talked here? How many
  • 8:12 - 8:16
    distinct IPs? One. Still, when it comes
  • 8:16 - 8:18
    here, is it seeing anything new? Nope. So
  • 8:18 - 8:20
    it's one. Seeing anything new? Nope. It's
  • 8:20 - 8:22
    one. So is it seeing anything new? Nope.
  • 8:22 - 8:25
    It's one. Oh, wait. This is a new IP
  • 8:25 - 8:28
    pairing. So the number jumps to two. Now
  • 8:28 - 8:30
    it flips back, but it’s already seen that
  • 8:30 - 8:34
    one, so it stays at two. 2, 2, 2, 2,
  • 8:34 - 8:37
    2. And then when it reaches a brand new
  • 8:37 - 8:39
    pair, how many times has it seen this one
  • 8:39 - 8:42
    talk to this one? It goes back to one and then
  • 8:42 - 8:44
    it grows again because, oh, there’s a new--
  • 8:44 - 8:46
    there’s a new communication there. So 2,
  • 8:46 - 8:49
    2, 2... Oh, brand new communication, so it
  • 8:49 - 8:52
    resets back to one. And so that’s what
  • 8:52 - 8:55
    stream stats will do. It will, based off
  • 8:55 - 8:58
    your pairing, by each time you have
  • 8:58 - 9:00
    a by on there, the field
  • 9:00 - 9:04
    changes, and the count restarts. If I didn’t
  • 9:04 - 9:05
    put a by in there,
  • 9:05 - 9:08
    this number would just keep
  • 9:08 - 9:09
    growing each time it finds a new
  • 9:09 - 9:11
    distinct count on the destination
  • 9:11 - 9:14
    IP. And basically, it’s just going to keep
  • 9:14 - 9:17
    adding up. So you’ve got stats, which
  • 9:17 - 9:19
    aggregates all of your
  • 9:19 - 9:22
    events into very simplified forms, and
  • 9:22 - 9:24
    it does statistics for the
  • 9:24 - 9:28
    entire, the entire summarized set of
  • 9:28 - 9:32
    data there. Then you have event stats,
  • 9:32 - 9:34
    which grabs the entire data set from beginning
  • 9:34 - 9:36
    to end, does the mathematical statistics
  • 9:36 - 9:38
    on it, and adds that value to each line,
  • 9:38 - 9:42
    repeating it. So if there were seven
  • 9:42 - 9:44
    distinct values here, all seven would
  • 9:44 - 9:47
    have the exact same value. And stream
  • 9:47 - 9:49
    stats? It orders it. Basically, each
  • 9:49 - 9:51
    item coming through the pipe, through the
  • 9:51 - 9:54
    stream, will change your statistics. And
  • 9:54 - 9:56
    so it’s a
  • 9:56 - 9:58
    different way of looking at it. All three
  • 9:58 - 10:00
    are different ways of looking at
  • 10:00 - 10:02
    statistical packages. So, just getting
  • 10:02 - 10:04
    some understanding of the data as it
  • 10:04 - 10:06
    flows through. But that's the basic principle.
  • 10:06 - 10:08
    If you want it quick and dirty, you want
  • 10:08 - 10:10
    just a summarized bit of data on there,
  • 10:10 - 10:13
    stats is your
  • 10:13 - 10:16
    key. Stream stats is the other
  • 10:16 - 10:17
    example where you're basically looking
  • 10:17 - 10:21
    for anomalies or averages over time, over
  • 10:21 - 10:24
    the period. And I will be showing another
  • 10:24 - 10:26
    tutorial right after this one with useful
  • 10:26 - 10:28
    queries where you can change the windows
  • 10:28 - 10:31
    and change how it groups things together.
  • 10:31 - 10:33
    But stream stats is an
  • 10:33 - 10:36
    amazing query for being able to know
  • 10:36 - 10:39
    if previous values have an effect on
  • 10:39 - 10:41
    future values, especially when looking for anomalies.
  • 10:41 - 10:44
    Anyway, I hope this helps you in your journey
  • 10:44 - 10:46
    from being a L.A.M.E. analyst to a Splunk
  • 10:46 - 10:49
    ninja. If you like this, feel free
  • 10:49 - 10:52
    to subscribe to my channel. Please, put
  • 10:52 - 10:54
    down below any comments or questions you
  • 10:54 - 10:56
    might have, or any content you want me to
  • 10:56 - 10:58
    do a video on. I love to hear from you
  • 10:58 - 11:00
    guys. I like to do content you
  • 11:00 - 11:02
    want to see. Anyway, I hope you’ll keep
  • 11:02 - 11:05
    coming back, and keep watching these videos.
Title:
Splunk Tutorial For Beginners | Stats vs Eventstats vs Streamstats Command in Splunk
Description:

more » « less
Video Language:
English
Duration:
11:06

English subtitles

Revisions Compare revisions