[Music]. Welcome to another L.A.M.E. (Log Analysis Made Easy) tutorial. In this one, we're going to talk about stats, event stats, and stream stats. And we're... Basically, this tutorial will brief you on the difference between the three, and they are slightly different. I'm going to try a few different ways to show it, and hopefully, by the end of this tutorial, you'll have a good idea of how they can be used. I'll put another video after this one with use cases, for example, analytic hunting and stuff that you might actually use the different queries for. But let's start. First off, the stats command. I just started here with index = internal, table is source and sourcetype, stats give me the distinct count of the source by the sourcetype. DC is distinct count. Space count, looking at an internal log. I just want to do something you can do anywhere you want, and I'm just getting all the distinct sources by sourcetype. When I ran that, I see that this sourcetype, Splunk_assist_internal_log, has two sources. This one has three. Most of these just have one. This one, Splunk_d, has four sources. And what you'll note is it takes 55,151 events and collapses them down. It is a transforming command. I use these terms in case you ever want to get Splunk certified or hear these things. These are transformation commands. Transformation commands take logs and change them into primarily tables. If it takes the raw log format and turns it into a table, an option with stats will collapse, like here, a massive reduction. Anyway, we've done that. So let's show stats. Let's show event stats. Event stats is going to take--oh, here’s another example of that command we’re going to just use. This is correlate index. I’m looking at my connection logs. I’m doing source IP, destination. I’m still staying with the stats command. Here, I'm going to give me all the distinct counts of destination IPs to a source IP. So how many different IP addresses did each source IP go to? There were 31,800 total events, but it only displays 81 because it collapses them down. I can see that 192.168.0.103 went to 33 different addresses, 25, 7, 10, 40, 43, etc. And that is stats. Now look, let's look at event stats. Event stats, going back to my original example, we had 155,118 events shown. Here, the exact same query gave me a distinct count on this internal. What you'll notice is I had 155,118 results come back--close enough. Clearly, it was based on when they ran, and how many displays? 155,118. All your statistics show up as individual lines of the entire group. So, it's going to go look at this entire dataset and come back with the statistical numbers for each line. And so, we can... if we move on we'll see when Splunk metric log changes. Somewhere down the line, we'll eventually get there. It changes. Now we have this access log, and there's just one unique, two unique. And so each--here you’ve got the two. You can and two, blah blah blah. And down the lines we go. So basically, this is just statistic, stating, and each line gets its stuff added to it. Another example using my Corelight logs, hopefully, this pushes out. Here's my source IP. Here's my destination IP. One of the things you’ll notice: be careful with stats. You lose values when you use stats. So here has stats 10,000. I would need to do something different to allow me to bring back more than 10,000 events. But just so you know, we're just going to move on and ignore the fact that if I let the limits be as big, it would be 31,780 events. And so I come back, and we can see how many times did zero, how many different IP addresses did 0.0.0.0 talk to? One. This is it, and it doesn’t matter how many times it shows up. It only talked one time. Now here, we can see 133. It says there were two. We can see the first one, 192.168.0.125. 120.5, 120.5, 120.5, still the same. But somewhere around here, there's going to be--oh, there it is. This one here, there's my second one, and that's why we have two. But it marks two for every one of these events. And same if I had something with three or more, like here 44. If I count it all, there will be 44 distinct IP addresses in all these pairings that go together. Here, I've got two, which is 251119. That's why I've got two. So event stats, it'll take the entire beginning to end of all your data, do its mathematical analysis, and every log that came back will get that value written into it. Stream stats does slightly differently. Stream stats, I'm going to show my last example here. This one’s my last example. Nope. Where did I put that? Okay. This one here, I’m just going to show--stream stats actually does very similarly to what event stats does, but it takes each line as it comes through the stream from the indexer and computes it and keeps growing. So for example here, I did a head 100. I’m not going to use any of the values. I’m just gonna say stream stats count. I just want to know. So if you’d done stats count, if I’d done a head 100 and I do a stats count, guess what the count’s going to be? 100 or less if there aren’t 100 values that come back. But if I do stream stats, my count has event count so I can see it growing. And I’m going to table it. And the very first value that comes back, it says, how many total events are there? Well, when the first event comes back, there’ll be one. Then when the second event comes back, how many will there be? Two. When the third one comes in line, how many will there be? 3. 4, 5, 6, 7, etc., until I reach the back, and it’s 100. So what happens is the statistical number keeps growing as the items come through the stream. Event stats totals the entire bundle from beginning to end, statistical numbers, and puts them on each line. Stream stats takes each line as it comes through and does the math on them. So let’s show another, kind of putting this into practice here. These are my internal logs. Source--we’re doing the distinct count. 11111. And we could basically... okay. So, 11111. Nothing's changing. Is there a place where we get something that changes? Too much. Alright. Let’s see. We might go to my bro log. Make it easier. Yeah. Too many of these to mess around with. We’ll go to bro. I did stream stats. Not that one. Stream stats here. I’m doing IPs. And so we can see here, one. So all these come back. Well, it talked. How many times has 468 talked here? How many distinct IPs? One. Still, when it comes here, is it seeing anything new? Nope. So it's one. Seeing anything new? Nope. It's one. So is it seeing anything new? Nope. It's one. Oh, wait. This is a new IP pairing. So the number jumps to two. Now it flips back, but it’s already seen that one, so it stays at two. 2, 2, 2, 2, 2. And then when it reaches a brand new pair, how many times has it seen this one talk to this one? It goes back to one and then it grows again because, oh, there’s a new-- there’s a new communication there. So 2, 2, 2... Oh, brand new communication, so it resets back to one. And so that’s what stream stats will do. It will, based off your pairing, by each time you have a by on there, the field changes, and the count restarts. If I didn’t put a by in there, this number would just keep growing each time it finds a new distinct count on the destination IP. And basically, it’s just going to keep adding up. So you’ve got stats, which aggregates all of your events into very simplified forms, and it does statistics for the entire, the entire summarized set of data there. Then you have event stats, which grabs the entire data set from beginning to end, does the mathematical statistics on it, and adds that value to each line, repeating it. So if there were seven distinct values here, all seven would have the exact same value. And stream stats? It orders it. Basically, each item coming through the pipe, through the stream, will change your statistics. And so it’s a different way of looking at it. All three are different ways of looking at statistical packages. So, just getting some understanding of the data as it flows through. But that's the basic principle. If you want it quick and dirty, you want just a summarized bit of data on there, stats is your key. Stream stats is the other example where you're basically looking for anomalies or averages over time, over the period. And I will be showing another tutorial right after this one with useful queries where you can change the windows and change how it groups things together. But stream stats is an amazing query for being able to know if previous values have an effect on future values, especially when looking for anomalies. Anyway, I hope this helps you in your journey from being a L.A.M.E. analyst to a Splunk ninja. If you like this, feel free to subscribe to my channel. Please, put down below any comments or questions you might have, or any content you want me to do a video on. I love to hear from you guys. I like to do content you want to see. Anyway, I hope you’ll keep coming back, and keep watching these videos.