WEBVTT 00:00:00.000 --> 00:00:08.700 [Music]. 00:00:10.200 --> 00:00:13.040 Welcome to another L.A.M.E. (Log Analysis 00:00:13.040 --> 00:00:15.080 Made Easy) tutorial. In this one, we're going 00:00:15.080 --> 00:00:19.199 to talk about stats, event stats, and 00:00:19.199 --> 00:00:21.720 stream stats. And we're... Basically, 00:00:21.720 --> 00:00:24.320 this tutorial will brief you on 00:00:24.320 --> 00:00:26.519 the difference between the three, and 00:00:26.519 --> 00:00:27.920 they are slightly different. I'm going to 00:00:27.920 --> 00:00:29.240 try a few different ways to 00:00:29.240 --> 00:00:31.320 show it, and hopefully, by the end of this 00:00:31.320 --> 00:00:33.760 tutorial, you'll have a good idea of 00:00:33.760 --> 00:00:36.239 how they can be used. I'll put another 00:00:36.239 --> 00:00:39.079 video after this one with use cases, for 00:00:39.079 --> 00:00:40.600 example, analytic hunting and stuff that 00:00:40.600 --> 00:00:42.600 you might actually use the different 00:00:42.600 --> 00:00:45.879 queries for. But let's start. First off, the 00:00:45.879 --> 00:00:48.480 stats command. I just started here with 00:00:48.480 --> 00:00:50.920 index = internal, table is source 00:00:50.920 --> 00:00:53.280 and sourcetype, stats give me the 00:00:53.280 --> 00:00:55.320 distinct count of the source by the 00:00:55.320 --> 00:00:59.079 sourcetype. DC is distinct count. 00:00:59.079 --> 00:01:00.559 Space count, looking at an internal log. I 00:01:00.559 --> 00:01:01.640 just want to do something you can do 00:01:01.640 --> 00:01:04.720 anywhere you want, and I'm just 00:01:04.720 --> 00:01:06.560 getting all the distinct sources by 00:01:06.560 --> 00:01:09.479 sourcetype. When I ran that, I see that 00:01:09.479 --> 00:01:12.670 this sourcetype, Splunk_assist_internal_log, 00:01:12.670 --> 00:01:14.840 has two sources. This one 00:01:14.840 --> 00:01:17.000 has three. Most of these just have one. 00:01:17.000 --> 00:01:20.119 This one, Splunk_d, has four sources. And 00:01:20.119 --> 00:01:25.270 what you'll note is it takes 55,151 events 00:01:25.270 --> 00:01:26.660 and collapses them down. 00:01:26.660 --> 00:01:29.880 It is a transforming command. I use these terms 00:01:29.880 --> 00:01:31.360 in case you ever want to get Splunk 00:01:31.360 --> 00:01:33.119 certified or hear these things. These are 00:01:33.119 --> 00:01:34.840 transformation commands. Transformation 00:01:34.840 --> 00:01:38.040 commands take logs and change them into 00:01:38.040 --> 00:01:40.840 primarily tables. If it takes the 00:01:40.840 --> 00:01:43.320 raw log format and turns it into a table, an 00:01:43.320 --> 00:01:45.320 option with stats will collapse, like 00:01:45.320 --> 00:01:48.402 here, a massive reduction. Anyway, 00:01:49.479 --> 00:01:52.510 we've done that. So let's show stats. 00:01:53.079 --> 00:01:56.200 Let's show event stats. Event stats is 00:01:56.200 --> 00:02:00.759 going to take--oh, here’s another 00:02:00.759 --> 00:02:02.200 example of that command we’re going to 00:02:02.200 --> 00:02:04.960 just use. This is correlate index. I’m 00:02:04.960 --> 00:02:06.520 looking at my connection logs. I’m doing 00:02:06.520 --> 00:02:08.479 source IP, destination. I’m still staying 00:02:08.479 --> 00:02:10.599 with the stats command. Here, I'm going to 00:02:10.599 --> 00:02:12.360 give me all the distinct 00:02:12.360 --> 00:02:14.959 counts of destination IPs to a source IP. 00:02:14.959 --> 00:02:17.920 So how many different IP addresses did 00:02:17.920 --> 00:02:21.319 each source IP go to? There were 31,800 00:02:21.319 --> 00:02:23.440 total events, but it only displays 81 00:02:23.440 --> 00:02:26.840 because it collapses them down. I can see 00:02:26.840 --> 00:02:31.040 that 192.168.0.103 went to 33 different 00:02:31.040 --> 00:02:35.959 addresses, 25, 7, 10, 40, 43, etc. And that is 00:02:35.959 --> 00:02:39.480 stats. Now look, let's look at event stats. 00:02:39.480 --> 00:02:41.280 Event stats, going back to my original 00:02:41.280 --> 00:02:45.390 example, we had 155,118 events shown. 00:02:46.000 --> 00:02:48.120 Here, the exact same query gave me a 00:02:48.120 --> 00:02:49.959 distinct count on this 00:02:49.959 --> 00:02:53.920 internal. What you'll notice is I had 00:02:53.920 --> 00:02:58.200 155,118 results come back--close enough. 00:02:58.200 --> 00:03:01.440 Clearly, it was based on when they ran, 00:03:01.440 --> 00:03:05.050 and how many displays? 155,118. 00:03:28.680 --> 00:03:32.080 All your statistics show up as individual lines of the 00:03:32.080 --> 00:03:33.799 entire group. So, it's going to go look at 00:03:33.799 --> 00:03:35.959 this entire dataset and come back with 00:03:35.959 --> 00:03:38.640 the statistical numbers for each line. 00:03:38.640 --> 00:03:40.560 And so, we can... if we move on we'll see 00:03:40.560 --> 00:03:42.879 when Splunk metric log 00:03:42.879 --> 00:03:44.959 changes. Somewhere down the line, we'll 00:03:44.959 --> 00:03:47.840 eventually get there. It changes. Now we 00:03:47.840 --> 00:03:50.159 have this access log, and there's just 00:03:50.159 --> 00:03:53.120 one unique, two unique. And so each--here 00:03:53.120 --> 00:03:55.120 you’ve got the two. You 00:03:55.120 --> 00:03:59.079 can and two, blah blah blah. And down the lines 00:03:59.079 --> 00:03:59.816 we go. 00:04:00.760 --> 00:04:02.760 So basically, this is just statistic, 00:04:02.760 --> 00:04:05.120 stating, and each line gets its stuff 00:04:05.120 --> 00:04:06.159 added to it. 00:04:06.159 --> 00:04:10.360 Another example using my Corelight logs, 00:04:10.360 --> 00:04:12.640 hopefully, this pushes out. Here's my 00:04:12.640 --> 00:04:15.879 source IP. Here's my destination IP. 00:04:15.879 --> 00:04:17.079 One of the things you’ll notice: be 00:04:17.079 --> 00:04:19.720 careful with stats. You lose values when 00:04:19.720 --> 00:04:23.520 you use stats. So here has stats 10,000. 00:04:23.520 --> 00:04:24.680 I would need to do something 00:04:24.680 --> 00:04:26.560 different to allow me to 00:04:26.560 --> 00:04:28.880 bring back more than 10,000 events. But 00:04:28.880 --> 00:04:30.759 just so you know, we're just going to move 00:04:30.759 --> 00:04:33.240 on and ignore the fact that if I let 00:04:33.240 --> 00:04:35.199 the limits be as big, it would be 00:04:35.199 --> 00:04:38.600 31,780 events. And so I come back, and we 00:04:38.600 --> 00:04:41.720 can see how many times did zero, how many 00:04:41.720 --> 00:04:45.280 different IP addresses did 0.0.0.0 talk to? 00:04:45.280 --> 00:04:47.840 One. This is it, and it doesn’t matter how 00:04:47.840 --> 00:04:49.520 many times it shows up. It only talked 00:04:49.520 --> 00:04:52.240 one time. Now here, we can see 00:04:52.240 --> 00:04:56.199 133. It says there were two. We can 00:04:56.199 --> 00:05:03.850 see the first one, 192.168.0.125. 120.5, 120.5, 120.5, still the same. 00:05:03.850 --> 00:05:05.240 But somewhere 00:05:05.240 --> 00:05:06.520 around here, there's going to be--oh, there 00:05:06.520 --> 00:05:09.560 it is. This one here, there's 00:05:09.560 --> 00:05:10.919 my second one, and that's why we have 00:05:10.919 --> 00:05:14.480 two. But it marks two for every one of these events. 00:05:15.840 --> 00:05:18.440 And same if I had something with 00:05:18.440 --> 00:05:21.400 three or more, like here 44. If I count it 00:05:21.400 --> 00:05:24.280 all, there will be 44 distinct IP 00:05:24.280 --> 00:05:27.199 addresses in all these pairings that 00:05:27.199 --> 00:05:31.560 go together. Here, I've got two, which is 251119. 00:05:31.560 --> 00:05:34.000 That's why I've got two. 00:05:34.000 --> 00:05:37.919 So event stats, it'll take the entire 00:05:37.919 --> 00:05:40.520 beginning to end of all your 00:05:40.520 --> 00:05:43.160 data, do its mathematical analysis, and 00:05:43.160 --> 00:05:45.880 every log that came back will get that 00:05:45.880 --> 00:05:49.160 value written into it. Stream stats does 00:05:49.160 --> 00:05:52.160 slightly differently. Stream stats, I'm 00:05:52.160 --> 00:05:55.199 going to show my last example here. 00:05:55.199 --> 00:05:58.880 This one’s my last example. Nope. 00:05:58.880 --> 00:06:02.199 Where did I put that? Okay. This one 00:06:02.199 --> 00:06:05.479 here, I’m just going to show--stream stats 00:06:05.479 --> 00:06:07.479 actually does very similarly to what event 00:06:07.479 --> 00:06:10.080 stats does, but it takes each line as it 00:06:10.080 --> 00:06:12.639 comes through the stream from the 00:06:12.639 --> 00:06:15.080 indexer and computes it and keeps 00:06:15.080 --> 00:06:18.680 growing. So for example here, I did a head 00:06:18.680 --> 00:06:21.000 100. I’m not going to use any of the 00:06:21.000 --> 00:06:22.120 values. I’m just gonna say stream 00:06:22.120 --> 00:06:24.759 stats count. I just want to know. So 00:06:24.759 --> 00:06:27.919 if you’d done stats count, if I’d done a head 00:06:27.919 --> 00:06:29.599 100 and I do a stats count, guess what the 00:06:29.599 --> 00:06:32.840 count’s going to be? 100 or less if 00:06:32.840 --> 00:06:35.039 there aren’t 100 values that come back. 00:06:35.039 --> 00:06:36.919 But if I do stream stats, my 00:06:36.919 --> 00:06:38.840 count has event count so I can see it 00:06:38.840 --> 00:06:41.520 growing. And I’m going to table it. And 00:06:41.520 --> 00:06:42.960 the very first value that comes back, it 00:06:42.960 --> 00:06:45.120 says, how many total events are there? 00:06:45.120 --> 00:06:46.680 Well, when the first event comes back, 00:06:46.680 --> 00:06:48.800 there’ll be one. Then when the second 00:06:48.800 --> 00:06:50.160 event comes back, how many will 00:06:50.160 --> 00:06:52.599 there be? Two. When the third one comes in 00:06:52.599 --> 00:06:54.840 line, how many will there be? 3. 00:06:54.840 --> 00:06:57.639 4, 5, 6, 7, etc., until I reach 00:06:57.639 --> 00:07:01.800 the back, and it’s 100. So what happens is 00:07:01.800 --> 00:07:04.800 the statistical number keeps growing as 00:07:04.800 --> 00:07:07.639 the items come through the stream. Event 00:07:07.639 --> 00:07:09.440 stats totals the 00:07:09.440 --> 00:07:12.360 entire bundle from beginning to end, 00:07:12.360 --> 00:07:13.840 statistical numbers, and puts them on 00:07:13.840 --> 00:07:16.360 each line. Stream stats takes each line 00:07:16.360 --> 00:07:19.280 as it comes through and does the math on 00:07:19.280 --> 00:07:21.280 them. So let’s show another, kind of 00:07:21.280 --> 00:07:24.800 putting this into practice here. These are my 00:07:24.800 --> 00:07:28.130 internal logs. Source--we’re doing the distinct count. 00:07:28.130 --> 00:07:30.840 11111. And we could basically... okay. 00:07:30.840 --> 00:07:33.080 So, 11111. Nothing's changing. 00:07:34.400 --> 00:07:36.440 Is there a place where we get 00:07:36.440 --> 00:07:38.550 something that changes? 00:07:42.759 --> 00:07:45.680 Too much. Alright. Let’s see. 00:07:45.680 --> 00:07:47.520 We might go 00:07:47.520 --> 00:07:51.683 to my bro log. Make it easier. 00:07:52.510 --> 00:07:55.280 Yeah. Too many of these to mess 00:07:55.280 --> 00:07:57.560 around with. We’ll go to bro. I did 00:07:57.560 --> 00:08:00.590 stream stats. Not that one. Stream stats here. 00:08:00.590 --> 00:08:02.880 I’m doing IPs. 00:08:02.880 --> 00:08:06.479 And so we can see here, one. 00:08:06.479 --> 00:08:09.479 So all these come back. Well, it talked. How 00:08:09.479 --> 00:08:12.280 many times has 468 talked here? How many 00:08:12.280 --> 00:08:16.280 distinct IPs? One. Still, when it comes 00:08:16.280 --> 00:08:17.879 here, is it seeing anything new? Nope. So 00:08:17.879 --> 00:08:20.120 it's one. Seeing anything new? Nope. It's 00:08:20.120 --> 00:08:21.919 one. So is it seeing anything new? Nope. 00:08:21.919 --> 00:08:25.080 It's one. Oh, wait. This is a new IP 00:08:25.080 --> 00:08:28.479 pairing. So the number jumps to two. Now 00:08:28.479 --> 00:08:30.319 it flips back, but it’s already seen that 00:08:30.319 --> 00:08:33.680 one, so it stays at two. 2, 2, 2, 2, 00:08:33.680 --> 00:08:37.000 2. And then when it reaches a brand new 00:08:37.000 --> 00:08:39.360 pair, how many times has it seen this one 00:08:39.360 --> 00:08:42.279 talk to this one? It goes back to one and then 00:08:42.279 --> 00:08:44.120 it grows again because, oh, there’s a new-- 00:08:44.120 --> 00:08:46.320 there’s a new communication there. So 2, 00:08:46.320 --> 00:08:49.440 2, 2... Oh, brand new communication, so it 00:08:49.440 --> 00:08:52.480 resets back to one. And so that’s what 00:08:52.480 --> 00:08:55.000 stream stats will do. It will, based off 00:08:55.000 --> 00:08:57.720 your pairing, by each time you have 00:08:57.720 --> 00:09:00.200 a by on there, the field 00:09:00.200 --> 00:09:03.800 changes, and the count restarts. If I didn’t 00:09:03.800 --> 00:09:05.320 put a by in there, 00:09:05.320 --> 00:09:07.880 this number would just keep 00:09:07.880 --> 00:09:09.360 growing each time it finds a new 00:09:09.360 --> 00:09:11.320 distinct count on the destination 00:09:11.320 --> 00:09:13.760 IP. And basically, it’s just going to keep 00:09:13.760 --> 00:09:16.920 adding up. So you’ve got stats, which 00:09:16.920 --> 00:09:18.720 aggregates all of your 00:09:18.720 --> 00:09:22.200 events into very simplified forms, and 00:09:22.200 --> 00:09:24.480 it does statistics for the 00:09:24.480 --> 00:09:27.920 entire, the entire summarized set of 00:09:27.920 --> 00:09:31.519 data there. Then you have event stats, 00:09:31.519 --> 00:09:33.640 which grabs the entire data set from beginning 00:09:33.640 --> 00:09:36.079 to end, does the mathematical statistics 00:09:36.079 --> 00:09:38.399 on it, and adds that value to each line, 00:09:38.399 --> 00:09:41.959 repeating it. So if there were seven 00:09:41.959 --> 00:09:43.720 distinct values here, all seven would 00:09:43.720 --> 00:09:46.519 have the exact same value. And stream 00:09:46.519 --> 00:09:49.160 stats? It orders it. Basically, each 00:09:49.160 --> 00:09:50.560 item coming through the pipe, through the 00:09:50.560 --> 00:09:54.000 stream, will change your statistics. And 00:09:54.000 --> 00:09:55.880 so it’s a 00:09:55.880 --> 00:09:57.600 different way of looking at it. All three 00:09:57.600 --> 00:09:59.720 are different ways of looking at 00:09:59.720 --> 00:10:02.360 statistical packages. So, just getting 00:10:02.360 --> 00:10:03.920 some understanding of the data as it 00:10:03.920 --> 00:10:06.200 flows through. But that's the basic principle. 00:10:06.200 --> 00:10:08.200 If you want it quick and dirty, you want 00:10:08.200 --> 00:10:10.440 just a summarized bit of data on there, 00:10:10.440 --> 00:10:12.920 stats is your 00:10:12.920 --> 00:10:15.519 key. Stream stats is the other 00:10:15.519 --> 00:10:17.440 example where you're basically looking 00:10:17.440 --> 00:10:21.200 for anomalies or averages over time, over 00:10:21.200 --> 00:10:24.160 the period. And I will be showing another 00:10:24.160 --> 00:10:26.240 tutorial right after this one with useful 00:10:26.240 --> 00:10:27.920 queries where you can change the windows 00:10:27.920 --> 00:10:30.720 and change how it groups things together. 00:10:30.720 --> 00:10:32.880 But stream stats is an 00:10:32.880 --> 00:10:36.160 amazing query for being able to know 00:10:36.160 --> 00:10:38.600 if previous values have an effect on 00:10:38.600 --> 00:10:41.360 future values, especially when looking for anomalies. 00:10:41.360 --> 00:10:44.040 Anyway, I hope this helps you in your journey 00:10:44.040 --> 00:10:46.480 from being a L.A.M.E. analyst to a Splunk 00:10:46.480 --> 00:10:49.440 ninja. If you like this, feel free 00:10:49.440 --> 00:10:51.800 to subscribe to my channel. Please, put 00:10:51.800 --> 00:10:53.600 down below any comments or questions you 00:10:53.600 --> 00:10:56.360 might have, or any content you want me to 00:10:56.360 --> 00:10:58.320 do a video on. I love to hear from you 00:10:58.320 --> 00:10:59.760 guys. I like to do content you 00:10:59.760 --> 00:11:01.839 want to see. Anyway, I hope you’ll keep 00:11:01.839 --> 00:11:04.939 coming back, and keep watching these videos.