1 00:00:00,000 --> 00:00:08,700 [Music]. 2 00:00:10,200 --> 00:00:13,040 Welcome to another L.A.M.E. (Log Analysis 3 00:00:13,040 --> 00:00:15,080 Made Easy) tutorial. In this one, we're going 4 00:00:15,080 --> 00:00:19,199 to talk about stats, event stats, and 5 00:00:19,199 --> 00:00:21,720 stream stats. And we're... Basically, 6 00:00:21,720 --> 00:00:24,320 this tutorial will brief you on 7 00:00:24,320 --> 00:00:26,519 the difference between the three, and 8 00:00:26,519 --> 00:00:27,920 they are slightly different. I'm going to 9 00:00:27,920 --> 00:00:29,240 try a few different ways to 10 00:00:29,240 --> 00:00:31,320 show it, and hopefully, by the end of this 11 00:00:31,320 --> 00:00:33,760 tutorial, you'll have a good idea of 12 00:00:33,760 --> 00:00:36,239 how they can be used. I'll put another 13 00:00:36,239 --> 00:00:39,079 video after this one with use cases, for 14 00:00:39,079 --> 00:00:40,600 example, analytic hunting and stuff that 15 00:00:40,600 --> 00:00:42,600 you might actually use the different 16 00:00:42,600 --> 00:00:45,879 queries for. But let's start. First off, the 17 00:00:45,879 --> 00:00:48,480 stats command. I just started here with 18 00:00:48,480 --> 00:00:50,920 index = internal, table is source 19 00:00:50,920 --> 00:00:53,280 and sourcetype, stats give me the 20 00:00:53,280 --> 00:00:55,320 distinct count of the source by the 21 00:00:55,320 --> 00:00:59,079 sourcetype. DC is distinct count. 22 00:00:59,079 --> 00:01:00,559 Space count, looking at an internal log. I 23 00:01:00,559 --> 00:01:01,640 just want to do something you can do 24 00:01:01,640 --> 00:01:04,720 anywhere you want, and I'm just 25 00:01:04,720 --> 00:01:06,560 getting all the distinct sources by 26 00:01:06,560 --> 00:01:09,479 sourcetype. When I ran that, I see that 27 00:01:09,479 --> 00:01:12,670 this sourcetype, Splunk_assist_internal_log, 28 00:01:12,670 --> 00:01:14,840 has two sources. This one 29 00:01:14,840 --> 00:01:17,000 has three. Most of these just have one. 30 00:01:17,000 --> 00:01:20,119 This one, Splunk_d, has four sources. And 31 00:01:20,119 --> 00:01:25,270 what you'll note is it takes 55,151 events 32 00:01:25,270 --> 00:01:26,660 and collapses them down. 33 00:01:26,660 --> 00:01:29,880 It is a transforming command. I use these terms 34 00:01:29,880 --> 00:01:31,360 in case you ever want to get Splunk 35 00:01:31,360 --> 00:01:33,119 certified or hear these things. These are 36 00:01:33,119 --> 00:01:34,840 transformation commands. Transformation 37 00:01:34,840 --> 00:01:38,040 commands take logs and change them into 38 00:01:38,040 --> 00:01:40,840 primarily tables. If it takes the 39 00:01:40,840 --> 00:01:43,320 raw log format and turns it into a table, an 40 00:01:43,320 --> 00:01:45,320 option with stats will collapse, like 41 00:01:45,320 --> 00:01:48,402 here, a massive reduction. Anyway, 42 00:01:49,479 --> 00:01:52,510 we've done that. So let's show stats. 43 00:01:53,079 --> 00:01:56,200 Let's show event stats. Event stats is 44 00:01:56,200 --> 00:02:00,759 going to take--oh, here’s another 45 00:02:00,759 --> 00:02:02,200 example of that command we’re going to 46 00:02:02,200 --> 00:02:04,960 just use. This is correlate index. I’m 47 00:02:04,960 --> 00:02:06,520 looking at my connection logs. I’m doing 48 00:02:06,520 --> 00:02:08,479 source IP, destination. I’m still staying 49 00:02:08,479 --> 00:02:10,599 with the stats command. Here, I'm going to 50 00:02:10,599 --> 00:02:12,360 give me all the distinct 51 00:02:12,360 --> 00:02:14,959 counts of destination IPs to a source IP. 52 00:02:14,959 --> 00:02:17,920 So how many different IP addresses did 53 00:02:17,920 --> 00:02:21,319 each source IP go to? There were 31,800 54 00:02:21,319 --> 00:02:23,440 total events, but it only displays 81 55 00:02:23,440 --> 00:02:26,840 because it collapses them down. I can see 56 00:02:26,840 --> 00:02:31,040 that 192.168.0.103 went to 33 different 57 00:02:31,040 --> 00:02:35,959 addresses, 25, 7, 10, 40, 43, etc. And that is 58 00:02:35,959 --> 00:02:39,480 stats. Now look, let's look at event stats. 59 00:02:39,480 --> 00:02:41,280 Event stats, going back to my original 60 00:02:41,280 --> 00:02:45,390 example, we had 155,118 events shown. 61 00:02:46,000 --> 00:02:48,120 Here, the exact same query gave me a 62 00:02:48,120 --> 00:02:49,959 distinct count on this 63 00:02:49,959 --> 00:02:53,920 internal. What you'll notice is I had 64 00:02:53,920 --> 00:02:58,200 155,118 results come back--close enough. 65 00:02:58,200 --> 00:03:01,440 Clearly, it was based on when they ran, 66 00:03:01,440 --> 00:03:05,050 and how many displays? 155,118. 67 00:03:28,680 --> 00:03:32,080 All your statistics show up as individual lines of the 68 00:03:32,080 --> 00:03:33,799 entire group. So, it's going to go look at 69 00:03:33,799 --> 00:03:35,959 this entire dataset and come back with 70 00:03:35,959 --> 00:03:38,640 the statistical numbers for each line. 71 00:03:38,640 --> 00:03:40,560 And so, we can... if we move on we'll see 72 00:03:40,560 --> 00:03:42,879 when Splunk metric log 73 00:03:42,879 --> 00:03:44,959 changes. Somewhere down the line, we'll 74 00:03:44,959 --> 00:03:47,840 eventually get there. It changes. Now we 75 00:03:47,840 --> 00:03:50,159 have this access log, and there's just 76 00:03:50,159 --> 00:03:53,120 one unique, two unique. And so each--here 77 00:03:53,120 --> 00:03:55,120 you’ve got the two. You 78 00:03:55,120 --> 00:03:59,079 can and two, blah blah blah. And down the lines 79 00:03:59,079 --> 00:03:59,816 we go. 80 00:04:00,760 --> 00:04:02,760 So basically, this is just statistic, 81 00:04:02,760 --> 00:04:05,120 stating, and each line gets its stuff 82 00:04:05,120 --> 00:04:06,159 added to it. 83 00:04:06,159 --> 00:04:10,360 Another example using my Corelight logs, 84 00:04:10,360 --> 00:04:12,640 hopefully, this pushes out. Here's my 85 00:04:12,640 --> 00:04:15,879 source IP. Here's my destination IP. 86 00:04:15,879 --> 00:04:17,079 One of the things you’ll notice: be 87 00:04:17,079 --> 00:04:19,720 careful with stats. You lose values when 88 00:04:19,720 --> 00:04:23,520 you use stats. So here has stats 10,000. 89 00:04:23,520 --> 00:04:24,680 I would need to do something 90 00:04:24,680 --> 00:04:26,560 different to allow me to 91 00:04:26,560 --> 00:04:28,880 bring back more than 10,000 events. But 92 00:04:28,880 --> 00:04:30,759 just so you know, we're just going to move 93 00:04:30,759 --> 00:04:33,240 on and ignore the fact that if I let 94 00:04:33,240 --> 00:04:35,199 the limits be as big, it would be 95 00:04:35,199 --> 00:04:38,600 31,780 events. And so I come back, and we 96 00:04:38,600 --> 00:04:41,720 can see how many times did zero, how many 97 00:04:41,720 --> 00:04:45,280 different IP addresses did 0.0.0.0 talk to? 98 00:04:45,280 --> 00:04:47,840 One. This is it, and it doesn’t matter how 99 00:04:47,840 --> 00:04:49,520 many times it shows up. It only talked 100 00:04:49,520 --> 00:04:52,240 one time. Now here, we can see 101 00:04:52,240 --> 00:04:56,199 133. It says there were two. We can 102 00:04:56,199 --> 00:05:03,850 see the first one, 192.168.0.125. 120.5, 120.5, 120.5, still the same. 103 00:05:03,850 --> 00:05:05,240 But somewhere 104 00:05:05,240 --> 00:05:06,520 around here, there's going to be--oh, there 105 00:05:06,520 --> 00:05:09,560 it is. This one here, there's 106 00:05:09,560 --> 00:05:10,919 my second one, and that's why we have 107 00:05:10,919 --> 00:05:14,480 two. But it marks two for every one of these events. 108 00:05:15,840 --> 00:05:18,440 And same if I had something with 109 00:05:18,440 --> 00:05:21,400 three or more, like here 44. If I count it 110 00:05:21,400 --> 00:05:24,280 all, there will be 44 distinct IP 111 00:05:24,280 --> 00:05:27,199 addresses in all these pairings that 112 00:05:27,199 --> 00:05:31,560 go together. Here, I've got two, which is 251119. 113 00:05:31,560 --> 00:05:34,000 That's why I've got two. 114 00:05:34,000 --> 00:05:37,919 So event stats, it'll take the entire 115 00:05:37,919 --> 00:05:40,520 beginning to end of all your 116 00:05:40,520 --> 00:05:43,160 data, do its mathematical analysis, and 117 00:05:43,160 --> 00:05:45,880 every log that came back will get that 118 00:05:45,880 --> 00:05:49,160 value written into it. Stream stats does 119 00:05:49,160 --> 00:05:52,160 slightly differently. Stream stats, I'm 120 00:05:52,160 --> 00:05:55,199 going to show my last example here. 121 00:05:55,199 --> 00:05:58,880 This one’s my last example. Nope. 122 00:05:58,880 --> 00:06:02,199 Where did I put that? Okay. This one 123 00:06:02,199 --> 00:06:05,479 here, I’m just going to show--stream stats 124 00:06:05,479 --> 00:06:07,479 actually does very similarly to what event 125 00:06:07,479 --> 00:06:10,080 stats does, but it takes each line as it 126 00:06:10,080 --> 00:06:12,639 comes through the stream from the 127 00:06:12,639 --> 00:06:15,080 indexer and computes it and keeps 128 00:06:15,080 --> 00:06:18,680 growing. So for example here, I did a head 129 00:06:18,680 --> 00:06:21,000 100. I’m not going to use any of the 130 00:06:21,000 --> 00:06:22,120 values. I’m just gonna say stream 131 00:06:22,120 --> 00:06:24,759 stats count. I just want to know. So 132 00:06:24,759 --> 00:06:27,919 if you’d done stats count, if I’d done a head 133 00:06:27,919 --> 00:06:29,599 100 and I do a stats count, guess what the 134 00:06:29,599 --> 00:06:32,840 count’s going to be? 100 or less if 135 00:06:32,840 --> 00:06:35,039 there aren’t 100 values that come back. 136 00:06:35,039 --> 00:06:36,919 But if I do stream stats, my 137 00:06:36,919 --> 00:06:38,840 count has event count so I can see it 138 00:06:38,840 --> 00:06:41,520 growing. And I’m going to table it. And 139 00:06:41,520 --> 00:06:42,960 the very first value that comes back, it 140 00:06:42,960 --> 00:06:45,120 says, how many total events are there? 141 00:06:45,120 --> 00:06:46,680 Well, when the first event comes back, 142 00:06:46,680 --> 00:06:48,800 there’ll be one. Then when the second 143 00:06:48,800 --> 00:06:50,160 event comes back, how many will 144 00:06:50,160 --> 00:06:52,599 there be? Two. When the third one comes in 145 00:06:52,599 --> 00:06:54,840 line, how many will there be? 3. 146 00:06:54,840 --> 00:06:57,639 4, 5, 6, 7, etc., until I reach 147 00:06:57,639 --> 00:07:01,800 the back, and it’s 100. So what happens is 148 00:07:01,800 --> 00:07:04,800 the statistical number keeps growing as 149 00:07:04,800 --> 00:07:07,639 the items come through the stream. Event 150 00:07:07,639 --> 00:07:09,440 stats totals the 151 00:07:09,440 --> 00:07:12,360 entire bundle from beginning to end, 152 00:07:12,360 --> 00:07:13,840 statistical numbers, and puts them on 153 00:07:13,840 --> 00:07:16,360 each line. Stream stats takes each line 154 00:07:16,360 --> 00:07:19,280 as it comes through and does the math on 155 00:07:19,280 --> 00:07:21,280 them. So let’s show another, kind of 156 00:07:21,280 --> 00:07:24,800 putting this into practice here. These are my 157 00:07:24,800 --> 00:07:28,130 internal logs. Source--we’re doing the distinct count. 158 00:07:28,130 --> 00:07:30,840 11111. And we could basically... okay. 159 00:07:30,840 --> 00:07:33,080 So, 11111. Nothing's changing. 160 00:07:34,400 --> 00:07:36,440 Is there a place where we get 161 00:07:36,440 --> 00:07:38,550 something that changes? 162 00:07:42,759 --> 00:07:45,680 Too much. Alright. Let’s see. 163 00:07:45,680 --> 00:07:47,520 We might go 164 00:07:47,520 --> 00:07:51,683 to my bro log. Make it easier. 165 00:07:52,510 --> 00:07:55,280 Yeah. Too many of these to mess 166 00:07:55,280 --> 00:07:57,560 around with. We’ll go to bro. I did 167 00:07:57,560 --> 00:08:00,590 stream stats. Not that one. Stream stats here. 168 00:08:00,590 --> 00:08:02,880 I’m doing IPs. 169 00:08:02,880 --> 00:08:06,479 And so we can see here, one. 170 00:08:06,479 --> 00:08:09,479 So all these come back. Well, it talked. How 171 00:08:09,479 --> 00:08:12,280 many times has 468 talked here? How many 172 00:08:12,280 --> 00:08:16,280 distinct IPs? One. Still, when it comes 173 00:08:16,280 --> 00:08:17,879 here, is it seeing anything new? Nope. So 174 00:08:17,879 --> 00:08:20,120 it's one. Seeing anything new? Nope. It's 175 00:08:20,120 --> 00:08:21,919 one. So is it seeing anything new? Nope. 176 00:08:21,919 --> 00:08:25,080 It's one. Oh, wait. This is a new IP 177 00:08:25,080 --> 00:08:28,479 pairing. So the number jumps to two. Now 178 00:08:28,479 --> 00:08:30,319 it flips back, but it’s already seen that 179 00:08:30,319 --> 00:08:33,680 one, so it stays at two. 2, 2, 2, 2, 180 00:08:33,680 --> 00:08:37,000 2. And then when it reaches a brand new 181 00:08:37,000 --> 00:08:39,360 pair, how many times has it seen this one 182 00:08:39,360 --> 00:08:42,279 talk to this one? It goes back to one and then 183 00:08:42,279 --> 00:08:44,120 it grows again because, oh, there’s a new-- 184 00:08:44,120 --> 00:08:46,320 there’s a new communication there. So 2, 185 00:08:46,320 --> 00:08:49,440 2, 2... Oh, brand new communication, so it 186 00:08:49,440 --> 00:08:52,480 resets back to one. And so that’s what 187 00:08:52,480 --> 00:08:55,000 stream stats will do. It will, based off 188 00:08:55,000 --> 00:08:57,720 your pairing, by each time you have 189 00:08:57,720 --> 00:09:00,200 a by on there, the field 190 00:09:00,200 --> 00:09:03,800 changes, and the count restarts. If I didn’t 191 00:09:03,800 --> 00:09:05,320 put a by in there, 192 00:09:05,320 --> 00:09:07,880 this number would just keep 193 00:09:07,880 --> 00:09:09,360 growing each time it finds a new 194 00:09:09,360 --> 00:09:11,320 distinct count on the destination 195 00:09:11,320 --> 00:09:13,760 IP. And basically, it’s just going to keep 196 00:09:13,760 --> 00:09:16,920 adding up. So you’ve got stats, which 197 00:09:16,920 --> 00:09:18,720 aggregates all of your 198 00:09:18,720 --> 00:09:22,200 events into very simplified forms, and 199 00:09:22,200 --> 00:09:24,480 it does statistics for the 200 00:09:24,480 --> 00:09:27,920 entire, the entire summarized set of 201 00:09:27,920 --> 00:09:31,519 data there. Then you have event stats, 202 00:09:31,519 --> 00:09:33,640 which grabs the entire data set from beginning 203 00:09:33,640 --> 00:09:36,079 to end, does the mathematical statistics 204 00:09:36,079 --> 00:09:38,399 on it, and adds that value to each line, 205 00:09:38,399 --> 00:09:41,959 repeating it. So if there were seven 206 00:09:41,959 --> 00:09:43,720 distinct values here, all seven would 207 00:09:43,720 --> 00:09:46,519 have the exact same value. And stream 208 00:09:46,519 --> 00:09:49,160 stats? It orders it. Basically, each 209 00:09:49,160 --> 00:09:50,560 item coming through the pipe, through the 210 00:09:50,560 --> 00:09:54,000 stream, will change your statistics. And 211 00:09:54,000 --> 00:09:55,880 so it’s a 212 00:09:55,880 --> 00:09:57,600 different way of looking at it. All three 213 00:09:57,600 --> 00:09:59,720 are different ways of looking at 214 00:09:59,720 --> 00:10:02,360 statistical packages. So, just getting 215 00:10:02,360 --> 00:10:03,920 some understanding of the data as it 216 00:10:03,920 --> 00:10:06,200 flows through. But that's the basic principle. 217 00:10:06,200 --> 00:10:08,200 If you want it quick and dirty, you want 218 00:10:08,200 --> 00:10:10,440 just a summarized bit of data on there, 219 00:10:10,440 --> 00:10:12,920 stats is your 220 00:10:12,920 --> 00:10:15,519 key. Stream stats is the other 221 00:10:15,519 --> 00:10:17,440 example where you're basically looking 222 00:10:17,440 --> 00:10:21,200 for anomalies or averages over time, over 223 00:10:21,200 --> 00:10:24,160 the period. And I will be showing another 224 00:10:24,160 --> 00:10:26,240 tutorial right after this one with useful 225 00:10:26,240 --> 00:10:27,920 queries where you can change the windows 226 00:10:27,920 --> 00:10:30,720 and change how it groups things together. 227 00:10:30,720 --> 00:10:32,880 But stream stats is an 228 00:10:32,880 --> 00:10:36,160 amazing query for being able to know 229 00:10:36,160 --> 00:10:38,600 if previous values have an effect on 230 00:10:38,600 --> 00:10:41,360 future values, especially when looking for anomalies. 231 00:10:41,360 --> 00:10:44,040 Anyway, I hope this helps you in your journey 232 00:10:44,040 --> 00:10:46,480 from being a L.A.M.E. analyst to a Splunk 233 00:10:46,480 --> 00:10:49,440 ninja. If you like this, feel free 234 00:10:49,440 --> 00:10:51,800 to subscribe to my channel. Please, put 235 00:10:51,800 --> 00:10:53,600 down below any comments or questions you 236 00:10:53,600 --> 00:10:56,360 might have, or any content you want me to 237 00:10:56,360 --> 00:10:58,320 do a video on. I love to hear from you 238 00:10:58,320 --> 00:10:59,760 guys. I like to do content you 239 00:10:59,760 --> 00:11:01,839 want to see. Anyway, I hope you’ll keep 240 00:11:01,839 --> 00:11:04,939 coming back, and keep watching these videos.