WEBVTT 00:00:00.120 --> 00:00:02.560 And welcome back. Not every journey to 00:00:02.560 --> 00:00:04.279 the world of troubleshooting ends the 00:00:04.279 --> 00:00:05.960 same way. Some things are easier to 00:00:05.960 --> 00:00:07.799 troubleshoot than others, and also, if you 00:00:07.799 --> 00:00:09.719 and I have been working on, like, one large 00:00:09.719 --> 00:00:11.240 network and we know it like the back of our 00:00:11.240 --> 00:00:13.440 hand, it's a lot easier to troubleshoot 00:00:13.440 --> 00:00:14.839 because we know what the subnets are and 00:00:14.839 --> 00:00:16.800 the interfaces involved. Whereas, on the 00:00:16.800 --> 00:00:17.920 other hand, if we go to a brand new 00:00:17.920 --> 00:00:19.920 network or we're doing consulting, it may 00:00:19.920 --> 00:00:21.960 take some warm-up time to get used to 00:00:21.960 --> 00:00:24.000 and to figure out where everything is on 00:00:24.000 --> 00:00:26.000 that specific customer's network. And when 00:00:26.000 --> 00:00:27.279 we're doing troubleshooting--again, 00:00:27.279 --> 00:00:28.599 whether it's our own network that we 00:00:28.599 --> 00:00:30.240 know really well or it's a new network 00:00:30.240 --> 00:00:31.759 that we are just introduced to--if we 00:00:31.759 --> 00:00:33.520 have a certain process or methodology 00:00:33.520 --> 00:00:35.360 for troubleshooting, we can apply that 00:00:35.360 --> 00:00:37.760 methodology across the board. So let's 00:00:37.760 --> 00:00:38.840 have some fun with this. We'll put an 00:00:38.840 --> 00:00:40.160 overview of the high-level steps 00:00:40.160 --> 00:00:41.960 regarding a troubleshooting methodology, 00:00:41.960 --> 00:00:43.800 and then, as we proceed together, we'll 00:00:43.800 --> 00:00:45.680 actually apply those steps as we 00:00:45.680 --> 00:00:48.039 troubleshoot together a network. So the 00:00:48.039 --> 00:00:49.360 very beginning of this troubleshooting 00:00:49.360 --> 00:00:51.920 process would be to identify the 00:00:51.920 --> 00:00:53.719 problem. Case in point: let's imagine that 00:00:53.719 --> 00:00:55.320 the user who's sitting at this computer 00:00:55.320 --> 00:00:57.559 right here, PC 10, calls the service desk 00:00:57.559 --> 00:00:58.640 or the help desk, or they're 00:00:58.640 --> 00:01:00.359 calling it in your organization, and they 00:01:00.359 --> 00:01:02.440 say, "Yeah, I've got a problem." And then the 00:01:02.440 --> 00:01:04.439 service desk says, "Okay, tell me more." 00:01:04.439 --> 00:01:05.799 And if the user says, "Well, I can't really 00:01:05.799 --> 00:01:08.360 tell you anything," well, we have to kind 00:01:08.360 --> 00:01:10.240 of, you know, narrow down what the problem 00:01:10.240 --> 00:01:12.080 is or at least get what the symptoms are. 00:01:12.080 --> 00:01:13.200 And that's why one of the very first 00:01:13.200 --> 00:01:14.920 steps is to identify the problem. So 00:01:14.920 --> 00:01:16.240 with the identification of the problem, 00:01:16.240 --> 00:01:18.280 the user may say, "I can't access the 00:01:18.280 --> 00:01:19.560 Internet," or they may just say, "The 00:01:19.560 --> 00:01:21.159 network is down." At which point, we would 00:01:21.159 --> 00:01:23.040 ask some additional questions. So let's 00:01:23.040 --> 00:01:24.840 imagine this user says, "I can't 00:01:24.840 --> 00:01:26.840 access anything on the Internet." That 00:01:26.840 --> 00:01:28.200 would fall into this category of 00:01:28.200 --> 00:01:29.479 identifying the problem: this 00:01:29.479 --> 00:01:31.280 user, who normally can access the 00:01:31.280 --> 00:01:32.960 Internet, can no longer access the 00:01:32.960 --> 00:01:34.880 Internet. The second step would be to 00:01:34.880 --> 00:01:37.280 establish a theory regarding why that 00:01:37.280 --> 00:01:39.000 might be happening. And so, by leveraging 00:01:39.000 --> 00:01:41.240 a topology like this, we could ask 00:01:41.240 --> 00:01:43.040 ourselves a few questions. For example, is 00:01:43.040 --> 00:01:45.719 this computer powered on? If the 00:01:45.719 --> 00:01:47.119 computer is powered on, does it have an 00:01:47.119 --> 00:01:49.479 IP address? And if the DHCP client did 00:01:49.479 --> 00:01:51.079 get the right information regarding a 00:01:51.079 --> 00:01:52.880 default gateway and the subnet and all 00:01:52.880 --> 00:01:54.880 that good stuff? And then regarding this 00:01:54.880 --> 00:01:56.840 port--is this port on the switch 00:01:56.840 --> 00:01:58.360 associated with the right VLAN, which is 00:01:58.360 --> 00:02:00.360 VLAN 10? And regarding the trunking, 00:02:00.360 --> 00:02:01.640 is it going down from the access layer 00:02:01.640 --> 00:02:04.079 switch to the core? Is trunking working, 00:02:04.079 --> 00:02:05.960 and is VLAN 10 being allowed? And then, 00:02:05.960 --> 00:02:07.880 from the default gateway's perspective 00:02:07.880 --> 00:02:09.879 regarding VLAN 10--who's acting as the 00:02:09.879 --> 00:02:11.840 default gateway? Is it core 1 or core 2? 00:02:11.840 --> 00:02:13.080 Or are they using a First Hop 00:02:13.080 --> 00:02:15.000 Redundancy Protocol? And if so, which one 00:02:15.000 --> 00:02:16.959 of these two devices is acting as the 00:02:16.959 --> 00:02:18.879 active device? And does that device 00:02:18.879 --> 00:02:20.560 acting as the default gateway have a 00:02:20.560 --> 00:02:22.599 route out towards the Internet? In simple 00:02:22.599 --> 00:02:24.239 terms, does it know how to forward? And 00:02:24.239 --> 00:02:25.480 the same thing would hold true for this 00:02:25.480 --> 00:02:27.920 router and then this connectivity to our 00:02:27.920 --> 00:02:29.599 service provider. And also, because we're 00:02:29.599 --> 00:02:31.240 using RFC 1918 00:02:31.240 --> 00:02:33.160 addresses, perhaps network address 00:02:33.160 --> 00:02:34.959 translation is failing or isn't 00:02:34.959 --> 00:02:37.000 implemented correctly. So if this user at 00:02:37.000 --> 00:02:39.680 PC 10, by doing a few tests, we verify that 00:02:39.680 --> 00:02:41.640 it can ping its default gateway--And if 00:02:41.640 --> 00:02:43.080 this device in VLAN 10 up here at 00:02:43.080 --> 00:02:45.040 headquarters can ping devices out here 00:02:45.040 --> 00:02:46.480 at Site 2 and Site 3 and has 00:02:46.480 --> 00:02:48.120 reachability there, that can help 00:02:48.120 --> 00:02:50.040 identify what is working, and then we 00:02:50.040 --> 00:02:51.480 can establish a theory about what may 00:02:51.480 --> 00:02:53.440 be specifically causing the problem. And 00:02:53.440 --> 00:02:54.879 then, once we've narrowed it down to what 00:02:54.879 --> 00:02:56.840 we think it might be, the third 00:02:56.840 --> 00:02:58.879 step is to test, which is to basically go 00:02:58.879 --> 00:03:00.680 in and prove your theory. If we think the 00:03:00.680 --> 00:03:03.120 problem is with router one, or if we 00:03:03.120 --> 00:03:04.120 think the problem is with a 00:03:04.120 --> 00:03:05.519 multilayer switch, or we think the 00:03:05.519 --> 00:03:07.120 problem is with the access layer, we want 00:03:07.120 --> 00:03:09.080 to do some testing to validate that what 00:03:09.080 --> 00:03:10.840 we think may be the problem really is 00:03:10.840 --> 00:03:12.599 causing the problem. And then, once we've 00:03:12.599 --> 00:03:14.319 narrowed it down and verified it, we then 00:03:14.319 --> 00:03:16.760 want to go ahead and solve the problem. 00:03:16.760 --> 00:03:19.319 Now, solving the problem in an 00:03:19.319 --> 00:03:21.920 organization also has many steps 00:03:21.920 --> 00:03:23.599 involved with it. Let's list a few of 00:03:23.599 --> 00:03:25.879 those as far as the solution to this 00:03:25.879 --> 00:03:27.239 network connectivity problem that the 00:03:27.239 --> 00:03:29.239 user is having out to the Internet. And 00:03:29.239 --> 00:03:31.560 let's also imagine, based on our testing, 00:03:31.560 --> 00:03:32.879 that we believe it's an issue with 00:03:32.879 --> 00:03:35.239 address translation, which could be NAT 00:03:35.239 --> 00:03:37.560 or PAT, but definitely needs to happen at 00:03:37.560 --> 00:03:39.480 some point before that traffic goes out 00:03:39.480 --> 00:03:41.120 to the Internet. So if we've done some 00:03:41.120 --> 00:03:42.400 testing and we've narrowed it down that 00:03:42.400 --> 00:03:44.120 it is an address translation issue, 00:03:44.120 --> 00:03:45.959 regarding solving that, we want to 00:03:45.959 --> 00:03:48.720 create a game plan on exactly how we are 00:03:48.720 --> 00:03:50.640 going to solve that problem. Perhaps with 00:03:50.640 --> 00:03:52.439 network address translation, the NAT device 00:03:52.439 --> 00:03:54.959 was set up to support VLAN 20 with the 00:03:54.959 --> 00:03:57.519 10.12 subnet and other networks like 00:03:57.519 --> 00:03:59.200 this over here at Site 2 and Site 3, 00:03:59.200 --> 00:04:01.200 but maybe perhaps not including the 00:04:01.200 --> 00:04:03.760 10.110 subnet. So we'd want to make a plan 00:04:03.760 --> 00:04:05.720 to correct that. And also, in corporations, 00:04:05.720 --> 00:04:06.879 that's going to involve going through 00:04:06.879 --> 00:04:08.519 change control if we're going to make a 00:04:08.519 --> 00:04:10.439 configuration change. And then, with the 00:04:10.439 --> 00:04:12.079 authorization from the change control 00:04:12.079 --> 00:04:13.519 board, we're going to go ahead and 00:04:13.519 --> 00:04:15.319 implement the change. And then, when we've 00:04:15.319 --> 00:04:16.880 implemented it, we also want to verify 00:04:16.880 --> 00:04:18.560 that it's working. And that verification 00:04:18.560 --> 00:04:20.000 would involve a few things: number one, 00:04:20.000 --> 00:04:21.519 that we now have connectivity from this 00:04:21.519 --> 00:04:24.040 PC up to the Internet. Also, we'd want to 00:04:24.040 --> 00:04:26.240 verify that we didn't make any other 00:04:26.240 --> 00:04:28.360 changes that would negatively impact our 00:04:28.360 --> 00:04:29.759 environment. Like, we want to make sure that 00:04:29.759 --> 00:04:31.320 everything else still functions as well-- 00:04:31.320 --> 00:04:33.479 VLAN 20 and the other sites--everybody can 00:04:33.479 --> 00:04:35.000 still forward out to the Internet. And 00:04:35.000 --> 00:04:36.199 then we'd also want to make sure we 00:04:36.199 --> 00:04:38.960 document the solution--what we did, how we 00:04:38.960 --> 00:04:41.039 did it. And if we changed the topology in 00:04:41.039 --> 00:04:42.880 some fashion, we'd want to include that 00:04:42.880 --> 00:04:45.039 update in our documentation. So the 00:04:45.039 --> 00:04:46.840 documentation of what was done and also 00:04:46.840 --> 00:04:48.520 the topology if there's been updates-- 00:04:48.520 --> 00:04:50.600 that's super important because, let's say, 00:04:50.600 --> 00:04:52.720 3 or 4 days go by and we have yet 00:04:52.720 --> 00:04:54.720 another problem. And we think, "Oh, I wonder 00:04:54.720 --> 00:04:57.280 if what we changed here injected 00:04:57.280 --> 00:04:58.919 additional problems into the network." So 00:04:58.919 --> 00:05:00.360 we could go back through our paper trail 00:05:00.360 --> 00:05:02.240 and identify what happened, when it 00:05:02.240 --> 00:05:03.960 happened, what was changed. That can 00:05:03.960 --> 00:05:05.120 help speed up our troubleshooting 00:05:05.120 --> 00:05:07.520 because a lot of times, there are 00:05:07.520 --> 00:05:09.080 cabling issues and physical issues and 00:05:09.080 --> 00:05:10.880 so forth, but a lot of times when 00:05:10.880 --> 00:05:12.560 something breaks on the network--when 00:05:12.560 --> 00:05:14.919 something stops working--it's quite often 00:05:14.919 --> 00:05:17.520 due to the last change that was made. 00:05:17.520 --> 00:05:18.800 So if we go back and take a look at the 00:05:18.800 --> 00:05:20.520 last change or two, that can help us 00:05:20.520 --> 00:05:22.400 reduce our troubleshooting time by 00:05:22.400 --> 00:05:23.919 either confirming that what was done is 00:05:23.919 --> 00:05:26.360 not impacting our current problem or by 00:05:26.360 --> 00:05:28.759 verifying that what was done indeed is 00:05:28.759 --> 00:05:30.520 impacting our current network. And then 00:05:30.520 --> 00:05:32.880 the last step here is to go ahead and 00:05:32.880 --> 00:05:36.139 repeat this process for the next problem. 00:05:36.139 --> 00:05:37.880 So the next service call that 00:05:37.880 --> 00:05:40.280 comes in, the next issue, the next problem-- 00:05:40.280 --> 00:05:41.880 again, we're going to follow this logical 00:05:41.880 --> 00:05:43.680 plan. So what I think would be fun to do 00:05:43.680 --> 00:05:45.800 is let's take this network topology, 00:05:45.800 --> 00:05:47.160 which we've been playing on and off with 00:05:47.160 --> 00:05:48.840 throughout these videos, and what I'll do 00:05:48.840 --> 00:05:50.800 is I will inject a problem somewhere in 00:05:50.800 --> 00:05:52.440 this mix, and then we can go through 00:05:52.440 --> 00:05:53.880 these steps one at a time in this 00:05:53.880 --> 00:05:55.960 troubleshooting methodology. And as we do 00:05:55.960 --> 00:05:57.720 so, we'll go into more details on each 00:05:57.720 --> 00:05:59.800 one. So, in the very next video, join me as 00:05:59.800 --> 00:06:01.600 we take a look at this first stage in 00:06:01.600 --> 00:06:03.400 the troubleshooting methodology, and that 00:06:03.400 --> 00:06:05.240 is identifying the problem, which we'll 00:06:05.240 --> 00:06:07.400 do in this network topology. So I'll see 00:06:07.400 --> 00:06:10.000 you in that video in just a moment. 00:06:10.000 --> 00:06:11.599 Hey, thanks for watching, and subscribe right 00:06:11.599 --> 00:06:13.440 here to get the latest information from 00:06:13.440 --> 00:06:15.520 CBT Nuggets. And if you're new to or 00:06:15.520 --> 00:06:17.440 considering a career in the world of IT, 00:06:17.440 --> 00:06:20.640 head on over to CBT Nuggets and sign up for a free trial.