And welcome back. Not every journey to
the world of troubleshooting ends the
same way. Some things are easier to
troubleshoot than others, and also, if you
and I have been working on, like, one large
network and we know it like the back of our
hand, it's a lot easier to troubleshoot
because we know what the subnets are and
the interfaces involved. Whereas, on the
other hand, if we go to a brand new
network or we're doing consulting, it may
take some warm-up time to get used to
and to figure out where everything is on
that specific customer's network. And when
we're doing troubleshooting--again,
whether it's our own network that we
know really well or it's a new network
that we are just introduced to--if we
have a certain process or methodology
for troubleshooting, we can apply that
methodology across the board. So let's
have some fun with this. We'll put an
overview of the high-level steps
regarding a troubleshooting methodology,
and then, as we proceed together, we'll
actually apply those steps as we
troubleshoot together a network. So the
very beginning of this troubleshooting
process would be to identify the
problem. Case in point: let's imagine that
the user who's sitting at this computer
right here, PC 10, calls the service desk
or the help desk, or they're
calling it in your organization, and they
say, "Yeah, I've got a problem." And then the
service desk says, "Okay, tell me more."
And if the user says, "Well, I can't really
tell you anything," well, we have to kind
of, you know, narrow down what the problem
is or at least get what the symptoms are.
And that's why one of the very first
steps is to identify the problem. So
with the identification of the problem,
the user may say, "I can't access the
Internet," or they may just say, "The
network is down." At which point, we would
ask some additional questions. So let's
imagine this user says, "I can't
access anything on the Internet." That
would fall into this category of
identifying the problem: this
user, who normally can access the
Internet, can no longer access the
Internet. The second step would be to
establish a theory regarding why that
might be happening. And so, by leveraging
a topology like this, we could ask
ourselves a few questions. For example, is
this computer powered on? If the
computer is powered on, does it have an
IP address? And if the DHCP client did
get the right information regarding a
default gateway and the subnet and all
that good stuff? And then regarding this
port--is this port on the switch
associated with the right VLAN, which is
VLAN 10? And regarding the trunking,
is it going down from the access layer
switch to the core? Is trunking working,
and is VLAN 10 being allowed? And then,
from the default gateway's perspective
regarding VLAN 10--who's acting as the
default gateway? Is it core 1 or core 2?
Or are they using a First Hop
Redundancy Protocol? And if so, which one
of these two devices is acting as the
active device? And does that device
acting as the default gateway have a
route out towards the Internet? In simple
terms, does it know how to forward? And
the same thing would hold true for this
router and then this connectivity to our
service provider. And also, because we're
using RFC 1918
addresses, perhaps network address
translation is failing or isn't
implemented correctly. So if this user at
PC 10, by doing a few tests, we verify that
it can ping its default gateway--And if
this device in VLAN 10 up here at
headquarters can ping devices out here
at Site 2 and Site 3 and has
reachability there, that can help
identify what is working, and then we
can establish a theory about what may
be specifically causing the problem. And
then, once we've narrowed it down to what
we think it might be, the third
step is to test, which is to basically go
in and prove your theory. If we think the
problem is with router one, or if we
think the problem is with a
multilayer switch, or we think the
problem is with the access layer, we want
to do some testing to validate that what
we think may be the problem really is
causing the problem. And then, once we've
narrowed it down and verified it, we then
want to go ahead and solve the problem.
Now, solving the problem in an
organization also has many steps
involved with it. Let's list a few of
those as far as the solution to this
network connectivity problem that the
user is having out to the Internet. And
let's also imagine, based on our testing,
that we believe it's an issue with
address translation, which could be NAT
or PAT, but definitely needs to happen at
some point before that traffic goes out
to the Internet. So if we've done some
testing and we've narrowed it down that
it is an address translation issue,
regarding solving that, we want to
create a game plan on exactly how we are
going to solve that problem. Perhaps with
network address translation, the NAT device
was set up to support VLAN 20 with the
10.12 subnet and other networks like
this over here at Site 2 and Site 3,
but maybe perhaps not including the
10.110 subnet. So we'd want to make a plan
to correct that. And also, in corporations,
that's going to involve going through
change control if we're going to make a
configuration change. And then, with the
authorization from the change control
board, we're going to go ahead and
implement the change. And then, when we've
implemented it, we also want to verify
that it's working. And that verification
would involve a few things: number one,
that we now have connectivity from this
PC up to the Internet. Also, we'd want to
verify that we didn't make any other
changes that would negatively impact our
environment. Like, we want to make sure that
everything else still functions as well--
VLAN 20 and the other sites--everybody can
still forward out to the Internet. And
then we'd also want to make sure we
document the solution--what we did, how we
did it. And if we changed the topology in
some fashion, we'd want to include that
update in our documentation. So the
documentation of what was done and also
the topology if there's been updates--
that's super important because, let's say,
3 or 4 days go by and we have yet
another problem. And we think, "Oh, I wonder
if what we changed here injected
additional problems into the network." So
we could go back through our paper trail
and identify what happened, when it
happened, what was changed. That can
help speed up our troubleshooting
because a lot of times, there are
cabling issues and physical issues and
so forth, but a lot of times when
something breaks on the network--when
something stops working--it's quite often
due to the last change that was made.
So if we go back and take a look at the
last change or two, that can help us
reduce our troubleshooting time by
either confirming that what was done is
not impacting our current problem or by
verifying that what was done indeed is
impacting our current network. And then
the last step here is to go ahead and
repeat this process for the next problem.
So the next service call that
comes in, the next issue, the next problem--
again, we're going to follow this logical
plan. So what I think would be fun to do
is let's take this network topology,
which we've been playing on and off with
throughout these videos, and what I'll do
is I will inject a problem somewhere in
this mix, and then we can go through
these steps one at a time in this
troubleshooting methodology. And as we do
so, we'll go into more details on each
one. So, in the very next video, join me as
we take a look at this first stage in
the troubleshooting methodology, and that
is identifying the problem, which we'll
do in this network topology. So I'll see
you in that video in just a moment.
Hey, thanks for watching, and subscribe right
here to get the latest information from
CBT Nuggets. And if you're new to or
considering a career in the world of IT,
head on over to CBT Nuggets and sign up for a free trial.