Measures of Variability (Range, Standard Deviation, Variance)

Edit subtitles

0:00 - 0:02

PROFESSOR: In this
video, we're going
0:02 - 0:04

to learn about measures
of variability,
0:04 - 0:06

another form of
descriptive statistics
0:06 - 0:08

that people often want to
know in addition to measures
0:08 - 0:09

of central tendency.
0:09 - 0:12

But before we get to any of
the nitty gritty details,
0:12 - 0:16

I want to motivate why we
need measures of variability
0:16 - 0:17

with two examples.
0:17 - 0:20

So here's two different
data sets, one on the top
0:20 - 0:21

and one on the bottom.
0:21 - 0:24

I'll just go ahead and tell you
that the mean for both data sets
0:24 - 0:25

is 87.
0:25 - 0:28

Now if I were to just tell
you the mean of these data,
0:28 - 0:30

I would be misleading
you a little bit,
0:30 - 0:33

because in reality, the
situation in each data set
0:33 - 0:34

is quite different.
0:34 - 0:36

If I were to plot
it out, for example,
0:36 - 0:37

you would see this
difference clearly.
0:37 - 0:39

In the top data
set, all the scores
0:39 - 0:41

are very clustered together.
0:41 - 0:42

Everything is close.
0:42 - 0:45

But in the bottom data set,
scores are very spread out.
0:45 - 0:48

So again, I need some way to
quantify these differences.
0:48 - 0:51

And a measure of central
tendency, like the mean,
0:51 - 0:53

simply can't capture that alone.
0:53 - 0:55

Here's another example.
0:55 - 0:57

Let's say you're working for
a pharmaceutical company,
0:57 - 0:58

something like that.
0:58 - 1:01

And you need to decide between
two different medications
1:01 - 1:02

for depression.
1:02 - 1:06

We'll call them medication
A and medication B. So
1:06 - 1:09

let's say you did a
study where you measured
1:09 - 1:11

how much improvement
happened when people took one
1:11 - 1:14

over the other, and
this is what you got.
1:14 - 1:16

So let's say over here
that higher scores mean
1:16 - 1:18

more improvement
and lower scores
1:18 - 1:20

mean little to no improvement.
1:20 - 1:21

Well, let's compare.
1:21 - 1:23

The means in this
case are the same.
1:23 - 1:27

In both cases, people improved
by about 10-ish points or so.
1:27 - 1:29

But the variability
is very different.
1:29 - 1:32

On the left, some people
benefited very greatly,
1:32 - 1:35

whereas others really
didn't benefit at all.
1:35 - 1:37

But on the right, everyone
benefits a good amount.
1:37 - 1:40

In this case, I would
personally pick medication B
1:40 - 1:42

because it's more consistent.
1:42 - 1:45

And so this is an example of
why knowing the variability
1:45 - 1:50

might help us to make
some real-life decisions.
1:50 - 1:54

So in general in statistics,
measures of variability
1:54 - 1:57

are ways to describe these
differences statistically.
1:57 - 2:00

They describe how scores
in a given data set
2:00 - 2:02

differ from one another.
2:02 - 2:04

And they capture things
like how spread out
2:04 - 2:05

or how clustered
together, the points
2:05 - 2:07

are things we've been
looking at already.
2:07 - 2:10

So there are three that
we're going to talk about.
2:10 - 2:13

We have the range, standard
deviation, and variance.
2:13 - 2:15

Let's start with the range.
2:15 - 2:18

The range is nice because
it's a really simple measure
2:18 - 2:22

of variability, of dispersion,
of how spread out points are.
2:22 - 2:25

It can often be calculated
in 5 or 10 seconds.
2:25 - 2:27

Here's the formula.
2:27 - 2:29

So we have the range, r.
2:29 - 2:30

Don't get confused
later on when we
2:30 - 2:35

learn about correlations, which
are often also described by r.
2:35 - 2:36

We'll use some
different subscripts
2:36 - 2:39

to make that difference
clear when the time comes.
2:39 - 2:40

But for now, range is r.
2:40 - 2:44

And then we have r
equals h minus l.
2:44 - 2:46

h means the highest
score in the data set,
2:46 - 2:49

l means the lowest
score in the data set.
2:49 - 2:51

So you can see that this is
a very simple calculation.
2:51 - 2:53

And if we go back
to the example we
2:53 - 2:55

were working with
a minute ago, we
2:55 - 2:58

can calculate the
range very quickly.
2:58 - 3:01

So for the first data set,
we have 95 negative 80.
3:01 - 3:02

So the range is 15.
3:02 - 3:06

And in the second data set,
we have 150 negative 25,
3:06 - 3:09

giving us a much
larger range of 125.
3:09 - 3:13

So in this case, I would do
well to report both to you.
3:13 - 3:16

I'll tell you the mean and
this measure of variability,
3:16 - 3:19

because that gives you a more
full picture of what's going on.
3:19 - 3:22

So a mean of 87
and a range of 15
3:22 - 3:25

describes a very
different situation
3:25 - 3:29

compared to a mean of
87 and a range of 125.
3:29 - 3:31

So again, it's a great
idea for me to report both.
3:31 - 3:33

And this is what's often done.
3:33 - 3:36

A big limitation of
the range, though,
3:36 - 3:38

is that by using it,
even though it's simple
3:38 - 3:40

and it's pretty
effective, you might
3:40 - 3:43

miss a little bit of the data,
a little bit of the information
3:43 - 3:44

in your data set.
3:44 - 3:46

Let me show you an
example to illustrate.
3:46 - 3:48

Here's a data set here.
3:48 - 3:50

Although these bars
are quite high,
3:50 - 3:54

there's really just one
sort of value in each bar.
3:54 - 3:56

So we have one person who
scored a 30, one person who
3:56 - 3:58

scored a 40, and so on.
3:58 - 4:00

Now the range here is 120.
4:00 - 4:03

It's 150 minus 30.
4:03 - 4:05

But let's look at
a second data set.
4:05 - 4:07

In this case, the
range is still 120
4:07 - 4:10

because our highest and
lowest values are the same.
4:10 - 4:13

But everybody is
kind of over here,
4:13 - 4:15

and there's just a couple
outliers beyond that.
4:15 - 4:17

So again, if I were to
just tell you the range,
4:17 - 4:19

I might be misleading you a
little bit because you're not
4:19 - 4:22

sure if it looks
like this on the left
4:22 - 4:24

or if the data looks
like this on the right.
4:24 - 4:26

And this is where standard
deviation and variance
4:26 - 4:28

come into play.
4:28 - 4:31

Standard deviation, just
like the name suggests,
4:31 - 4:34

describes the standard
or typical amount
4:34 - 4:38

that scores deviate from the
mean, hence standard deviation.
4:38 - 4:40

Now we'll get into exactly
what this looks like
4:40 - 4:43

once we learn to calculate
standard deviation.
4:43 - 4:46

But I just want to show
you some symbols for now.
4:46 - 4:49

So like with means, we
have different symbols
4:49 - 4:53

to describe population standard
deviation versus sample standard
4:53 - 4:54

deviation.
4:54 - 4:57

Population standard deviation
is described by sigma.
4:57 - 5:01

It's this sort of O with a Elvis
hair, I like to think of it as,
5:01 - 5:03

not to be confused
with this sigma, which
5:03 - 5:06

is a capital S.
Unfortunately, they're
5:06 - 5:09

named the same thing, which
means take the sum of.
5:09 - 5:11

We learned about
that previously.
5:11 - 5:14

This is sigma with a little s.
5:14 - 5:17

So for a sample,
standard deviation
5:17 - 5:19

is simply described by S.
5:19 - 5:21

So I want to take a
step back and talk
5:21 - 5:24

about why standard
deviations are really useful.
5:24 - 5:28

Whenever you have a normal
curve, a normally distributed
5:28 - 5:31

set of data, which is very
common in the world, things
5:31 - 5:34

like height, weight, and so on
are all normally distributed,
5:34 - 5:37

standard deviations have this
really interesting property
5:37 - 5:40

of telling you a lot of
information about what's common
5:40 - 5:42

and what's uncommon.
5:42 - 5:45

So if we have 0, this is right
at the mean of whatever we're
5:45 - 5:46

talking about.
5:46 - 5:47

This is the mean.
5:47 - 5:50

0 standard deviations away
from the mean is right here.
5:50 - 5:51

You're right at the mean.
5:51 - 5:54

We can look at one standard
deviation above the mean and one
5:54 - 5:57

standard deviation below,
and we automatically just
5:57 - 6:00

because of how standard
deviations work,
6:00 - 6:03

that 68% of people will
fall within this range.
6:03 - 6:04

We can go beyond that.
6:04 - 6:06

We know that between
two standard deviations
6:06 - 6:10

in either direction of the mean,
95% of people will be contained.
6:10 - 6:14

And 3, you're getting really
extreme, really far out, really
6:14 - 6:18

rare, 99.7% of the
data will be contained
6:18 - 6:20

within three standard
deviations in either direction
6:20 - 6:22

from the mean.
6:22 - 6:25

To illustrate this a little bit
more, let's talk some specifics.
6:25 - 6:28

So let's say I'm
looking at IQ scores.
6:28 - 6:30

We know a lot about IQ scores.
6:30 - 6:33

We know for example, the
population mean of IQ is 100.
6:33 - 6:37

And we know that the population
standard deviation, sigma,
6:37 - 6:38

is 15.
6:38 - 6:41

So let's go ahead and draw
that same sort of normal curve.
6:41 - 6:44

We know that intelligence
is normally distributed.
6:44 - 6:46

And let's take a look
at what information
6:46 - 6:48

we have just by knowing
standard deviation.
6:48 - 6:51

So average IQ is
right here at 100.
6:51 - 6:55

One standard deviation
above the mean would be 115.
6:55 - 6:59

Two standard deviations
above the mean would be 130.
6:59 - 7:02

And three standard
deviations would be 145.
7:02 - 7:04

And we could do the same
in the opposite direction.
7:04 - 7:08

One standard deviation below
the mean of intelligence is 85.
7:08 - 7:10

Two standard
deviations below is 70.
7:10 - 7:13

And three standard deviations
below the mean of intelligence
7:13 - 7:14

is 55.
7:14 - 7:18

So again, I automatically
know 68% of people
7:18 - 7:22

will fall between
an IQ of 85 and 115.
7:22 - 7:26

I also know that 95%
of people will fall
7:26 - 7:28

between an IQ of 70 and 130.
7:28 - 7:33

And finally, that
99.7 or so will fall
7:33 - 7:36

between an IQ of 55 and 145.
7:36 - 7:39

So this is great to know
because if you tell me
7:39 - 7:42

you have an IQ of 146,
I'm really impressed.
7:42 - 7:44

This is rare.
7:44 - 7:45

This is very extreme.
7:45 - 7:50

But if you tell me you have
an IQ say, 106, something
7:50 - 7:52

like that, that's fine.
7:52 - 7:53

Good for you.
7:53 - 7:54

Not very impressed.
7:54 - 7:57

So knowing standard
deviations helps
7:57 - 8:01

you to get this extra
information about a data set.
8:01 - 8:03

So finally, we have variance.
8:03 - 8:05

Variance is very simple.
8:05 - 8:08

It's just the square
of standard deviation.
8:08 - 8:13

So it's the average squared
deviation from the mean.
8:13 - 8:16

Unfortunately, for variance,
it doesn't get its own symbols.
8:16 - 8:18

We just take the
symbols we already
8:18 - 8:19

have for standard deviation.
8:19 - 8:21

And we put a squared
because it's just
8:21 - 8:23

squared standard deviation.
8:23 - 8:26

So here, for a population,
we would call the variance
8:26 - 8:28

in a population sigma squared.
8:28 - 8:34

And for a sample, we would call
the sample variance, s squared.
8:34 - 8:36

So in the next
video, we'll learn
8:36 - 8:38

how to calculate
some of these things.
8:38 - 8:40

But I want to at least highlight
some of the formulas you're
8:40 - 8:41

going to see.
8:41 - 8:43

So we have four
different formulas
8:43 - 8:46

because we have standard
deviation and variance.
8:46 - 8:48

And we have the population
versions and the statistic
8:48 - 8:50

for sample versions.
8:50 - 8:53

So for standard deviation
in the population.
8:53 - 8:54

This is our formula.
8:54 - 8:56

Notice we have
sigma on the left.
8:56 - 8:59

And we have all this mess,
which I'll get into next time.
8:59 - 9:01

One thing I'll mention is that
for all of these formulas,
9:01 - 9:06

the numerator is called
the sums of squares, ss.
9:06 - 9:08

And we're going to learn
about what the sums of squares
9:08 - 9:10

really means in the next video.
9:10 - 9:13

But for now, just
keep that in mind.
9:13 - 9:15

So for our sample
statistic, we have this.
9:15 - 9:17

You're going to see
an s on the left here.
9:17 - 9:19

And it's going to have
some similarities,
9:19 - 9:21

but you're going to notice a
difference or two that we'll
9:21 - 9:22

talk about in the next video.
9:22 - 9:24

For variance, we
have sigma squared.
9:24 - 9:30

And for sample statistic version
of variance, we have s squared.

Title:: Measures of Variability (Range, Standard Deviation, Variance)
Description:: more » « less
Video Language:: English
Duration:: 09:30

TTU_OAL edited English subtitles for Measures of Variability (Range, Standard Deviation, Variance)

English subtitles

Revisions

Revision 1 Uploaded

TTU_OAL

Measures of Variability (Range, Standard Deviation, Variance)

Revisions

Our website uses cookies

Operating cookies (Required)