< Return to Video

Conditions for inference on slope | More on regression | AP Statistics | Khan Academy

  • 0:00 - 0:02
    - [Instructor] In a previous
    video, we began to think about
  • 0:02 - 0:05
    how we can use a regression
    line and, in particular,
  • 0:05 - 0:08
    the slope of a regression
    line based on sample data,
  • 0:08 - 0:11
    how we can use that in
    order to make inference
  • 0:11 - 0:16
    about the slope of the true
    population regression line.
  • 0:16 - 0:18
    In this video, what we're
    going to think about,
  • 0:18 - 0:20
    what are the conditions for inference
  • 0:20 - 0:23
    when we're dealing with regression lines?
  • 0:23 - 0:25
    And these are going to be, in some ways,
  • 0:25 - 0:27
    similar to the conditions for inference
  • 0:27 - 0:30
    that we thought about when we
    were doing hypothesis testing
  • 0:30 - 0:34
    and confidence intervals for
    means and for proportions,
  • 0:34 - 0:37
    but there's also going to
    be a few new conditions.
  • 0:37 - 0:40
    So to help us remember these conditions,
  • 0:40 - 0:45
    you might want to think about
    the LINER acronym, L-I-N-E-R.
  • 0:47 - 0:50
    And if it isn't obvious to
    you, this almost is linear.
  • 0:50 - 0:53
    Liner, if it had an A, it would be linear.
  • 0:53 - 0:55
    And this is valuable because, remember,
  • 0:55 - 0:57
    we're thinking about linear regression.
  • 0:57 - 1:01
    So the L right over here
    actually does stand for linear.
  • 1:01 - 1:05
    And here, the condition is, is
    that the actual relationship
  • 1:05 - 1:09
    in the population between
    your x and y variables
  • 1:09 - 1:11
    actually is a linear relationship,
  • 1:11 - 1:13
    so actual
  • 1:14 - 1:15
    linear
  • 1:16 - 1:17
    relationship,
  • 1:18 - 1:19
    relationship
  • 1:20 - 1:22
    between,
  • 1:22 - 1:24
    between x
  • 1:24 - 1:26
    and y.
  • 1:26 - 1:29
    Now, in a lot of cases, you
    might just have to assume
  • 1:29 - 1:31
    that this is going to be
    the case when you see it on
  • 1:31 - 1:34
    an exam, like an AP exam, for example.
  • 1:34 - 1:36
    They might say, hey, assume
    this condition is met.
  • 1:36 - 1:38
    Oftentimes, it'll say assume all
  • 1:38 - 1:39
    of these conditions are met.
  • 1:39 - 1:41
    They just want you to maybe
    know about these conditions.
  • 1:41 - 1:43
    But this is something to think about.
  • 1:43 - 1:46
    If the underlying
    relationship is nonlinear,
  • 1:46 - 1:47
    well, then maybe some of your
  • 1:47 - 1:50
    inferences might not be as robust.
  • 1:50 - 1:53
    Now, the next one is
    one we have seen before
  • 1:53 - 1:56
    when we're talking about general
    conditions for inference,
  • 1:56 - 1:58
    and this is the independence,
  • 1:58 - 2:00
    independence condition.
  • 2:00 - 2:02
    And there's a couple of
    ways to think about it.
  • 2:02 - 2:04
    Either individual observations
  • 2:04 - 2:06
    are independent of each other.
  • 2:06 - 2:09
    So you could be sampling with replacement.
  • 2:09 - 2:12
    Or you could be thinking
    about your 10% rule,
  • 2:12 - 2:13
    that we have done when we thought about
  • 2:13 - 2:18
    the independence condition
    for proportions and for means,
  • 2:18 - 2:20
    where we would need to feel confident
  • 2:20 - 2:24
    that the size of our
    sample is no more than 10%
  • 2:24 - 2:26
    of the size of the population.
  • 2:26 - 2:28
    Now, the next one is the normal condition,
  • 2:28 - 2:30
    which we have talked about
    when we were doing inference
  • 2:30 - 2:33
    for proportions and for means.
  • 2:33 - 2:35
    Although, it means something a
    little bit more sophisticated
  • 2:35 - 2:38
    when we're dealing with a regression.
  • 2:38 - 2:40
    The normal condition, and, once again,
  • 2:40 - 2:42
    many times people just
    say assume it's been met.
  • 2:42 - 2:44
    But let me actually
    draw a regression line,
  • 2:44 - 2:45
    but do it with a little perspective,
  • 2:45 - 2:47
    and I'm gonna add a third dimension.
  • 2:47 - 2:48
    Let's say that's the x-axis,
  • 2:48 - 2:50
    and let's say this is the y-axis.
  • 2:50 - 2:55
    And the true population
    regression line looks like this.
  • 2:55 - 2:57
    And so the normal condition tells us
  • 2:57 - 3:00
    that, for any given x
    in the true population,
  • 3:01 - 3:06
    the distribution of y's that
    you would expect is normal,
  • 3:06 - 3:07
    is normal.
  • 3:07 - 3:09
    So let me see if I can
    draw a normal distribution
  • 3:09 - 3:11
    for the y's,
  • 3:11 - 3:12
    given that x.
  • 3:12 - 3:14
    So that would be that
    normal distribution there.
  • 3:14 - 3:17
    And then let's say, for
    this x right over here,
  • 3:17 - 3:21
    you would expect a normal
    distribution as well,
  • 3:21 - 3:23
    so just like,
  • 3:23 - 3:25
    just like this.
  • 3:25 - 3:25
    So if we're given x,
  • 3:25 - 3:28
    the distribution of y's should be normal.
  • 3:28 - 3:30
    Once again, many times you'll just be
  • 3:30 - 3:32
    told to assume that that has
    been met because it might,
  • 3:32 - 3:34
    at least in an introductory
    statistics class,
  • 3:34 - 3:37
    be a little bit hard to
    figure this out on your own.
  • 3:37 - 3:39
    Now, the next condition
    is related to that,
  • 3:39 - 3:43
    and this is the idea of
    having equal variance,
  • 3:43 - 3:45
    equal variance.
  • 3:45 - 3:46
    And that's just saying that each
  • 3:46 - 3:49
    of these normal distributions should have
  • 3:49 - 3:51
    the same spread for a given x.
  • 3:51 - 3:53
    And so you could say equal variance,
  • 3:53 - 3:55
    or you could even think about them having
  • 3:55 - 3:56
    the equal standard deviation.
  • 3:56 - 4:00
    So, for example, if, for a
    given x, let's say for this x,
  • 4:00 - 4:03
    all of sudden, you had
    a much lower variance,
  • 4:03 - 4:04
    made it look like this,
  • 4:04 - 4:07
    then you would no longer meet
    your conditions for inference.
  • 4:07 - 4:10
    Last, but not least, and this
    is one we've seen many times,
  • 4:10 - 4:12
    this is the random condition.
  • 4:12 - 4:15
    And this is that the data comes from
  • 4:15 - 4:17
    a well-designed random sample or
  • 4:17 - 4:19
    some type of randomized experiment.
  • 4:19 - 4:23
    And this condition we have
    seen in every type of condition
  • 4:23 - 4:26
    for inference that we
    have looked at so far.
  • 4:26 - 4:27
    So I'll leave you there.
  • 4:27 - 4:28
    It's good to know.
  • 4:28 - 4:30
    It will show up on some exams.
  • 4:30 - 4:33
    But many times, when it
    comes to problem solving,
  • 4:33 - 4:36
    in an introductory statistics
    class, they will tell you,
  • 4:36 - 4:39
    hey, just assume the conditions
    for inference have been met.
  • 4:39 - 4:41
    Or what are the conditions for inference?
  • 4:41 - 4:43
    But they're not going to
    actually make you prove,
  • 4:43 - 4:46
    for example, the normal or
    the equal variance condition.
  • 4:46 - 4:47
    That might be a bit much
  • 4:47 - 4:50
    for an introductory statistics class.
Title:
Conditions for inference on slope | More on regression | AP Statistics | Khan Academy
Description:

more » « less
Video Language:
English
Team:
Khan Academy
Duration:
04:51

English subtitles

Revisions