< Return to Video

Multicollinearity (in Regression Analysis)

  • 0:00 - 0:02
    PROFESSOR: Hello, and welcome.
  • 0:02 - 0:06
    In this video, I'll explain to
    you what multicollinearity is
  • 0:06 - 0:09
    and how you can check it online.
  • 0:09 - 0:12
    And we get started right now.
  • 0:12 - 0:16
    So the first question is,
    what is multicollinearity?
  • 0:16 - 0:22
    Multicollinearity means that two
    or more independent variables
  • 0:22 - 0:25
    are strongly correlated
    with one another.
  • 0:25 - 0:28
    The problem about
    multicollinearity is that
  • 0:28 - 0:32
    the effect of individual
    variables cannot be clearly
  • 0:32 - 0:34
    separated.
  • 0:34 - 0:37
    Let's look at the
    regression equation again.
  • 0:37 - 0:40
    We have the dependent
    variable here
  • 0:40 - 0:43
    and the independent
    variable with
  • 0:43 - 0:45
    the respective coefficients.
  • 0:45 - 0:51
    For example, if there is a high
    correlation between x1 and x2,
  • 0:51 - 0:55
    or if these two variables
    are almost equal,
  • 0:55 - 1:00
    then it is quite difficult
    to determine b1 and b2.
  • 1:00 - 1:03
    If both variables
    are completely equal,
  • 1:03 - 1:10
    the regression model does not
    know how to determine b1 and b2.
  • 1:10 - 1:14
    This means that the regression
    model becomes unstable.
  • 1:14 - 1:16
    If you now want to
    use the regression
  • 1:16 - 1:19
    model for a
    prediction, it does not
  • 1:19 - 1:22
    matter if there is
    multicollinearity.
  • 1:22 - 1:26
    In a prediction, you are
    only interested in how good
  • 1:26 - 1:28
    the prediction is,
    but you are not
  • 1:28 - 1:31
    interested in how
    big the influence
  • 1:31 - 1:34
    of the respective variables is.
  • 1:34 - 1:38
    However, if the regression model
    is used to measure the influence
  • 1:38 - 1:42
    of the independent variable
    on a dependent variable,
  • 1:42 - 1:46
    there must not be
    multicollinearity, and if it is,
  • 1:46 - 1:50
    the coefficients cannot be
    interpreted meaningfully.
  • 1:50 - 1:53
    So the next question
    is, how can we now
  • 1:53 - 1:56
    diagnose multicollinearity?
  • 1:56 - 1:59
    If we look at the
    regression equation again,
  • 1:59 - 2:06
    we have the variable x1, x2,
    and upon to the variable xk.
  • 2:06 - 2:09
    We now want to
    know if x1 is quite
  • 2:09 - 2:13
    identical to any other
    variable or a combination
  • 2:13 - 2:15
    of the other variables.
  • 2:15 - 2:19
    In order to do this, we simply
    set up a regression model.
  • 2:19 - 2:22
    In this new regression
    model, we take x1
  • 2:22 - 2:25
    as the new dependent variable.
  • 2:25 - 2:28
    If we now can
    predict x1 very well
  • 2:28 - 2:30
    from the other
    independent variables,
  • 2:30 - 2:34
    we don't need x1 anymore,
    because we can use
  • 2:34 - 2:36
    the other variables instead.
  • 2:36 - 2:39
    If we would now
    use all variables,
  • 2:39 - 2:44
    it could be that the regression
    model gets very unstable.
  • 2:44 - 2:46
    In mathematics, we would
    say that the equation
  • 2:46 - 2:48
    is overdetermined.
  • 2:48 - 2:52
    We could now do this
    for all other variables.
  • 2:52 - 2:57
    So we estimate now x2 by
    using the other variables.
  • 2:57 - 3:01
    And we estimate xk by
    the other variables.
  • 3:01 - 3:06
    In this case, we have k
    new regression models.
  • 3:06 - 3:09
    For each of these
    regression models,
  • 3:09 - 3:13
    we calculate the tolerance and
    the variance inflation factor.
  • 3:13 - 3:18
    The tolerance is obtained
    by taking 1 minus r squared,
  • 3:18 - 3:22
    which is the coefficient
    of determination
  • 3:22 - 3:24
    or the variance explanation.
  • 3:24 - 3:27
    The variance
    inflation factor is 1
  • 3:27 - 3:32
    divided by 1 minus the
    coefficient of determination.
  • 3:32 - 3:36
    Multicollinearity could
    exist if the tolerance
  • 3:36 - 3:39
    is smaller than 0.1.
  • 3:39 - 3:42
    If we look at the
    variance inflation factor,
  • 3:42 - 3:47
    there could be multicollinearity
    if the variance inflation
  • 3:47 - 3:50
    factor is larger than 10.
  • 3:50 - 3:53
    And now I will show
    you how you can easily
  • 3:53 - 3:56
    check the requirements online.
  • 3:56 - 4:01
    In order to do this,
    please visit datatab.net,
  • 4:01 - 4:04
    and click on the
    Statistics Calculator.
  • 4:04 - 4:09
    If you want to use your own
    data, just click on Clear Table.
  • 4:09 - 4:12
    I will use the example data now.
  • 4:12 - 4:15
    If you want to
    perform a regression,
  • 4:15 - 4:18
    just click on the
    tab Regression.
  • 4:18 - 4:22
    On the left side, you can
    choose your dependent variable.
  • 4:22 - 4:24
    On the right side,
    you can choose
  • 4:24 - 4:26
    your independent variables.
  • 4:26 - 4:30
    In our example, we
    want to choose salary
  • 4:30 - 4:32
    as the dependent variable.
  • 4:32 - 4:35
    And as the
    independent variables,
  • 4:35 - 4:38
    we choose gender,
    age, and weight.
  • 4:38 - 4:42
    Now we can click on
    Check conditions.
  • 4:42 - 4:46
    And we get the results
    of the condition checks.
  • 4:46 - 4:51
    First, we start
    with the linearity.
  • 4:51 - 4:55
    Then we see the
    normality of errors.
  • 4:55 - 4:58
    Further, we have the
    multicollinearity tests,
  • 4:58 - 5:05
    where we have the tolerance and
    the variance inflation factor.
  • 5:05 - 5:10
    And finally, we can see the
    test of homoscedasticity.
  • 5:10 - 5:13
    This is how easy you can
    check the requirements
  • 5:13 - 5:16
    for a linear regression model.
  • 5:16 - 5:20
    Another important topic when
    talking about regression models
  • 5:20 - 5:22
    are dummy variables.
  • 5:22 - 5:26
    If you want to learn more
    about dummy variables,
  • 5:26 - 5:29
    just continue to
    watch the next video.
  • 5:29 - 5:31
    See you soon.
  • 5:31 - 5:56
Title:
Multicollinearity (in Regression Analysis)
Description:

more » « less
Video Language:
English
Duration:
05:57
TTU_OAL edited English subtitles for Multicollinearity (in Regression Analysis) Jun 13, 2025, 7:50 PM

English subtitles

Revisions