What is a hypothesis test? A beginner's guide to hypothesis testing!

Rollback to version 1

0:00 - 0:00
0:00 - 0:02

(speaker)
What is hypothesis testing?
0:02 - 0:06

Hypothesis Testing is used to determine
0:06 - 0:09

whether there is enough evidence
in a sample of data
0:09 - 0:14

to infer that a certain condition
is true for the entire population.
0:14 - 0:18

Therefore, it is a method
to test an assumption or theory
0:18 - 0:22

about a parameter
of a population based on a sample.
0:22 - 0:25

What is the population
and what is the sample?
0:25 - 0:29

The population
is the whole group we are interested in.
0:29 - 0:31

If you want to study the average height
0:31 - 0:34

of all adults in the United States,
0:34 - 0:38

then a population
would be all adults in the United States.
0:38 - 0:42

The sample
is the smaller group we actually study
0:42 - 0:44

chosen from the population.
0:44 - 0:49

For example, 150 adults
were selected from the United States,
0:49 - 0:51

and now we want to use the sample
0:51 - 0:53

to make a statement about the population.
0:53 - 0:56

And here are the six steps how to do that.
0:56 - 0:58

Number one:
hypothesis.
0:58 - 1:01

First, we need a statement, a hypothesis,
1:01 - 1:03

that we want to test.
1:03 - 1:04

For example, you want to know
1:04 - 1:07

whether a drug will have a positive effect
1:07 - 1:11

on blood pressure
in people with high blood pressure.
1:11 - 1:12

But what's next?
1:12 - 1:14

In our hypothesis, we stated
1:14 - 1:18

that we would like to study people
with high blood pressure.
1:18 - 1:22

So our population is all people
with high blood pressure
1:22 - 1:24

in, for example, the US.
1:24 - 1:28

Obviously, we cannot collect data
from the whole population,
1:28 - 1:30

so we take a sample from the population.
1:30 - 1:33

Now we use this sample to make a statement
1:33 - 1:35

about the population.
1:35 - 1:37

But how do we do that?
1:37 - 1:39

For this, we need a hypothesis test.
1:39 - 1:41

Hypothesis testing is a method
1:41 - 1:45

for testing a claim
about a parameter in a population
1:45 - 1:48

using data measured in a sample.
1:48 - 1:50

Great, that's exactly what we need.
1:50 - 1:53

There are many different hypothesis tests,
1:53 - 1:54

and at the end of this video,
1:54 - 1:58

I will give you a guide
on how to find the right test.
1:58 - 2:00

And, of course, you can find videos
2:00 - 2:03

about many more hypothesis tests
on our channel.
2:03 - 2:06

But how does a hypothesis test work?
2:06 - 2:08

When we conduct a hypothesis test,
2:08 - 2:10

we start with a research hypothesis,
2:10 - 2:13

also called alternative hypothesis.
2:13 - 2:17

This is the hypothesis
we are trying to find evidence for.
2:17 - 2:19

In our case, the research hypothesis
2:19 - 2:23

is the drug has an effect
on blood pressure,
2:23 - 2:26

but we cannot test this hypothesis
directly
2:26 - 2:28

with a classical hypothesis test,
2:28 - 2:30

so we test the opposite hypothesis
2:30 - 2:33

that the drug has no effect
on blood pressure.
2:34 - 2:35

But what does that mean?
2:35 - 2:40

First, we assume that the drug
has no effect in a population.
2:40 - 2:42

We therefore assume that, in general,
2:42 - 2:46

people who take the drug
and people who don't take the drug
2:46 - 2:49

have the same blood pressure on average.
2:49 - 2:51

If we now take a random sample,
2:51 - 2:56

and it turns out that the drug
has a large effect in the sample,
2:56 - 3:01

then we can ask how likely
it is to draw such a sample
3:01 - 3:03

or one that deviates even more
3:03 - 3:06

if the drug actually has no effect.
3:06 - 3:11

So in reality, on average,
there is no difference in a population.
3:11 - 3:15

If this probability is very low,
we can ask ourselves,
3:15 - 3:18

maybe the drug has an effect
in the population,
3:18 - 3:22

and we may have enough evidence
to reject the null hypothesis
3:22 - 3:24

that the drug has no effect.
3:24 - 3:28

And it is this probability
that is called the "p-value".
3:28 - 3:32

Let's summarize this
in three simple steps:
3:32 - 3:34

number one,
the null hypothesis states
3:34 - 3:37

that there is no difference
in the population;
3:37 - 3:40

number two,
the hypothesis test calculates
3:40 - 3:44

how much the sample deviates
from the null hypothesis;
3:44 - 3:48

number three,
the p-value indicates the probability
3:48 - 3:53

of getting a sample
that deviates as much as our sample,
3:53 - 3:56

or one that even deviates more
than our sample,
3:56 - 4:00

assuming the null hypothesis is true.
4:00 - 4:03

But at what point
is the p-value small enough
4:03 - 4:05

for us to reject the null hypothesis?
4:05 - 4:07

This brings us to the next point,
4:07 - 4:09

statistical significance.
4:09 - 4:13

If the p-value is less than
a predetermined threshold,
4:13 - 4:16

the result
is considered statistically significant.
4:16 - 4:19

This means that the result is unlikely
4:19 - 4:21

to have occurred by chance alone,
4:21 - 4:23

and that we have enough evidence
4:23 - 4:25

to reject the null hypothesis.
4:25 - 4:29

This threshold is often 0.05.
4:29 - 4:31

Therefore, a small p-value suggests
4:31 - 4:34

that the observed data or sample
4:34 - 4:37

is inconsistent with the null hypothesis.
4:37 - 4:40

This leads us
to reject the null hypothesis
4:40 - 4:42

in favor of the alternative hypothesis.
4:42 - 4:46

A large p-value suggests
that the observed data
4:46 - 4:48

is consistent with the null hypothesis,
4:48 - 4:50

and we will not reject it.
4:50 - 4:54

But note, there is always a risk
of making an error.
4:54 - 4:56

A small p-value does not prove
4:56 - 4:58

that the alternative hypothesis is true.
4:58 - 5:03

It is only saying
that it is unlikely to get such a result
5:03 - 5:07

or a more extreme
when the null hypothesis is true.
5:07 - 5:10

And again, if the null hypothesis is true,
5:10 - 5:12

there is no difference in a population.
5:12 - 5:14

And the other way around,
5:14 - 5:16

a large p-value does not prove
5:16 - 5:18

that the null hypothesis is true.
5:18 - 5:22

It is only saying
that it is likely to get such a result
5:22 - 5:26

or a more extreme
when the null hypothesis is true.
5:26 - 5:28

So there are two types of errors,
5:28 - 5:32

which are called Type I and Type II error.
5:32 - 5:34

Let's start with the Type I error.
5:34 - 5:37

In hypothesis testing,
a Type I error occurs
5:37 - 5:41

when a true null hypothesis is rejected.
5:41 - 5:44

So in reality,
the null hypothesis is true,
5:44 - 5:47

but we make the decision
to reject the null hypothesis.
5:47 - 5:52

In our example, it means
that the drug actually had no effect.
5:52 - 5:56

So in reality, there is no difference
in blood pressure.
5:56 - 5:58

Whether the drug is taken or not,
5:58 - 6:02

the blood pressure
remains the same in both cases.
6:02 - 6:06

But our sample
happened to be so far off the true value
6:06 - 6:09

that we mistakenly thought
the drug was working.
6:09 - 6:12

And a Type II error occurs
6:12 - 6:15

when a false null hypothesis
is not rejected.
6:15 - 6:18

So in reality,
the null hypothesis is false,
6:18 - 6:23

but we make the decision
not to reject the null hypothesis.
6:23 - 6:27

In our example,
this means the drug actually did work;
6:27 - 6:28

there is a difference between
6:28 - 6:31

those who have taken the drug
and those who have not,
6:31 - 6:34

but it was just a coincidence
6:34 - 6:38

that the sample taken
did not show much difference,
6:38 - 6:42

and we mistakenly thought
the drug was not working.
6:42 - 6:44

And now I'll show you how Data Tab
6:44 - 6:48

helps you to find
a suitable hypothesis test,
6:48 - 6:52

and, of course, calculates it
and interprets the results for you.
6:52 - 6:54

Let's go to datatab.net,
6:54 - 6:57

and copy your own data in here.
6:57 - 6:59

We will just use this example dataset.
6:59 - 7:02

After copying your data into the table,
7:02 - 7:05

the variables appear down here.
7:05 - 7:08

Data Tab automatically tries to determine
7:08 - 7:10

the correct level of measurement,
7:10 - 7:13

but you can also change it up here.
7:13 - 7:16

Now we just click on "Hypothesis Testing"
7:16 - 7:19

and select the variables we want to use
7:19 - 7:22

for the calculation
of the hypothesis test.
7:22 - 7:26

Data Tab
will then suggest a suitable test.
7:26 - 7:29

For example,
in this case, a Chi squared test,
7:29 - 7:34

or in that case, an analysis of variance.
7:34 - 7:37

Then you will see the hypotheses
and the results.
7:37 - 7:40

If you are not sure
how to interpret the results,
7:40 - 7:43

click on "Summary in words".
7:43 - 7:46

Further, you can check the assumptions
7:46 - 7:49

and decide whether you want to calculate
a parametric
7:49 - 7:51

or a non-parametric test.
7:51 - 7:53

You can find out the difference
7:53 - 7:58

between parametric and non-parametric
tests in my next video.
7:58 - 8:02

Thanks for watching
and I hope you enjoyed the video.

Title:: What is a hypothesis test? A beginner's guide to hypothesis testing!
Description:: more » « less
Video Language:: English
Duration:: 08:07

	CeeCee edited English subtitles for What is a hypothesis test? A beginner's guide to hypothesis testing!
	CeeCee edited English subtitles for What is a hypothesis test? A beginner's guide to hypothesis testing!

English subtitles

Revisions Compare revisions

Revision 2 Edited

CeeCee
Revision 1 Uploaded

CeeCee

	Revision Number	Author	Created
	2	CeeCee
	1	CeeCee

What is a hypothesis test? A beginner's guide to hypothesis testing!

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)