Lec 6 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
-
0:07 - 0:07Today we're going to not talk
about sorting. -
0:11 - 0:11This is an exciting new
development. -
0:14 - 0:14We're going to talk about
another problem, -
0:18 - 0:18a related problem,
but a different problem. -
0:35 - 0:35We're going to talk about
another problem that we would -
0:38 - 0:38like to solve in linear time.
Last class we talked about we -
0:42 - 0:42could do sorting in linear time.
To do that we needed some -
0:45 - 0:45additional assumptions.
Today we're going to look at a -
0:48 - 0:48problem that really only needs
linear time, even though at -
0:51 - 0:51first glance it might look like
it requires sorting. -
0:54 - 0:54So this is going to be an
easier problem. -
0:57 - 0:57The problem is I give you a
bunch of numbers. -
1:00 - 1:00Let's call them elements.
And they are in some array, -
1:07 - 1:07let's say.
And they're in no particular -
1:12 - 1:12order, so unsorted.
I want to find the kth smallest -
1:18 - 1:18element.
-
1:26 - 1:26This is called the element of
rank k. -
1:37 - 1:37In other words,
I have this list of numbers -
1:40 - 1:40which is unsorted.
And, if I were to sort it, -
1:43 - 1:43I would like to know what the
kth element is. -
1:46 - 1:46But I'm not allowed to sort it.
One solution to this problem, -
1:50 - 1:50this is the naīve algorithm,
is you just sort and then -
1:54 - 1:54return the kth element.
This is another possible -
1:57 - 1:57definition of the problem.
And we would like to do better -
2:03 - 2:03than that.
So you could sort, -
2:06 - 2:06what's called the array A,
and then return A[k]. -
2:11 - 2:11That is one thing we could do.
And if we use heap sort or -
2:16 - 2:16mergesort, this will take n lg n
time. -
2:20 - 2:20We would like to do better than
n lg n. -
2:24 - 2:24Ideally linear time.
The problem is pretty natural, -
2:29 - 2:29straightforward.
It has various applications. -
2:34 - 2:34Depending on how you choose k,
k could be any number between 1 -
2:40 - 2:40and n.
For example, -
2:41 - 2:41if we choose k=1 that element
has a name. -
2:45 - 2:45Any suggestions of what the
name is? -
2:48 - 2:48The minimum.
That's easy. -
2:50 - 2:50Any suggestions on how we could
find the minimum element in an -
2:55 - 2:55array in linear time?
Right. -
2:59 - 2:59Just scan through the array.
Keep track of what the smallest -
3:04 - 3:04number is that you've seen.
The same thing with the -
3:09 - 3:09maximum, k=n.
These are rather trivial. -
3:12 - 3:12But a more interesting version
of the order statistic problem -
3:17 - 3:17is to find the median.
This is either k equals n plus -
3:22 - 3:221 over 2 floor or ceiling.
I will call both of those -
3:26 - 3:26elements medians.
-
3:34 - 3:34Finding the median of an
unsorted array in linear time is -
3:37 - 3:37quite tricky.
And that sort of is the main -
3:39 - 3:39goal of this lecture,
is to be able to find the -
3:42 - 3:42medians.
For free we're going to be able -
3:44 - 3:44to find the arbitrary kth
smallest element, -
3:46 - 3:46but typically we're most
interested in finding the -
3:49 - 3:49median.
And on Friday in recitation -
3:51 - 3:51you'll see why that is so
useful. -
3:52 - 3:52There are all sorts of
situations where you can use -
3:55 - 3:55median for really effective
divide-and-conquer without -
3:58 - 3:58having to sort.
You can solve a lot of problems -
4:03 - 4:03in linear time as a result.
And we're going to cover today -
4:07 - 4:07two algorithms for finding order
statistics. -
4:11 - 4:11Both of them are linear time.
The first one is randomized, -
4:15 - 4:15so it's only linear expected
time. -
4:18 - 4:18And the second one is
worst-case linear time, -
4:22 - 4:22and it will build on the
randomized version. -
4:25 - 4:25Let's start with a randomize
divide-and-conquer algorithm. -
4:46 - 4:46This algorithm is called
rand-select. -
5:02 - 5:02And the parameters are a little
bit more than what we're used -
5:06 - 5:06to.
The order statistics problem -
5:09 - 5:09you're given an array A.
And here I've changed notation -
5:13 - 5:13and I'm looking for the ith
smallest element, -
5:16 - 5:16so i is the index I'm looking
for. -
5:18 - 5:18And I'm also going to change
the problem a little bit. -
5:22 - 5:22And instead of trying to find
it in the whole array, -
5:26 - 5:26I'm going to look in a
particular interval of the -
5:29 - 5:29array, A from p up to q.
We're going to need that for a -
5:34 - 5:34recursion.
This better be a recursive -
5:36 - 5:36algorithm because we're using
divide-and-conquer. -
5:39 - 5:39Here is the algorithm.
-
5:51 - 5:51With a base case.
It's pretty simple. -
5:55 - 5:55Then we're going to use part of
the quicksort algorithm, -
6:00 - 6:00randomized quicksort.
-
6:09 - 6:09We didn't actually define this
subroutine two lectures ago, -
6:13 - 6:13but you should know what it
does, especially if you've read -
6:18 - 6:18the textbook.
This says in the array A[p...q] -
6:21 - 6:21pick a random element,
so pick a random index between -
6:25 - 6:25p and q, swap it with the first
element, then call partition. -
6:30 - 6:30And partition uses that first
element to split the rest of the -
6:35 - 6:35array into less than or equal to
that random partition and -
6:39 - 6:39greater than or equal to that
partition. -
6:43 - 6:43This is just picking a random
partition element between p and -
6:47 - 6:47q, cutting the array in half,
although the two sizes may not -
6:52 - 6:52be equal.
And it returns the index of -
6:55 - 6:55that partition element,
some number between p and q. -
7:00 - 7:00And we're going to define k to
be this particular value, -
7:08 - 7:08r minus p plus 1.
And the reason for that is that -
7:15 - 7:15k is then the rank of the
partition element. -
7:22 - 7:22This is in A[p...q].
Let me draw a picture here. -
7:30 - 7:30We have our array A.
It starts at p and ends at q. -
7:34 - 7:34There is other stuff,
but for this recursive all we -
7:39 - 7:39care about is p up to q.
We pick a random partition -
7:43 - 7:43element, say this one,
and we partition things so that -
7:47 - 7:47everything in here,
let's call this r, -
7:51 - 7:51is less than or equal to A[r]
and everything up here is -
7:55 - 7:55greater than or equal to A[r].
And A[r] is our partition -
8:00 - 8:00element.
After this call, -
8:03 - 8:03that's what the array looks
like. -
8:06 - 8:06And we get r.
We get the index of where -
8:10 - 8:10partition element is stored.
The number of elements that are -
8:15 - 8:15less than or equal to A[r] and
including r is r minus p plus 1. -
8:20 - 8:20There will be r minus p
elements here, -
8:24 - 8:24and we're adding 1 to get this
element. -
8:28 - 8:28And, if you start counting at
1, if this is rank 1, -
8:33 - 8:33rank 2, this element will have
rank k. -
8:36 - 8:36That's just from the
construction in the partition. -
8:40 - 8:40And now we get to recurse.
And there are three cases -- -
8:53 - 8:53-- depending on how i relates
to k. -
8:55 - 8:55Remember i is the rank that
we're looking for, -
8:58 - 8:58k is the rank that we happen to
get out of this random -
9:01 - 9:01partition.
We don't have much control over -
9:04 - 9:04k, but if we're lucky i=k.
That's the element we want. -
9:13 - 9:13Then we just return the
partition element. -
9:15 - 9:15More likely is that the element
we're looking for is either to -
9:18 - 9:18the left or to the right.
And if it's to the left we're -
9:21 - 9:21going to recurse in the
left-hand portion of the array. -
9:24 - 9:24And if it's to the right we're
going to recurse in the -
9:26 - 9:26right-hand portion.
So, pretty straightforward at -
9:29 - 9:29this point.
-
9:45 - 9:45I just have to get all the
indices right. -
10:08 - 10:08Either we're going to recurse
on the part between p and r -
10:11 - 10:11minus 1, that's this case.
The rank we're looking for is -
10:15 - 10:15to the left of the rank of
element A[r]. -
10:17 - 10:17Or, we're going to recurse on
the right part between r plus 1 -
10:21 - 10:21and q.
Where we recurse on the left -
10:23 - 10:23part the rank we're looking for
remains the same, -
10:26 - 10:26but when we recurse on the
right part the rank we're -
10:29 - 10:29looking for gets offset.
Because we sort of got rid of -
10:34 - 10:34the k elements over here.
I should have written this -
10:39 - 10:39length is k.
We've sort of swept away k -
10:42 - 10:42ranks of elements.
And now within this array we're -
10:47 - 10:47looking for the i minus kth
smallest element. -
10:51 - 10:51That's the recursion.
We only recurse once. -
10:55 - 10:55And random partition is not a
recursion. -
11:00 - 11:00That just takes linear time.
And the total amount of work -
11:05 - 11:05we're doing here should be
linear time plus one recursion. -
11:09 - 11:09And we'd next like to see what
the total running time is in -
11:14 - 11:14expectation, but let's first do
a little example -- -
11:26 - 11:26-- to make this algorithm
perfectly clear. -
11:29 - 11:29Let's suppose we're looking for
the seventh smallest element in -
11:34 - 11:34this array.
-
11:50 - 11:50And let's suppose,
just for example, -
11:53 - 11:53that the pivot we're using is
just the first element. -
11:58 - 11:58So, nothing fancy.
I would have to flip a few -
12:02 - 12:02coins in order to generate a
random one, so let's just pick -
12:07 - 12:07this one.
If I partition at the element -
12:10 - 12:106, this is actually an example
we did two weeks ago, -
12:14 - 12:14and I won't go through it
again, but we get the same -
12:18 - 12:18array, as we did two weeks ago,
namely 2, 5, -
12:21 - 12:213, 6, 8, 13,
10 and 11. -
12:23 - 12:23If you run through the
partitioning algorithm, -
12:27 - 12:27that happens to be the order
that it throws the elements -
12:31 - 12:31into.
And this is our position r. -
12:35 - 12:35This is p here.
It's just 1. -
12:37 - 12:37And q is just the end.
And I am looking for the -
12:41 - 12:41seventh smallest element.
And it happens when I run this -
12:45 - 12:45partition that 6 falls into the
fourth place. -
12:48 - 12:48And we know that means,
because all the elements here -
12:52 - 12:52are less than 6 and all the
elements here are greater than -
12:57 - 12:576, if this array were sorted,
6 would be right here in -
13:01 - 13:01position four.
So, r here is 4. -
13:05 - 13:05Yeah?
The 12 turned into an 11? -
13:09 - 13:09This was an 11,
believe it or not. -
13:14 - 13:14Let me be simple.
Sorry. -
13:17 - 13:17Sometimes my ones look like
twos. -
13:21 - 13:21Not a good feature.
That's an easy way to cover. -
13:27 - 13:27[LAUGHTER]
Don't try that on exams. -
13:32 - 13:32Oh, that one was just a two.
No. -
13:34 - 13:34Even though we're not sorting
the array, we're only spending -
13:38 - 13:38linear work here to partition by
6. -
13:40 - 13:40We know that if we had sorted
the array 6 would fall here. -
13:44 - 13:44We don't know about these other
elements. -
13:46 - 13:46They're not in sorted order,
but from the properties of -
13:50 - 13:50partition we know 6 went the
right spot. -
13:53 - 13:53We now know rank of 6 is 4.
We happened to be looking for 7 -
13:56 - 13:56and we happened to get this
number 4. -
14:00 - 14:00We want something over here.
It turns out we're looking for -
14:04 - 14:0410, I guess.
No, 11. -
14:05 - 14:05There should be eight elements
in this array, -
14:08 - 14:08so it's the next to max.
Max here is 13, -
14:11 - 14:11I'm cheating here.
The answer we're looking for is -
14:14 - 14:1411.
We know that what we're looking -
14:16 - 14:16for is in the right-hand part
because the rank we're looking -
14:20 - 14:20for is 7, which is bigger than
4. -
14:23 - 14:23Now, what rank are we looking
for in here? -
14:25 - 14:25Well, we've gotten rid of four
elements over here. -
14:30 - 14:30It happened here that k is also
4 because p is 1 in this -
14:35 - 14:35example.
The rank of 6 was 4. -
14:38 - 14:38We throw away those four
elements. -
14:41 - 14:41Now we're looking for rank 7
minus 4 which is 3. -
14:46 - 14:46And, indeed,
the rank 3 element here is -
14:50 - 14:50still 11.
So, you recursively find that. -
14:54 - 14:54That's your answer.
Now that algorithm should be -
14:58 - 14:58pretty clear.
The tricky part is to analyze -
15:03 - 15:03it.
And the analysis here is quite -
15:06 - 15:06a bit like randomized quicksort,
although not quite as hairy, -
15:10 - 15:10so it will go faster.
But it will be also sort of a -
15:14 - 15:14nice review of the randomized
quicksort analysis which was a -
15:18 - 15:18bit tricky and always good to
see a couple of times. -
15:22 - 15:22We're going to follow the same
kind of outline as before to -
15:26 - 15:26look at the expected running
time of this algorithm. -
15:31 - 15:31And to start out we're going
to, as before, -
15:35 - 15:35look at some intuition just to
feel good about ourselves. -
15:40 - 15:40Also feel bad as you'll see.
Let's think about two sort of -
15:45 - 15:45extreme cases,
a good case and the worst case. -
15:49 - 15:49And I should mention that in
all of the analyses today we -
15:54 - 15:54assume the elements are
distinct. -
16:04 - 16:04It gets really messy if the
elements are not distinct. -
16:08 - 16:08And you may even have to change
the algorithms a little bit -
16:12 - 16:12because if all the elements are
equal, if you pick a random -
16:17 - 16:17element, the partition does not
do so well. -
16:20 - 16:20But let's assume they're all
distinct, which is the really -
16:24 - 16:24interesting case.
A pretty luck case -- -
16:28 - 16:28I mean the best cases we
partition right in the middle. -
16:33 - 16:33The number of elements to the
left of our partition is equal -
16:38 - 16:38to the number of elements to the
right of our partition. -
16:42 - 16:42But almost as good would be
some kind of 1/10 to 9/10 split. -
16:47 - 16:47Any constant fraction,
we should feel that. -
16:51 - 16:51Any constant fraction is as
good as 1/2. -
16:54 - 16:54Then the recurrence we get is,
let's say at most, -
16:58 - 16:58this bad.
So, it depends. -
17:01 - 17:01If we have let's say 1/10 on
the left and 9/10 on the right -
17:05 - 17:05every time we do a partition.
It depends where our answer is. -
17:09 - 17:09It could be if i is really
small it's in the 1/10 part. -
17:13 - 17:13If i is really big it's going
to be in the 9/10 part, -
17:16 - 17:16or most of the time it's going
to be in the 9/10 part. -
17:20 - 17:20We're doing worst-case analysis
within the lucky case, -
17:23 - 17:23so we're happy to have upper
bounds. -
17:26 - 17:26I will say t(n) is at most t of
T(9/10n)+Theta(n). -
17:30 - 17:30Clearly it's worse if we're in
the bigger part. -
17:35 - 17:35What is the solution to this
recurrence? -
17:38 - 17:38Oh, solving recurrence was so
long ago. -
17:42 - 17:42What method should we use for
solving this recurrence? -
17:48 - 17:48The master method.
What case are we in? -
17:51 - 17:51Three.
Good. -
17:52 - 17:52You still remember.
This is Case 3. -
17:56 - 17:56We're looking at nlog_b(a).
b here is 10/9, -
18:01 - 18:01although it doesn't really
matter because a is 1. -
18:06 - 18:06log base anything of 1 is 0.
So, this is n^0 which is 1. -
18:11 - 18:11And n is polynomially larger
than 1. -
18:15 - 18:15This is going to be O(n),
which is good. -
18:19 - 18:19That is what we want,
linear time. -
18:22 - 18:22If we're in the lucky case,
great. -
18:25 - 18:25Unfortunately this is only
intuition. -
18:30 - 18:30And we're not always going to
get the lucky case. -
18:33 - 18:33We could do the same kind of
analysis as we did with -
18:36 - 18:36randomized quicksort.
If you alternate between lucky -
18:38 - 18:38and unlucky, things will still
be good, but let's just talk -
18:42 - 18:42about the unlucky case to show
how bad things can get. -
18:45 - 18:45And this really would be a
worst-case analysis. -
18:53 - 18:53The unlucky case we get a split
of 0:n-1. -
19:00 - 19:00Because we're removing the
partition element either way. -
19:05 - 19:05And there could be nothing less
than the partition element. -
19:10 - 19:10We have 0 on the left-hand side
and we have n-1 on the -
19:15 - 19:15right-hand side.
Now we get a recurrence like -
19:18 - 19:18T(n)=T(n-1) plus linear cost.
And what's the solution to that -
19:24 - 19:24recurrence?
n^2. -
19:25 - 19:25Yes.
This one you should just know. -
19:28 - 19:28It's n^2 because it's an
arithmetic series. -
19:38 - 19:38And that's pretty bad.
This is much, -
19:40 - 19:40much worse than sorting and
then picking the ith element. -
19:43 - 19:43In the worst-case this
algorithm really sucks, -
19:46 - 19:46but most of the time it's going
to do really well. -
19:49 - 19:49And, unless you're really,
really unlucky and every coin -
19:52 - 19:52you flip gives the wrong answer,
you won't get this case and you -
19:56 - 19:56will get something more like the
lucky case. -
19:59 - 19:59At least that's what we'd like
to prove. -
20:02 - 20:02And we will prove that the
expected running time here is -
20:05 - 20:05linear.
So, it's very rare to get -
20:07 - 20:07anything quadratic.
But later on we will see how to -
20:10 - 20:10make the worst-case linear as
well. -
20:12 - 20:12This would really,
really solve the problem. -
20:30 - 20:30Let's get into the analysis.
-
20:43 - 20:43Now, you've seen an analysis
much like this before. -
20:47 - 20:47What do you suggest we do in
order to analyze this expected -
20:52 - 20:52time?
It's a divide-and-conquer -
20:54 - 20:54algorithm, so we kind of like to
write down the recurrence on -
20:59 - 20:59something resembling the running
time. -
21:09 - 21:09I don't need the answer,
but what's the first step that -
21:13 - 21:13we might do to analyze the
expected running time of this -
21:17 - 21:17algorithm?
Sorry? -
21:18 - 21:18Look at different cases,
yeah. -
21:20 - 21:20Exactly.
We have all these possible ways -
21:23 - 21:23that random partition could
split. -
21:25 - 21:25It could split 0 to the n-1.
It could split in half. -
21:30 - 21:30There are n choices where it
could split. -
21:33 - 21:33How can we break into those
cases? -
21:36 - 21:36Indicator random variables.
Cool. -
21:39 - 21:39Exactly.
That's what we want to do. -
21:41 - 21:41Indicator random variable
suggests that what we're dealing -
21:46 - 21:46with is not exactly just a
function T(n) but it's a random -
21:51 - 21:51variable.
This is one subtlety. -
21:53 - 21:53T(n) depends on the random
choices, so it's really a random -
21:58 - 21:58variable.
-
22:05 - 22:05And then we're going to use
indicator random variables to -
22:08 - 22:08get a recurrence on T(n).
-
22:25 - 22:25So, T(n) is the running time of
rand-select on an input of size -
22:33 - 22:33n.
-
22:40 - 22:40And I am also going to write
down explicitly an assumption -
22:46 - 22:46about the random numbers.
-
22:55 - 22:55That they should be chosen
independently from each other. -
23:00 - 23:00Every time I call random
partition, it's generating a -
23:03 - 23:03completely independent random
number from all the other times -
23:07 - 23:07I call random partition.
That is important, -
23:10 - 23:10of course, for this analysis to
work. -
23:13 - 23:13We will see why some point down
the line. -
23:15 - 23:15And now, to sort of write down
an equation for T(n) we're going -
23:20 - 23:20to define indicator random
variables, as you suggested. -
23:36 - 23:36And we will call it X_k.
And this is for all k=0...n-1. -
23:50 - 23:50Indicator random variables
either 1 or 0. -
23:54 - 23:54And it's going to be 1 if the
partition comes out k on the -
24:01 - 24:01left-hand side.
So say the partition generates -
24:07 - 24:07a k:n-k-1 split and it is 0
otherwise. -
24:11 - 24:11We have n of these indicator
random variables between -
24:17 - 24:170...n-1.
And in each case, -
24:20 - 24:20no matter how the random choice
comes out, exactly one of them -
24:28 - 24:28will be 1.
All the others will be 0. -
24:32 - 24:32Now we can divide out the
running time of this algorithm -
24:37 - 24:37based on which case we're in.
-
24:49 - 24:49That will sort of unify this
intuition that we did and get -
24:57 - 24:57all the cases.
And then we can look at the -
25:03 - 25:03expectation.
T(n), if we just split out by -
25:09 - 25:09cases, we have an upper bound
like this. -
25:28 - 25:28If we have 0 to n-1 split,
the worst is we have n-1. -
25:33 - 25:33Then we have to recurse in a
problem of size n-1. -
25:38 - 25:38In fact, it would be pretty
hard to recurse in a problem of -
25:44 - 25:44size 0.
If we have a 1 to n-2 split -
25:47 - 25:47then we take the max of the two
sides. -
25:51 - 25:51That's certainly going to give
us an upper bound and so on. -
26:03 - 26:03And at the bottom you get an
n-1 to 0 split. -
26:14 - 26:14This is now sort of
conditioning on various events, -
26:17 - 26:17but we have indicator random
variables to tell us when these -
26:20 - 26:20events happen.
We can just multiply each of -
26:22 - 26:22these values by the indicator
random variable and it will come -
26:25 - 26:25out 0 if that's not the case and
will come out 1 and give us this -
26:28 - 26:28value if that happens to be the
split. -
26:31 - 26:31So, if we add up all of those
we'll get the same thing. -
26:38 - 26:38This is equal to the sum over
all k of the indicator random -
26:46 - 26:46variable times the cost in that
case, which is t of max k, -
26:53 - 26:53and the other side,
which is n-k-1, -
26:57 - 26:57plus theta n.
This is our recurrence, -
27:02 - 27:02in some sense,
for the random variable -
27:05 - 27:05representing running time.
Now, the value will depend on -
27:09 - 27:09which case we come into.
We know the probability of each -
27:14 - 27:14of these events happening is the
same because we're choosing the -
27:19 - 27:19partition element uniformly at
random, but we cannot really -
27:24 - 27:24simplify much beyond this until
we take expectations. -
27:29 - 27:29We know this random variable
could be as big as n^2. -
27:33 - 27:33Hopefully it's usually linear.
We will take expectations of -
27:37 - 27:37both sides and get what we want.
-
27:54 - 27:54Let's look at the expectation
of this random variable, -
27:58 - 27:58which is just the expectation,
I will copy over, -
28:02 - 28:02summation we have here so I can
work on this board. -
28:30 - 28:30I want to compute the
expectation of this summation. -
28:34 - 28:34What property of expectation
should I use? -
28:37 - 28:37Linearity, good.
We can bring the summation -
28:40 - 28:40outside.
-
29:08 - 29:08Now I have a sum of
expectation. -
29:10 - 29:10Let's look at each expectation
individually. -
29:13 - 29:13It's a product of two random
variables, if you will. -
29:16 - 29:16This is an indicator random
variable and this is some more -
29:19 - 29:19complicated function,
some more complicated random -
29:22 - 29:22variable representing some
running time, -
29:25 - 29:25which depends on what random
choices are made in that -
29:28 - 29:28recursive call.
Now what should I do? -
29:32 - 29:32I have the expectation of the
product of two random variables. -
29:38 - 29:38Independence,
exactly. -
29:40 - 29:40If I know that these two random
variables are independent then I -
29:46 - 29:46know that the expectation of the
product is the product of the -
29:51 - 29:51expectations.
Now we have to check are they -
29:55 - 29:55independent?
I hope so because otherwise -
29:59 - 29:59there isn't much else I can do.
Why are they independent? -
30:05 - 30:05Sorry?
Because we stated that they -
30:08 - 30:08are, right.
Because of this assumption. -
30:11 - 30:11We assume that all the random
numbers are chosen -
30:14 - 30:14independently.
We need to sort of interpolate -
30:18 - 30:18that here.
These X_k's, -
30:19 - 30:19all the X_k's,
X_0 up to X_n-1, -
30:22 - 30:22so all the ones appearing in
this summation are dependent -
30:26 - 30:26upon a single random choice of
this particular call to random -
30:31 - 30:31partition.
All of these are correlated, -
30:37 - 30:37because if one of them is 1,
all the others are forced to be -
30:44 - 30:440.
So, there is a lot of -
30:47 - 30:47correlation among the X_k's.
But with respect to everything -
30:55 - 30:55that is in here,
and the only random part is -
31:00 - 31:00this T(max(kn-k-1)).
That is the reason that this -
31:07 - 31:07random variable is independent
from these. -
31:13 - 31:13The same thing as quicksort,
but I know some people got -
31:19 - 31:19confused about it a couple
lectures ago so I am -
31:25 - 31:25reiterating.
We get the product of -
31:29 - 31:29expectations,
E[X_k] E[T(max(kn-k-1))]. -
31:35 - 31:35I mean the order n comes
outside, but let's leave it -
31:40 - 31:40inside for now.
There is no expectation to -
31:45 - 31:45compute there for order n.
Order n is order n. -
31:49 - 31:49What is the expectation of X_k?
1/n, because they're all chosen -
31:56 - 31:56with equal probability.
There is n of them, -
32:00 - 32:00so the expectation is 1/n.
The value is either 1 or 0. -
32:05 - 32:05We start to be able to split
this up. -
32:07 - 32:07We have 1/n times this expected
value of some recursive T call, -
32:12 - 32:12and then we have plus 1 over n
times order n, -
32:16 - 32:16also known as a constant,
but everything is summed up n -
32:20 - 32:20times so let's expand this.
-
32:35 - 32:35I have the sum k=0 to n-1.
I guess the 1/n can come -
32:42 - 32:42outside.
And we have expectation of -
32:47 - 32:47[T(max(kn-k-1))].
Lots of nifty braces there. -
32:54 - 32:54And then plus we have,
on the other hand, -
33:00 - 33:00the sum k=0 to n-1.
Let me just write that out -
33:06 - 33:06again.
We have a 1/n in front and we -
33:09 - 33:09have a Theta(n) inside.
This summation is n^2. -
33:12 - 33:12And then we're dividing by n,
so this whole thing is, -
33:17 - 33:17again, order n.
Nothing fancy happened there. -
33:20 - 33:20This is really just saying the
expectation of order n is order -
33:25 - 33:25n.
Average value of order n is -
33:27 - 33:27order n.
What is interesting is this -
33:32 - 33:32part.
Now, what could we do with this -
33:35 - 33:35summation?
Here we start to differ from -
33:39 - 33:39randomized quicksort because we
have this max. -
33:43 - 33:43Randomized quicksort we had the
sum of T(k) plus T(n-k-1) -
33:48 - 33:48because we were making both
recursive calls. -
33:53 - 33:53Here we're only making the
biggest one. -
33:56 - 33:56That max is really a pain for
evaluating this recurrence. -
34:03 - 34:03How could I get rid of the max?
That's one way to think of it. -
34:12 - 34:12Yeah?
-
34:18 - 34:18Exactly.
I could only sum up to halfway -
34:21 - 34:21and then double.
In other words, -
34:23 - 34:23terms are getting repeated
twice here. -
34:26 - 34:26When k=0 or when k=n-1,
I get the same T(n-1). -
34:30 - 34:30When k=1 or n-2,
I get the same thing, -
34:34 - 34:342 and n-3.
What I will actually do is sum -
34:38 - 34:38from halfway up.
That's a little bit cleaner. -
34:43 - 34:43And let me get the indices
right. -
34:46 - 34:46Floor of n/2 up to n-1 will be
safe. -
34:49 - 34:49And then I just have E[T(k)],
except I forgot to multiply by -
34:56 - 34:562, so I'm going to change this 1
to a 2. -
35:01 - 35:01And order n is preserved.
This is just because each term -
35:05 - 35:05is appearing twice.
I can factor it out. -
35:08 - 35:08And if n is odd,
I'm actually double-counting -
35:11 - 35:11somewhat, but it's certain at
most that. -
35:13 - 35:13So, that's a safe upper bound.
And upper bounds are all we -
35:17 - 35:17care about because we're hoping
to get linear. -
35:20 - 35:20And the running time of this
algorithm is definitely at least -
35:25 - 35:25linear, so we just need an upper
bounded linear. -
35:29 - 35:29So, this is a recurrence.
E[T(n)] is at most 2/n times -
35:33 - 35:33the sum of half the numbers
between 0 and n of -
35:36 - 35:36E[T(k)]+Theta(n).
It's a bit of hairy recurrence. -
35:40 - 35:40We want to solve it,
though. -
35:42 - 35:42And it's actually a little bit
easier than the randomized -
35:46 - 35:46quicksort recurrence.
We're going to solve it. -
35:49 - 35:49What method should we use?
Sorry? -
35:51 - 35:51Master method?
Master would be nice, -
35:54 - 35:54except that each of the
recursive calls is with a -
35:57 - 35:57different value of k.
The master method only works -
36:02 - 36:02when all the calls are with the
same value, same size. -
36:06 - 36:06Alas, it would be nice if we
could use the master method. -
36:09 - 36:09What else do we have?
Substitution. -
36:12 - 36:12When it's hard,
when in doubt, -
36:14 - 36:14use substitution.
I mean the good thing here is -
36:17 - 36:17we know what we want.
From the intuition at least, -
36:20 - 36:20which is now erased,
we really feel that this should -
36:24 - 36:24be linear time.
So, we know what we want to -
36:26 - 36:26prove.
And indeed we can prove it just -
36:32 - 36:32directly with substitution.
-
36:42 - 36:42I want to claim there is some
constant c greater than zero -
36:46 - 36:46such that E[T(n)],
according to this recurrence, -
36:50 - 36:50is at most c times n.
Let's prove that over here. -
37:00 - 37:00As we guessed,
the proof is by substitution. -
37:13 - 37:13What that means is we're going
to assume, by induction, -
37:18 - 37:18that this inequality is true
for all smaller m. -
37:23 - 37:23I will just say 4 less than n.
And we need to prove it for n. -
37:29 - 37:29We get E[T(n)].
Now we are just going to expand -
37:34 - 37:34using the recurrence that we
have. -
37:36 - 37:36It's at most this.
I will copy that over. -
37:54 - 37:54And then each of these
recursive calls is with some -
37:58 - 37:58value k that is strictly smaller
than n. -
38:01 - 38:01Sorry, I copied it wrong,
floor of n over 2, -
38:04 - 38:04not zero.
And so I can apply the -
38:07 - 38:07induction hypothesis to each of
these. -
38:11 - 38:11This is at most c times k by
the induction hypothesis. -
38:17 - 38:17And so I get this inequality.
-
38:37 - 38:37This c can come outside the
summation because it's just a -
38:41 - 38:41constant.
And I will be slightly tedious -
38:43 - 38:43in writing this down again,
because what I care about is -
38:47 - 38:47the summation here that is left
over. -
38:56 - 38:56This is a good old-fashioned
summation. -
39:01 - 39:01And if you remember back to
your summation tricks or -
39:05 - 39:05whatever, you should be able to
evaluate this. -
39:08 - 39:08If we started at zero and went
up to n minus 1, -
39:11 - 39:11that's just an arithmetic
series, but here we have the -
39:15 - 39:15tail end of an arithmetic
series. -
39:17 - 39:17And you should know,
at least up to theta, -
39:20 - 39:20what this is,
right? -
39:21 - 39:21n^2, yeah.
It's definitely T(n^2). -
39:24 - 39:24But we need here a slightly
better upper bond, -
39:27 - 39:27as we will see the constants
really matter. -
39:31 - 39:31What we're going to use is that
this summation is at most 3/8 -
39:35 - 39:35times n^2.
And that will be critical, -
39:38 - 39:38the fact that 3/8 is smaller
than 1/2, I believe. -
39:42 - 39:42So it's going to get rid of
this 2. -
39:44 - 39:44I am not going to prove this.
This is an exercise. -
39:48 - 39:48When you know that it is true,
it's easy because you can just -
39:52 - 39:52prove it by induction.
Figuring out that number is a -
39:56 - 39:56little bit more work,
but not too much more. -
40:00 - 40:00So you should prove that by
induction. -
40:04 - 40:04Now let me simplify.
This is a bit messy, -
40:09 - 40:09but what I want is c times n.
Let's write it as our desired -
40:16 - 40:16value minus the residual.
And here we have some crazy -
40:22 - 40:22fractions.
This is 2 times 3 which is 6 -
40:27 - 40:27over 8 which is 3/4,
right? -
40:31 - 40:31Here we have 1,
so we have to subtract up 1/4 -
40:35 - 40:35to get 3/4.
And this should be, -
40:37 - 40:37I guess, 1/4 times c times n.
And then we have this theta n -
40:42 - 40:42with double negation becomes a
plus theta n. -
40:46 - 40:46That should be clear.
I am just rewriting that. -
40:50 - 40:50So we have what we want over
here. -
40:53 - 40:53And then we hope that this is
nonnegative because what we want -
40:58 - 40:58is that this less than or equal
to c times n. -
41:03 - 41:03That will be true,
provided this thing is -
41:06 - 41:06nonnegative.
And it looks pretty good -
41:09 - 41:09because we're free to choose c
however large we want. -
41:13 - 41:13Whatever constant is imbedded
in this beta notation is one -
41:18 - 41:18fixed constant,
whatever makes this recurrence -
41:21 - 41:21true.
We just set c to be bigger than -
41:24 - 41:244 times that constant and then
this will be nonnegative. -
41:28 - 41:28So this is true for c
sufficiently large to dwarf that -
41:33 - 41:33theta constant.
It's also the base case. -
41:37 - 41:37I just have to make the cursory
mention that we choose c large -
41:41 - 41:41enough so that this claim is
true, even in the base case -
41:46 - 41:46where n is at most some
constant. -
41:48 - 41:48Here it's like 1 or so because
then we're not making a -
41:52 - 41:52recursive call.
What we get -- -
41:55 - 41:55This algorithm,
randomize select, -
41:59 - 41:59has expected running time order
n, Theta(n). -
42:12 - 42:12The annoying this is that in
the worst-case, -
42:16 - 42:16if you're really,
really unlucky it's n^2. -
42:19 - 42:19Any questions before we move on
from this point? -
42:24 - 42:24This finished off the proof of
this fact that we have Theta(n) -
42:29 - 42:29expected time.
We already saw the n^2 -
42:33 - 42:33worst-case.
All perfectly clear? -
42:35 - 42:35Good.
You should go over these -
42:37 - 42:37proofs.
They're intrinsically related -
42:40 - 42:40between randomized quicksort and
randomized select. -
42:44 - 42:44Know them in your heart.
This is a great algorithm that -
42:48 - 42:48works really well in practice
because most of the time you're -
42:52 - 42:52going to split,
say, in the middle, -
42:55 - 42:55somewhere between a 1/4 and 3/4
and everything is good. -
43:00 - 43:00It's extremely unlikely that
you get the n^2 worst-case. -
43:03 - 43:03It would have to happen with
like 1 over n^n probability or -
43:07 - 43:07something really,
really small. -
43:09 - 43:09But I am a theoretician at
least. -
43:10 - 43:10And it would be really nice if
you could get Theta(n) in the -
43:14 - 43:14worst-case.
That would be the cleanest -
43:16 - 43:16result that you could hope for
because that's optimal. -
43:19 - 43:19You cannot do better than
Theta(n). -
43:21 - 43:21You've got to look at the
elements. -
43:23 - 43:23So, you might ask,
can we get rid of this -
43:26 - 43:26worst-case behavior and somehow
avoid randomization and -
43:29 - 43:29guarantee Theta(n) worst-case
running time? -
43:33 - 43:33And you can but it's a rather
nontrivial algorithm. -
43:39 - 43:39And this is going to be one of
the most sophisticated that -
43:46 - 43:46we've seen so far.
It won't continue to be the -
43:51 - 43:51most sophisticated algorithm we
will see, but here it is. -
43:58 - 43:58Worst-case linear time order
statistics. -
44:09 - 44:09And this is an algorithm by
several, all very famous people, -
44:23 - 44:23Blum, Floyd,
Pratt, Rivest and Tarjan. -
44:32 - 44:32I think I've only met the B and
the R and the T. -
44:35 - 44:35Oh, no, I've met Pratt as well.
I'm getting close to all the -
44:40 - 44:40authors.
This is a somewhat old result, -
44:42 - 44:42but at the time it was a major
breakthrough and still is an -
44:47 - 44:47amazing algorithm.
Ron Rivest is a professor here. -
44:50 - 44:50You should know him from the R
in RSA. -
44:53 - 44:53When I took my PhD
comprehensives some time ago, -
44:56 - 44:56on the cover sheet was a joke
question. -
45:00 - 45:00It asked of the authors of the
worst-case linear time order -
45:05 - 45:05statistics algorithm,
which of them is the most rich? -
45:09 - 45:09Sadly it was not a graded part
of the comprehensive exam, -
45:13 - 45:13but it was an amusing question.
I won't answer it here because -
45:18 - 45:18we're on tape,
[LAUGHTER] but think about it. -
45:22 - 45:22I may not be obvious.
Several of them are rich. -
45:25 - 45:25It's just the question of who
is the most rich. -
45:30 - 45:30Anyway, before they were rich
they came up with this -
45:34 - 45:34algorithm.
They've come up with many -
45:36 - 45:36algorithms since,
even after getting rich, -
45:39 - 45:39believe it or not.
What we want is a good pivot, -
45:42 - 45:42guaranteed good pivot.
Random pivot is going to be -
45:45 - 45:45really good.
And so the simplest algorithm -
45:48 - 45:48is just pick a random pivot.
It's going to be good with high -
45:52 - 45:52probability.
We want to force a good pivot -
45:55 - 45:55deterministically.
And the new idea here is we're -
45:58 - 45:58going to generate it
recursively. -
46:02 - 46:02What else could we do but
recurse? -
46:04 - 46:04Well, you should know from your
recurrences that if we did two -
46:09 - 46:09recursive calls on problems of
half the size and we have a -
46:13 - 46:13linear extra work that's the
mergesort recurrence, -
46:16 - 46:16T(n)=2[T(n/2)+Theta(n)].
You should recite in your -
46:20 - 46:20sleep.
That's n lg n. -
46:21 - 46:21So we cannot recurse on two
problems of half the size. -
46:25 - 46:25We've got to do better.
Somehow these recursions have -
46:30 - 46:30to add up to strictly less than
n. -
46:33 - 46:33That's the magic of this
algorithm. -
46:35 - 46:35So this will just be called
select instead of rand-select. -
46:40 - 46:40And it really depends on an
array, but I will focus on the -
46:44 - 46:44i-th element that we want to
select and the size of the array -
46:49 - 46:49that we want to select in.
And I am going to write this -
46:53 - 46:53algorithm slightly less formally
than randomize-select because -
46:58 - 46:58it's a bit higher level of an
algorithm. -
47:22 - 47:22And let me draw over here the
picture of the algorithm. -
47:31 - 47:31The first step is sort of the
weirdest and it's one of the key -
47:36 - 47:36ideas.
You take your elements, -
47:39 - 47:39and they are in no particular
order, so instead of drawing -
47:44 - 47:44them on a line,
I am going to draw them in a 5 -
47:48 - 47:48by n over 5 grid.
Why not? -
47:50 - 47:50This, unfortunately,
take a little while to draw, -
47:54 - 47:54but it will take you equally
long so I will take my time. -
48:00 - 48:00It doesn't really matter what
the width is, -
48:03 - 48:03but it should be width n over 5
so make sure you draw your -
48:06 - 48:06figure accordingly.
Width n over 5, -
48:08 - 48:08but the height should be
exactly 5. -
48:10 - 48:10I think I got it right.
I can count that high. -
48:13 - 48:13Here is 5.
And this should be, -
48:15 - 48:15well, you know,
our number may not be divisible -
48:18 - 48:18by 5, so maybe it ends off in
sort of an odd way. -
48:21 - 48:21But what I would like is that
these chunks should be floor of -
48:25 - 48:25n over 5.
And then we will have, -
48:27 - 48:27at most, four elements left
over. -
48:30 - 48:30So I am going to ignore those.
They don't really matter. -
48:34 - 48:34It's just an additive constant.
Here is my array. -
48:37 - 48:37I just happened to write it in
this funny way. -
48:40 - 48:40And I will call these vertical
things groups. -
48:43 - 48:43I would circle them,
and I did that in my notes, -
48:46 - 48:46but things get really messy if
you start circling. -
48:49 - 48:49This diagram is going to get
really full, just to warn you. -
48:53 - 48:53By the end it will be almost
unintelligible, -
48:56 - 48:56but there it is.
If you are really feeling -
49:00 - 49:00bored, you can draw this a few
times. -
49:03 - 49:03And you should draw how it
grows. -
49:06 - 49:06So there are the groups,
vertical groups of five. -
49:10 - 49:10Next step.
-
49:18 - 49:18The second step is to recurse.
This is where things are a bit -
49:25 - 49:25unusual, well,
even more unusual. -
49:28 - 49:28Oops, sorry.
I really should have had a line -
49:33 - 49:33between one and two so I am
going to have to move this down -
49:37 - 49:37and insert it here.
I also, in step one, -
49:40 - 49:40want to find the median of each
group. -
49:53 - 49:53What I would like to do is just
imagine this figure, -
49:56 - 49:56each of the five elements in
each group gets reorganized so -
50:00 - 50:00that the middle one is the
median. -
50:02 - 50:02So I am going to call these the
medians of each group. -
50:06 - 50:06I have five elements so the
median is right in the middle. -
50:10 - 50:10There are two elements less
than the median, -
50:13 - 50:13two elements greater than the
median. -
50:16 - 50:16Again, we're assuming all
elements are distinct. -
50:19 - 50:19So there they are.
I compute them. -
50:22 - 50:22How long does that take me?
N over five groups, -
50:25 - 50:25each with five elements,
compute the median of each one? -
50:30 - 50:30Sorry?
Yeah, 2 times n over 5. -
50:32 - 50:32It's theta n,
that's all I need to know. -
50:35 - 50:35I mean, you're counting
comparisons, which is good. -
50:38 - 50:38It's definitely Theta(n).
The point is within each group, -
50:42 - 50:42I only have to do a constant
number of comparisons because -
50:46 - 50:46it's a constant number of
elements. -
50:48 - 50:48It doesn't matter.
You could use randomize select -
50:52 - 50:52for all I care.
No matter what you do, -
50:54 - 50:54it can only take a constant
number of comparisons. -
50:59 - 50:59As long as you don't make a
comparison more than once. -
51:03 - 51:03So this is easy.
You could sort the five numbers -
51:07 - 51:07and then look at the third one,
it doesn't matter because there -
51:12 - 51:12are only five of them.
That's one nifty idea. -
51:16 - 51:16Already we have some elements
that are sort of vaguely in the -
51:21 - 51:21middle but just of the group.
And we've only done linear -
51:26 - 51:26work.
So doing well so far. -
51:29 - 51:29Now we get to the second step,
which I started to write -
51:34 - 51:34before, where we recurse.
-
51:58 - 51:58So the next idea is,
well, we have these floor over -
52:02 - 52:02n over 5 medians.
I am going to compute the -
52:04 - 52:04median of those medians.
I am imagining that I -
52:07 - 52:07rearranged these.
And, unfortunately, -
52:09 - 52:09it's an even number,
there are six of them, -
52:12 - 52:12but I will rearrange so that
this guy, I have drawn in a -
52:15 - 52:15second box, is the median of
these elements so that these two -
52:19 - 52:19elements are strictly less than
this guy, these three elements -
52:22 - 52:22are strictly greater than this
guy. -
52:24 - 52:24Now, that doesn't directly tell
me anything, it would seem, -
52:28 - 52:28about any of the elements out
here. -
52:31 - 52:31We will come back to that.
In fact, it does tell us about -
52:35 - 52:35some of the elements.
But right now this element is -
52:39 - 52:39just the median of these guys.
Each of these guys is a median -
52:43 - 52:43of five elements.
That's all we know. -
52:45 - 52:45If we do that recursively,
this is going to take T of n -
52:49 - 52:49over 5 time.
So far so good. -
52:51 - 52:51We can afford a recursion on a
problem of size n over 5 and -
52:55 - 52:55linear work.
We know that works out to -
52:58 - 52:58linear time.
But there is more. -
53:01 - 53:01We're obviously not done yet.
-
53:10 - 53:10The next step is x is our
partition element. -
53:13 - 53:13We partition there.
The rest of the algorithm is -
53:16 - 53:16just like randomized partition,
so we're going to define k to -
53:19 - 53:19be the rank of x.
And this can be done, -
53:22 - 53:22I mean it's n minus r plus 1 or
whatever, but I'm not going to -
53:26 - 53:26write out how to do that because
we're at a higher level here. -
53:30 - 53:30But it can be done.
And then we have the three-way -
53:35 - 53:35branching.
So if i happens to equal k -
53:38 - 53:38we're happy.
The pivot element is the -
53:41 - 53:41element we're looking for,
but more likely i is either -
53:46 - 53:46less than k or it is bigger than
k. -
53:49 - 53:49And then we make the
appropriate recursive call, -
53:54 - 53:54so here we recursively select
the i-th smallest element -- -
54:08 - 54:08-- in the lower part of the
array. -
54:11 - 54:11Left of the partition element.
Otherwise, we recursively -
54:16 - 54:16select the i minus k-th smallest
element in the upper part of the -
54:23 - 54:23array.
I am writing this at a high -
54:26 - 54:26level because we've already seen
it. -
54:30 - 54:30All of this is the same as the
last couple steps of randomized -
54:36 - 54:36select.
-
54:45 - 54:45That is the algorithm.
The real question is why does -
54:48 - 54:48it work?
Why is this linear time? -
54:50 - 54:50The first question is what's
the recurrence? -
54:53 - 54:53We cannot quite write it down
yet because we don't know how -
54:57 - 54:57big these recursive subproblems
could be. -
55:00 - 55:00We're going to either recurse
in the lower part or the upper -
55:04 - 55:04part, that's just like before.
If we're unlucky and we have a -
55:08 - 55:08split of like zero to n minus
one, this is going to be a -
55:12 - 55:12quadratic time algorithm.
The claim is that this -
55:15 - 55:15partition element is guaranteed
to be pretty good and good -
55:19 - 55:19enough.
The running time of this thing -
55:21 - 55:21will be T of something times n,
and we don't know what the -
55:25 - 55:25something is yet.
How big could it be? -
55:27 - 55:27Well, I could ask you.
But we're sort of indirect here -
55:32 - 55:32so I will tell you.
We have already a recursive -
55:35 - 55:35call of T of n over 5.
It better be that whatever -
55:38 - 55:38constant, so it's going to be
something times n, -
55:41 - 55:41it better be that that constant
is strictly less than 4/5. -
55:45 - 55:45If it's equal to 4/5 then
you're not splitting up the -
55:48 - 55:48problem enough to get an n lg n
running time. -
55:51 - 55:51If it's strictly less than 4/5
then you're reducing the problem -
55:55 - 55:55by at least a constant factor.
In the sense if you add up all -
56:00 - 56:00the recursive subproblems,
n over 5 and something times n, -
56:04 - 56:04you get something that is a
constant strictly less than one -
56:07 - 56:07times n.
That forces the work to be -
56:10 - 56:10geometric.
If it's geometric you're going -
56:12 - 56:12to get linear time.
So this is intuition but it's -
56:15 - 56:15the right intuition.
Whenever you're aiming for -
56:18 - 56:18linear time keep that in mind.
If you're doing a -
56:21 - 56:21divide-and-conquer,
you've got to get the total -
56:24 - 56:24subproblem size to be some
constant less than one times n. -
56:28 - 56:28That will work.
OK, so we've got to work out -
56:33 - 56:33this constant here.
And we're going to use this -
56:37 - 56:37figure, which so far looks
surprisingly uncluttered. -
56:42 - 56:42Now we will make it cluttered.
What I would like to do is draw -
56:48 - 56:48an arrow between two vertices,
two points, elements, -
56:54 - 56:54whatever you want to call them.
Let's call them a and b. -
57:00 - 57:00And I want to orient the arrow
so it points to a larger value, -
57:04 - 57:04so this means that a is less
than b. -
57:07 - 57:07This is notation just for the
diagram. -
57:10 - 57:10And so this element,
I am going to write down what I -
57:13 - 57:13know.
This element is the median of -
57:16 - 57:16these five elements.
I will suppose that it is drawn -
57:19 - 57:19so that these elements are
larger than the median, -
57:23 - 57:23these elements are smaller than
the median. -
57:26 - 57:26Therefore, I have arrows like
this. -
57:28 - 57:28Here is where I wish I had some
colored chalk. -
57:33 - 57:33This is just stating this guy
is in the middle of those five -
57:37 - 57:37elements.
I know that in every single -
57:39 - 57:39column.
-
57:55 - 57:55Here is where the diagram
starts to get messy. -
57:58 - 57:58I am not done yet.
Now, we also know that this -
58:02 - 58:02element is the median of the
medians. -
58:04 - 58:04Of all the squared elements,
this guy is the middle. -
58:07 - 58:07And I will draw it so that
these are the ones smaller than -
58:10 - 58:10the median, these are the ones
larger than the median. -
58:13 - 58:13I mean the algorithm cannot do
this. -
58:15 - 58:15It doesn't necessarily know how
all this works. -
58:18 - 58:18I guess it could,
but this is just for analysis -
58:21 - 58:21purposes.
We know this guy is bigger than -
58:23 - 58:23that one and bigger than that
one. -
58:25 - 58:25We don't directly know about
the other elements. -
58:29 - 58:29We just know that that one is
bigger than both of those and -
58:33 - 58:33this guy is smaller than these.
Now, that is as messy as the -
58:37 - 58:37figure will get.
Now, the nice thing about less -
58:41 - 58:41than is that it's a transitive
relation. -
58:43 - 58:43If I have a directed path in
this graph, I know that this -
58:47 - 58:47element is strictly less than
that element because this is -
58:51 - 58:51less than that one and this is
less than that one. -
58:55 - 58:55Even though directly I only
know within a column and within -
58:59 - 58:59this middle row,
I actually know that this -
59:02 - 59:02element --
This is x, by the way. -
59:06 - 59:06This element is larger than all
of these elements because it's -
59:11 - 59:11larger than this one and this
one and each of these is larger -
59:15 - 59:15than all of those by these
arrows. -
59:18 - 59:18I also know that all of these
elements in this rectangle here, -
59:23 - 59:23and you don't have to do this
but I will make the background -
59:28 - 59:28even more cluttered.
All of these elements in this -
59:33 - 59:33rectangle are greater than or
equal to this one and all of the -
59:38 - 59:38elements in this rectangle are
less than or equal to x. -
59:43 - 59:43Now, how many are there?
Well, this is roughly halfway -
59:47 - 59:47along the set of groups and this
is 3/5 of these columns. -
59:52 - 59:52So what we get is that there
are at least -- -
59:57 - 59:57We have n over 5 groups and we
have half of the groups that -
60:04 - 60:04we're looking at here roughly,
so let's call that floor of n -
60:10 - 60:10over 2, and then within each
group we have three elements. -
60:17 - 60:17So we have at least 3 times
floor of floor of n over 5 over -
60:23 - 60:232 n floor elements that are less
than or equal to x. -
60:30 - 60:30And we have the same that are
greater than or equal to x. -
60:36 - 60:36Let me simplify this a little
bit more. -
60:40 - 60:40I can also give you some more
justification, -
60:45 - 60:45and we drew the picture,
but just for why this is true. -
60:51 - 60:51We have at least n over 5 over
2 group medians that are less -
60:58 - 60:58than or equal to x.
This is the argument we use. -
61:03 - 61:03We have half of the group
medians are less than or equal -
61:06 - 61:06to x because x is the median of
the group median, -
61:09 - 61:09so that is no big surprise.
This is almost an equality but -
61:12 - 61:12we're making floors so it's
greater than or equal to. -
61:15 - 61:15And then, for each group
median, we know that there are -
61:18 - 61:18three elements there that are
less than or equal to that group -
61:22 - 61:22median.
So, by transitivity, -
61:23 - 61:23they're also less than or equal
to x. -
61:25 - 61:25We get this number times three.
This is actually just floor of -
61:31 - 61:31n over 10.
I was being unnecessarily -
61:34 - 61:34complicated there,
but that is where it came from. -
61:38 - 61:38What we know is that this thing
is now at least 3 times n over -
61:44 - 61:4410, which is roughly 3/10 of
elements are in one side. -
61:48 - 61:48In fact, at least 3/10 of the
elements are in each side. -
61:53 - 61:53Therefore, each side has at
most 7/10 elements roughly. -
61:59 - 61:59So the number here will be
7/10. -
62:01 - 62:01And, if I'm lucky,
7/10 plus 1/5 is strictly less -
62:05 - 62:05than one.
I believe it is, -
62:06 - 62:06but I have trouble working with
tenths. -
62:09 - 62:09I can only handle powers of
two. -
62:11 - 62:11What we're going to use is a
minor simplification, -
62:15 - 62:15which just barely still works,
is a little bit easier to think -
62:19 - 62:19about.
It's mainly to get rid of this -
62:22 - 62:22floor because the floor is
annoying. -
62:24 - 62:24And we don't really have a
sloppiness lemma that applies -
62:28 - 62:28here.
It turns out if n is -
62:31 - 62:31sufficiently large,
3 times floor of n over 10 is -
62:35 - 62:35greater than or equal to 1/4.
Quarters I can handle. -
62:39 - 62:39The claim is that each group
has size at least 1/4, -
62:42 - 62:42therefore each group has size
at most 3/4 because there's a -
62:47 - 62:47quarter on the side.
This will be 3/4. -
62:49 - 62:49And I can definitely tell that
1/5 is less than 1/4. -
62:53 - 62:53This is going to add up to
something strictly less than one -
62:57 - 62:57and then it will work.
How is my time? -
63:01 - 63:01Good.
At this point, -
63:03 - 63:03the rest of the analysis is
easy. -
63:06 - 63:06How the heck you would come up
with this algorithm, -
63:10 - 63:10you realize that this is
clearly a really good choice for -
63:15 - 63:15finding a partition element,
just barely good enough that -
63:20 - 63:20both recursions add up to linear
time. -
63:23 - 63:23Well, that's why it took so
many famous people. -
63:28 - 63:28Especially in quizzes,
but I think in general this -
63:31 - 63:31class, you won't have to come up
with an algorithm this clever -
63:34 - 63:34because you can just use this
algorithm to find the median. -
63:38 - 63:38And the median is a really good
partition element. -
63:40 - 63:40Now that you know this
algorithm, now that we're beyond -
63:43 - 63:431973, you don't need to know how
to do this. -
63:46 - 63:46I mean you should know how this
algorithm works, -
63:48 - 63:48but you don't need to do this
in another algorithm because you -
63:52 - 63:52can just say run this algorithm,
you will get the median in -
63:55 - 63:55linear time, and then you can
partition to the left and the -
63:59 - 63:59right.
And then the left and the right -
64:02 - 64:02will have exactly equal size.
Great. -
64:05 - 64:05This is a really powerful
subroutine. -
64:07 - 64:07You could use this all over the
place, and you will on Friday. -
64:12 - 64:12Have I analyzed the running
time pretty much? -
64:15 - 64:15The first step is linear.
The second step is T of n over -
64:19 - 64:195.
The third step, -
64:20 - 64:20I didn't write it,
is linear. -
64:22 - 64:22And then the last step is just
a recursive call. -
64:25 - 64:25And now we know that this is
3/4. -
64:34 - 64:34I get this recurrence.
T of n is, I'll say at most, -
64:40 - 64:40T of n over 5 plus T of 3/4n.
You could have also used 7/10. -
64:47 - 64:47It would give the same answer,
but you would also need a floor -
64:54 - 64:54so we won't do that.
I claim that this is linear. -
65:01 - 65:01How should I prove it?
Substitution. -
65:12 - 65:12Claim that T of n is at most
again c times n, -
65:16 - 65:16that will be enough.
Proof is by substitution. -
65:20 - 65:20Again, we assume this is true
for smaller n. -
65:24 - 65:24And want to prove it for n.
We have T of n is at most this -
65:29 - 65:29thing.
T of n over 5. -
65:31 - 65:31And by induction,
because n of 5 is smaller than -
65:36 - 65:36n, we know that this is at most
c. -
65:40 - 65:40Let me write it as c over 5
times n. -
65:44 - 65:44Sure, why not.
Then we have here 3/4cn. -
65:48 - 65:48And then we have a linear term.
Now, unfortunately, -
65:53 - 65:53I have to deal with things that
are not powers of two. -
66:00 - 66:00I will cheat and look at my
notes. -
66:02 - 66:02This is also known as 19/20
times c times n plus theta n. -
66:07 - 66:07And the point is just that this
is strictly less than one. -
66:11 - 66:11Because it's strictly less than
one, I can write this as one -
66:15 - 66:15times c of n minus some
constant, here it happens to be -
66:19 - 66:191/20, as long as I have
something left over here, -
66:23 - 66:231/20 times c times n.
Then I have this annoying theta -
66:27 - 66:27n term which I want to get rid
of because I want this to be -
66:31 - 66:31nonnegative.
But it is nonnegative, -
66:35 - 66:35as long as I set c to be
really, really large, -
66:38 - 66:38at least 20 times whatever
constant is here. -
66:42 - 66:42So this is at most c times n
for c sufficiently large. -
66:46 - 66:46And, oh, by the way,
if n is less than or equal to -
66:50 - 66:5050, which we used up here,
then T of n is a constant, -
66:54 - 66:54it doesn't really matter what
you do, and T of n is at most c -
66:59 - 66:59times n for c sufficiently
large. -
67:03 - 67:03That proves this claim.
Of course, the constant here is -
67:06 - 67:06pretty damn big.
It depends exactly what the -
67:08 - 67:08constants and the running times
are, which depends on your -
67:12 - 67:12machine, but practically this
algorithm is not so hot because -
67:15 - 67:15the constants are pretty big.
Even though this element is -
67:18 - 67:18guaranteed to be somewhere
vaguely in the middle, -
67:21 - 67:21and even though these
recursions add up to strictly -
67:24 - 67:24less than n and it's geometric,
it's geometric because the -
67:27 - 67:27problem is reducing by at least
a factor of 19/20 each time. -
67:31 - 67:31So it actually takes a while
for the problem to get really -
67:35 - 67:35small.
Practically you probably don't -
67:37 - 67:37want to use this algorithm
unless you cannot somehow flip -
67:41 - 67:41coins.
The randomized algorithm works -
67:43 - 67:43really, really fast.
Theoretically this is your -
67:46 - 67:46dream, the best you could hope
for because it's linear time and -
67:50 - 67:50you need linear time as
guaranteed linear time. -
67:53 - 67:53I will mention,
before we end, -
67:55 - 67:55an exercise.
-
68:03 - 68:03Why did we use groups of five?
Why not groups of three? -
68:06 - 68:06As you might guess,
the answer is because it -
68:09 - 68:09doesn't work with groups of
three. -
68:11 - 68:11But it's quite constructive to
find out why. -
68:14 - 68:14If you work through this math
with groups of three instead of -
68:18 - 68:18groups of five,
you will find that you don't -
68:20 - 68:20quite get the problem reduction
that you need. -
68:23 - 68:23Five is the smallest number for
which this works. -
68:27 - 68:27It would work with seven,
but theoretically not any -
68:30 - 68:30better than a constant factor.
Any questions? -
68:33 - 68:33All right.
Then recitation Friday. -
68:35 - 68:35Homework lab Sunday.
Problem set due Monday. -
68:38 - 68:38Quiz one in two weeks.
- Title:
- Lec 6 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
- Description:
-
Lecture 06: Order Statistics, Median
View the complete course at: http://ocw.mit.edu/6-046JF05
License: Creative Commons BY-NC-SA
More information at http://ocw.mit.edu/terms
More courses at http://ocw.mit.edu
- Duration:
- 01:08:49