Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
-
0:08 - 0:08OK.
Today we are going to talk -
0:11 - 0:11about a very interesting
algorithm called Quicksort -- -
0:23 - 0:23-- which was invented by Tony
Hoare in 1962. -
0:30 - 0:30And it has ended up being a
really interesting algorithm -
0:36 - 0:36from many points of view.
And because of that, -
0:41 - 0:41it turns out today's lecture is
going to be both hard and fast. -
0:48 - 0:48If you see the person next to
you sleeping, -
0:52 - 0:52you will want to say let's get
going. -
0:56 - 0:56It's a divide-and-conquer
algorithm. -
1:06 - 1:06And it sorts,
as they say, -
1:08 - 1:08in place, meaning that it just
rearranged the elements where -
1:14 - 1:14they are.
That is like insertion sort -
1:18 - 1:18rearranges elements where they
are. -
1:21 - 1:21Mergesort does not.
Mergesort requires extra -
1:25 - 1:25storage in order to do the merge
operation. -
1:30 - 1:30To merge in linear time and
place, it doesn't merge in place -
1:35 - 1:35in linear time.
It doesn't do it just by -
1:38 - 1:38rearranging.
It is nice because it is in -
1:41 - 1:41place, so that means that it is
fairly efficient in its use of -
1:46 - 1:46storage.
And it also happens to be very -
1:49 - 1:49practical if you tune it a bit.
The basic algorithm turns out, -
1:54 - 1:54if you just implement that,
it's not necessarily that -
1:58 - 1:58efficient.
But if what you do was then do -
2:03 - 2:03the standard kinds of things you
do to goose up the runtime of -
2:08 - 2:08something, and we'll talk a
little about what those things -
2:13 - 2:13are, then it can be very,
very practical. -
2:16 - 2:16So, it uses divide-and-conquer
paradigm. -
2:33 - 2:33First step is divide.
And to do this basically it -
2:41 - 2:41does it by partitioning.
So, it partitions the input -
2:49 - 2:49array into two subarrays around
an element we call the pivot -- -
3:04 - 3:04-- such that elements in the
lower subarray are less than or -
3:15 - 3:15equal to x, are less than or
equal to elements in the upper -
3:26 - 3:26subarray.
If we draw a picture of the -
3:31 - 3:31input array, this partition step
basically takes some element x -
3:39 - 3:39and everything over here is less
than or equal to x after the -
3:46 - 3:46partition step and everything
over here is greater than or -
3:53 - 3:53equal to x.
And so now the conquer step is -
3:57 - 3:57pretty easy.
You just recursively sort the -
4:05 - 4:05two subarrays.
So, I recursively sort the -
4:11 - 4:11elements less than or equal to
x, I recursively sort the -
4:19 - 4:19elements greater than or equal
to x. -
4:25 - 4:25And then combine is then just
trivial. -
4:32 - 4:32Because once I have sorted the
things less than or equal to x, -
4:37 - 4:37then sorted the things greater
than or equal to x, -
4:40 - 4:40the whole thing is sorted.
So, there is nothing to do -
4:44 - 4:44really for the combine.
The key step in quicksort is -
4:48 - 4:48this partition step.
That is the thing that does all -
4:52 - 4:52of the work.
And so you can view quicksort -
4:55 - 4:55of just as recursive
partitioning. -
4:58 - 4:58That's all it is.
Just as mergesort was recursive -
5:06 - 5:06merging, quicksort sort of goes
the other way around and does -
5:19 - 5:19recursive partitioning.
The key is the linear time, -
5:29 - 5:29by which I mean theta n,
partitioning subroutine. -
5:40 - 5:40And here are some pseudocode
for it. -
5:43 - 5:43This is actually slightly
different from the book. -
5:48 - 5:48The book has one.
In fact, there is a nice -
5:51 - 5:51problem in the book that has
even a different one, -
5:56 - 5:56but they are all basically the
same idea. -
6:00 - 6:00Partition (A,
p, q). -
6:02 - 6:02And what we are looking at,
at this step of the recursion, -
6:07 - 6:07is the subarray A from p to q.
And basically we pick a pivot, -
6:12 - 6:12which is we are going to just
pick as the first element of the -
6:17 - 6:17array A of p.
-
6:26 - 6:26And the book,
just for your information, -
6:30 - 6:30uses A of q.
I use A of p. -
6:32 - 6:32It doesn't really matter.
And then we set an index to p -
6:37 - 6:37and then we have a loop.
-
7:35 - 7:35This is the code.
Basically the structure of it -
7:41 - 7:41is a for loop with an "if"
statement in the middle. -
7:47 - 7:47And so the structure of the
algorithm of this partitioning -
7:54 - 7:54step looks as follows.
We set the pivot to be the -
8:01 - 8:01first element.
Here is p and here is q. -
8:05 - 8:05This is going to be our
invariant for the loop. -
8:10 - 8:10And, at any time during the
execution of a loop, -
8:15 - 8:15I essentially have some values
up to i which are already less -
8:22 - 8:22than or equal to x and then some
values that end at j minus 1 -
8:29 - 8:29that are greater than or equal
to x. -
8:34 - 8:34And then I don't know about the
rest. -
8:38 - 8:38And so we start out with i
equal to p and j equal to p plus -
8:45 - 8:451.
It starts out at p plus 1 so -
8:48 - 8:48that everything is unknown
except for x here. -
8:53 - 8:53And then the idea is that it is
going to preserve this -
8:59 - 8:59invariant.
And the way it does it is, -
9:03 - 9:03as we go through the loop,
it looks at a of j and says is -
9:07 - 9:07it greater than or equal to x?
Sorry, is it less than or equal -
9:11 - 9:11to x?
If it is greater than or equal -
9:13 - 9:13to x it does nothing,
because what can happen? -
9:16 - 9:16If this is greater than or
equal to x, essentially it just -
9:20 - 9:20goes to the next iterational
loop which moves this boundary -
9:24 - 9:24and the invariant is satisfied.
Does everybody see that? -
9:27 - 9:27Yeah, OK.
But if it is less than or equal -
9:31 - 9:31to x, I have got a problem if I
want to maintain the invariant -
9:36 - 9:36if this next element is less
than or equal to x. -
9:39 - 9:39And so what it does then is it
says oh, let me just move this -
9:43 - 9:43boundary and swap this element
here, which is greater than or -
9:47 - 9:47equal to x, with this one here
that is less than or equal to x, -
9:52 - 9:52thereby increasing the size of
this subarray and then the -
9:56 - 9:56invariant is satisfied again.
It is a fairly simple -
9:59 - 9:59algorithm.
And it is actually a very tight -
10:04 - 10:04and easy algorithm.
That is one reason that this is -
10:09 - 10:09such a great piece of code
because it is very efficient. -
10:14 - 10:14Now, in principle,
the running time for this on n -
10:18 - 10:18elements is order n.
-
10:28 - 10:28Because I am basically just
going through the n elements and -
10:32 - 10:32just doing a constant amount of
work and then just a constant -
10:36 - 10:36amount of work outside.
This is a clever piece of code. -
10:39 - 10:39In fact, in principle partition
is easy, right? -
10:42 - 10:42If I weren't worrying about
doing it in place, -
10:45 - 10:45it is really a pretty easy
thing to do. -
10:47 - 10:47I take an element and just
compare every other element with -
10:51 - 10:51it.
I throw one into one bin and -
10:53 - 10:53one into the other.
That is clearly linear time. -
10:57 - 10:57But often what you find is that
just because you can do it that -
11:01 - 11:01way theoretically doesn't mean
that that is going to end up -
11:05 - 11:05giving you good code.
And this is a nice piece of -
11:08 - 11:08code that allows you to do it in
place. -
11:11 - 11:11And that is one reason why this
is a particularly good -
11:14 - 11:14algorithm, because the constants
are good. -
11:17 - 11:17So, yes, when we do asymptotic
analysis we tend to ignore the -
11:21 - 11:21constants, but when you're
actually building code you care -
11:25 - 11:25about the constants.
But first you care much more -
11:30 - 11:30than just about the constants,
is whether overall it is going -
11:37 - 11:37to be a fast algorithm.
Let's go through an example of -
11:44 - 11:44this, I guess I will do it over
here, just so we get the gist. -
11:51 - 11:51Here is a sample array that I
have created out of hallcloth. -
11:58 - 11:58And here we are going to set x,
the pivot, to be 6. -
12:05 - 12:05Let's look to see how this
algorithm works. -
12:08 - 12:08So, i starts out here and j
starts out here if we -
12:12 - 12:12initialize.
And what we do is start -
12:15 - 12:15scanning right,
essentially that code is -
12:19 - 12:19scanning right until it gets
something which is less than or -
12:23 - 12:23equal to the pivot.
It keeps going here until it -
12:27 - 12:27finds, j keeps incrementing
until it finds something that is -
12:32 - 12:32less than or equal to the pivot.
And, in that case, -
12:38 - 12:38it is the number 5.
Then it says we will swap these -
12:44 - 12:44two things.
And it does that and we get 6, -
12:50 - 12:505, 13, 10, 8,
3, 2, 11. -
12:52 - 12:52And meanwhile now i gets
incremented and j continues -
12:59 - 12:59where it left off.
And so now we keep scanning -
13:05 - 13:05right until we get to something
that is less than or equal to -
13:13 - 13:13the pivot.
In this case it is 3. -
13:17 - 13:17We swap 3 and 5 and get 6,
3, etc. -
13:21 - 13:21And now, at this step we
increment i, we start j out -
13:27 - 13:27here.
And in this case, -
13:31 - 13:31right off the bat,
we have something which is less -
13:36 - 13:36than or equal to x,
so we swap these two. -
13:40 - 13:40I blew it, didn't I?
Oops. -
13:42 - 13:42What did I do?
I swapped the wrong thing, -
13:46 - 13:46didn't I, here?
Ah-ha. -
13:48 - 13:48That is why I am not a
computer. -
13:52 - 13:52Good.
We should have swapped this -
13:55 - 13:55guy, right?
Swapped i plus 1, -
13:58 - 13:58right?
This was i. -
14:01 - 14:01We swap i plus 1,
good. -
14:04 - 14:04So, that's all wrong.
Let's swap the right things. -
14:09 - 14:09Now we have 6,
5, 3, 10, 8, -
14:12 - 14:1213, 2, 11.
That even corresponds to my -
14:16 - 14:16notes for some strange reason.
This is i and now this is j. -
14:23 - 14:23And now when I look,
I immediately have something -
14:28 - 14:28that is less than or equal to
the pivot. -
14:34 - 14:34We swap this and i plus 1,
so now we have 6, -
14:42 - 14:425, 3, 2, 8, 13,
10, 11. -
14:45 - 14:45And we, at that point,
increment i to here. -
14:53 - 14:53And we have j now going here
and j runs to the end. -
15:03 - 15:03And the loop terminates.
When the loop terminates there -
15:09 - 15:09is one less swap that we do,
which is to put our pivot -
15:14 - 15:14element in the middle between
the two subarrays. -
15:19 - 15:19Here we swap this one and this
one, and so that gives us then -
15:25 - 15:252, 5, 3, 6, 8,
13, 10, 11. -
15:28 - 15:28And this is the pivot.
And everything over here is -
15:34 - 15:34less than or equal to the pivot.
And everything over here is -
15:40 - 15:40greater than or equal to the
pivot. -
15:56 - 15:56OK, so the quicksort routine.
Once we have this partition -
16:00 - 16:00routine, quicksort is a pretty
easy piece of code to write. -
16:18 - 16:18I should have said return here
i, right? -
16:21 - 16:21You have got to return with the
pivot. -
16:25 - 16:25Here I have got to return i
because we want to know where -
16:30 - 16:30the pivot element is.
Sorry. -
16:34 - 16:34I will plug in my code.
r gets partition of (A, -
16:47 - 16:47p, q) and then we quicksort (A,
p, r-1) and quicksort of (A, -
17:03 - 17:03r+1, q).
And that is it. -
17:10 - 17:10That's the code.
The initial call is quicksort -
17:17 - 17:17of (A, 1, n).
Because once we partitioned, -
17:25 - 17:25we just have to quicksort the
two portions, -
17:32 - 17:32the left and right portions.
Just the boundary case is -
17:39 - 17:39probably worth mentioning for a
second. -
17:41 - 17:41If there are zero or one
elements, that is basically what -
17:44 - 17:44can possibly happen here,
is that I get zero or one -
17:47 - 17:47elements here.
Then the point is there is -
17:49 - 17:49nothing to do because the array
is sorted, either because it is -
17:53 - 17:53an empty array or because it
only has one element. -
17:56 - 17:56One of the tricks to making
quicksort go fast, -
17:59 - 17:59as one tunes this,
is to, in fact, -
18:01 - 18:01look at having a special
purpose sorting routine for -
18:05 - 18:05small numbers of elements.
For example, -
18:07 - 18:07if you get down to five
elements having some straight -
18:11 - 18:11line piece of code that knows
how to sort five elements -
18:14 - 18:14sufficiently as opposed to
continuing to go through -
18:18 - 18:18recursion in order to accomplish
that. -
18:20 - 18:20And there are a variety of
other things. -
18:23 - 18:23This is a tail recursive code,
and so you can use certain tail -
18:27 - 18:27recursion optimizations.
And there are a variety of -
18:32 - 18:32other kinds of optimizations
that you can use to make this -
18:35 - 18:35code go fast.
So, yeah, you can tune it up a -
18:38 - 18:38bit beyond what is there,
but the core of it is this -
18:41 - 18:41efficient partitioning routine.
-
18:53 - 18:53That is the algorithm.
It turns out that looking at -
18:59 - 18:59how fast it runs is actually a
little bit challenging. -
19:05 - 19:05In the analysis,
we are going to assume that all -
19:10 - 19:10elements are distinct.
It turns out that this -
19:14 - 19:14particular code does not work
very well when you have repeated -
19:20 - 19:20elements, but Hoare's original
partitioning routine is actually -
19:26 - 19:26more efficient in that case if
there are duplicates in what you -
19:32 - 19:32are sorting.
And I encourage you to look at -
19:36 - 19:36that.
It has a much more complicated -
19:39 - 19:39invariant for partitioning
routine, but it does a similar -
19:43 - 19:43kind of thing.
It's just a bit more -
19:45 - 19:45complicated.
If they weren't all distinct, -
19:48 - 19:48there are things you can do to
make them distinct or you can -
19:52 - 19:52just use this code.
The easiest thing to do is just -
19:56 - 19:56use Hoare's original code
because that works pretty well -
20:00 - 20:00when they are nondistinct.
But this is a little bit easier -
20:08 - 20:08to understand.
Let's let T(n) be the -
20:13 - 20:13worst-case running time on n
elements. -
20:18 - 20:18And so what is the worse-case?
What is the worse-case going to -
20:27 - 20:27be for quicksort?
-
20:40 - 20:40That's right.
If you always pick the pivot -
20:43 - 20:43and everything is greater than
or everything is less than, -
20:47 - 20:47you are not going to partition
the array very well. -
20:51 - 20:51And when does that happen?
What does the original input -
20:55 - 20:55look like that makes that
happen? -
20:57 - 20:57If it is already sorted or
reverse sorted. -
21:01 - 21:01So, if the input is sorted or
reverse sorted. -
21:05 - 21:05That is actually kind of
important to understand, -
21:10 - 21:10because it turns out the most
common thing to sort is -
21:15 - 21:15something that is already
sorted, surprisingly, -
21:19 - 21:19or things that are nearly
sorted. -
21:22 - 21:22But often it is just sorted and
somebody wants to make sure it -
21:28 - 21:28is sorted.
Well, let's just sort it again -
21:33 - 21:33rather than checking to see if
it is sorted. -
21:38 - 21:38And, in those cases,
one side of the partition of -
21:44 - 21:44each partition has no elements.
Then we can write out what the -
21:50 - 21:50recursion is for that.
We have T(n). -
21:54 - 21:54If one side has no elements,
we are going to have T(0) on -
22:01 - 22:01that side.
And on the other side we are -
22:06 - 22:06going to have T(n-1).
We are just writing out the -
22:11 - 22:11recursion for this.
One side has no elements. -
22:16 - 22:16The other side has n-1
elements. -
22:20 - 22:20And then partitioning and all
the bookkeeping and so forth is -
22:27 - 22:27order n.
What is T(0)? -
22:30 - 22:30What is T(0)?
What is that asymptotically? -
22:36 - 22:36It's a constant,
order 1. -
22:39 - 22:39That is just order 1 + T(n-1) +
order n. -
22:44 - 22:44Well, the order 1 can be
absorbed into the order n, -
22:51 - 22:51so this is really just saying
it is T(n-1) + order n. -
22:59 - 22:59And what is that equal to?
That is order n^2. -
23:09 - 23:09Why is that order n^2?
It is an arithmetic series. -
23:19 - 23:19Actually, just like we got for
insertion sort. -
23:30 - 23:30Just like for insertion sort it
is an arithmetic series. -
23:36 - 23:36Going through all that work and
we have an algorithm called -
23:42 - 23:42quicksort, and it is no faster
than insertion sort. -
23:47 - 23:47Nevertheless,
I said it was a good algorithm. -
23:52 - 23:52The reason it is a good
algorithm is because its average -
23:58 - 23:58case time, as we are going to
see, is very good. -
24:04 - 24:04But let's try to understand
this a little bit more just so -
24:10 - 24:10that we understand the
difference between what is going -
24:15 - 24:15to happen in the average case
and what is going to happen in -
24:21 - 24:21the worse-case.
Let's draw a recursion tree for -
24:26 - 24:26this for T(n) = T(0) + T(n-1) +
and I will make the constant -
24:31 - 24:31explicit for cn.
So, we get an intuition of what -
24:37 - 24:37is going on.
Some constant times n. -
24:39 - 24:39And then we have T(n) is equal
to, and we write it with the -
24:44 - 24:44constant part here,
cn, and then T(0) here, -
24:48 - 24:48and then T(n-1) here.
Now, I know that all you folks -
24:52 - 24:52are really fast and want to jump
immediately to the full-blown -
24:57 - 24:57tree.
But, let me tell you, -
25:00 - 25:00my advice is that you spend
just a couple of minutes writing -
25:04 - 25:04it out.
Since the tree grows -
25:07 - 25:07exponentially,
it only costs you a constant -
25:10 - 25:10overhead to write out the small
cases and make sure that you -
25:15 - 25:15have got the pattern that you
are developing. -
25:18 - 25:18So, I am going to go one more
step. -
25:21 - 25:21Here we have T(0) and now this
becomes c(n-1) and now we have -
25:26 - 25:26another T(0) over here and
T(n-2). -
25:30 - 25:30And we continue that,
dot, dot, dot. -
25:36 - 25:36That is all equal to cn with a
T(0) here, c(n-1) with a T(0), -
25:46 - 25:46c(n-2), T(0) here,
and that goes all the way down -
25:54 - 25:54until we end up with T(1) down
here. -
26:01 - 26:01What is the height of this tree?
-
26:17 - 26:17What is the height of the tree
here? -
26:21 - 26:21Yeah, n.
Good. -
26:22 - 26:22Because every step we are just
decrementing the argument by 1. -
26:30 - 26:30So, the height is n.
To analyze this, -
26:35 - 26:35let's first add up everything
that is here. -
26:41 - 26:41Just so we understand where
these things are coming from, -
26:49 - 26:49this is just theta of the
summation of k equals 1 to n of -
26:56 - 26:56k, actually of ck.
That is what is in there. -
27:02 - 27:02And that is equal to order n^2.
That is where our algorithmatic -
27:08 - 27:08series is coming from.
So, that is Theta(n^2). -
27:13 - 27:13And then all of these things
here are all Theta(1). -
27:23 - 27:23And how many of them are there?
There are n Theta(1)'s. -
27:33 - 27:33So, the total amount is T(n) =
Theta(n) + Theta(n^2) = -
27:55 - 27:55Theta(n^2).
Just to see what the structure -
28:03 - 28:03is in terms of the recursion
tree, it is a highly unbalanced -
28:09 - 28:09recursion tree.
Now I am going to do something -
28:14 - 28:14that I told you should never do,
which is we are going to be do -
28:21 - 28:21a best-case analysis.
This is for intuition only. -
28:26 - 28:26And, in general,
we don't do best-case analyses. -
28:32 - 28:32It doesn't mean anything,
unless we get some intuition -
28:36 - 28:36for it maybe.
But basically it means nothing -
28:39 - 28:39mathematically because it's
providing no guarantee. -
28:55 - 28:55And so this is intuition only.
-
29:05 - 29:05If we are really lucky what
happens for partition? -
29:14 - 29:14What is going to be the lucky
case? -
29:26 - 29:26Yeah, it splits right in the
middle. -
29:29 - 29:29Which is essentially --
-
29:43 - 29:43-- n/2 : n/2.
It is really (n-1)/2 : -
29:46 - 29:46(n-1)/2, but we're not going to
worry about the details because -
29:51 - 29:51we're only doing intuition for
the best-case because best-case -
29:57 - 29:57is not what we want.
If that happened, -
30:02 - 30:02what is the recurrence I get?
Imagine it split it exactly in -
30:12 - 30:12the middle every time,
then what happens? -
30:18 - 30:18You get T(n) = 2T(n/2) + order
n for partitioning and -
30:27 - 30:27bookkeeping.
And what is the solution of -
30:33 - 30:33that recurrence?
That is n log n. -
30:37 - 30:37That is the same as the merge
sort recurrence. -
30:42 - 30:42It is which case of the master
theorem? -
30:47 - 30:47Case 2, right?
Because n to the log base 2 of -
30:52 - 30:522 is n to the 1,
it is the same, -
30:55 - 30:55so we tack on the extra log n.
Case 2 of the master theorem. -
31:05 - 31:05That is pretty good.
That says that in the best-case -
31:14 - 31:14quicksort is going to do well.
How about let's suppose the -
31:24 - 31:24split is always let's say 1/10 :
9/10, 1/10n : -
31:32 - 31:329/10n.
In that case, -
31:36 - 31:36are we lucky or are we unlucky?
-
31:52 - 31:52I mean, if the split is really
skewed, we clearly are going to -
31:57 - 31:57be unlucky, right,
because then it's, -
32:01 - 32:01say, 1 to n.
If it is really in the middle -
32:04 - 32:04it is n log n.
What do you suppose it is if it -
32:08 - 32:08is 1/10 : 9/10?
Is that lucky or unlucky? -
32:12 - 32:12We will have a little democracy
here. -
32:14 - 32:14Who thinks that that is a lucky
case? -
32:18 - 32:18It is going to be fast running
time. -
32:20 - 32:20And who thinks it is an unlucky
case? -
32:23 - 32:23OK, so we have some brave
souls. -
32:26 - 32:26And who didn't vote?
Oh, come on. -
32:30 - 32:30Come on.
It is always better, -
32:32 - 32:32by the way, to say yes or no
and be right or wrong, -
32:36 - 32:36because then you have some
emotional commitment to it and -
32:40 - 32:40we will remember better,
rather than just sitting and -
32:44 - 32:44being quiet.
You don't manipulate your own -
32:47 - 32:47emotions well enough to remember
things well. -
32:50 - 32:50Those people who voted win over
the people who don't vote, -
32:55 - 32:55whether they are right or
wrong. -
32:57 - 32:57Well, let's take a look.
Here is the recurrence. -
33:03 - 33:03T(n) = T(1/10n) + T(9/10n) +
Theta(n). -
33:08 - 33:08And we will assume that this
part here is less than or equal -
33:15 - 33:15to some cn in order to analyze
it. -
33:19 - 33:19We will just do a recursion
tree for this and see. -
33:26 - 33:26Here is a recursion tree.
-
33:35 - 33:35We have T(n) = cn,
T(1/10n), T(9/10n). -
33:41 - 33:41Now we have again cn at the
top. -
33:47 - 33:47This gets complicated,
right? -
33:51 - 33:51This is 1/10cn.
Now, over here we have 1/10. -
34:00 - 34:00And then we are plugging it
into the recursion again, -
34:10 - 34:10so we now get T(1/100n) and
over here we get T(9/100n). -
34:20 - 34:20And over here we have now
9/10cn. -
34:26 - 34:26And that gives us T(9/100n)
again. -
34:34 - 34:34And here we get T(81/100n).
And we keep going on. -
34:48 - 34:48That is equal to cn,
1/10cn here. -
34:59 - 34:59Down this way we have 1/100cn.
And that keeps going down until -
35:08 - 35:08we get to order 1 down here.
And over here we have 9/10cn. -
35:17 - 35:17And here, let's see,
this is 9/100cn and this is now -
35:25 - 35:259/100cn and this is 81/100cn.
And these things keep going -
35:33 - 35:33down until they get down to
order 1. -
35:37 - 35:37But the leaves are not all at
uniform depth here, -
35:42 - 35:42right?
This side is way further up -
35:45 - 35:45than this side,
right? -
35:48 - 35:48Because here we are only going
down by 9/10 each time. -
35:53 - 35:53So, in fact,
what is the length of this path -
35:58 - 35:58here?
What is the length of this path -
36:02 - 36:02down to this,
if I take the left most spine? -
36:15 - 36:15Somebody raise there hand.
Yeah? -
36:24 - 36:24Log base 10 of n.
Because I am basically cutting -
36:31 - 36:31down by a factor of 10 each
time. -
36:33 - 36:33And how long does it take me to
reduce it to 1? -
36:36 - 36:36That is the definition,
if you will, -
36:38 - 36:38of what a log is,
log base 10. -
36:40 - 36:40What is this one?
What is this path going that -
36:43 - 36:43way?
-
36:53 - 36:53Log of n. Log base 10/9 of n.
Because we're going down by -
37:02 - 37:029/10 each time.
Once again, essentially the -
37:06 - 37:06definition of n.
And everything in between there -
37:10 - 37:10is somewhere between log base 10
of n and log base 10/9 of n. -
37:15 - 37:15So, everything is in between
there. -
37:17 - 37:17Now what I can do is do the
trick that we did for mergesort -
37:22 - 37:22in looking at what the
evaluation of this is by adding -
37:27 - 37:27up what is the cost of the total
level. -
37:31 - 37:31That is just cn.
What is the cost of the next -
37:37 - 37:37level?
cn. -
37:38 - 37:38And what is the cost of the
next level? -
37:43 - 37:43cn.
Every level we are still doing -
37:48 - 37:48the same amount of work.
And we take that all the way -
37:55 - 37:55down.
And the last levels -- -
38:00 - 38:00Eventually we hit some point
where it is not equal to cn -
38:05 - 38:05where we start getting things
that are less than or equal to -
38:09 - 38:09cn because some of the leaves
start dropping out starting at -
38:14 - 38:14this level.
Basically this part is going to -
38:18 - 38:18be log base 10 of n,
and then we start getting -
38:21 - 38:21things that are less than or
equal to cn, and so forth, -
38:26 - 38:26until finally we get to add it
all up. -
38:30 - 38:30T(n) is going to be less than
or equal to cn times, -
38:36 - 38:36well, what is the longest that
this could possibly be? -
38:43 - 38:43Log base 10/9 of n.
Plus we have all of the leaves -
38:50 - 38:50that we have to add in,
but all the leaves together add -
38:57 - 38:57up to just order n.
All the leaves add up to order -
39:03 - 39:03n, so we have + Theta(n).
And so this is how much? -
39:08 - 39:08If I add all of this together,
what is this asymptotically? -
39:14 - 39:14That is n log n.
So, T(n) is actually bounded by -
39:19 - 39:19n log n.
We are lucky. -
39:21 - 39:21Those people who guessed lucky
were right. -
39:25 - 39:25A 1/10 : 9/10 split is
asymptotically as good as a 50 : -
39:31 - 39:3150 split.
And, in fact, -
39:34 - 39:34we can lower bound this by just
looking at these things here and -
39:41 - 39:41discover that,
in fact, T(n) is lower bounded -
39:46 - 39:46by cn log base 10 of n + order
n. -
39:49 - 39:49And so T(n) is lower bounded by
also asymptotically n log n. -
39:55 - 39:55So, T(n) is actually Theta(n lg
n). -
40:00 - 40:00Now, this is not really proof.
I generally recommend that you -
40:04 - 40:04don't do this kind of thing to
do a proof. -
40:08 - 40:08This is a good intuition of a
recursion tree. -
40:11 - 40:11The way you prove this is what?
Substitution method. -
40:15 - 40:15Good.
What you do is use this to get -
40:17 - 40:17your guess and then use
substitution method to prove -
40:21 - 40:21that your guess is right.
It is too easy to make mistakes -
40:25 - 40:25with this method.
It is very easy to make -
40:28 - 40:28mistakes.
With the substitution method it -
40:33 - 40:33is harder to make mistakes
because there is just algebra -
40:37 - 40:37there that you are cranking
through. -
40:40 - 40:40It is easier to verify rather
than dot, dot, -
40:44 - 40:44dots and trees that you drew
improperly and wrote in wrong -
40:48 - 40:48amounts and so forth.
OK? -
40:50 - 40:50So, this is n log n.
That's pretty good. -
40:54 - 40:54It is order n log n.
And we are lucky. -
40:57 - 40:57Now let's try another one.
This is all for intuition -
41:02 - 41:02because, I will tell you,
by the time we get to the end -
41:06 - 41:06of this class you folks are
going to bolting for the door -
41:10 - 41:10because we are going to do some
good math today, -
41:14 - 41:14actually.
It is actually fun math, -
41:16 - 41:16I think, but it is challenging.
If you are not awake, -
41:20 - 41:20you can still sleep now,
but I will tell you when to -
41:24 - 41:24wake up.
One more bit of intuition. -
41:26 - 41:26Suppose that we alternate
steps. -
41:30 - 41:30Suppose we do the partitioning
thing. -
41:33 - 41:33And it happens that we start
out lucky and then we have a -
41:39 - 41:39partitioning step that is
unlucky and then we have a step -
41:44 - 41:44that is lucky and a step that is
unlucky and we do that all the -
41:50 - 41:50way down the tree.
Suppose we alternate. -
42:07 - 42:07Are we lucky or unlucky if we
do that? -
42:09 - 42:09This time I want everybody
voting. -
42:11 - 42:11It doesn't matter what your
answer is. -
42:14 - 42:14Everybody has to have a stake
in the game. -
42:16 - 42:16It is sort of like horseracing.
If ever you have watched -
42:20 - 42:20horseracing, it is really
boring, but if you put a little -
42:23 - 42:23bit of money down,
a little skin in the game -
42:26 - 42:26suddenly it is interesting.
The same thing here. -
42:30 - 42:30I want everybody to put some
skin in the game. -
42:34 - 42:34Who thinks that this is going
to be lucky? -
42:38 - 42:38Who thinks it is going to be
unlucky? -
42:41 - 42:41OK.
Who didn't vote? -
42:43 - 42:43[LAUGHTER] You guys.
No skin in the game, -
42:46 - 42:46ha?
Let's analyze this so we can -
42:49 - 42:49once again write a recurrence.
On the lucky step, -
42:53 - 42:53we will have L(n) be the
running time on a lucky step of -
42:58 - 42:58size n.
And that is going to be twice. -
43:03 - 43:03While the next step is going to
be unlucky. -
43:07 - 43:07It is two unluckies over 2 plus
order n. -
43:11 - 43:11That is our lucky step.
And then for the unlucky step -
43:16 - 43:16it is essentially going to be L
of n minus 1, -
43:20 - 43:20it is going to be lucky on the
next step, plus order n. -
43:25 - 43:25That is unlucky.
See how I have described this -
43:31 - 43:31behavior with a system now of
recurrences that are dependent -
43:36 - 43:36where the boundary cases,
once again which are unstated, -
43:41 - 43:41is that the recurrences have a
constant solution with constant -
43:47 - 43:47input.
Now we just do a little bit of -
43:50 - 43:50algebra using substitution.
L(n) is then equal to, -
43:55 - 43:55well, I can just plug in,
for U(n/2) plug in the value of -
44:00 - 44:00U(n/2).
And that gives me 2[L(n/2-1) + -
44:06 - 44:06Theta(n) + Theta(n)].
See what I did here? -
44:12 - 44:12I simply plugged in,
for U(n/2), this recurrence. -
44:19 - 44:19In fact, technically I guess I
should have said Theta(n/2) just -
44:27 - 44:27to make this substitution more
straightforward. -
44:35 - 44:35It is the same thing,
but just to not skip a step. -
44:42 - 44:42That we can now crank through.
And that is 2L(n/2 - 1) +, -
44:49 - 44:49and now I have two T(n/2) plus
another one, so all of that is -
44:57 - 44:57just order n.
And what is the solution to -
45:04 - 45:04that recurrence?
n log n. -
45:08 - 45:08Theta(n lg n).
Does everybody see that? -
45:13 - 45:13OK?
Theta(n lg n). -
45:16 - 45:16This is basically just,
once again, master theorem with -
45:24 - 45:24a little bit of jiggering here.
That minus one is only going to -
45:33 - 45:33help us, actually,
in the solution of the master -
45:38 - 45:38theorem.
So, it is order n lg n. -
45:42 - 45:42We are lucky.
If we alternate lucky and -
45:46 - 45:46unlucky, we are lucky.
How can we insure that we are -
45:51 - 45:51usually lucky?
If I have the input already -
45:55 - 45:55sorted, I am going to be
unlucky. -
46:00 - 46:00Excuse me?
You could randomly arrange the -
46:03 - 46:03elements, that is one way.
What is another way? -
46:08 - 46:08That is a perfectly good way,
actually. -
46:11 - 46:11In fact, it is a common thing
to do. -
46:14 - 46:14Randomly choose the pivot,
OK. -
46:16 - 46:16It turns out those are
effectively equivalent, -
46:20 - 46:20but we are going to do the
randomly choose the pivot -
46:25 - 46:25because it is a little bit
easier to analyze. -
46:30 - 46:30But they are effectively
equivalent. -
46:33 - 46:33That gives us the algorithm
called randomized quicksort. -
46:46 - 46:46And the nice thing about
randomized quicksort is that the -
46:53 - 46:53running time is independent of
the input ordering. -
47:00 - 47:00Very much for the same reason
that if I just scramble the -
47:04 - 47:04input, it would be independent
of the input ordering. -
47:08 - 47:08If I randomly scramble the
input then it doesn't matter -
47:13 - 47:13what the order of the input was.
Whereas, original quicksort has -
47:18 - 47:18some slow cases,
input sorted or reverse sorted, -
47:21 - 47:21and some fast cases.
In particular, -
47:24 - 47:24it turns out that if it is
random it is going to be pretty -
47:29 - 47:29fast.
If I actually randomly scramble -
47:32 - 47:32the input or pivot on a random
element, it doesn't matter what -
47:36 - 47:36the input was.
One way of thinking about this -
47:38 - 47:38is with an adversary.
Imagine your adversary, -
47:41 - 47:41you are saying I have a good
sorting algorithm and he says I -
47:44 - 47:44have a good sorting algorithm
and you're trying to sell to a -
47:48 - 47:48single customer.
And the customer says OK, -
47:50 - 47:50you guys come up with
benchmarks for each of your -
47:53 - 47:53algorithms.
And you get to look at his -
47:55 - 47:55algorithm.
Well, you look and you say oh, -
47:59 - 47:59he is using quicksort.
I will just give him something -
48:03 - 48:03that is already sorted.
That is what you could do to -
48:07 - 48:07him.
If you had quicksort, -
48:08 - 48:08he would do the same thing to
you. -
48:11 - 48:11So, how can you defeat him?
Well, one way is with -
48:14 - 48:14randomization.
Big idea in computer science, -
48:17 - 48:17use random numbers.
The idea here is if I permute -
48:21 - 48:21the ordering at random,
as one suggestion, -
48:24 - 48:24or I pivot at random places,
then the input ordering didn't -
48:28 - 48:28matter.
And so there is no bad ordering -
48:32 - 48:32that he can provide that is
going to make my code run -
48:37 - 48:37slowly.
Now, I might get unlucky. -
48:39 - 48:39But that is just unlucky in my
use of my random-number -
48:43 - 48:43generator.
It is not unlucky with respect -
48:46 - 48:46to what the input was.
What the input was doesn't -
48:50 - 48:50matter.
Everybody follow that? -
48:53 - 48:53OK.
The nice thing about randomized -
48:55 - 48:55quicksort is that it makes no
assumptions about the input -
49:00 - 49:00distribution.
You don't have to assume that -
49:06 - 49:06all inputs are equally likely
because either you can make it -
49:13 - 49:13that way or you pivot in a way
that makes that effectively -
49:20 - 49:20whole.
And, in particular, -
49:24 - 49:24there is no specific input that
can elicit the worst-case -
49:31 - 49:31behavior.
The worst-case is determined -
49:45 - 49:45only by a random-number
generator. -
50:00 - 50:00And, therefore,
since it is only determined by -
50:04 - 50:04a random-number generator,
we can essentially bound the -
50:09 - 50:09unluckiness mathematically.
We can say what are the odds? -
50:14 - 50:14So, we are going to analyze
this. -
50:16 - 50:16And this is where you know if
you belong in this course or -
50:22 - 50:22not.
If you have skipped 6.042 or -
50:24 - 50:24whatever, this is a good place
to do the comparison. -
50:30 - 50:30Since it is going to be a
little bit, why don't people -
50:34 - 50:34just stand up for a moment and
take a stretch break. -
50:38 - 50:38Since this is going to be a
nice piece of mathematics we are -
50:42 - 50:42going to do, you are going to
want to feel fresh for it. -
51:02 - 51:02Stretch break is over.
-
51:10 - 51:10Analysis.
Good. -
51:12 - 51:12I think we are going to make
this. -
51:16 - 51:16I am sort of racing.
There is a lot of stuff to -
51:23 - 51:23cover today.
Good. -
51:25 - 51:25Let's let T(n) now be the
random variable for the running -
51:32 - 51:32time assuming --
-
51:46 - 51:46Wow.
I didn't even write here what -
51:48 - 51:48we did here.
So, we are going to pivot on a -
51:50 - 51:50random element.
-
51:59 - 51:59That is the basic scheme we are
going to do. -
52:02 - 52:02And the way I do that,
by the way, is just in the code -
52:05 - 52:05for partition,
rather than partitioning on the -
52:08 - 52:08first element,
before I do the partition, -
52:10 - 52:10I just swap the first element
with some other element in the -
52:14 - 52:14array chosen at random,
perhaps itself. -
52:17 - 52:17So, they are all equally likely
to be pivoted on. -
52:20 - 52:20And then just run the ordinary
partition. -
52:23 - 52:23This is a random variable for
running in time assuming, -
52:28 - 52:28we have to make an assumption
for doing probability, -
52:32 - 52:32the random numbers are
independent. -
52:41 - 52:41So that when I pivot in one
place, it is independent of how -
52:45 - 52:45I pivoted in some other place as
I am running this algorithm. -
52:49 - 52:49Then, to analyze this,
what I am going to do is I want -
52:52 - 52:52to know where we pivoted.
For k = 0, 1, -
52:57 - 52:57..., n-1, let's let,
for a particular partition, -
53:08 - 53:08the random variable X_k = 1 if
partition generates a k : -
53:22 - 53:22n-k-1 split,
and 0 otherwise. -
53:35 - 53:35In the partition routine,
I am picking a random element -
53:39 - 53:39to pivot on.
And X_k is going to be my -
53:43 - 53:43random variable that is 1 if it
generates a split that has k -
53:47 - 53:47elements on the left side and
n-k-1 elements on the right side -
53:53 - 53:53of the pivot.
Some of those, -
53:55 - 53:55too, of course are n-1 because
I also have the pivot. -
54:00 - 54:00And 0 otherwise.
So, I now have n random -
54:04 - 54:04variables that I have defined
associated with a single -
54:09 - 54:09partition where all of them are
going to be zero except one of -
54:15 - 54:15them, whichever one happens to
occur is going to have the value -
54:21 - 54:211.
This is called, -
54:23 - 54:23by the way.
What is the name of this type -
54:27 - 54:27of random variable?
-
54:37 - 54:37Bernoulli.
Well, Bernoulli has other -
54:40 - 54:40assumptions.
It is an indicator random -
54:43 - 54:43variable.
It turns out it is Bernoulli, -
54:47 - 54:47but that's OK.
It is an indicator random -
54:50 - 54:50variable.
It just takes on the value of -
54:53 - 54:530, 1.
And Bernoulli random variables -
54:56 - 54:56are a particular type of
indicator random variable. -
55:02 - 55:02Which it turns out these are.
That is an indicator random -
55:06 - 55:06variable.
Indicator random variables are -
55:10 - 55:10a good way when you are trying
to understand what the sum of a -
55:14 - 55:14bunch of things is.
It is a good way to break apart -
55:18 - 55:18your big random variables into
smaller ones that can be -
55:23 - 55:23analyzed.
Let's just take a look at this -
55:26 - 55:26indicator random variable.
What is the expectation of X_k -
55:30 - 55:30equal to?
-
55:42 - 55:42In other words,
what is the probability that I -
55:46 - 55:46generate a k :
n-k-1 split? -
56:00 - 56:00X_k is, let's just write out
what that means, -
56:05 - 56:05just to refresh people's
memory. -
56:09 - 56:09That is 0 times the probability
that X_k equals 0 plus 1 times -
56:16 - 56:16the probability that X_k equals
1, which is equal, -
56:22 - 56:22well, that is all zero.
That is just equal to the -
56:27 - 56:27probability that X_k equals 1.
And that is a general property -
56:35 - 56:35of indicator random variables,
is that their expectation is -
56:40 - 56:40the probability that they are 1.
The nice thing about indicator -
56:46 - 56:46random variables is it directly
connects the probability to the -
56:51 - 56:51expectation without any other
terms going on. -
56:55 - 56:55What is the probability of X_k
equals 1? -
56:59 - 56:591/n.
So, all splits are equally -
57:02 - 57:02likely.
And I have n elements, -
57:05 - 57:05so each ones has a 1/n chance
of being picked as the pivot. -
57:10 - 57:10And, once you pick the pivot,
that determines what is on the -
57:16 - 57:16left and the right and so forth.
So, it is 1/n. -
57:20 - 57:20Everybody with me so far?
More or less? -
57:23 - 57:23OK.
As I say, this is going to test -
57:26 - 57:26whether you're in the class.
If you go home and you study -
57:32 - 57:32this and you cannot get it,
and you have a deficiency in -
57:37 - 57:37your math background in trying
to take the course, -
57:42 - 57:42this is a good indication that
probably you have taken -
57:46 - 57:46something a little over your
head. -
57:49 - 57:49Let's write out what T(n) is
equal to here. -
58:00 - 58:00T(n) is going to be equal to
T(0) + T(n-1) + Theta(n) if we -
58:12 - 58:12get a 0 : n-1 split and is equal
to T(1) + T(n-2) + order n if we -
58:25 - 58:25have a 1 : n-2 split.
-
58:35 - 58:35And now down here it is going
to be T(n-1) + T(0) + Theta(n) -
58:46 - 58:46if we end up with an n-1 :
0 split. -
58:52 - 58:52So, this is our recurrence for
T(n). -
59:00 - 59:00And, unfortunately,
the recurrence is kind of hairy -
59:04 - 59:04because it has got n cases.
And this is, -
59:07 - 59:07once again, where the
brilliance of being able to use -
59:11 - 59:11indicator random variables comes
in. -
59:14 - 59:14Because we will be able to take
this case analysis and reduce it -
59:19 - 59:19to mathematics so we don't have
cases using indicator random -
59:23 - 59:23variables.
And the way we do that is using -
59:30 - 59:30the following trick of
converting the cases into a -
59:37 - 59:37summation.
-
59:55 - 59:55Let's just take a look at why
these two things are the same. -
60:01 - 60:01The indicator random variable
is zero, except if you get the -
60:06 - 60:06particular split.
Therefore, this summation is -
60:11 - 60:11going to be zero,
except for that k which -
60:14 - 60:14actually appeared in which case
it is the value that we say it -
60:20 - 60:20is.
See the trick using -
60:22 - 60:22multiplication by 0,
1 variable to handle all the -
60:27 - 60:27cases?
I think that is damn clever. -
60:31 - 60:31I think that is damn clever.
And this is like the classic -
60:36 - 60:36thing that you do with indicator
random variables. -
60:40 - 60:40It's one of the reasons they
are a very powerful method. -
60:45 - 60:45Because now we actually have a
mathematical expression, -
60:50 - 60:50hairy although it may be,
for our recurrence. -
60:54 - 60:54Now, what we are going to
analyze is the expected value of -
60:58 - 60:58T(n).
That is what we want to do. -
61:03 - 61:03What is the expected value of
T(n)? -
61:07 - 61:07To do that, I just write the
expected value of T(n) is equal -
61:13 - 61:13to the expected value of this
big summation. -
61:18 - 61:18And now we can go ahead and
start to evaluate the expected -
61:24 - 61:24value of that summation.
Everybody with me? -
61:30 - 61:30Yes?
Any questions at this point? -
61:32 - 61:32I see a thumbs up.
That's nice to see. -
61:35 - 61:35But I generally believe that
what I want to see is no thumbs -
61:40 - 61:40down.
It is good to see the thumbs -
61:43 - 61:43up, but that means one person
understands, or thinks he -
61:47 - 61:47understands.
[LAUGHTER] So, -
61:49 - 61:49this is, I claim,
equal to the following. -
61:52 - 61:52Actually, I am going to need a
little space here so I am going -
61:57 - 61:57to move the equal sign over a
little bit. -
62:25 - 62:25I claim that summation is equal
to that. -
62:28 - 62:28This expectation is equal to
that summation of expectations. -
62:32 - 62:32Why is that?
What are the magic words that -
62:36 - 62:36justify this step?
Linearity of expectation. -
62:40 - 62:40The expectation of a sum is the
sum of the expectations. -
62:46 - 62:46So, that is linearity of
expectation. -
62:49 - 62:49I don't need independence for
that. -
62:53 - 62:53That is just always true for
expectation of any random -
62:58 - 62:58variables.
The sum of the expectations is -
63:05 - 63:05the expectation of the sum and
vice versa. -
63:11 - 63:11Here we did the vice versa.
That is equal to now the sum of -
63:20 - 63:20k=0 to n-1 of expectation of X_k
[T(k) + T(n-k-1) + Theta(n)]. -
63:36 - 63:36Why is that true?
What I have done is I've said -
63:42 - 63:42the expectation of the product
is the product of the -
63:49 - 63:49expectations.
That is because of -
63:53 - 63:53independence.
What is independent of what? -
64:00 - 64:00The X_k here,
random variable, -
64:02 - 64:02are independent of any of the
other partitionings in, -
64:07 - 64:07if you will,
the X_k that would exist for -
64:10 - 64:10any of the other recursive
calls. -
64:13 - 64:13So, whatever happens in here is
independent of what happened -
64:18 - 64:18there.
We are actually hiding. -
64:21 - 64:21Since we have a recurrence,
we are not partitioning the -
64:25 - 64:25same wage time.
We have a different one. -
64:30 - 64:30We actually have something
going on underneath the -
64:33 - 64:33mathematics you have to pay
attention to that the -
64:36 - 64:36mathematics alone isn't really
showing, which is that in T(k) -
64:40 - 64:40there is actually a set of
random choices that are being -
64:44 - 64:44made, if you will.
And so you have to understand -
64:47 - 64:47that those are independent of
those, in which case we can -
64:50 - 64:50multiple the probabilities of
their expectations. -
64:53 - 64:53Is everybody with me?
That is a big one, -
64:56 - 64:56independence of X_k from other
random choices. -
65:00 - 65:00That is equal to now,
well, first of all, -
65:05 - 65:05this is nice.
What is the expectation of X_k? -
65:10 - 65:101/n.
That actually doesn't even -
65:13 - 65:13belong in the summation.
We will just pop it outside. -
65:20 - 65:20I get 1/n times the sum of k=0
to n-1 of expectation of T(k) + -
65:36 - 65:361/n summation k=0 to n-1 of
expectation of T(n-k-1) + 1/n -
65:50 - 65:50summation k=0 to n-1 up to
Theta(n). -
66:00 - 66:00That is, again,
using linearity of expectation -
66:05 - 66:05there this time to split up
these pieces and just factoring -
66:11 - 66:11out the expectation of k as
being 1/n. -
66:15 - 66:15Everybody with me still?
All of this is elementary. -
66:20 - 66:20It is just one of these things
that is hard just because there -
66:27 - 66:27are so many steps.
And it takes that you have seen -
66:33 - 66:33some of this before.
Now the next observation is -
66:38 - 66:38that these two summations are,
in fact, identical. -
66:43 - 66:43They are the same summation,
just in a different order. -
66:49 - 66:49This is going T(0),
T(1), T(2), T(3) up to T(n-1). -
66:54 - 66:54This one is going T(n-1),
T(n-2), T(n-3) down to T(0). -
67:00 - 67:00These are, in fact,
equal. -
67:02 - 67:02So, therefore,
I have two of them. -
67:19 - 67:19And then what is this term
equal to? -
67:37 - 67:37What is that one equal to?
Theta(n). -
67:41 - 67:41Let's just see why.
The summation of 0 : -
67:45 - 67:45n of Theta(n) is Theta(n^2)
divided by n. -
67:50 - 67:50Or, if I want to bring the
Theta(n) out, -
67:54 - 67:54I have 1 times the summation of
k equals1 to n of Theta(1) or of -
68:02 - 68:021.
So, once again, -
68:05 - 68:05you get n, either way of doing
it. -
68:08 - 68:08This is, in some sense,
because the summations have -
68:13 - 68:13identical terms,
and this is just algebra. -
68:17 - 68:17Now what we are going to do is
do something for technical -
68:23 - 68:23convenience.
Because we are going to absorb -
68:30 - 68:30the k=0, 1 terms into the
Theta(n) for technical -
68:38 - 68:38convenience.
-
68:46 - 68:46We have a recurrence here where
I have an order n. -
68:50 - 68:50And, if I look at the cases
where k=0 or k=1, -
68:54 - 68:54I know what the expectation is.
For 0, 1, the expected cost is -
69:00 - 69:00the worst case cost,
which is constant. -
69:04 - 69:04Because I am only solving the
problem for a constant size. -
69:08 - 69:08And we know that for any of the
boundary cases that our solution -
69:12 - 69:12of recurrence,
our assumption is that it is -
69:15 - 69:15constant time.
So, I basically can just take -
69:18 - 69:18those two terms out.
And all that does it accumulate -
69:22 - 69:22some more constant here in the
Theta(n). -
69:24 - 69:24It is going to make the
solution of the recurrence a -
69:28 - 69:28little bit easier.
And, if I do that, -
69:33 - 69:33I get expectation of T(n) = 2/n
summation k=2 to n-1 of -
69:44 - 69:44expectation of T(k) + Theta(n).
-
70:00 - 70:00So, all of that work was to
derive the recurrence. -
70:07 - 70:07And now we have to solve it.
Just to review what we did, -
70:15 - 70:15we started out with a
recurrence which was for the -
70:21 - 70:21random variable which involved a
case statement. -
70:29 - 70:29We converted that into some
mathematics without the case -
70:33 - 70:33statement, just with a product,
and then we derived a -
70:37 - 70:37recurrence for the expectation.
And now we are in the process -
70:42 - 70:42of trying to solve that
recurrence. -
70:44 - 70:44We have done some
simplification of the recurrence -
70:48 - 70:48so that we understand what it is
that we are going to solve here. -
70:53 - 70:53By the way, I don't give things
like this on quizzes. -
70:56 - 70:56I do expect you to understand
it. -
71:00 - 71:00The elements of this you will
find on a quiz. -
71:03 - 71:03This is a lot of work to figure
out. -
71:06 - 71:06This took smart people to do.
Even though it is all -
71:09 - 71:09elementary, but working out
something like this at the -
71:13 - 71:13elementary level is still a bit
of work even for somebody who is -
71:17 - 71:17knowledgeable in this area.
Now we are going to solve that -
71:23 - 71:23last recurrence over there and
we are going to prove that the -
71:30 - 71:30expectation of T(n) is less than
or equal to (an lg n) for some -
71:38 - 71:38constant a greater than 0.
That is going to be what we are -
71:44 - 71:44going to do.
And so what technique do you -
71:49 - 71:49think we should use to prove
this? -
71:53 - 71:53Does this look like a master
method? -
72:03 - 72:03It is nothing like the master
method. -
72:09 - 72:09So, when in doubt do
substitution. -
72:14 - 72:14It is the grand-daddy of all
methods. -
72:19 - 72:19What we will do is solve the
base case by simply choosing a -
72:28 - 72:28big enough so that (an lg n) is
bigger than the expectation of -
72:38 - 72:38T(n) for sufficiently large
small n. -
72:45 - 72:45So, I just pick a to be big
enough. -
72:49 - 72:49And this is,
by the way, why I wanted to -
72:54 - 72:54exclude 0 and 1 from the
recurrence. -
72:59 - 72:59Because, for example,
when n=0, log of 0 is, -
73:03 - 73:03it's like dividing by 0,
right, you cannot do it. -
73:08 - 73:08Log of 1 is 0.
So here, even if I restricted -
73:13 - 73:13it to 1, here I would have a 0,
and I can't ever pick a big -
73:18 - 73:18enough to dominate those cases.
What I do is I just say look, -
73:24 - 73:24I just absorb whatever the cost
is into the T(n) for technical -
73:31 - 73:31convenience.
And that lets me address it as -
73:37 - 73:37(an lg n) to be big enough to
handle the base case. -
73:44 - 73:44So, that is why we made that
technical assumption. -
73:51 - 73:51We are going to use a fact
which is that the summation of -
73:59 - 73:59k=2 to n-1 of k lg k is less
than or equal to 1/2n^2 lg n - -
74:07 - 74:071/8n^2.
I am going to leave that as an -
74:11 - 74:11exercise for you to workout.
I think it is an exercise in -
74:15 - 74:15the book, too.
I want you to go and evaluate -
74:19 - 74:19this.
There are two ways to evaluate -
74:21 - 74:21it.
One is by using purely -
74:23 - 74:23summations and facts about
summations by splitting the -
74:27 - 74:27summation into two pieces and
reconstituting it to prove this -
74:32 - 74:32bound.
The other way is to use the -
74:35 - 74:35integral method for solving
summations. -
74:39 - 74:39Either way you can prove.
The integral method actually -
74:43 - 74:43gets you a tighter bound than
this. -
74:46 - 74:46This is a basic fact,
and you should go off and know -
74:51 - 74:51how to do that.
Now let's do substitution. -
75:06 - 75:06The expectation of T(n) is less
than or equal to 2/n times the -
75:19 - 75:19summation k=2 to n-1,
now we do the substitution of -
75:29 - 75:29ak lg k, the smaller values plus
Theta(n). -
75:39 - 75:39I might mentioned,
by the way, that the hard part -
75:42 - 75:42of doing this,
it is easy to get the bound -
75:45 - 75:45without this term,
it is easy to get this bound, -
75:49 - 75:491/2n^2 lg n,
it is harder to get the second -
75:52 - 75:52order term.
It turns out you need the -
75:54 - 75:54second order term in order to do
what we are going to do. -
75:59 - 75:59You have to be able to subtract
a quadratic amount of the n^2 lg -
76:06 - 76:06n in order to make this proof
work. -
76:09 - 76:09And that is the trickier part
in evaluating that summation. -
76:15 - 76:15So, we get this.
That is less than or equal to? -
76:20 - 76:20Well, I happen to know how much
this is by using that formula. -
76:27 - 76:27I use my fact and get 2a/n
(1/2n^2 lg n - 1/8n^2) + -
76:32 - 76:32Theta(n).
Did I do something wrong? -
76:44 - 76:44There we go.
Very good. -
76:52 - 76:52That is equal to -
If I multiply this first part -
77:05 - 77:05through that is an lg n.
And now, so I don't make a -
77:14 - 77:14mistake, I want to express this
as my desired, -
77:23 - 77:23this is what I want it to be,
minus a residual. -
77:32 - 77:32I am going to write the
residual as this part. -
77:39 - 77:39And so, the way to write that
is, that is going to be minus. -
77:47 - 77:47And then it is going to be this
term here, which is going to be -
77:57 - 77:57an/4 - Theta(n).
-
78:05 - 78:05And that is going to be less
than or equal to an lg n if this -
78:12 - 78:12part is positive.
And I can make that part -
78:17 - 78:17positive by picking a big enough
such that an/4 dominates the -
78:25 - 78:25constant in the Theta(n) here.
Whatever the constant is here, -
78:35 - 78:35I can find an a that is big
enough so that this term makes -
78:45 - 78:45this part positive.
If a is big enough so that an/4 -
78:55 - 78:55dominates Theta(n).
And so the running time of -
79:01 - 79:01randomized quicksort is order n
lg n. -
79:04 - 79:04That is what we just proved,
the expected running time is -
79:09 - 79:09order n lg n.
Now, in practice, -
79:12 - 79:12quicksort is a great algorithm.
It is typically three or more -
79:17 - 79:17times faster than mergesort.
It doesn't give you the strong -
79:22 - 79:22guarantee necessarily of
mergesort and being worst-case n -
79:26 - 79:26lg n.
But in practice, -
79:29 - 79:29if you use randomized
quicksort, it is generally as -
79:33 - 79:33much as three times faster.
It does require code tuning in -
79:38 - 79:38order to get it up to be that
fast. -
79:40 - 79:40You do have to go and coarsen
the base cases and do some other -
79:45 - 79:45tricks there,
but most good sorting -
79:48 - 79:48algorithms that you will find
are based on quicksort. -
79:52 - 79:52Also one of the other reasons
it works well is because it -
79:56 - 79:56tends to work well with caches
in virtual memory. -
80:01 - 80:01We are not really talking much
about caching models and so -
80:05 - 80:05forth, big topic these days in
algorithms, but it does work -
80:10 - 80:10very well with caches in virtual
memory. -
80:13 - 80:13It is another reason that this
is a good algorithm to use. -
80:17 - 80:17Good recitation,
by the way, on Friday. -
80:20 - 80:20We are going to see another n
log n time algorithm, -
80:24 - 80:24a very important one in
recitation on Friday.
- Title:
- Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
- Description:
-
Lecture 04: Quicksort, Randomized Algorithms
View the complete course at: http://ocw.mit.edu/6-046JF05
License: Creative Commons BY-NC-SA
More information at http://ocw.mit.edu/terms
More courses at http://ocw.mit.edu
- Duration:
- 01:20:33