Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

Edit subtitles

0:08 - 0:08

OK.
Today we are going to talk
0:11 - 0:11

about a very interesting
algorithm called Quicksort --
0:23 - 0:23

-- which was invented by Tony
Hoare in 1962.
0:30 - 0:30

And it has ended up being a
really interesting algorithm
0:36 - 0:36

from many points of view.
And because of that,
0:41 - 0:41

it turns out today's lecture is
going to be both hard and fast.
0:48 - 0:48

If you see the person next to
you sleeping,
0:52 - 0:52

you will want to say let's get
going.
0:56 - 0:56

It's a divide-and-conquer
algorithm.
1:06 - 1:06

And it sorts,
as they say,
1:08 - 1:08

in place, meaning that it just
rearranged the elements where
1:14 - 1:14

they are.
That is like insertion sort
1:18 - 1:18

rearranges elements where they
are.
1:21 - 1:21

Mergesort does not.
Mergesort requires extra
1:25 - 1:25

storage in order to do the merge
operation.
1:30 - 1:30

To merge in linear time and
place, it doesn't merge in place
1:35 - 1:35

in linear time.
It doesn't do it just by
1:38 - 1:38

rearranging.
It is nice because it is in
1:41 - 1:41

place, so that means that it is
fairly efficient in its use of
1:46 - 1:46

storage.
And it also happens to be very
1:49 - 1:49

practical if you tune it a bit.
The basic algorithm turns out,
1:54 - 1:54

if you just implement that,
it's not necessarily that
1:58 - 1:58

efficient.
But if what you do was then do
2:03 - 2:03

the standard kinds of things you
do to goose up the runtime of
2:08 - 2:08

something, and we'll talk a
little about what those things
2:13 - 2:13

are, then it can be very,
very practical.
2:16 - 2:16

So, it uses divide-and-conquer
paradigm.
2:33 - 2:33

First step is divide.
And to do this basically it
2:41 - 2:41

does it by partitioning.
So, it partitions the input
2:49 - 2:49

array into two subarrays around
an element we call the pivot --
3:04 - 3:04

-- such that elements in the
lower subarray are less than or
3:15 - 3:15

equal to x, are less than or
equal to elements in the upper
3:26 - 3:26

subarray.
If we draw a picture of the
3:31 - 3:31

input array, this partition step
basically takes some element x
3:39 - 3:39

and everything over here is less
than or equal to x after the
3:46 - 3:46

partition step and everything
over here is greater than or
3:53 - 3:53

equal to x.
And so now the conquer step is
3:57 - 3:57

pretty easy.
You just recursively sort the
4:05 - 4:05

two subarrays.
So, I recursively sort the
4:11 - 4:11

elements less than or equal to
x, I recursively sort the
4:19 - 4:19

elements greater than or equal
to x.
4:25 - 4:25

And then combine is then just
trivial.
4:32 - 4:32

Because once I have sorted the
things less than or equal to x,
4:37 - 4:37

then sorted the things greater
than or equal to x,
4:40 - 4:40

the whole thing is sorted.
So, there is nothing to do
4:44 - 4:44

really for the combine.
The key step in quicksort is
4:48 - 4:48

this partition step.
That is the thing that does all
4:52 - 4:52

of the work.
And so you can view quicksort
4:55 - 4:55

of just as recursive
partitioning.
4:58 - 4:58

That's all it is.
Just as mergesort was recursive
5:06 - 5:06

merging, quicksort sort of goes
the other way around and does
5:19 - 5:19

recursive partitioning.
The key is the linear time,
5:29 - 5:29

by which I mean theta n,
partitioning subroutine.
5:40 - 5:40

And here are some pseudocode
for it.
5:43 - 5:43

This is actually slightly
different from the book.
5:48 - 5:48

The book has one.
In fact, there is a nice
5:51 - 5:51

problem in the book that has
even a different one,
5:56 - 5:56

but they are all basically the
same idea.
6:00 - 6:00

Partition (A,
p, q).
6:02 - 6:02

And what we are looking at,
at this step of the recursion,
6:07 - 6:07

is the subarray A from p to q.
And basically we pick a pivot,
6:12 - 6:12

which is we are going to just
pick as the first element of the
6:17 - 6:17

array A of p.
6:26 - 6:26

And the book,
just for your information,
6:30 - 6:30

uses A of q.
I use A of p.
6:32 - 6:32

It doesn't really matter.
And then we set an index to p
6:37 - 6:37

and then we have a loop.
7:35 - 7:35

This is the code.
Basically the structure of it
7:41 - 7:41

is a for loop with an "if"
statement in the middle.
7:47 - 7:47

And so the structure of the
algorithm of this partitioning
7:54 - 7:54

step looks as follows.
We set the pivot to be the
8:01 - 8:01

first element.
Here is p and here is q.
8:05 - 8:05

This is going to be our
invariant for the loop.
8:10 - 8:10

And, at any time during the
execution of a loop,
8:15 - 8:15

I essentially have some values
up to i which are already less
8:22 - 8:22

than or equal to x and then some
values that end at j minus 1
8:29 - 8:29

that are greater than or equal
to x.
8:34 - 8:34

And then I don't know about the
rest.
8:38 - 8:38

And so we start out with i
equal to p and j equal to p plus
8:45 - 8:45

1.
It starts out at p plus 1 so
8:48 - 8:48

that everything is unknown
except for x here.
8:53 - 8:53

And then the idea is that it is
going to preserve this
8:59 - 8:59

invariant.
And the way it does it is,
9:03 - 9:03

as we go through the loop,
it looks at a of j and says is
9:07 - 9:07

it greater than or equal to x?
Sorry, is it less than or equal
9:11 - 9:11

to x?
If it is greater than or equal
9:13 - 9:13

to x it does nothing,
because what can happen?
9:16 - 9:16

If this is greater than or
equal to x, essentially it just
9:20 - 9:20

goes to the next iterational
loop which moves this boundary
9:24 - 9:24

and the invariant is satisfied.
Does everybody see that?
9:27 - 9:27

Yeah, OK.
But if it is less than or equal
9:31 - 9:31

to x, I have got a problem if I
want to maintain the invariant
9:36 - 9:36

if this next element is less
than or equal to x.
9:39 - 9:39

And so what it does then is it
says oh, let me just move this
9:43 - 9:43

boundary and swap this element
here, which is greater than or
9:47 - 9:47

equal to x, with this one here
that is less than or equal to x,
9:52 - 9:52

thereby increasing the size of
this subarray and then the
9:56 - 9:56

invariant is satisfied again.
It is a fairly simple
9:59 - 9:59

algorithm.
And it is actually a very tight
10:04 - 10:04

and easy algorithm.
That is one reason that this is
10:09 - 10:09

such a great piece of code
because it is very efficient.
10:14 - 10:14

Now, in principle,
the running time for this on n
10:18 - 10:18

elements is order n.
10:28 - 10:28

Because I am basically just
going through the n elements and
10:32 - 10:32

just doing a constant amount of
work and then just a constant
10:36 - 10:36

amount of work outside.
This is a clever piece of code.
10:39 - 10:39

In fact, in principle partition
is easy, right?
10:42 - 10:42

If I weren't worrying about
doing it in place,
10:45 - 10:45

it is really a pretty easy
thing to do.
10:47 - 10:47

I take an element and just
compare every other element with
10:51 - 10:51

it.
I throw one into one bin and
10:53 - 10:53

one into the other.
That is clearly linear time.
10:57 - 10:57

But often what you find is that
just because you can do it that
11:01 - 11:01

way theoretically doesn't mean
that that is going to end up
11:05 - 11:05

giving you good code.
And this is a nice piece of
11:08 - 11:08

code that allows you to do it in
place.
11:11 - 11:11

And that is one reason why this
is a particularly good
11:14 - 11:14

algorithm, because the constants
are good.
11:17 - 11:17

So, yes, when we do asymptotic
analysis we tend to ignore the
11:21 - 11:21

constants, but when you're
actually building code you care
11:25 - 11:25

about the constants.
But first you care much more
11:30 - 11:30

than just about the constants,
is whether overall it is going
11:37 - 11:37

to be a fast algorithm.
Let's go through an example of
11:44 - 11:44

this, I guess I will do it over
here, just so we get the gist.
11:51 - 11:51

Here is a sample array that I
have created out of hallcloth.
11:58 - 11:58

And here we are going to set x,
the pivot, to be 6.
12:05 - 12:05

Let's look to see how this
algorithm works.
12:08 - 12:08

So, i starts out here and j
starts out here if we
12:12 - 12:12

initialize.
And what we do is start
12:15 - 12:15

scanning right,
essentially that code is
12:19 - 12:19

scanning right until it gets
something which is less than or
12:23 - 12:23

equal to the pivot.
It keeps going here until it
12:27 - 12:27

finds, j keeps incrementing
until it finds something that is
12:32 - 12:32

less than or equal to the pivot.
And, in that case,
12:38 - 12:38

it is the number 5.
Then it says we will swap these
12:44 - 12:44

two things.
And it does that and we get 6,
12:50 - 12:50

5, 13, 10, 8,
3, 2, 11.
12:52 - 12:52

And meanwhile now i gets
incremented and j continues
12:59 - 12:59

where it left off.
And so now we keep scanning
13:05 - 13:05

right until we get to something
that is less than or equal to
13:13 - 13:13

the pivot.
In this case it is 3.
13:17 - 13:17

We swap 3 and 5 and get 6,
3, etc.
13:21 - 13:21

And now, at this step we
increment i, we start j out
13:27 - 13:27

here.
And in this case,
13:31 - 13:31

right off the bat,
we have something which is less
13:36 - 13:36

than or equal to x,
so we swap these two.
13:40 - 13:40

I blew it, didn't I?
Oops.
13:42 - 13:42

What did I do?
I swapped the wrong thing,
13:46 - 13:46

didn't I, here?
Ah-ha.
13:48 - 13:48

That is why I am not a
computer.
13:52 - 13:52

Good.
We should have swapped this
13:55 - 13:55

guy, right?
Swapped i plus 1,
13:58 - 13:58

right?
This was i.
14:01 - 14:01

We swap i plus 1,
good.
14:04 - 14:04

So, that's all wrong.
Let's swap the right things.
14:09 - 14:09

Now we have 6,
5, 3, 10, 8,
14:12 - 14:12

13, 2, 11.
That even corresponds to my
14:16 - 14:16

notes for some strange reason.
This is i and now this is j.
14:23 - 14:23

And now when I look,
I immediately have something
14:28 - 14:28

that is less than or equal to
the pivot.
14:34 - 14:34

We swap this and i plus 1,
so now we have 6,
14:42 - 14:42

5, 3, 2, 8, 13,
10, 11.
14:45 - 14:45

And we, at that point,
increment i to here.
14:53 - 14:53

And we have j now going here
and j runs to the end.
15:03 - 15:03

And the loop terminates.
When the loop terminates there
15:09 - 15:09

is one less swap that we do,
which is to put our pivot
15:14 - 15:14

element in the middle between
the two subarrays.
15:19 - 15:19

Here we swap this one and this
one, and so that gives us then
15:25 - 15:25

2, 5, 3, 6, 8,
13, 10, 11.
15:28 - 15:28

And this is the pivot.
And everything over here is
15:34 - 15:34

less than or equal to the pivot.
And everything over here is
15:40 - 15:40

greater than or equal to the
pivot.
15:56 - 15:56

OK, so the quicksort routine.
Once we have this partition
16:00 - 16:00

routine, quicksort is a pretty
easy piece of code to write.
16:18 - 16:18

I should have said return here
i, right?
16:21 - 16:21

You have got to return with the
pivot.
16:25 - 16:25

Here I have got to return i
because we want to know where
16:30 - 16:30

the pivot element is.
Sorry.
16:34 - 16:34

I will plug in my code.
r gets partition of (A,
16:47 - 16:47

p, q) and then we quicksort (A,
p, r-1) and quicksort of (A,
17:03 - 17:03

r+1, q).
And that is it.
17:10 - 17:10

That's the code.
The initial call is quicksort
17:17 - 17:17

of (A, 1, n).
Because once we partitioned,
17:25 - 17:25

we just have to quicksort the
two portions,
17:32 - 17:32

the left and right portions.
Just the boundary case is
17:39 - 17:39

probably worth mentioning for a
second.
17:41 - 17:41

If there are zero or one
elements, that is basically what
17:44 - 17:44

can possibly happen here,
is that I get zero or one
17:47 - 17:47

elements here.
Then the point is there is
17:49 - 17:49

nothing to do because the array
is sorted, either because it is
17:53 - 17:53

an empty array or because it
only has one element.
17:56 - 17:56

One of the tricks to making
quicksort go fast,
17:59 - 17:59

as one tunes this,
is to, in fact,
18:01 - 18:01

look at having a special
purpose sorting routine for
18:05 - 18:05

small numbers of elements.
For example,
18:07 - 18:07

if you get down to five
elements having some straight
18:11 - 18:11

line piece of code that knows
how to sort five elements
18:14 - 18:14

sufficiently as opposed to
continuing to go through
18:18 - 18:18

recursion in order to accomplish
that.
18:20 - 18:20

And there are a variety of
other things.
18:23 - 18:23

This is a tail recursive code,
and so you can use certain tail
18:27 - 18:27

recursion optimizations.
And there are a variety of
18:32 - 18:32

other kinds of optimizations
that you can use to make this
18:35 - 18:35

code go fast.
So, yeah, you can tune it up a
18:38 - 18:38

bit beyond what is there,
but the core of it is this
18:41 - 18:41

efficient partitioning routine.
18:53 - 18:53

That is the algorithm.
It turns out that looking at
18:59 - 18:59

how fast it runs is actually a
little bit challenging.
19:05 - 19:05

In the analysis,
we are going to assume that all
19:10 - 19:10

elements are distinct.
It turns out that this
19:14 - 19:14

particular code does not work
very well when you have repeated
19:20 - 19:20

elements, but Hoare's original
partitioning routine is actually
19:26 - 19:26

more efficient in that case if
there are duplicates in what you
19:32 - 19:32

are sorting.
And I encourage you to look at
19:36 - 19:36

that.
It has a much more complicated
19:39 - 19:39

invariant for partitioning
routine, but it does a similar
19:43 - 19:43

kind of thing.
It's just a bit more
19:45 - 19:45

complicated.
If they weren't all distinct,
19:48 - 19:48

there are things you can do to
make them distinct or you can
19:52 - 19:52

just use this code.
The easiest thing to do is just
19:56 - 19:56

use Hoare's original code
because that works pretty well
20:00 - 20:00

when they are nondistinct.
But this is a little bit easier
20:08 - 20:08

to understand.
Let's let T(n) be the
20:13 - 20:13

worst-case running time on n
elements.
20:18 - 20:18

And so what is the worse-case?
What is the worse-case going to
20:27 - 20:27

be for quicksort?
20:40 - 20:40

That's right.
If you always pick the pivot
20:43 - 20:43

and everything is greater than
or everything is less than,
20:47 - 20:47

you are not going to partition
the array very well.
20:51 - 20:51

And when does that happen?
What does the original input
20:55 - 20:55

look like that makes that
happen?
20:57 - 20:57

If it is already sorted or
reverse sorted.
21:01 - 21:01

So, if the input is sorted or
reverse sorted.
21:05 - 21:05

That is actually kind of
important to understand,
21:10 - 21:10

because it turns out the most
common thing to sort is
21:15 - 21:15

something that is already
sorted, surprisingly,
21:19 - 21:19

or things that are nearly
sorted.
21:22 - 21:22

But often it is just sorted and
somebody wants to make sure it
21:28 - 21:28

is sorted.
Well, let's just sort it again
21:33 - 21:33

rather than checking to see if
it is sorted.
21:38 - 21:38

And, in those cases,
one side of the partition of
21:44 - 21:44

each partition has no elements.
Then we can write out what the
21:50 - 21:50

recursion is for that.
We have T(n).
21:54 - 21:54

If one side has no elements,
we are going to have T(0) on
22:01 - 22:01

that side.
And on the other side we are
22:06 - 22:06

going to have T(n-1).
We are just writing out the
22:11 - 22:11

recursion for this.
One side has no elements.
22:16 - 22:16

The other side has n-1
elements.
22:20 - 22:20

And then partitioning and all
the bookkeeping and so forth is
22:27 - 22:27

order n.
What is T(0)?
22:30 - 22:30

What is T(0)?
What is that asymptotically?
22:36 - 22:36

It's a constant,
order 1.
22:39 - 22:39

That is just order 1 + T(n-1) +
order n.
22:44 - 22:44

Well, the order 1 can be
absorbed into the order n,
22:51 - 22:51

so this is really just saying
it is T(n-1) + order n.
22:59 - 22:59

And what is that equal to?
That is order n^2.
23:09 - 23:09

Why is that order n^2?
It is an arithmetic series.
23:19 - 23:19

Actually, just like we got for
insertion sort.
23:30 - 23:30

Just like for insertion sort it
is an arithmetic series.
23:36 - 23:36

Going through all that work and
we have an algorithm called
23:42 - 23:42

quicksort, and it is no faster
than insertion sort.
23:47 - 23:47

Nevertheless,
I said it was a good algorithm.
23:52 - 23:52

The reason it is a good
algorithm is because its average
23:58 - 23:58

case time, as we are going to
see, is very good.
24:04 - 24:04

But let's try to understand
this a little bit more just so
24:10 - 24:10

that we understand the
difference between what is going
24:15 - 24:15

to happen in the average case
and what is going to happen in
24:21 - 24:21

the worse-case.
Let's draw a recursion tree for
24:26 - 24:26

this for T(n) = T(0) + T(n-1) +
and I will make the constant
24:31 - 24:31

explicit for cn.
So, we get an intuition of what
24:37 - 24:37

is going on.
Some constant times n.
24:39 - 24:39

And then we have T(n) is equal
to, and we write it with the
24:44 - 24:44

constant part here,
cn, and then T(0) here,
24:48 - 24:48

and then T(n-1) here.
Now, I know that all you folks
24:52 - 24:52

are really fast and want to jump
immediately to the full-blown
24:57 - 24:57

tree.
But, let me tell you,
25:00 - 25:00

my advice is that you spend
just a couple of minutes writing
25:04 - 25:04

it out.
Since the tree grows
25:07 - 25:07

exponentially,
it only costs you a constant
25:10 - 25:10

overhead to write out the small
cases and make sure that you
25:15 - 25:15

have got the pattern that you
are developing.
25:18 - 25:18

So, I am going to go one more
step.
25:21 - 25:21

Here we have T(0) and now this
becomes c(n-1) and now we have
25:26 - 25:26

another T(0) over here and
T(n-2).
25:30 - 25:30

And we continue that,
dot, dot, dot.
25:36 - 25:36

That is all equal to cn with a
T(0) here, c(n-1) with a T(0),
25:46 - 25:46

c(n-2), T(0) here,
and that goes all the way down
25:54 - 25:54

until we end up with T(1) down
here.
26:01 - 26:01

What is the height of this tree?
26:17 - 26:17

What is the height of the tree
here?
26:21 - 26:21

Yeah, n.
Good.
26:22 - 26:22

Because every step we are just
decrementing the argument by 1.
26:30 - 26:30

So, the height is n.
To analyze this,
26:35 - 26:35

let's first add up everything
that is here.
26:41 - 26:41

Just so we understand where
these things are coming from,
26:49 - 26:49

this is just theta of the
summation of k equals 1 to n of
26:56 - 26:56

k, actually of ck.
That is what is in there.
27:02 - 27:02

And that is equal to order n^2.
That is where our algorithmatic
27:08 - 27:08

series is coming from.
So, that is Theta(n^2).
27:13 - 27:13

And then all of these things
here are all Theta(1).
27:23 - 27:23

And how many of them are there?
There are n Theta(1)'s.
27:33 - 27:33

So, the total amount is T(n) =
Theta(n) + Theta(n^2) =
27:55 - 27:55

Theta(n^2).
Just to see what the structure
28:03 - 28:03

is in terms of the recursion
tree, it is a highly unbalanced
28:09 - 28:09

recursion tree.
Now I am going to do something
28:14 - 28:14

that I told you should never do,
which is we are going to be do
28:21 - 28:21

a best-case analysis.
This is for intuition only.
28:26 - 28:26

And, in general,
we don't do best-case analyses.
28:32 - 28:32

It doesn't mean anything,
unless we get some intuition
28:36 - 28:36

for it maybe.
But basically it means nothing
28:39 - 28:39

mathematically because it's
providing no guarantee.
28:55 - 28:55

And so this is intuition only.
29:05 - 29:05

If we are really lucky what
happens for partition?
29:14 - 29:14

What is going to be the lucky
case?
29:26 - 29:26

Yeah, it splits right in the
middle.
29:29 - 29:29

Which is essentially --
29:43 - 29:43

-- n/2 : n/2.
It is really (n-1)/2 :
29:46 - 29:46

(n-1)/2, but we're not going to
worry about the details because
29:51 - 29:51

we're only doing intuition for
the best-case because best-case
29:57 - 29:57

is not what we want.
If that happened,
30:02 - 30:02

what is the recurrence I get?
Imagine it split it exactly in
30:12 - 30:12

the middle every time,
then what happens?
30:18 - 30:18

You get T(n) = 2T(n/2) + order
n for partitioning and
30:27 - 30:27

bookkeeping.
And what is the solution of
30:33 - 30:33

that recurrence?
That is n log n.
30:37 - 30:37

That is the same as the merge
sort recurrence.
30:42 - 30:42

It is which case of the master
theorem?
30:47 - 30:47

Case 2, right?
Because n to the log base 2 of
30:52 - 30:52

2 is n to the 1,
it is the same,
30:55 - 30:55

so we tack on the extra log n.
Case 2 of the master theorem.
31:05 - 31:05

That is pretty good.
That says that in the best-case
31:14 - 31:14

quicksort is going to do well.
How about let's suppose the
31:24 - 31:24

split is always let's say 1/10 :
9/10, 1/10n :
31:32 - 31:32

9/10n.
In that case,
31:36 - 31:36

are we lucky or are we unlucky?
31:52 - 31:52

I mean, if the split is really
skewed, we clearly are going to
31:57 - 31:57

be unlucky, right,
because then it's,
32:01 - 32:01

say, 1 to n.
If it is really in the middle
32:04 - 32:04

it is n log n.
What do you suppose it is if it
32:08 - 32:08

is 1/10 : 9/10?
Is that lucky or unlucky?
32:12 - 32:12

We will have a little democracy
here.
32:14 - 32:14

Who thinks that that is a lucky
case?
32:18 - 32:18

It is going to be fast running
time.
32:20 - 32:20

And who thinks it is an unlucky
case?
32:23 - 32:23

OK, so we have some brave
souls.
32:26 - 32:26

And who didn't vote?
Oh, come on.
32:30 - 32:30

Come on.
It is always better,
32:32 - 32:32

by the way, to say yes or no
and be right or wrong,
32:36 - 32:36

because then you have some
emotional commitment to it and
32:40 - 32:40

we will remember better,
rather than just sitting and
32:44 - 32:44

being quiet.
You don't manipulate your own
32:47 - 32:47

emotions well enough to remember
things well.
32:50 - 32:50

Those people who voted win over
the people who don't vote,
32:55 - 32:55

whether they are right or
wrong.
32:57 - 32:57

Well, let's take a look.
Here is the recurrence.
33:03 - 33:03

T(n) = T(1/10n) + T(9/10n) +
Theta(n).
33:08 - 33:08

And we will assume that this
part here is less than or equal
33:15 - 33:15

to some cn in order to analyze
it.
33:19 - 33:19

We will just do a recursion
tree for this and see.
33:26 - 33:26

Here is a recursion tree.
33:35 - 33:35

We have T(n) = cn,
T(1/10n), T(9/10n).
33:41 - 33:41

Now we have again cn at the
top.
33:47 - 33:47

This gets complicated,
right?
33:51 - 33:51

This is 1/10cn.
Now, over here we have 1/10.
34:00 - 34:00

And then we are plugging it
into the recursion again,
34:10 - 34:10

so we now get T(1/100n) and
over here we get T(9/100n).
34:20 - 34:20

And over here we have now
9/10cn.
34:26 - 34:26

And that gives us T(9/100n)
again.
34:34 - 34:34

And here we get T(81/100n).
And we keep going on.
34:48 - 34:48

That is equal to cn,
1/10cn here.
34:59 - 34:59

Down this way we have 1/100cn.
And that keeps going down until
35:08 - 35:08

we get to order 1 down here.
And over here we have 9/10cn.
35:17 - 35:17

And here, let's see,
this is 9/100cn and this is now
35:25 - 35:25

9/100cn and this is 81/100cn.
And these things keep going
35:33 - 35:33

down until they get down to
order 1.
35:37 - 35:37

But the leaves are not all at
uniform depth here,
35:42 - 35:42

right?
This side is way further up
35:45 - 35:45

than this side,
right?
35:48 - 35:48

Because here we are only going
down by 9/10 each time.
35:53 - 35:53

So, in fact,
what is the length of this path
35:58 - 35:58

here?
What is the length of this path
36:02 - 36:02

down to this,
if I take the left most spine?
36:15 - 36:15

Somebody raise there hand.
Yeah?
36:24 - 36:24

Log base 10 of n.
Because I am basically cutting
36:31 - 36:31

down by a factor of 10 each
time.
36:33 - 36:33

And how long does it take me to
reduce it to 1?
36:36 - 36:36

That is the definition,
if you will,
36:38 - 36:38

of what a log is,
log base 10.
36:40 - 36:40

What is this one?
What is this path going that
36:43 - 36:43

way?
36:53 - 36:53

Log of n. Log base 10/9 of n.
Because we're going down by
37:02 - 37:02

9/10 each time.
Once again, essentially the
37:06 - 37:06

definition of n.
And everything in between there
37:10 - 37:10

is somewhere between log base 10
of n and log base 10/9 of n.
37:15 - 37:15

So, everything is in between
there.
37:17 - 37:17

Now what I can do is do the
trick that we did for mergesort
37:22 - 37:22

in looking at what the
evaluation of this is by adding
37:27 - 37:27

up what is the cost of the total
level.
37:31 - 37:31

That is just cn.
What is the cost of the next
37:37 - 37:37

level?
cn.
37:38 - 37:38

And what is the cost of the
next level?
37:43 - 37:43

cn.
Every level we are still doing
37:48 - 37:48

the same amount of work.
And we take that all the way
37:55 - 37:55

down.
And the last levels --
38:00 - 38:00

Eventually we hit some point
where it is not equal to cn
38:05 - 38:05

where we start getting things
that are less than or equal to
38:09 - 38:09

cn because some of the leaves
start dropping out starting at
38:14 - 38:14

this level.
Basically this part is going to
38:18 - 38:18

be log base 10 of n,
and then we start getting
38:21 - 38:21

things that are less than or
equal to cn, and so forth,
38:26 - 38:26

until finally we get to add it
all up.
38:30 - 38:30

T(n) is going to be less than
or equal to cn times,
38:36 - 38:36

well, what is the longest that
this could possibly be?
38:43 - 38:43

Log base 10/9 of n.
Plus we have all of the leaves
38:50 - 38:50

that we have to add in,
but all the leaves together add
38:57 - 38:57

up to just order n.
All the leaves add up to order
39:03 - 39:03

n, so we have + Theta(n).
And so this is how much?
39:08 - 39:08

If I add all of this together,
what is this asymptotically?
39:14 - 39:14

That is n log n.
So, T(n) is actually bounded by
39:19 - 39:19

n log n.
We are lucky.
39:21 - 39:21

Those people who guessed lucky
were right.
39:25 - 39:25

A 1/10 : 9/10 split is
asymptotically as good as a 50 :
39:31 - 39:31

50 split.
And, in fact,
39:34 - 39:34

we can lower bound this by just
looking at these things here and
39:41 - 39:41

discover that,
in fact, T(n) is lower bounded
39:46 - 39:46

by cn log base 10 of n + order
n.
39:49 - 39:49

And so T(n) is lower bounded by
also asymptotically n log n.
39:55 - 39:55

So, T(n) is actually Theta(n lg
n).
40:00 - 40:00

Now, this is not really proof.
I generally recommend that you
40:04 - 40:04

don't do this kind of thing to
do a proof.
40:08 - 40:08

This is a good intuition of a
recursion tree.
40:11 - 40:11

The way you prove this is what?
Substitution method.
40:15 - 40:15

Good.
What you do is use this to get
40:17 - 40:17

your guess and then use
substitution method to prove
40:21 - 40:21

that your guess is right.
It is too easy to make mistakes
40:25 - 40:25

with this method.
It is very easy to make
40:28 - 40:28

mistakes.
With the substitution method it
40:33 - 40:33

is harder to make mistakes
because there is just algebra
40:37 - 40:37

there that you are cranking
through.
40:40 - 40:40

It is easier to verify rather
than dot, dot,
40:44 - 40:44

dots and trees that you drew
improperly and wrote in wrong
40:48 - 40:48

amounts and so forth.
OK?
40:50 - 40:50

So, this is n log n.
That's pretty good.
40:54 - 40:54

It is order n log n.
And we are lucky.
40:57 - 40:57

Now let's try another one.
This is all for intuition
41:02 - 41:02

because, I will tell you,
by the time we get to the end
41:06 - 41:06

of this class you folks are
going to bolting for the door
41:10 - 41:10

because we are going to do some
good math today,
41:14 - 41:14

actually.
It is actually fun math,
41:16 - 41:16

I think, but it is challenging.
If you are not awake,
41:20 - 41:20

you can still sleep now,
but I will tell you when to
41:24 - 41:24

wake up.
One more bit of intuition.
41:26 - 41:26

Suppose that we alternate
steps.
41:30 - 41:30

Suppose we do the partitioning
thing.
41:33 - 41:33

And it happens that we start
out lucky and then we have a
41:39 - 41:39

partitioning step that is
unlucky and then we have a step
41:44 - 41:44

that is lucky and a step that is
unlucky and we do that all the
41:50 - 41:50

way down the tree.
Suppose we alternate.
42:07 - 42:07

Are we lucky or unlucky if we
do that?
42:09 - 42:09

This time I want everybody
voting.
42:11 - 42:11

It doesn't matter what your
answer is.
42:14 - 42:14

Everybody has to have a stake
in the game.
42:16 - 42:16

It is sort of like horseracing.
If ever you have watched
42:20 - 42:20

horseracing, it is really
boring, but if you put a little
42:23 - 42:23

bit of money down,
a little skin in the game
42:26 - 42:26

suddenly it is interesting.
The same thing here.
42:30 - 42:30

I want everybody to put some
skin in the game.
42:34 - 42:34

Who thinks that this is going
to be lucky?
42:38 - 42:38

Who thinks it is going to be
unlucky?
42:41 - 42:41

OK.
Who didn't vote?
42:43 - 42:43

[LAUGHTER] You guys.
No skin in the game,
42:46 - 42:46

ha?
Let's analyze this so we can
42:49 - 42:49

once again write a recurrence.
On the lucky step,
42:53 - 42:53

we will have L(n) be the
running time on a lucky step of
42:58 - 42:58

size n.
And that is going to be twice.
43:03 - 43:03

While the next step is going to
be unlucky.
43:07 - 43:07

It is two unluckies over 2 plus
order n.
43:11 - 43:11

That is our lucky step.
And then for the unlucky step
43:16 - 43:16

it is essentially going to be L
of n minus 1,
43:20 - 43:20

it is going to be lucky on the
next step, plus order n.
43:25 - 43:25

That is unlucky.
See how I have described this
43:31 - 43:31

behavior with a system now of
recurrences that are dependent
43:36 - 43:36

where the boundary cases,
once again which are unstated,
43:41 - 43:41

is that the recurrences have a
constant solution with constant
43:47 - 43:47

input.
Now we just do a little bit of
43:50 - 43:50

algebra using substitution.
L(n) is then equal to,
43:55 - 43:55

well, I can just plug in,
for U(n/2) plug in the value of
44:00 - 44:00

U(n/2).
And that gives me 2[L(n/2-1) +
44:06 - 44:06

Theta(n) + Theta(n)].
See what I did here?
44:12 - 44:12

I simply plugged in,
for U(n/2), this recurrence.
44:19 - 44:19

In fact, technically I guess I
should have said Theta(n/2) just
44:27 - 44:27

to make this substitution more
straightforward.
44:35 - 44:35

It is the same thing,
but just to not skip a step.
44:42 - 44:42

That we can now crank through.
And that is 2L(n/2 - 1) +,
44:49 - 44:49

and now I have two T(n/2) plus
another one, so all of that is
44:57 - 44:57

just order n.
And what is the solution to
45:04 - 45:04

that recurrence?
n log n.
45:08 - 45:08

Theta(n lg n).
Does everybody see that?
45:13 - 45:13

OK?
Theta(n lg n).
45:16 - 45:16

This is basically just,
once again, master theorem with
45:24 - 45:24

a little bit of jiggering here.
That minus one is only going to
45:33 - 45:33

help us, actually,
in the solution of the master
45:38 - 45:38

theorem.
So, it is order n lg n.
45:42 - 45:42

We are lucky.
If we alternate lucky and
45:46 - 45:46

unlucky, we are lucky.
How can we insure that we are
45:51 - 45:51

usually lucky?
If I have the input already
45:55 - 45:55

sorted, I am going to be
unlucky.
46:00 - 46:00

Excuse me?
You could randomly arrange the
46:03 - 46:03

elements, that is one way.
What is another way?
46:08 - 46:08

That is a perfectly good way,
actually.
46:11 - 46:11

In fact, it is a common thing
to do.
46:14 - 46:14

Randomly choose the pivot,
OK.
46:16 - 46:16

It turns out those are
effectively equivalent,
46:20 - 46:20

but we are going to do the
randomly choose the pivot
46:25 - 46:25

because it is a little bit
easier to analyze.
46:30 - 46:30

But they are effectively
equivalent.
46:33 - 46:33

That gives us the algorithm
called randomized quicksort.
46:46 - 46:46

And the nice thing about
randomized quicksort is that the
46:53 - 46:53

running time is independent of
the input ordering.
47:00 - 47:00

Very much for the same reason
that if I just scramble the
47:04 - 47:04

input, it would be independent
of the input ordering.
47:08 - 47:08

If I randomly scramble the
input then it doesn't matter
47:13 - 47:13

what the order of the input was.
Whereas, original quicksort has
47:18 - 47:18

some slow cases,
input sorted or reverse sorted,
47:21 - 47:21

and some fast cases.
In particular,
47:24 - 47:24

it turns out that if it is
random it is going to be pretty
47:29 - 47:29

fast.
If I actually randomly scramble
47:32 - 47:32

the input or pivot on a random
element, it doesn't matter what
47:36 - 47:36

the input was.
One way of thinking about this
47:38 - 47:38

is with an adversary.
Imagine your adversary,
47:41 - 47:41

you are saying I have a good
sorting algorithm and he says I
47:44 - 47:44

have a good sorting algorithm
and you're trying to sell to a
47:48 - 47:48

single customer.
And the customer says OK,
47:50 - 47:50

you guys come up with
benchmarks for each of your
47:53 - 47:53

algorithms.
And you get to look at his
47:55 - 47:55

algorithm.
Well, you look and you say oh,
47:59 - 47:59

he is using quicksort.
I will just give him something
48:03 - 48:03

that is already sorted.
That is what you could do to
48:07 - 48:07

him.
If you had quicksort,
48:08 - 48:08

he would do the same thing to
you.
48:11 - 48:11

So, how can you defeat him?
Well, one way is with
48:14 - 48:14

randomization.
Big idea in computer science,
48:17 - 48:17

use random numbers.
The idea here is if I permute
48:21 - 48:21

the ordering at random,
as one suggestion,
48:24 - 48:24

or I pivot at random places,
then the input ordering didn't
48:28 - 48:28

matter.
And so there is no bad ordering
48:32 - 48:32

that he can provide that is
going to make my code run
48:37 - 48:37

slowly.
Now, I might get unlucky.
48:39 - 48:39

But that is just unlucky in my
use of my random-number
48:43 - 48:43

generator.
It is not unlucky with respect
48:46 - 48:46

to what the input was.
What the input was doesn't
48:50 - 48:50

matter.
Everybody follow that?
48:53 - 48:53

OK.
The nice thing about randomized
48:55 - 48:55

quicksort is that it makes no
assumptions about the input
49:00 - 49:00

distribution.
You don't have to assume that
49:06 - 49:06

all inputs are equally likely
because either you can make it
49:13 - 49:13

that way or you pivot in a way
that makes that effectively
49:20 - 49:20

whole.
And, in particular,
49:24 - 49:24

there is no specific input that
can elicit the worst-case
49:31 - 49:31

behavior.
The worst-case is determined
49:45 - 49:45

only by a random-number
generator.
50:00 - 50:00

And, therefore,
since it is only determined by
50:04 - 50:04

a random-number generator,
we can essentially bound the
50:09 - 50:09

unluckiness mathematically.
We can say what are the odds?
50:14 - 50:14

So, we are going to analyze
this.
50:16 - 50:16

And this is where you know if
you belong in this course or
50:22 - 50:22

not.
If you have skipped 6.042 or
50:24 - 50:24

whatever, this is a good place
to do the comparison.
50:30 - 50:30

Since it is going to be a
little bit, why don't people
50:34 - 50:34

just stand up for a moment and
take a stretch break.
50:38 - 50:38

Since this is going to be a
nice piece of mathematics we are
50:42 - 50:42

going to do, you are going to
want to feel fresh for it.
51:02 - 51:02

Stretch break is over.
51:10 - 51:10

Analysis.
Good.
51:12 - 51:12

I think we are going to make
this.
51:16 - 51:16

I am sort of racing.
There is a lot of stuff to
51:23 - 51:23

cover today.
Good.
51:25 - 51:25

Let's let T(n) now be the
random variable for the running
51:32 - 51:32

time assuming --
51:46 - 51:46

Wow.
I didn't even write here what
51:48 - 51:48

we did here.
So, we are going to pivot on a
51:50 - 51:50

random element.
51:59 - 51:59

That is the basic scheme we are
going to do.
52:02 - 52:02

And the way I do that,
by the way, is just in the code
52:05 - 52:05

for partition,
rather than partitioning on the
52:08 - 52:08

first element,
before I do the partition,
52:10 - 52:10

I just swap the first element
with some other element in the
52:14 - 52:14

array chosen at random,
perhaps itself.
52:17 - 52:17

So, they are all equally likely
to be pivoted on.
52:20 - 52:20

And then just run the ordinary
partition.
52:23 - 52:23

This is a random variable for
running in time assuming,
52:28 - 52:28

we have to make an assumption
for doing probability,
52:32 - 52:32

the random numbers are
independent.
52:41 - 52:41

So that when I pivot in one
place, it is independent of how
52:45 - 52:45

I pivoted in some other place as
I am running this algorithm.
52:49 - 52:49

Then, to analyze this,
what I am going to do is I want
52:52 - 52:52

to know where we pivoted.
For k = 0, 1,
52:57 - 52:57

..., n-1, let's let,
for a particular partition,
53:08 - 53:08

the random variable X_k = 1 if
partition generates a k :
53:22 - 53:22

n-k-1 split,
and 0 otherwise.
53:35 - 53:35

In the partition routine,
I am picking a random element
53:39 - 53:39

to pivot on.
And X_k is going to be my
53:43 - 53:43

random variable that is 1 if it
generates a split that has k
53:47 - 53:47

elements on the left side and
n-k-1 elements on the right side
53:53 - 53:53

of the pivot.
Some of those,
53:55 - 53:55

too, of course are n-1 because
I also have the pivot.
54:00 - 54:00

And 0 otherwise.
So, I now have n random
54:04 - 54:04

variables that I have defined
associated with a single
54:09 - 54:09

partition where all of them are
going to be zero except one of
54:15 - 54:15

them, whichever one happens to
occur is going to have the value
54:21 - 54:21

1.
This is called,
54:23 - 54:23

by the way.
What is the name of this type
54:27 - 54:27

of random variable?
54:37 - 54:37

Bernoulli.
Well, Bernoulli has other
54:40 - 54:40

assumptions.
It is an indicator random
54:43 - 54:43

variable.
It turns out it is Bernoulli,
54:47 - 54:47

but that's OK.
It is an indicator random
54:50 - 54:50

variable.
It just takes on the value of
54:53 - 54:53

0, 1.
And Bernoulli random variables
54:56 - 54:56

are a particular type of
indicator random variable.
55:02 - 55:02

Which it turns out these are.
That is an indicator random
55:06 - 55:06

variable.
Indicator random variables are
55:10 - 55:10

a good way when you are trying
to understand what the sum of a
55:14 - 55:14

bunch of things is.
It is a good way to break apart
55:18 - 55:18

your big random variables into
smaller ones that can be
55:23 - 55:23

analyzed.
Let's just take a look at this
55:26 - 55:26

indicator random variable.
What is the expectation of X_k
55:30 - 55:30

equal to?
55:42 - 55:42

In other words,
what is the probability that I
55:46 - 55:46

generate a k :
n-k-1 split?
56:00 - 56:00

X_k is, let's just write out
what that means,
56:05 - 56:05

just to refresh people's
memory.
56:09 - 56:09

That is 0 times the probability
that X_k equals 0 plus 1 times
56:16 - 56:16

the probability that X_k equals
1, which is equal,
56:22 - 56:22

well, that is all zero.
That is just equal to the
56:27 - 56:27

probability that X_k equals 1.
And that is a general property
56:35 - 56:35

of indicator random variables,
is that their expectation is
56:40 - 56:40

the probability that they are 1.
The nice thing about indicator
56:46 - 56:46

random variables is it directly
connects the probability to the
56:51 - 56:51

expectation without any other
terms going on.
56:55 - 56:55

What is the probability of X_k
equals 1?
56:59 - 56:59

1/n.
So, all splits are equally
57:02 - 57:02

likely.
And I have n elements,
57:05 - 57:05

so each ones has a 1/n chance
of being picked as the pivot.
57:10 - 57:10

And, once you pick the pivot,
that determines what is on the
57:16 - 57:16

left and the right and so forth.
So, it is 1/n.
57:20 - 57:20

Everybody with me so far?
More or less?
57:23 - 57:23

OK.
As I say, this is going to test
57:26 - 57:26

whether you're in the class.
If you go home and you study
57:32 - 57:32

this and you cannot get it,
and you have a deficiency in
57:37 - 57:37

your math background in trying
to take the course,
57:42 - 57:42

this is a good indication that
probably you have taken
57:46 - 57:46

something a little over your
head.
57:49 - 57:49

Let's write out what T(n) is
equal to here.
58:00 - 58:00

T(n) is going to be equal to
T(0) + T(n-1) + Theta(n) if we
58:12 - 58:12

get a 0 : n-1 split and is equal
to T(1) + T(n-2) + order n if we
58:25 - 58:25

have a 1 : n-2 split.
58:35 - 58:35

And now down here it is going
to be T(n-1) + T(0) + Theta(n)
58:46 - 58:46

if we end up with an n-1 :
0 split.
58:52 - 58:52

So, this is our recurrence for
T(n).
59:00 - 59:00

And, unfortunately,
the recurrence is kind of hairy
59:04 - 59:04

because it has got n cases.
And this is,
59:07 - 59:07

once again, where the
brilliance of being able to use
59:11 - 59:11

indicator random variables comes
in.
59:14 - 59:14

Because we will be able to take
this case analysis and reduce it
59:19 - 59:19

to mathematics so we don't have
cases using indicator random
59:23 - 59:23

variables.
And the way we do that is using
59:30 - 59:30

the following trick of
converting the cases into a
59:37 - 59:37

summation.
59:55 - 59:55

Let's just take a look at why
these two things are the same.
60:01 - 60:01

The indicator random variable
is zero, except if you get the
60:06 - 60:06

particular split.
Therefore, this summation is
60:11 - 60:11

going to be zero,
except for that k which
60:14 - 60:14

actually appeared in which case
it is the value that we say it
60:20 - 60:20

is.
See the trick using
60:22 - 60:22

multiplication by 0,
1 variable to handle all the
60:27 - 60:27

cases?
I think that is damn clever.
60:31 - 60:31

I think that is damn clever.
And this is like the classic
60:36 - 60:36

thing that you do with indicator
random variables.
60:40 - 60:40

It's one of the reasons they
are a very powerful method.
60:45 - 60:45

Because now we actually have a
mathematical expression,
60:50 - 60:50

hairy although it may be,
for our recurrence.
60:54 - 60:54

Now, what we are going to
analyze is the expected value of
60:58 - 60:58

T(n).
That is what we want to do.
61:03 - 61:03

What is the expected value of
T(n)?
61:07 - 61:07

To do that, I just write the
expected value of T(n) is equal
61:13 - 61:13

to the expected value of this
big summation.
61:18 - 61:18

And now we can go ahead and
start to evaluate the expected
61:24 - 61:24

value of that summation.
Everybody with me?
61:30 - 61:30

Yes?
Any questions at this point?
61:32 - 61:32

I see a thumbs up.
That's nice to see.
61:35 - 61:35

But I generally believe that
what I want to see is no thumbs
61:40 - 61:40

down.
It is good to see the thumbs
61:43 - 61:43

up, but that means one person
understands, or thinks he
61:47 - 61:47

understands.
[LAUGHTER] So,
61:49 - 61:49

this is, I claim,
equal to the following.
61:52 - 61:52

Actually, I am going to need a
little space here so I am going
61:57 - 61:57

to move the equal sign over a
little bit.
62:25 - 62:25

I claim that summation is equal
to that.
62:28 - 62:28

This expectation is equal to
that summation of expectations.
62:32 - 62:32

Why is that?
What are the magic words that
62:36 - 62:36

justify this step?
Linearity of expectation.
62:40 - 62:40

The expectation of a sum is the
sum of the expectations.
62:46 - 62:46

So, that is linearity of
expectation.
62:49 - 62:49

I don't need independence for
that.
62:53 - 62:53

That is just always true for
expectation of any random
62:58 - 62:58

variables.
The sum of the expectations is
63:05 - 63:05

the expectation of the sum and
vice versa.
63:11 - 63:11

Here we did the vice versa.
That is equal to now the sum of
63:20 - 63:20

k=0 to n-1 of expectation of X_k
[T(k) + T(n-k-1) + Theta(n)].
63:36 - 63:36

Why is that true?
What I have done is I've said
63:42 - 63:42

the expectation of the product
is the product of the
63:49 - 63:49

expectations.
That is because of
63:53 - 63:53

independence.
What is independent of what?
64:00 - 64:00

The X_k here,
random variable,
64:02 - 64:02

are independent of any of the
other partitionings in,
64:07 - 64:07

if you will,
the X_k that would exist for
64:10 - 64:10

any of the other recursive
calls.
64:13 - 64:13

So, whatever happens in here is
independent of what happened
64:18 - 64:18

there.
We are actually hiding.
64:21 - 64:21

Since we have a recurrence,
we are not partitioning the
64:25 - 64:25

same wage time.
We have a different one.
64:30 - 64:30

We actually have something
going on underneath the
64:33 - 64:33

mathematics you have to pay
attention to that the
64:36 - 64:36

mathematics alone isn't really
showing, which is that in T(k)
64:40 - 64:40

there is actually a set of
random choices that are being
64:44 - 64:44

made, if you will.
And so you have to understand
64:47 - 64:47

that those are independent of
those, in which case we can
64:50 - 64:50

multiple the probabilities of
their expectations.
64:53 - 64:53

Is everybody with me?
That is a big one,
64:56 - 64:56

independence of X_k from other
random choices.
65:00 - 65:00

That is equal to now,
well, first of all,
65:05 - 65:05

this is nice.
What is the expectation of X_k?
65:10 - 65:10

1/n.
That actually doesn't even
65:13 - 65:13

belong in the summation.
We will just pop it outside.
65:20 - 65:20

I get 1/n times the sum of k=0
to n-1 of expectation of T(k) +
65:36 - 65:36

1/n summation k=0 to n-1 of
expectation of T(n-k-1) + 1/n
65:50 - 65:50

summation k=0 to n-1 up to
Theta(n).
66:00 - 66:00

That is, again,
using linearity of expectation
66:05 - 66:05

there this time to split up
these pieces and just factoring
66:11 - 66:11

out the expectation of k as
being 1/n.
66:15 - 66:15

Everybody with me still?
All of this is elementary.
66:20 - 66:20

It is just one of these things
that is hard just because there
66:27 - 66:27

are so many steps.
And it takes that you have seen
66:33 - 66:33

some of this before.
Now the next observation is
66:38 - 66:38

that these two summations are,
in fact, identical.
66:43 - 66:43

They are the same summation,
just in a different order.
66:49 - 66:49

This is going T(0),
T(1), T(2), T(3) up to T(n-1).
66:54 - 66:54

This one is going T(n-1),
T(n-2), T(n-3) down to T(0).
67:00 - 67:00

These are, in fact,
equal.
67:02 - 67:02

So, therefore,
I have two of them.
67:19 - 67:19

And then what is this term
equal to?
67:37 - 67:37

What is that one equal to?
Theta(n).
67:41 - 67:41

Let's just see why.
The summation of 0 :
67:45 - 67:45

n of Theta(n) is Theta(n^2)
divided by n.
67:50 - 67:50

Or, if I want to bring the
Theta(n) out,
67:54 - 67:54

I have 1 times the summation of
k equals1 to n of Theta(1) or of
68:02 - 68:02

1.
So, once again,
68:05 - 68:05

you get n, either way of doing
it.
68:08 - 68:08

This is, in some sense,
because the summations have
68:13 - 68:13

identical terms,
and this is just algebra.
68:17 - 68:17

Now what we are going to do is
do something for technical
68:23 - 68:23

convenience.
Because we are going to absorb
68:30 - 68:30

the k=0, 1 terms into the
Theta(n) for technical
68:38 - 68:38

convenience.
68:46 - 68:46

We have a recurrence here where
I have an order n.
68:50 - 68:50

And, if I look at the cases
where k=0 or k=1,
68:54 - 68:54

I know what the expectation is.
For 0, 1, the expected cost is
69:00 - 69:00

the worst case cost,
which is constant.
69:04 - 69:04

Because I am only solving the
problem for a constant size.
69:08 - 69:08

And we know that for any of the
boundary cases that our solution
69:12 - 69:12

of recurrence,
our assumption is that it is
69:15 - 69:15

constant time.
So, I basically can just take
69:18 - 69:18

those two terms out.
And all that does it accumulate
69:22 - 69:22

some more constant here in the
Theta(n).
69:24 - 69:24

It is going to make the
solution of the recurrence a
69:28 - 69:28

little bit easier.
And, if I do that,
69:33 - 69:33

I get expectation of T(n) = 2/n
summation k=2 to n-1 of
69:44 - 69:44

expectation of T(k) + Theta(n).
70:00 - 70:00

So, all of that work was to
derive the recurrence.
70:07 - 70:07

And now we have to solve it.
Just to review what we did,
70:15 - 70:15

we started out with a
recurrence which was for the
70:21 - 70:21

random variable which involved a
case statement.
70:29 - 70:29

We converted that into some
mathematics without the case
70:33 - 70:33

statement, just with a product,
and then we derived a
70:37 - 70:37

recurrence for the expectation.
And now we are in the process
70:42 - 70:42

of trying to solve that
recurrence.
70:44 - 70:44

We have done some
simplification of the recurrence
70:48 - 70:48

so that we understand what it is
that we are going to solve here.
70:53 - 70:53

By the way, I don't give things
like this on quizzes.
70:56 - 70:56

I do expect you to understand
it.
71:00 - 71:00

The elements of this you will
find on a quiz.
71:03 - 71:03

This is a lot of work to figure
out.
71:06 - 71:06

This took smart people to do.
Even though it is all
71:09 - 71:09

elementary, but working out
something like this at the
71:13 - 71:13

elementary level is still a bit
of work even for somebody who is
71:17 - 71:17

knowledgeable in this area.
Now we are going to solve that
71:23 - 71:23

last recurrence over there and
we are going to prove that the
71:30 - 71:30

expectation of T(n) is less than
or equal to (an lg n) for some
71:38 - 71:38

constant a greater than 0.
That is going to be what we are
71:44 - 71:44

going to do.
And so what technique do you
71:49 - 71:49

think we should use to prove
this?
71:53 - 71:53

Does this look like a master
method?
72:03 - 72:03

It is nothing like the master
method.
72:09 - 72:09

So, when in doubt do
substitution.
72:14 - 72:14

It is the grand-daddy of all
methods.
72:19 - 72:19

What we will do is solve the
base case by simply choosing a
72:28 - 72:28

big enough so that (an lg n) is
bigger than the expectation of
72:38 - 72:38

T(n) for sufficiently large
small n.
72:45 - 72:45

So, I just pick a to be big
enough.
72:49 - 72:49

And this is,
by the way, why I wanted to
72:54 - 72:54

exclude 0 and 1 from the
recurrence.
72:59 - 72:59

Because, for example,
when n=0, log of 0 is,
73:03 - 73:03

it's like dividing by 0,
right, you cannot do it.
73:08 - 73:08

Log of 1 is 0.
So here, even if I restricted
73:13 - 73:13

it to 1, here I would have a 0,
and I can't ever pick a big
73:18 - 73:18

enough to dominate those cases.
What I do is I just say look,
73:24 - 73:24

I just absorb whatever the cost
is into the T(n) for technical
73:31 - 73:31

convenience.
And that lets me address it as
73:37 - 73:37

(an lg n) to be big enough to
handle the base case.
73:44 - 73:44

So, that is why we made that
technical assumption.
73:51 - 73:51

We are going to use a fact
which is that the summation of
73:59 - 73:59

k=2 to n-1 of k lg k is less
than or equal to 1/2n^2 lg n -
74:07 - 74:07

1/8n^2.
I am going to leave that as an
74:11 - 74:11

exercise for you to workout.
I think it is an exercise in
74:15 - 74:15

the book, too.
I want you to go and evaluate
74:19 - 74:19

this.
There are two ways to evaluate
74:21 - 74:21

it.
One is by using purely
74:23 - 74:23

summations and facts about
summations by splitting the
74:27 - 74:27

summation into two pieces and
reconstituting it to prove this
74:32 - 74:32

bound.
The other way is to use the
74:35 - 74:35

integral method for solving
summations.
74:39 - 74:39

Either way you can prove.
The integral method actually
74:43 - 74:43

gets you a tighter bound than
this.
74:46 - 74:46

This is a basic fact,
and you should go off and know
74:51 - 74:51

how to do that.
Now let's do substitution.
75:06 - 75:06

The expectation of T(n) is less
than or equal to 2/n times the
75:19 - 75:19

summation k=2 to n-1,
now we do the substitution of
75:29 - 75:29

ak lg k, the smaller values plus
Theta(n).
75:39 - 75:39

I might mentioned,
by the way, that the hard part
75:42 - 75:42

of doing this,
it is easy to get the bound
75:45 - 75:45

without this term,
it is easy to get this bound,
75:49 - 75:49

1/2n^2 lg n,
it is harder to get the second
75:52 - 75:52

order term.
It turns out you need the
75:54 - 75:54

second order term in order to do
what we are going to do.
75:59 - 75:59

You have to be able to subtract
a quadratic amount of the n^2 lg
76:06 - 76:06

n in order to make this proof
work.
76:09 - 76:09

And that is the trickier part
in evaluating that summation.
76:15 - 76:15

So, we get this.
That is less than or equal to?
76:20 - 76:20

Well, I happen to know how much
this is by using that formula.
76:27 - 76:27

I use my fact and get 2a/n
(1/2n^2 lg n - 1/8n^2) +
76:32 - 76:32

Theta(n).
Did I do something wrong?
76:44 - 76:44

There we go.
Very good.
76:52 - 76:52

That is equal to -
If I multiply this first part
77:05 - 77:05

through that is an lg n.
And now, so I don't make a
77:14 - 77:14

mistake, I want to express this
as my desired,
77:23 - 77:23

this is what I want it to be,
minus a residual.
77:32 - 77:32

I am going to write the
residual as this part.
77:39 - 77:39

And so, the way to write that
is, that is going to be minus.
77:47 - 77:47

And then it is going to be this
term here, which is going to be
77:57 - 77:57

an/4 - Theta(n).
78:05 - 78:05

And that is going to be less
than or equal to an lg n if this
78:12 - 78:12

part is positive.
And I can make that part
78:17 - 78:17

positive by picking a big enough
such that an/4 dominates the
78:25 - 78:25

constant in the Theta(n) here.
Whatever the constant is here,
78:35 - 78:35

I can find an a that is big
enough so that this term makes
78:45 - 78:45

this part positive.
If a is big enough so that an/4
78:55 - 78:55

dominates Theta(n).
And so the running time of
79:01 - 79:01

randomized quicksort is order n
lg n.
79:04 - 79:04

That is what we just proved,
the expected running time is
79:09 - 79:09

order n lg n.
Now, in practice,
79:12 - 79:12

quicksort is a great algorithm.
It is typically three or more
79:17 - 79:17

times faster than mergesort.
It doesn't give you the strong
79:22 - 79:22

guarantee necessarily of
mergesort and being worst-case n
79:26 - 79:26

lg n.
But in practice,
79:29 - 79:29

if you use randomized
quicksort, it is generally as
79:33 - 79:33

much as three times faster.
It does require code tuning in
79:38 - 79:38

order to get it up to be that
fast.
79:40 - 79:40

You do have to go and coarsen
the base cases and do some other
79:45 - 79:45

tricks there,
but most good sorting
79:48 - 79:48

algorithms that you will find
are based on quicksort.
79:52 - 79:52

Also one of the other reasons
it works well is because it
79:56 - 79:56

tends to work well with caches
in virtual memory.
80:01 - 80:01

We are not really talking much
about caching models and so
80:05 - 80:05

forth, big topic these days in
algorithms, but it does work
80:10 - 80:10

very well with caches in virtual
memory.
80:13 - 80:13

It is another reason that this
is a good algorithm to use.
80:17 - 80:17

Good recitation,
by the way, on Friday.
80:20 - 80:20

We are going to see another n
log n time algorithm,
80:24 - 80:24

a very important one in
recitation on Friday.

Title:: Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Description:: Lecture 04: Quicksort, Randomized Algorithms

View the complete course at: http://ocw.mit.edu/6-046JF05

License: Creative Commons BY-NC-SA

More information at http://ocw.mit.edu/terms

More courses at http://ocw.mit.edu

more » « less
Duration:: 01:20:33

	Amara Bot edited English subtitles for Lec 4 \| MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
	Amara Bot edited English subtitles for Lec 4 \| MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
	Amara Bot edited English subtitles for Lec 4 \| MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

English subtitles

Revisions

Revision 1

Amara Bot

Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

Revisions

Our website uses cookies

Operating cookies (Required)