< Return to Video

Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

  • 0:08 - 0:08
    OK.
    Today we are going to talk
  • 0:11 - 0:11
    about a very interesting
    algorithm called Quicksort --
  • 0:23 - 0:23
    -- which was invented by Tony
    Hoare in 1962.
  • 0:30 - 0:30
    And it has ended up being a
    really interesting algorithm
  • 0:36 - 0:36
    from many points of view.
    And because of that,
  • 0:41 - 0:41
    it turns out today's lecture is
    going to be both hard and fast.
  • 0:48 - 0:48
    If you see the person next to
    you sleeping,
  • 0:52 - 0:52
    you will want to say let's get
    going.
  • 0:56 - 0:56
    It's a divide-and-conquer
    algorithm.
  • 1:06 - 1:06
    And it sorts,
    as they say,
  • 1:08 - 1:08
    in place, meaning that it just
    rearranged the elements where
  • 1:14 - 1:14
    they are.
    That is like insertion sort
  • 1:18 - 1:18
    rearranges elements where they
    are.
  • 1:21 - 1:21
    Mergesort does not.
    Mergesort requires extra
  • 1:25 - 1:25
    storage in order to do the merge
    operation.
  • 1:30 - 1:30
    To merge in linear time and
    place, it doesn't merge in place
  • 1:35 - 1:35
    in linear time.
    It doesn't do it just by
  • 1:38 - 1:38
    rearranging.
    It is nice because it is in
  • 1:41 - 1:41
    place, so that means that it is
    fairly efficient in its use of
  • 1:46 - 1:46
    storage.
    And it also happens to be very
  • 1:49 - 1:49
    practical if you tune it a bit.
    The basic algorithm turns out,
  • 1:54 - 1:54
    if you just implement that,
    it's not necessarily that
  • 1:58 - 1:58
    efficient.
    But if what you do was then do
  • 2:03 - 2:03
    the standard kinds of things you
    do to goose up the runtime of
  • 2:08 - 2:08
    something, and we'll talk a
    little about what those things
  • 2:13 - 2:13
    are, then it can be very,
    very practical.
  • 2:16 - 2:16
    So, it uses divide-and-conquer
    paradigm.
  • 2:33 - 2:33
    First step is divide.
    And to do this basically it
  • 2:41 - 2:41
    does it by partitioning.
    So, it partitions the input
  • 2:49 - 2:49
    array into two subarrays around
    an element we call the pivot --
  • 3:04 - 3:04
    -- such that elements in the
    lower subarray are less than or
  • 3:15 - 3:15
    equal to x, are less than or
    equal to elements in the upper
  • 3:26 - 3:26
    subarray.
    If we draw a picture of the
  • 3:31 - 3:31
    input array, this partition step
    basically takes some element x
  • 3:39 - 3:39
    and everything over here is less
    than or equal to x after the
  • 3:46 - 3:46
    partition step and everything
    over here is greater than or
  • 3:53 - 3:53
    equal to x.
    And so now the conquer step is
  • 3:57 - 3:57
    pretty easy.
    You just recursively sort the
  • 4:05 - 4:05
    two subarrays.
    So, I recursively sort the
  • 4:11 - 4:11
    elements less than or equal to
    x, I recursively sort the
  • 4:19 - 4:19
    elements greater than or equal
    to x.
  • 4:25 - 4:25
    And then combine is then just
    trivial.
  • 4:32 - 4:32
    Because once I have sorted the
    things less than or equal to x,
  • 4:37 - 4:37
    then sorted the things greater
    than or equal to x,
  • 4:40 - 4:40
    the whole thing is sorted.
    So, there is nothing to do
  • 4:44 - 4:44
    really for the combine.
    The key step in quicksort is
  • 4:48 - 4:48
    this partition step.
    That is the thing that does all
  • 4:52 - 4:52
    of the work.
    And so you can view quicksort
  • 4:55 - 4:55
    of just as recursive
    partitioning.
  • 4:58 - 4:58
    That's all it is.
    Just as mergesort was recursive
  • 5:06 - 5:06
    merging, quicksort sort of goes
    the other way around and does
  • 5:19 - 5:19
    recursive partitioning.
    The key is the linear time,
  • 5:29 - 5:29
    by which I mean theta n,
    partitioning subroutine.
  • 5:40 - 5:40
    And here are some pseudocode
    for it.
  • 5:43 - 5:43
    This is actually slightly
    different from the book.
  • 5:48 - 5:48
    The book has one.
    In fact, there is a nice
  • 5:51 - 5:51
    problem in the book that has
    even a different one,
  • 5:56 - 5:56
    but they are all basically the
    same idea.
  • 6:00 - 6:00
    Partition (A,
    p, q).
  • 6:02 - 6:02
    And what we are looking at,
    at this step of the recursion,
  • 6:07 - 6:07
    is the subarray A from p to q.
    And basically we pick a pivot,
  • 6:12 - 6:12
    which is we are going to just
    pick as the first element of the
  • 6:17 - 6:17
    array A of p.
  • 6:26 - 6:26
    And the book,
    just for your information,
  • 6:30 - 6:30
    uses A of q.
    I use A of p.
  • 6:32 - 6:32
    It doesn't really matter.
    And then we set an index to p
  • 6:37 - 6:37
    and then we have a loop.
  • 7:35 - 7:35
    This is the code.
    Basically the structure of it
  • 7:41 - 7:41
    is a for loop with an "if"
    statement in the middle.
  • 7:47 - 7:47
    And so the structure of the
    algorithm of this partitioning
  • 7:54 - 7:54
    step looks as follows.
    We set the pivot to be the
  • 8:01 - 8:01
    first element.
    Here is p and here is q.
  • 8:05 - 8:05
    This is going to be our
    invariant for the loop.
  • 8:10 - 8:10
    And, at any time during the
    execution of a loop,
  • 8:15 - 8:15
    I essentially have some values
    up to i which are already less
  • 8:22 - 8:22
    than or equal to x and then some
    values that end at j minus 1
  • 8:29 - 8:29
    that are greater than or equal
    to x.
  • 8:34 - 8:34
    And then I don't know about the
    rest.
  • 8:38 - 8:38
    And so we start out with i
    equal to p and j equal to p plus
  • 8:45 - 8:45
    1.
    It starts out at p plus 1 so
  • 8:48 - 8:48
    that everything is unknown
    except for x here.
  • 8:53 - 8:53
    And then the idea is that it is
    going to preserve this
  • 8:59 - 8:59
    invariant.
    And the way it does it is,
  • 9:03 - 9:03
    as we go through the loop,
    it looks at a of j and says is
  • 9:07 - 9:07
    it greater than or equal to x?
    Sorry, is it less than or equal
  • 9:11 - 9:11
    to x?
    If it is greater than or equal
  • 9:13 - 9:13
    to x it does nothing,
    because what can happen?
  • 9:16 - 9:16
    If this is greater than or
    equal to x, essentially it just
  • 9:20 - 9:20
    goes to the next iterational
    loop which moves this boundary
  • 9:24 - 9:24
    and the invariant is satisfied.
    Does everybody see that?
  • 9:27 - 9:27
    Yeah, OK.
    But if it is less than or equal
  • 9:31 - 9:31
    to x, I have got a problem if I
    want to maintain the invariant
  • 9:36 - 9:36
    if this next element is less
    than or equal to x.
  • 9:39 - 9:39
    And so what it does then is it
    says oh, let me just move this
  • 9:43 - 9:43
    boundary and swap this element
    here, which is greater than or
  • 9:47 - 9:47
    equal to x, with this one here
    that is less than or equal to x,
  • 9:52 - 9:52
    thereby increasing the size of
    this subarray and then the
  • 9:56 - 9:56
    invariant is satisfied again.
    It is a fairly simple
  • 9:59 - 9:59
    algorithm.
    And it is actually a very tight
  • 10:04 - 10:04
    and easy algorithm.
    That is one reason that this is
  • 10:09 - 10:09
    such a great piece of code
    because it is very efficient.
  • 10:14 - 10:14
    Now, in principle,
    the running time for this on n
  • 10:18 - 10:18
    elements is order n.
  • 10:28 - 10:28
    Because I am basically just
    going through the n elements and
  • 10:32 - 10:32
    just doing a constant amount of
    work and then just a constant
  • 10:36 - 10:36
    amount of work outside.
    This is a clever piece of code.
  • 10:39 - 10:39
    In fact, in principle partition
    is easy, right?
  • 10:42 - 10:42
    If I weren't worrying about
    doing it in place,
  • 10:45 - 10:45
    it is really a pretty easy
    thing to do.
  • 10:47 - 10:47
    I take an element and just
    compare every other element with
  • 10:51 - 10:51
    it.
    I throw one into one bin and
  • 10:53 - 10:53
    one into the other.
    That is clearly linear time.
  • 10:57 - 10:57
    But often what you find is that
    just because you can do it that
  • 11:01 - 11:01
    way theoretically doesn't mean
    that that is going to end up
  • 11:05 - 11:05
    giving you good code.
    And this is a nice piece of
  • 11:08 - 11:08
    code that allows you to do it in
    place.
  • 11:11 - 11:11
    And that is one reason why this
    is a particularly good
  • 11:14 - 11:14
    algorithm, because the constants
    are good.
  • 11:17 - 11:17
    So, yes, when we do asymptotic
    analysis we tend to ignore the
  • 11:21 - 11:21
    constants, but when you're
    actually building code you care
  • 11:25 - 11:25
    about the constants.
    But first you care much more
  • 11:30 - 11:30
    than just about the constants,
    is whether overall it is going
  • 11:37 - 11:37
    to be a fast algorithm.
    Let's go through an example of
  • 11:44 - 11:44
    this, I guess I will do it over
    here, just so we get the gist.
  • 11:51 - 11:51
    Here is a sample array that I
    have created out of hallcloth.
  • 11:58 - 11:58
    And here we are going to set x,
    the pivot, to be 6.
  • 12:05 - 12:05
    Let's look to see how this
    algorithm works.
  • 12:08 - 12:08
    So, i starts out here and j
    starts out here if we
  • 12:12 - 12:12
    initialize.
    And what we do is start
  • 12:15 - 12:15
    scanning right,
    essentially that code is
  • 12:19 - 12:19
    scanning right until it gets
    something which is less than or
  • 12:23 - 12:23
    equal to the pivot.
    It keeps going here until it
  • 12:27 - 12:27
    finds, j keeps incrementing
    until it finds something that is
  • 12:32 - 12:32
    less than or equal to the pivot.
    And, in that case,
  • 12:38 - 12:38
    it is the number 5.
    Then it says we will swap these
  • 12:44 - 12:44
    two things.
    And it does that and we get 6,
  • 12:50 - 12:50
    5, 13, 10, 8,
    3, 2, 11.
  • 12:52 - 12:52
    And meanwhile now i gets
    incremented and j continues
  • 12:59 - 12:59
    where it left off.
    And so now we keep scanning
  • 13:05 - 13:05
    right until we get to something
    that is less than or equal to
  • 13:13 - 13:13
    the pivot.
    In this case it is 3.
  • 13:17 - 13:17
    We swap 3 and 5 and get 6,
    3, etc.
  • 13:21 - 13:21
    And now, at this step we
    increment i, we start j out
  • 13:27 - 13:27
    here.
    And in this case,
  • 13:31 - 13:31
    right off the bat,
    we have something which is less
  • 13:36 - 13:36
    than or equal to x,
    so we swap these two.
  • 13:40 - 13:40
    I blew it, didn't I?
    Oops.
  • 13:42 - 13:42
    What did I do?
    I swapped the wrong thing,
  • 13:46 - 13:46
    didn't I, here?
    Ah-ha.
  • 13:48 - 13:48
    That is why I am not a
    computer.
  • 13:52 - 13:52
    Good.
    We should have swapped this
  • 13:55 - 13:55
    guy, right?
    Swapped i plus 1,
  • 13:58 - 13:58
    right?
    This was i.
  • 14:01 - 14:01
    We swap i plus 1,
    good.
  • 14:04 - 14:04
    So, that's all wrong.
    Let's swap the right things.
  • 14:09 - 14:09
    Now we have 6,
    5, 3, 10, 8,
  • 14:12 - 14:12
    13, 2, 11.
    That even corresponds to my
  • 14:16 - 14:16
    notes for some strange reason.
    This is i and now this is j.
  • 14:23 - 14:23
    And now when I look,
    I immediately have something
  • 14:28 - 14:28
    that is less than or equal to
    the pivot.
  • 14:34 - 14:34
    We swap this and i plus 1,
    so now we have 6,
  • 14:42 - 14:42
    5, 3, 2, 8, 13,
    10, 11.
  • 14:45 - 14:45
    And we, at that point,
    increment i to here.
  • 14:53 - 14:53
    And we have j now going here
    and j runs to the end.
  • 15:03 - 15:03
    And the loop terminates.
    When the loop terminates there
  • 15:09 - 15:09
    is one less swap that we do,
    which is to put our pivot
  • 15:14 - 15:14
    element in the middle between
    the two subarrays.
  • 15:19 - 15:19
    Here we swap this one and this
    one, and so that gives us then
  • 15:25 - 15:25
    2, 5, 3, 6, 8,
    13, 10, 11.
  • 15:28 - 15:28
    And this is the pivot.
    And everything over here is
  • 15:34 - 15:34
    less than or equal to the pivot.
    And everything over here is
  • 15:40 - 15:40
    greater than or equal to the
    pivot.
  • 15:56 - 15:56
    OK, so the quicksort routine.
    Once we have this partition
  • 16:00 - 16:00
    routine, quicksort is a pretty
    easy piece of code to write.
  • 16:18 - 16:18
    I should have said return here
    i, right?
  • 16:21 - 16:21
    You have got to return with the
    pivot.
  • 16:25 - 16:25
    Here I have got to return i
    because we want to know where
  • 16:30 - 16:30
    the pivot element is.
    Sorry.
  • 16:34 - 16:34
    I will plug in my code.
    r gets partition of (A,
  • 16:47 - 16:47
    p, q) and then we quicksort (A,
    p, r-1) and quicksort of (A,
  • 17:03 - 17:03
    r+1, q).
    And that is it.
  • 17:10 - 17:10
    That's the code.
    The initial call is quicksort
  • 17:17 - 17:17
    of (A, 1, n).
    Because once we partitioned,
  • 17:25 - 17:25
    we just have to quicksort the
    two portions,
  • 17:32 - 17:32
    the left and right portions.
    Just the boundary case is
  • 17:39 - 17:39
    probably worth mentioning for a
    second.
  • 17:41 - 17:41
    If there are zero or one
    elements, that is basically what
  • 17:44 - 17:44
    can possibly happen here,
    is that I get zero or one
  • 17:47 - 17:47
    elements here.
    Then the point is there is
  • 17:49 - 17:49
    nothing to do because the array
    is sorted, either because it is
  • 17:53 - 17:53
    an empty array or because it
    only has one element.
  • 17:56 - 17:56
    One of the tricks to making
    quicksort go fast,
  • 17:59 - 17:59
    as one tunes this,
    is to, in fact,
  • 18:01 - 18:01
    look at having a special
    purpose sorting routine for
  • 18:05 - 18:05
    small numbers of elements.
    For example,
  • 18:07 - 18:07
    if you get down to five
    elements having some straight
  • 18:11 - 18:11
    line piece of code that knows
    how to sort five elements
  • 18:14 - 18:14
    sufficiently as opposed to
    continuing to go through
  • 18:18 - 18:18
    recursion in order to accomplish
    that.
  • 18:20 - 18:20
    And there are a variety of
    other things.
  • 18:23 - 18:23
    This is a tail recursive code,
    and so you can use certain tail
  • 18:27 - 18:27
    recursion optimizations.
    And there are a variety of
  • 18:32 - 18:32
    other kinds of optimizations
    that you can use to make this
  • 18:35 - 18:35
    code go fast.
    So, yeah, you can tune it up a
  • 18:38 - 18:38
    bit beyond what is there,
    but the core of it is this
  • 18:41 - 18:41
    efficient partitioning routine.
  • 18:53 - 18:53
    That is the algorithm.
    It turns out that looking at
  • 18:59 - 18:59
    how fast it runs is actually a
    little bit challenging.
  • 19:05 - 19:05
    In the analysis,
    we are going to assume that all
  • 19:10 - 19:10
    elements are distinct.
    It turns out that this
  • 19:14 - 19:14
    particular code does not work
    very well when you have repeated
  • 19:20 - 19:20
    elements, but Hoare's original
    partitioning routine is actually
  • 19:26 - 19:26
    more efficient in that case if
    there are duplicates in what you
  • 19:32 - 19:32
    are sorting.
    And I encourage you to look at
  • 19:36 - 19:36
    that.
    It has a much more complicated
  • 19:39 - 19:39
    invariant for partitioning
    routine, but it does a similar
  • 19:43 - 19:43
    kind of thing.
    It's just a bit more
  • 19:45 - 19:45
    complicated.
    If they weren't all distinct,
  • 19:48 - 19:48
    there are things you can do to
    make them distinct or you can
  • 19:52 - 19:52
    just use this code.
    The easiest thing to do is just
  • 19:56 - 19:56
    use Hoare's original code
    because that works pretty well
  • 20:00 - 20:00
    when they are nondistinct.
    But this is a little bit easier
  • 20:08 - 20:08
    to understand.
    Let's let T(n) be the
  • 20:13 - 20:13
    worst-case running time on n
    elements.
  • 20:18 - 20:18
    And so what is the worse-case?
    What is the worse-case going to
  • 20:27 - 20:27
    be for quicksort?
  • 20:40 - 20:40
    That's right.
    If you always pick the pivot
  • 20:43 - 20:43
    and everything is greater than
    or everything is less than,
  • 20:47 - 20:47
    you are not going to partition
    the array very well.
  • 20:51 - 20:51
    And when does that happen?
    What does the original input
  • 20:55 - 20:55
    look like that makes that
    happen?
  • 20:57 - 20:57
    If it is already sorted or
    reverse sorted.
  • 21:01 - 21:01
    So, if the input is sorted or
    reverse sorted.
  • 21:05 - 21:05
    That is actually kind of
    important to understand,
  • 21:10 - 21:10
    because it turns out the most
    common thing to sort is
  • 21:15 - 21:15
    something that is already
    sorted, surprisingly,
  • 21:19 - 21:19
    or things that are nearly
    sorted.
  • 21:22 - 21:22
    But often it is just sorted and
    somebody wants to make sure it
  • 21:28 - 21:28
    is sorted.
    Well, let's just sort it again
  • 21:33 - 21:33
    rather than checking to see if
    it is sorted.
  • 21:38 - 21:38
    And, in those cases,
    one side of the partition of
  • 21:44 - 21:44
    each partition has no elements.
    Then we can write out what the
  • 21:50 - 21:50
    recursion is for that.
    We have T(n).
  • 21:54 - 21:54
    If one side has no elements,
    we are going to have T(0) on
  • 22:01 - 22:01
    that side.
    And on the other side we are
  • 22:06 - 22:06
    going to have T(n-1).
    We are just writing out the
  • 22:11 - 22:11
    recursion for this.
    One side has no elements.
  • 22:16 - 22:16
    The other side has n-1
    elements.
  • 22:20 - 22:20
    And then partitioning and all
    the bookkeeping and so forth is
  • 22:27 - 22:27
    order n.
    What is T(0)?
  • 22:30 - 22:30
    What is T(0)?
    What is that asymptotically?
  • 22:36 - 22:36
    It's a constant,
    order 1.
  • 22:39 - 22:39
    That is just order 1 + T(n-1) +
    order n.
  • 22:44 - 22:44
    Well, the order 1 can be
    absorbed into the order n,
  • 22:51 - 22:51
    so this is really just saying
    it is T(n-1) + order n.
  • 22:59 - 22:59
    And what is that equal to?
    That is order n^2.
  • 23:09 - 23:09
    Why is that order n^2?
    It is an arithmetic series.
  • 23:19 - 23:19
    Actually, just like we got for
    insertion sort.
  • 23:30 - 23:30
    Just like for insertion sort it
    is an arithmetic series.
  • 23:36 - 23:36
    Going through all that work and
    we have an algorithm called
  • 23:42 - 23:42
    quicksort, and it is no faster
    than insertion sort.
  • 23:47 - 23:47
    Nevertheless,
    I said it was a good algorithm.
  • 23:52 - 23:52
    The reason it is a good
    algorithm is because its average
  • 23:58 - 23:58
    case time, as we are going to
    see, is very good.
  • 24:04 - 24:04
    But let's try to understand
    this a little bit more just so
  • 24:10 - 24:10
    that we understand the
    difference between what is going
  • 24:15 - 24:15
    to happen in the average case
    and what is going to happen in
  • 24:21 - 24:21
    the worse-case.
    Let's draw a recursion tree for
  • 24:26 - 24:26
    this for T(n) = T(0) + T(n-1) +
    and I will make the constant
  • 24:31 - 24:31
    explicit for cn.
    So, we get an intuition of what
  • 24:37 - 24:37
    is going on.
    Some constant times n.
  • 24:39 - 24:39
    And then we have T(n) is equal
    to, and we write it with the
  • 24:44 - 24:44
    constant part here,
    cn, and then T(0) here,
  • 24:48 - 24:48
    and then T(n-1) here.
    Now, I know that all you folks
  • 24:52 - 24:52
    are really fast and want to jump
    immediately to the full-blown
  • 24:57 - 24:57
    tree.
    But, let me tell you,
  • 25:00 - 25:00
    my advice is that you spend
    just a couple of minutes writing
  • 25:04 - 25:04
    it out.
    Since the tree grows
  • 25:07 - 25:07
    exponentially,
    it only costs you a constant
  • 25:10 - 25:10
    overhead to write out the small
    cases and make sure that you
  • 25:15 - 25:15
    have got the pattern that you
    are developing.
  • 25:18 - 25:18
    So, I am going to go one more
    step.
  • 25:21 - 25:21
    Here we have T(0) and now this
    becomes c(n-1) and now we have
  • 25:26 - 25:26
    another T(0) over here and
    T(n-2).
  • 25:30 - 25:30
    And we continue that,
    dot, dot, dot.
  • 25:36 - 25:36
    That is all equal to cn with a
    T(0) here, c(n-1) with a T(0),
  • 25:46 - 25:46
    c(n-2), T(0) here,
    and that goes all the way down
  • 25:54 - 25:54
    until we end up with T(1) down
    here.
  • 26:01 - 26:01
    What is the height of this tree?
  • 26:17 - 26:17
    What is the height of the tree
    here?
  • 26:21 - 26:21
    Yeah, n.
    Good.
  • 26:22 - 26:22
    Because every step we are just
    decrementing the argument by 1.
  • 26:30 - 26:30
    So, the height is n.
    To analyze this,
  • 26:35 - 26:35
    let's first add up everything
    that is here.
  • 26:41 - 26:41
    Just so we understand where
    these things are coming from,
  • 26:49 - 26:49
    this is just theta of the
    summation of k equals 1 to n of
  • 26:56 - 26:56
    k, actually of ck.
    That is what is in there.
  • 27:02 - 27:02
    And that is equal to order n^2.
    That is where our algorithmatic
  • 27:08 - 27:08
    series is coming from.
    So, that is Theta(n^2).
  • 27:13 - 27:13
    And then all of these things
    here are all Theta(1).
  • 27:23 - 27:23
    And how many of them are there?
    There are n Theta(1)'s.
  • 27:33 - 27:33
    So, the total amount is T(n) =
    Theta(n) + Theta(n^2) =
  • 27:55 - 27:55
    Theta(n^2).
    Just to see what the structure
  • 28:03 - 28:03
    is in terms of the recursion
    tree, it is a highly unbalanced
  • 28:09 - 28:09
    recursion tree.
    Now I am going to do something
  • 28:14 - 28:14
    that I told you should never do,
    which is we are going to be do
  • 28:21 - 28:21
    a best-case analysis.
    This is for intuition only.
  • 28:26 - 28:26
    And, in general,
    we don't do best-case analyses.
  • 28:32 - 28:32
    It doesn't mean anything,
    unless we get some intuition
  • 28:36 - 28:36
    for it maybe.
    But basically it means nothing
  • 28:39 - 28:39
    mathematically because it's
    providing no guarantee.
  • 28:55 - 28:55
    And so this is intuition only.
  • 29:05 - 29:05
    If we are really lucky what
    happens for partition?
  • 29:14 - 29:14
    What is going to be the lucky
    case?
  • 29:26 - 29:26
    Yeah, it splits right in the
    middle.
  • 29:29 - 29:29
    Which is essentially --
  • 29:43 - 29:43
    -- n/2 : n/2.
    It is really (n-1)/2 :
  • 29:46 - 29:46
    (n-1)/2, but we're not going to
    worry about the details because
  • 29:51 - 29:51
    we're only doing intuition for
    the best-case because best-case
  • 29:57 - 29:57
    is not what we want.
    If that happened,
  • 30:02 - 30:02
    what is the recurrence I get?
    Imagine it split it exactly in
  • 30:12 - 30:12
    the middle every time,
    then what happens?
  • 30:18 - 30:18
    You get T(n) = 2T(n/2) + order
    n for partitioning and
  • 30:27 - 30:27
    bookkeeping.
    And what is the solution of
  • 30:33 - 30:33
    that recurrence?
    That is n log n.
  • 30:37 - 30:37
    That is the same as the merge
    sort recurrence.
  • 30:42 - 30:42
    It is which case of the master
    theorem?
  • 30:47 - 30:47
    Case 2, right?
    Because n to the log base 2 of
  • 30:52 - 30:52
    2 is n to the 1,
    it is the same,
  • 30:55 - 30:55
    so we tack on the extra log n.
    Case 2 of the master theorem.
  • 31:05 - 31:05
    That is pretty good.
    That says that in the best-case
  • 31:14 - 31:14
    quicksort is going to do well.
    How about let's suppose the
  • 31:24 - 31:24
    split is always let's say 1/10 :
    9/10, 1/10n :
  • 31:32 - 31:32
    9/10n.
    In that case,
  • 31:36 - 31:36
    are we lucky or are we unlucky?
  • 31:52 - 31:52
    I mean, if the split is really
    skewed, we clearly are going to
  • 31:57 - 31:57
    be unlucky, right,
    because then it's,
  • 32:01 - 32:01
    say, 1 to n.
    If it is really in the middle
  • 32:04 - 32:04
    it is n log n.
    What do you suppose it is if it
  • 32:08 - 32:08
    is 1/10 : 9/10?
    Is that lucky or unlucky?
  • 32:12 - 32:12
    We will have a little democracy
    here.
  • 32:14 - 32:14
    Who thinks that that is a lucky
    case?
  • 32:18 - 32:18
    It is going to be fast running
    time.
  • 32:20 - 32:20
    And who thinks it is an unlucky
    case?
  • 32:23 - 32:23
    OK, so we have some brave
    souls.
  • 32:26 - 32:26
    And who didn't vote?
    Oh, come on.
  • 32:30 - 32:30
    Come on.
    It is always better,
  • 32:32 - 32:32
    by the way, to say yes or no
    and be right or wrong,
  • 32:36 - 32:36
    because then you have some
    emotional commitment to it and
  • 32:40 - 32:40
    we will remember better,
    rather than just sitting and
  • 32:44 - 32:44
    being quiet.
    You don't manipulate your own
  • 32:47 - 32:47
    emotions well enough to remember
    things well.
  • 32:50 - 32:50
    Those people who voted win over
    the people who don't vote,
  • 32:55 - 32:55
    whether they are right or
    wrong.
  • 32:57 - 32:57
    Well, let's take a look.
    Here is the recurrence.
  • 33:03 - 33:03
    T(n) = T(1/10n) + T(9/10n) +
    Theta(n).
  • 33:08 - 33:08
    And we will assume that this
    part here is less than or equal
  • 33:15 - 33:15
    to some cn in order to analyze
    it.
  • 33:19 - 33:19
    We will just do a recursion
    tree for this and see.
  • 33:26 - 33:26
    Here is a recursion tree.
  • 33:35 - 33:35
    We have T(n) = cn,
    T(1/10n), T(9/10n).
  • 33:41 - 33:41
    Now we have again cn at the
    top.
  • 33:47 - 33:47
    This gets complicated,
    right?
  • 33:51 - 33:51
    This is 1/10cn.
    Now, over here we have 1/10.
  • 34:00 - 34:00
    And then we are plugging it
    into the recursion again,
  • 34:10 - 34:10
    so we now get T(1/100n) and
    over here we get T(9/100n).
  • 34:20 - 34:20
    And over here we have now
    9/10cn.
  • 34:26 - 34:26
    And that gives us T(9/100n)
    again.
  • 34:34 - 34:34
    And here we get T(81/100n).
    And we keep going on.
  • 34:48 - 34:48
    That is equal to cn,
    1/10cn here.
  • 34:59 - 34:59
    Down this way we have 1/100cn.
    And that keeps going down until
  • 35:08 - 35:08
    we get to order 1 down here.
    And over here we have 9/10cn.
  • 35:17 - 35:17
    And here, let's see,
    this is 9/100cn and this is now
  • 35:25 - 35:25
    9/100cn and this is 81/100cn.
    And these things keep going
  • 35:33 - 35:33
    down until they get down to
    order 1.
  • 35:37 - 35:37
    But the leaves are not all at
    uniform depth here,
  • 35:42 - 35:42
    right?
    This side is way further up
  • 35:45 - 35:45
    than this side,
    right?
  • 35:48 - 35:48
    Because here we are only going
    down by 9/10 each time.
  • 35:53 - 35:53
    So, in fact,
    what is the length of this path
  • 35:58 - 35:58
    here?
    What is the length of this path
  • 36:02 - 36:02
    down to this,
    if I take the left most spine?
  • 36:15 - 36:15
    Somebody raise there hand.
    Yeah?
  • 36:24 - 36:24
    Log base 10 of n.
    Because I am basically cutting
  • 36:31 - 36:31
    down by a factor of 10 each
    time.
  • 36:33 - 36:33
    And how long does it take me to
    reduce it to 1?
  • 36:36 - 36:36
    That is the definition,
    if you will,
  • 36:38 - 36:38
    of what a log is,
    log base 10.
  • 36:40 - 36:40
    What is this one?
    What is this path going that
  • 36:43 - 36:43
    way?
  • 36:53 - 36:53
    Log of n. Log base 10/9 of n.
    Because we're going down by
  • 37:02 - 37:02
    9/10 each time.
    Once again, essentially the
  • 37:06 - 37:06
    definition of n.
    And everything in between there
  • 37:10 - 37:10
    is somewhere between log base 10
    of n and log base 10/9 of n.
  • 37:15 - 37:15
    So, everything is in between
    there.
  • 37:17 - 37:17
    Now what I can do is do the
    trick that we did for mergesort
  • 37:22 - 37:22
    in looking at what the
    evaluation of this is by adding
  • 37:27 - 37:27
    up what is the cost of the total
    level.
  • 37:31 - 37:31
    That is just cn.
    What is the cost of the next
  • 37:37 - 37:37
    level?
    cn.
  • 37:38 - 37:38
    And what is the cost of the
    next level?
  • 37:43 - 37:43
    cn.
    Every level we are still doing
  • 37:48 - 37:48
    the same amount of work.
    And we take that all the way
  • 37:55 - 37:55
    down.
    And the last levels --
  • 38:00 - 38:00
    Eventually we hit some point
    where it is not equal to cn
  • 38:05 - 38:05
    where we start getting things
    that are less than or equal to
  • 38:09 - 38:09
    cn because some of the leaves
    start dropping out starting at
  • 38:14 - 38:14
    this level.
    Basically this part is going to
  • 38:18 - 38:18
    be log base 10 of n,
    and then we start getting
  • 38:21 - 38:21
    things that are less than or
    equal to cn, and so forth,
  • 38:26 - 38:26
    until finally we get to add it
    all up.
  • 38:30 - 38:30
    T(n) is going to be less than
    or equal to cn times,
  • 38:36 - 38:36
    well, what is the longest that
    this could possibly be?
  • 38:43 - 38:43
    Log base 10/9 of n.
    Plus we have all of the leaves
  • 38:50 - 38:50
    that we have to add in,
    but all the leaves together add
  • 38:57 - 38:57
    up to just order n.
    All the leaves add up to order
  • 39:03 - 39:03
    n, so we have + Theta(n).
    And so this is how much?
  • 39:08 - 39:08
    If I add all of this together,
    what is this asymptotically?
  • 39:14 - 39:14
    That is n log n.
    So, T(n) is actually bounded by
  • 39:19 - 39:19
    n log n.
    We are lucky.
  • 39:21 - 39:21
    Those people who guessed lucky
    were right.
  • 39:25 - 39:25
    A 1/10 : 9/10 split is
    asymptotically as good as a 50 :
  • 39:31 - 39:31
    50 split.
    And, in fact,
  • 39:34 - 39:34
    we can lower bound this by just
    looking at these things here and
  • 39:41 - 39:41
    discover that,
    in fact, T(n) is lower bounded
  • 39:46 - 39:46
    by cn log base 10 of n + order
    n.
  • 39:49 - 39:49
    And so T(n) is lower bounded by
    also asymptotically n log n.
  • 39:55 - 39:55
    So, T(n) is actually Theta(n lg
    n).
  • 40:00 - 40:00
    Now, this is not really proof.
    I generally recommend that you
  • 40:04 - 40:04
    don't do this kind of thing to
    do a proof.
  • 40:08 - 40:08
    This is a good intuition of a
    recursion tree.
  • 40:11 - 40:11
    The way you prove this is what?
    Substitution method.
  • 40:15 - 40:15
    Good.
    What you do is use this to get
  • 40:17 - 40:17
    your guess and then use
    substitution method to prove
  • 40:21 - 40:21
    that your guess is right.
    It is too easy to make mistakes
  • 40:25 - 40:25
    with this method.
    It is very easy to make
  • 40:28 - 40:28
    mistakes.
    With the substitution method it
  • 40:33 - 40:33
    is harder to make mistakes
    because there is just algebra
  • 40:37 - 40:37
    there that you are cranking
    through.
  • 40:40 - 40:40
    It is easier to verify rather
    than dot, dot,
  • 40:44 - 40:44
    dots and trees that you drew
    improperly and wrote in wrong
  • 40:48 - 40:48
    amounts and so forth.
    OK?
  • 40:50 - 40:50
    So, this is n log n.
    That's pretty good.
  • 40:54 - 40:54
    It is order n log n.
    And we are lucky.
  • 40:57 - 40:57
    Now let's try another one.
    This is all for intuition
  • 41:02 - 41:02
    because, I will tell you,
    by the time we get to the end
  • 41:06 - 41:06
    of this class you folks are
    going to bolting for the door
  • 41:10 - 41:10
    because we are going to do some
    good math today,
  • 41:14 - 41:14
    actually.
    It is actually fun math,
  • 41:16 - 41:16
    I think, but it is challenging.
    If you are not awake,
  • 41:20 - 41:20
    you can still sleep now,
    but I will tell you when to
  • 41:24 - 41:24
    wake up.
    One more bit of intuition.
  • 41:26 - 41:26
    Suppose that we alternate
    steps.
  • 41:30 - 41:30
    Suppose we do the partitioning
    thing.
  • 41:33 - 41:33
    And it happens that we start
    out lucky and then we have a
  • 41:39 - 41:39
    partitioning step that is
    unlucky and then we have a step
  • 41:44 - 41:44
    that is lucky and a step that is
    unlucky and we do that all the
  • 41:50 - 41:50
    way down the tree.
    Suppose we alternate.
  • 42:07 - 42:07
    Are we lucky or unlucky if we
    do that?
  • 42:09 - 42:09
    This time I want everybody
    voting.
  • 42:11 - 42:11
    It doesn't matter what your
    answer is.
  • 42:14 - 42:14
    Everybody has to have a stake
    in the game.
  • 42:16 - 42:16
    It is sort of like horseracing.
    If ever you have watched
  • 42:20 - 42:20
    horseracing, it is really
    boring, but if you put a little
  • 42:23 - 42:23
    bit of money down,
    a little skin in the game
  • 42:26 - 42:26
    suddenly it is interesting.
    The same thing here.
  • 42:30 - 42:30
    I want everybody to put some
    skin in the game.
  • 42:34 - 42:34
    Who thinks that this is going
    to be lucky?
  • 42:38 - 42:38
    Who thinks it is going to be
    unlucky?
  • 42:41 - 42:41
    OK.
    Who didn't vote?
  • 42:43 - 42:43
    [LAUGHTER] You guys.
    No skin in the game,
  • 42:46 - 42:46
    ha?
    Let's analyze this so we can
  • 42:49 - 42:49
    once again write a recurrence.
    On the lucky step,
  • 42:53 - 42:53
    we will have L(n) be the
    running time on a lucky step of
  • 42:58 - 42:58
    size n.
    And that is going to be twice.
  • 43:03 - 43:03
    While the next step is going to
    be unlucky.
  • 43:07 - 43:07
    It is two unluckies over 2 plus
    order n.
  • 43:11 - 43:11
    That is our lucky step.
    And then for the unlucky step
  • 43:16 - 43:16
    it is essentially going to be L
    of n minus 1,
  • 43:20 - 43:20
    it is going to be lucky on the
    next step, plus order n.
  • 43:25 - 43:25
    That is unlucky.
    See how I have described this
  • 43:31 - 43:31
    behavior with a system now of
    recurrences that are dependent
  • 43:36 - 43:36
    where the boundary cases,
    once again which are unstated,
  • 43:41 - 43:41
    is that the recurrences have a
    constant solution with constant
  • 43:47 - 43:47
    input.
    Now we just do a little bit of
  • 43:50 - 43:50
    algebra using substitution.
    L(n) is then equal to,
  • 43:55 - 43:55
    well, I can just plug in,
    for U(n/2) plug in the value of
  • 44:00 - 44:00
    U(n/2).
    And that gives me 2[L(n/2-1) +
  • 44:06 - 44:06
    Theta(n) + Theta(n)].
    See what I did here?
  • 44:12 - 44:12
    I simply plugged in,
    for U(n/2), this recurrence.
  • 44:19 - 44:19
    In fact, technically I guess I
    should have said Theta(n/2) just
  • 44:27 - 44:27
    to make this substitution more
    straightforward.
  • 44:35 - 44:35
    It is the same thing,
    but just to not skip a step.
  • 44:42 - 44:42
    That we can now crank through.
    And that is 2L(n/2 - 1) +,
  • 44:49 - 44:49
    and now I have two T(n/2) plus
    another one, so all of that is
  • 44:57 - 44:57
    just order n.
    And what is the solution to
  • 45:04 - 45:04
    that recurrence?
    n log n.
  • 45:08 - 45:08
    Theta(n lg n).
    Does everybody see that?
  • 45:13 - 45:13
    OK?
    Theta(n lg n).
  • 45:16 - 45:16
    This is basically just,
    once again, master theorem with
  • 45:24 - 45:24
    a little bit of jiggering here.
    That minus one is only going to
  • 45:33 - 45:33
    help us, actually,
    in the solution of the master
  • 45:38 - 45:38
    theorem.
    So, it is order n lg n.
  • 45:42 - 45:42
    We are lucky.
    If we alternate lucky and
  • 45:46 - 45:46
    unlucky, we are lucky.
    How can we insure that we are
  • 45:51 - 45:51
    usually lucky?
    If I have the input already
  • 45:55 - 45:55
    sorted, I am going to be
    unlucky.
  • 46:00 - 46:00
    Excuse me?
    You could randomly arrange the
  • 46:03 - 46:03
    elements, that is one way.
    What is another way?
  • 46:08 - 46:08
    That is a perfectly good way,
    actually.
  • 46:11 - 46:11
    In fact, it is a common thing
    to do.
  • 46:14 - 46:14
    Randomly choose the pivot,
    OK.
  • 46:16 - 46:16
    It turns out those are
    effectively equivalent,
  • 46:20 - 46:20
    but we are going to do the
    randomly choose the pivot
  • 46:25 - 46:25
    because it is a little bit
    easier to analyze.
  • 46:30 - 46:30
    But they are effectively
    equivalent.
  • 46:33 - 46:33
    That gives us the algorithm
    called randomized quicksort.
  • 46:46 - 46:46
    And the nice thing about
    randomized quicksort is that the
  • 46:53 - 46:53
    running time is independent of
    the input ordering.
  • 47:00 - 47:00
    Very much for the same reason
    that if I just scramble the
  • 47:04 - 47:04
    input, it would be independent
    of the input ordering.
  • 47:08 - 47:08
    If I randomly scramble the
    input then it doesn't matter
  • 47:13 - 47:13
    what the order of the input was.
    Whereas, original quicksort has
  • 47:18 - 47:18
    some slow cases,
    input sorted or reverse sorted,
  • 47:21 - 47:21
    and some fast cases.
    In particular,
  • 47:24 - 47:24
    it turns out that if it is
    random it is going to be pretty
  • 47:29 - 47:29
    fast.
    If I actually randomly scramble
  • 47:32 - 47:32
    the input or pivot on a random
    element, it doesn't matter what
  • 47:36 - 47:36
    the input was.
    One way of thinking about this
  • 47:38 - 47:38
    is with an adversary.
    Imagine your adversary,
  • 47:41 - 47:41
    you are saying I have a good
    sorting algorithm and he says I
  • 47:44 - 47:44
    have a good sorting algorithm
    and you're trying to sell to a
  • 47:48 - 47:48
    single customer.
    And the customer says OK,
  • 47:50 - 47:50
    you guys come up with
    benchmarks for each of your
  • 47:53 - 47:53
    algorithms.
    And you get to look at his
  • 47:55 - 47:55
    algorithm.
    Well, you look and you say oh,
  • 47:59 - 47:59
    he is using quicksort.
    I will just give him something
  • 48:03 - 48:03
    that is already sorted.
    That is what you could do to
  • 48:07 - 48:07
    him.
    If you had quicksort,
  • 48:08 - 48:08
    he would do the same thing to
    you.
  • 48:11 - 48:11
    So, how can you defeat him?
    Well, one way is with
  • 48:14 - 48:14
    randomization.
    Big idea in computer science,
  • 48:17 - 48:17
    use random numbers.
    The idea here is if I permute
  • 48:21 - 48:21
    the ordering at random,
    as one suggestion,
  • 48:24 - 48:24
    or I pivot at random places,
    then the input ordering didn't
  • 48:28 - 48:28
    matter.
    And so there is no bad ordering
  • 48:32 - 48:32
    that he can provide that is
    going to make my code run
  • 48:37 - 48:37
    slowly.
    Now, I might get unlucky.
  • 48:39 - 48:39
    But that is just unlucky in my
    use of my random-number
  • 48:43 - 48:43
    generator.
    It is not unlucky with respect
  • 48:46 - 48:46
    to what the input was.
    What the input was doesn't
  • 48:50 - 48:50
    matter.
    Everybody follow that?
  • 48:53 - 48:53
    OK.
    The nice thing about randomized
  • 48:55 - 48:55
    quicksort is that it makes no
    assumptions about the input
  • 49:00 - 49:00
    distribution.
    You don't have to assume that
  • 49:06 - 49:06
    all inputs are equally likely
    because either you can make it
  • 49:13 - 49:13
    that way or you pivot in a way
    that makes that effectively
  • 49:20 - 49:20
    whole.
    And, in particular,
  • 49:24 - 49:24
    there is no specific input that
    can elicit the worst-case
  • 49:31 - 49:31
    behavior.
    The worst-case is determined
  • 49:45 - 49:45
    only by a random-number
    generator.
  • 50:00 - 50:00
    And, therefore,
    since it is only determined by
  • 50:04 - 50:04
    a random-number generator,
    we can essentially bound the
  • 50:09 - 50:09
    unluckiness mathematically.
    We can say what are the odds?
  • 50:14 - 50:14
    So, we are going to analyze
    this.
  • 50:16 - 50:16
    And this is where you know if
    you belong in this course or
  • 50:22 - 50:22
    not.
    If you have skipped 6.042 or
  • 50:24 - 50:24
    whatever, this is a good place
    to do the comparison.
  • 50:30 - 50:30
    Since it is going to be a
    little bit, why don't people
  • 50:34 - 50:34
    just stand up for a moment and
    take a stretch break.
  • 50:38 - 50:38
    Since this is going to be a
    nice piece of mathematics we are
  • 50:42 - 50:42
    going to do, you are going to
    want to feel fresh for it.
  • 51:02 - 51:02
    Stretch break is over.
  • 51:10 - 51:10
    Analysis.
    Good.
  • 51:12 - 51:12
    I think we are going to make
    this.
  • 51:16 - 51:16
    I am sort of racing.
    There is a lot of stuff to
  • 51:23 - 51:23
    cover today.
    Good.
  • 51:25 - 51:25
    Let's let T(n) now be the
    random variable for the running
  • 51:32 - 51:32
    time assuming --
  • 51:46 - 51:46
    Wow.
    I didn't even write here what
  • 51:48 - 51:48
    we did here.
    So, we are going to pivot on a
  • 51:50 - 51:50
    random element.
  • 51:59 - 51:59
    That is the basic scheme we are
    going to do.
  • 52:02 - 52:02
    And the way I do that,
    by the way, is just in the code
  • 52:05 - 52:05
    for partition,
    rather than partitioning on the
  • 52:08 - 52:08
    first element,
    before I do the partition,
  • 52:10 - 52:10
    I just swap the first element
    with some other element in the
  • 52:14 - 52:14
    array chosen at random,
    perhaps itself.
  • 52:17 - 52:17
    So, they are all equally likely
    to be pivoted on.
  • 52:20 - 52:20
    And then just run the ordinary
    partition.
  • 52:23 - 52:23
    This is a random variable for
    running in time assuming,
  • 52:28 - 52:28
    we have to make an assumption
    for doing probability,
  • 52:32 - 52:32
    the random numbers are
    independent.
  • 52:41 - 52:41
    So that when I pivot in one
    place, it is independent of how
  • 52:45 - 52:45
    I pivoted in some other place as
    I am running this algorithm.
  • 52:49 - 52:49
    Then, to analyze this,
    what I am going to do is I want
  • 52:52 - 52:52
    to know where we pivoted.
    For k = 0, 1,
  • 52:57 - 52:57
    ..., n-1, let's let,
    for a particular partition,
  • 53:08 - 53:08
    the random variable X_k = 1 if
    partition generates a k :
  • 53:22 - 53:22
    n-k-1 split,
    and 0 otherwise.
  • 53:35 - 53:35
    In the partition routine,
    I am picking a random element
  • 53:39 - 53:39
    to pivot on.
    And X_k is going to be my
  • 53:43 - 53:43
    random variable that is 1 if it
    generates a split that has k
  • 53:47 - 53:47
    elements on the left side and
    n-k-1 elements on the right side
  • 53:53 - 53:53
    of the pivot.
    Some of those,
  • 53:55 - 53:55
    too, of course are n-1 because
    I also have the pivot.
  • 54:00 - 54:00
    And 0 otherwise.
    So, I now have n random
  • 54:04 - 54:04
    variables that I have defined
    associated with a single
  • 54:09 - 54:09
    partition where all of them are
    going to be zero except one of
  • 54:15 - 54:15
    them, whichever one happens to
    occur is going to have the value
  • 54:21 - 54:21
    1.
    This is called,
  • 54:23 - 54:23
    by the way.
    What is the name of this type
  • 54:27 - 54:27
    of random variable?
  • 54:37 - 54:37
    Bernoulli.
    Well, Bernoulli has other
  • 54:40 - 54:40
    assumptions.
    It is an indicator random
  • 54:43 - 54:43
    variable.
    It turns out it is Bernoulli,
  • 54:47 - 54:47
    but that's OK.
    It is an indicator random
  • 54:50 - 54:50
    variable.
    It just takes on the value of
  • 54:53 - 54:53
    0, 1.
    And Bernoulli random variables
  • 54:56 - 54:56
    are a particular type of
    indicator random variable.
  • 55:02 - 55:02
    Which it turns out these are.
    That is an indicator random
  • 55:06 - 55:06
    variable.
    Indicator random variables are
  • 55:10 - 55:10
    a good way when you are trying
    to understand what the sum of a
  • 55:14 - 55:14
    bunch of things is.
    It is a good way to break apart
  • 55:18 - 55:18
    your big random variables into
    smaller ones that can be
  • 55:23 - 55:23
    analyzed.
    Let's just take a look at this
  • 55:26 - 55:26
    indicator random variable.
    What is the expectation of X_k
  • 55:30 - 55:30
    equal to?
  • 55:42 - 55:42
    In other words,
    what is the probability that I
  • 55:46 - 55:46
    generate a k :
    n-k-1 split?
  • 56:00 - 56:00
    X_k is, let's just write out
    what that means,
  • 56:05 - 56:05
    just to refresh people's
    memory.
  • 56:09 - 56:09
    That is 0 times the probability
    that X_k equals 0 plus 1 times
  • 56:16 - 56:16
    the probability that X_k equals
    1, which is equal,
  • 56:22 - 56:22
    well, that is all zero.
    That is just equal to the
  • 56:27 - 56:27
    probability that X_k equals 1.
    And that is a general property
  • 56:35 - 56:35
    of indicator random variables,
    is that their expectation is
  • 56:40 - 56:40
    the probability that they are 1.
    The nice thing about indicator
  • 56:46 - 56:46
    random variables is it directly
    connects the probability to the
  • 56:51 - 56:51
    expectation without any other
    terms going on.
  • 56:55 - 56:55
    What is the probability of X_k
    equals 1?
  • 56:59 - 56:59
    1/n.
    So, all splits are equally
  • 57:02 - 57:02
    likely.
    And I have n elements,
  • 57:05 - 57:05
    so each ones has a 1/n chance
    of being picked as the pivot.
  • 57:10 - 57:10
    And, once you pick the pivot,
    that determines what is on the
  • 57:16 - 57:16
    left and the right and so forth.
    So, it is 1/n.
  • 57:20 - 57:20
    Everybody with me so far?
    More or less?
  • 57:23 - 57:23
    OK.
    As I say, this is going to test
  • 57:26 - 57:26
    whether you're in the class.
    If you go home and you study
  • 57:32 - 57:32
    this and you cannot get it,
    and you have a deficiency in
  • 57:37 - 57:37
    your math background in trying
    to take the course,
  • 57:42 - 57:42
    this is a good indication that
    probably you have taken
  • 57:46 - 57:46
    something a little over your
    head.
  • 57:49 - 57:49
    Let's write out what T(n) is
    equal to here.
  • 58:00 - 58:00
    T(n) is going to be equal to
    T(0) + T(n-1) + Theta(n) if we
  • 58:12 - 58:12
    get a 0 : n-1 split and is equal
    to T(1) + T(n-2) + order n if we
  • 58:25 - 58:25
    have a 1 : n-2 split.
  • 58:35 - 58:35
    And now down here it is going
    to be T(n-1) + T(0) + Theta(n)
  • 58:46 - 58:46
    if we end up with an n-1 :
    0 split.
  • 58:52 - 58:52
    So, this is our recurrence for
    T(n).
  • 59:00 - 59:00
    And, unfortunately,
    the recurrence is kind of hairy
  • 59:04 - 59:04
    because it has got n cases.
    And this is,
  • 59:07 - 59:07
    once again, where the
    brilliance of being able to use
  • 59:11 - 59:11
    indicator random variables comes
    in.
  • 59:14 - 59:14
    Because we will be able to take
    this case analysis and reduce it
  • 59:19 - 59:19
    to mathematics so we don't have
    cases using indicator random
  • 59:23 - 59:23
    variables.
    And the way we do that is using
  • 59:30 - 59:30
    the following trick of
    converting the cases into a
  • 59:37 - 59:37
    summation.
  • 59:55 - 59:55
    Let's just take a look at why
    these two things are the same.
  • 60:01 - 60:01
    The indicator random variable
    is zero, except if you get the
  • 60:06 - 60:06
    particular split.
    Therefore, this summation is
  • 60:11 - 60:11
    going to be zero,
    except for that k which
  • 60:14 - 60:14
    actually appeared in which case
    it is the value that we say it
  • 60:20 - 60:20
    is.
    See the trick using
  • 60:22 - 60:22
    multiplication by 0,
    1 variable to handle all the
  • 60:27 - 60:27
    cases?
    I think that is damn clever.
  • 60:31 - 60:31
    I think that is damn clever.
    And this is like the classic
  • 60:36 - 60:36
    thing that you do with indicator
    random variables.
  • 60:40 - 60:40
    It's one of the reasons they
    are a very powerful method.
  • 60:45 - 60:45
    Because now we actually have a
    mathematical expression,
  • 60:50 - 60:50
    hairy although it may be,
    for our recurrence.
  • 60:54 - 60:54
    Now, what we are going to
    analyze is the expected value of
  • 60:58 - 60:58
    T(n).
    That is what we want to do.
  • 61:03 - 61:03
    What is the expected value of
    T(n)?
  • 61:07 - 61:07
    To do that, I just write the
    expected value of T(n) is equal
  • 61:13 - 61:13
    to the expected value of this
    big summation.
  • 61:18 - 61:18
    And now we can go ahead and
    start to evaluate the expected
  • 61:24 - 61:24
    value of that summation.
    Everybody with me?
  • 61:30 - 61:30
    Yes?
    Any questions at this point?
  • 61:32 - 61:32
    I see a thumbs up.
    That's nice to see.
  • 61:35 - 61:35
    But I generally believe that
    what I want to see is no thumbs
  • 61:40 - 61:40
    down.
    It is good to see the thumbs
  • 61:43 - 61:43
    up, but that means one person
    understands, or thinks he
  • 61:47 - 61:47
    understands.
    [LAUGHTER] So,
  • 61:49 - 61:49
    this is, I claim,
    equal to the following.
  • 61:52 - 61:52
    Actually, I am going to need a
    little space here so I am going
  • 61:57 - 61:57
    to move the equal sign over a
    little bit.
  • 62:25 - 62:25
    I claim that summation is equal
    to that.
  • 62:28 - 62:28
    This expectation is equal to
    that summation of expectations.
  • 62:32 - 62:32
    Why is that?
    What are the magic words that
  • 62:36 - 62:36
    justify this step?
    Linearity of expectation.
  • 62:40 - 62:40
    The expectation of a sum is the
    sum of the expectations.
  • 62:46 - 62:46
    So, that is linearity of
    expectation.
  • 62:49 - 62:49
    I don't need independence for
    that.
  • 62:53 - 62:53
    That is just always true for
    expectation of any random
  • 62:58 - 62:58
    variables.
    The sum of the expectations is
  • 63:05 - 63:05
    the expectation of the sum and
    vice versa.
  • 63:11 - 63:11
    Here we did the vice versa.
    That is equal to now the sum of
  • 63:20 - 63:20
    k=0 to n-1 of expectation of X_k
    [T(k) + T(n-k-1) + Theta(n)].
  • 63:36 - 63:36
    Why is that true?
    What I have done is I've said
  • 63:42 - 63:42
    the expectation of the product
    is the product of the
  • 63:49 - 63:49
    expectations.
    That is because of
  • 63:53 - 63:53
    independence.
    What is independent of what?
  • 64:00 - 64:00
    The X_k here,
    random variable,
  • 64:02 - 64:02
    are independent of any of the
    other partitionings in,
  • 64:07 - 64:07
    if you will,
    the X_k that would exist for
  • 64:10 - 64:10
    any of the other recursive
    calls.
  • 64:13 - 64:13
    So, whatever happens in here is
    independent of what happened
  • 64:18 - 64:18
    there.
    We are actually hiding.
  • 64:21 - 64:21
    Since we have a recurrence,
    we are not partitioning the
  • 64:25 - 64:25
    same wage time.
    We have a different one.
  • 64:30 - 64:30
    We actually have something
    going on underneath the
  • 64:33 - 64:33
    mathematics you have to pay
    attention to that the
  • 64:36 - 64:36
    mathematics alone isn't really
    showing, which is that in T(k)
  • 64:40 - 64:40
    there is actually a set of
    random choices that are being
  • 64:44 - 64:44
    made, if you will.
    And so you have to understand
  • 64:47 - 64:47
    that those are independent of
    those, in which case we can
  • 64:50 - 64:50
    multiple the probabilities of
    their expectations.
  • 64:53 - 64:53
    Is everybody with me?
    That is a big one,
  • 64:56 - 64:56
    independence of X_k from other
    random choices.
  • 65:00 - 65:00
    That is equal to now,
    well, first of all,
  • 65:05 - 65:05
    this is nice.
    What is the expectation of X_k?
  • 65:10 - 65:10
    1/n.
    That actually doesn't even
  • 65:13 - 65:13
    belong in the summation.
    We will just pop it outside.
  • 65:20 - 65:20
    I get 1/n times the sum of k=0
    to n-1 of expectation of T(k) +
  • 65:36 - 65:36
    1/n summation k=0 to n-1 of
    expectation of T(n-k-1) + 1/n
  • 65:50 - 65:50
    summation k=0 to n-1 up to
    Theta(n).
  • 66:00 - 66:00
    That is, again,
    using linearity of expectation
  • 66:05 - 66:05
    there this time to split up
    these pieces and just factoring
  • 66:11 - 66:11
    out the expectation of k as
    being 1/n.
  • 66:15 - 66:15
    Everybody with me still?
    All of this is elementary.
  • 66:20 - 66:20
    It is just one of these things
    that is hard just because there
  • 66:27 - 66:27
    are so many steps.
    And it takes that you have seen
  • 66:33 - 66:33
    some of this before.
    Now the next observation is
  • 66:38 - 66:38
    that these two summations are,
    in fact, identical.
  • 66:43 - 66:43
    They are the same summation,
    just in a different order.
  • 66:49 - 66:49
    This is going T(0),
    T(1), T(2), T(3) up to T(n-1).
  • 66:54 - 66:54
    This one is going T(n-1),
    T(n-2), T(n-3) down to T(0).
  • 67:00 - 67:00
    These are, in fact,
    equal.
  • 67:02 - 67:02
    So, therefore,
    I have two of them.
  • 67:19 - 67:19
    And then what is this term
    equal to?
  • 67:37 - 67:37
    What is that one equal to?
    Theta(n).
  • 67:41 - 67:41
    Let's just see why.
    The summation of 0 :
  • 67:45 - 67:45
    n of Theta(n) is Theta(n^2)
    divided by n.
  • 67:50 - 67:50
    Or, if I want to bring the
    Theta(n) out,
  • 67:54 - 67:54
    I have 1 times the summation of
    k equals1 to n of Theta(1) or of
  • 68:02 - 68:02
    1.
    So, once again,
  • 68:05 - 68:05
    you get n, either way of doing
    it.
  • 68:08 - 68:08
    This is, in some sense,
    because the summations have
  • 68:13 - 68:13
    identical terms,
    and this is just algebra.
  • 68:17 - 68:17
    Now what we are going to do is
    do something for technical
  • 68:23 - 68:23
    convenience.
    Because we are going to absorb
  • 68:30 - 68:30
    the k=0, 1 terms into the
    Theta(n) for technical
  • 68:38 - 68:38
    convenience.
  • 68:46 - 68:46
    We have a recurrence here where
    I have an order n.
  • 68:50 - 68:50
    And, if I look at the cases
    where k=0 or k=1,
  • 68:54 - 68:54
    I know what the expectation is.
    For 0, 1, the expected cost is
  • 69:00 - 69:00
    the worst case cost,
    which is constant.
  • 69:04 - 69:04
    Because I am only solving the
    problem for a constant size.
  • 69:08 - 69:08
    And we know that for any of the
    boundary cases that our solution
  • 69:12 - 69:12
    of recurrence,
    our assumption is that it is
  • 69:15 - 69:15
    constant time.
    So, I basically can just take
  • 69:18 - 69:18
    those two terms out.
    And all that does it accumulate
  • 69:22 - 69:22
    some more constant here in the
    Theta(n).
  • 69:24 - 69:24
    It is going to make the
    solution of the recurrence a
  • 69:28 - 69:28
    little bit easier.
    And, if I do that,
  • 69:33 - 69:33
    I get expectation of T(n) = 2/n
    summation k=2 to n-1 of
  • 69:44 - 69:44
    expectation of T(k) + Theta(n).
  • 70:00 - 70:00
    So, all of that work was to
    derive the recurrence.
  • 70:07 - 70:07
    And now we have to solve it.
    Just to review what we did,
  • 70:15 - 70:15
    we started out with a
    recurrence which was for the
  • 70:21 - 70:21
    random variable which involved a
    case statement.
  • 70:29 - 70:29
    We converted that into some
    mathematics without the case
  • 70:33 - 70:33
    statement, just with a product,
    and then we derived a
  • 70:37 - 70:37
    recurrence for the expectation.
    And now we are in the process
  • 70:42 - 70:42
    of trying to solve that
    recurrence.
  • 70:44 - 70:44
    We have done some
    simplification of the recurrence
  • 70:48 - 70:48
    so that we understand what it is
    that we are going to solve here.
  • 70:53 - 70:53
    By the way, I don't give things
    like this on quizzes.
  • 70:56 - 70:56
    I do expect you to understand
    it.
  • 71:00 - 71:00
    The elements of this you will
    find on a quiz.
  • 71:03 - 71:03
    This is a lot of work to figure
    out.
  • 71:06 - 71:06
    This took smart people to do.
    Even though it is all
  • 71:09 - 71:09
    elementary, but working out
    something like this at the
  • 71:13 - 71:13
    elementary level is still a bit
    of work even for somebody who is
  • 71:17 - 71:17
    knowledgeable in this area.
    Now we are going to solve that
  • 71:23 - 71:23
    last recurrence over there and
    we are going to prove that the
  • 71:30 - 71:30
    expectation of T(n) is less than
    or equal to (an lg n) for some
  • 71:38 - 71:38
    constant a greater than 0.
    That is going to be what we are
  • 71:44 - 71:44
    going to do.
    And so what technique do you
  • 71:49 - 71:49
    think we should use to prove
    this?
  • 71:53 - 71:53
    Does this look like a master
    method?
  • 72:03 - 72:03
    It is nothing like the master
    method.
  • 72:09 - 72:09
    So, when in doubt do
    substitution.
  • 72:14 - 72:14
    It is the grand-daddy of all
    methods.
  • 72:19 - 72:19
    What we will do is solve the
    base case by simply choosing a
  • 72:28 - 72:28
    big enough so that (an lg n) is
    bigger than the expectation of
  • 72:38 - 72:38
    T(n) for sufficiently large
    small n.
  • 72:45 - 72:45
    So, I just pick a to be big
    enough.
  • 72:49 - 72:49
    And this is,
    by the way, why I wanted to
  • 72:54 - 72:54
    exclude 0 and 1 from the
    recurrence.
  • 72:59 - 72:59
    Because, for example,
    when n=0, log of 0 is,
  • 73:03 - 73:03
    it's like dividing by 0,
    right, you cannot do it.
  • 73:08 - 73:08
    Log of 1 is 0.
    So here, even if I restricted
  • 73:13 - 73:13
    it to 1, here I would have a 0,
    and I can't ever pick a big
  • 73:18 - 73:18
    enough to dominate those cases.
    What I do is I just say look,
  • 73:24 - 73:24
    I just absorb whatever the cost
    is into the T(n) for technical
  • 73:31 - 73:31
    convenience.
    And that lets me address it as
  • 73:37 - 73:37
    (an lg n) to be big enough to
    handle the base case.
  • 73:44 - 73:44
    So, that is why we made that
    technical assumption.
  • 73:51 - 73:51
    We are going to use a fact
    which is that the summation of
  • 73:59 - 73:59
    k=2 to n-1 of k lg k is less
    than or equal to 1/2n^2 lg n -
  • 74:07 - 74:07
    1/8n^2.
    I am going to leave that as an
  • 74:11 - 74:11
    exercise for you to workout.
    I think it is an exercise in
  • 74:15 - 74:15
    the book, too.
    I want you to go and evaluate
  • 74:19 - 74:19
    this.
    There are two ways to evaluate
  • 74:21 - 74:21
    it.
    One is by using purely
  • 74:23 - 74:23
    summations and facts about
    summations by splitting the
  • 74:27 - 74:27
    summation into two pieces and
    reconstituting it to prove this
  • 74:32 - 74:32
    bound.
    The other way is to use the
  • 74:35 - 74:35
    integral method for solving
    summations.
  • 74:39 - 74:39
    Either way you can prove.
    The integral method actually
  • 74:43 - 74:43
    gets you a tighter bound than
    this.
  • 74:46 - 74:46
    This is a basic fact,
    and you should go off and know
  • 74:51 - 74:51
    how to do that.
    Now let's do substitution.
  • 75:06 - 75:06
    The expectation of T(n) is less
    than or equal to 2/n times the
  • 75:19 - 75:19
    summation k=2 to n-1,
    now we do the substitution of
  • 75:29 - 75:29
    ak lg k, the smaller values plus
    Theta(n).
  • 75:39 - 75:39
    I might mentioned,
    by the way, that the hard part
  • 75:42 - 75:42
    of doing this,
    it is easy to get the bound
  • 75:45 - 75:45
    without this term,
    it is easy to get this bound,
  • 75:49 - 75:49
    1/2n^2 lg n,
    it is harder to get the second
  • 75:52 - 75:52
    order term.
    It turns out you need the
  • 75:54 - 75:54
    second order term in order to do
    what we are going to do.
  • 75:59 - 75:59
    You have to be able to subtract
    a quadratic amount of the n^2 lg
  • 76:06 - 76:06
    n in order to make this proof
    work.
  • 76:09 - 76:09
    And that is the trickier part
    in evaluating that summation.
  • 76:15 - 76:15
    So, we get this.
    That is less than or equal to?
  • 76:20 - 76:20
    Well, I happen to know how much
    this is by using that formula.
  • 76:27 - 76:27
    I use my fact and get 2a/n
    (1/2n^2 lg n - 1/8n^2) +
  • 76:32 - 76:32
    Theta(n).
    Did I do something wrong?
  • 76:44 - 76:44
    There we go.
    Very good.
  • 76:52 - 76:52
    That is equal to -
    If I multiply this first part
  • 77:05 - 77:05
    through that is an lg n.
    And now, so I don't make a
  • 77:14 - 77:14
    mistake, I want to express this
    as my desired,
  • 77:23 - 77:23
    this is what I want it to be,
    minus a residual.
  • 77:32 - 77:32
    I am going to write the
    residual as this part.
  • 77:39 - 77:39
    And so, the way to write that
    is, that is going to be minus.
  • 77:47 - 77:47
    And then it is going to be this
    term here, which is going to be
  • 77:57 - 77:57
    an/4 - Theta(n).
  • 78:05 - 78:05
    And that is going to be less
    than or equal to an lg n if this
  • 78:12 - 78:12
    part is positive.
    And I can make that part
  • 78:17 - 78:17
    positive by picking a big enough
    such that an/4 dominates the
  • 78:25 - 78:25
    constant in the Theta(n) here.
    Whatever the constant is here,
  • 78:35 - 78:35
    I can find an a that is big
    enough so that this term makes
  • 78:45 - 78:45
    this part positive.
    If a is big enough so that an/4
  • 78:55 - 78:55
    dominates Theta(n).
    And so the running time of
  • 79:01 - 79:01
    randomized quicksort is order n
    lg n.
  • 79:04 - 79:04
    That is what we just proved,
    the expected running time is
  • 79:09 - 79:09
    order n lg n.
    Now, in practice,
  • 79:12 - 79:12
    quicksort is a great algorithm.
    It is typically three or more
  • 79:17 - 79:17
    times faster than mergesort.
    It doesn't give you the strong
  • 79:22 - 79:22
    guarantee necessarily of
    mergesort and being worst-case n
  • 79:26 - 79:26
    lg n.
    But in practice,
  • 79:29 - 79:29
    if you use randomized
    quicksort, it is generally as
  • 79:33 - 79:33
    much as three times faster.
    It does require code tuning in
  • 79:38 - 79:38
    order to get it up to be that
    fast.
  • 79:40 - 79:40
    You do have to go and coarsen
    the base cases and do some other
  • 79:45 - 79:45
    tricks there,
    but most good sorting
  • 79:48 - 79:48
    algorithms that you will find
    are based on quicksort.
  • 79:52 - 79:52
    Also one of the other reasons
    it works well is because it
  • 79:56 - 79:56
    tends to work well with caches
    in virtual memory.
  • 80:01 - 80:01
    We are not really talking much
    about caching models and so
  • 80:05 - 80:05
    forth, big topic these days in
    algorithms, but it does work
  • 80:10 - 80:10
    very well with caches in virtual
    memory.
  • 80:13 - 80:13
    It is another reason that this
    is a good algorithm to use.
  • 80:17 - 80:17
    Good recitation,
    by the way, on Friday.
  • 80:20 - 80:20
    We are going to see another n
    log n time algorithm,
  • 80:24 - 80:24
    a very important one in
    recitation on Friday.
Title:
Lec 4 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Description:

Lecture 04: Quicksort, Randomized Algorithms

View the complete course at: http://ocw.mit.edu/6-046JF05

License: Creative Commons BY-NC-SA

More information at http://ocw.mit.edu/terms

More courses at http://ocw.mit.edu

more » « less
Duration:
01:20:33

English subtitles

Revisions