< Return to Video

Lec 6 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

  • 0:07 - 0:07
    Today we're going to not talk
    about sorting.
  • 0:11 - 0:11
    This is an exciting new
    development.
  • 0:14 - 0:14
    We're going to talk about
    another problem,
  • 0:18 - 0:18
    a related problem,
    but a different problem.
  • 0:35 - 0:35
    We're going to talk about
    another problem that we would
  • 0:38 - 0:38
    like to solve in linear time.
    Last class we talked about we
  • 0:42 - 0:42
    could do sorting in linear time.
    To do that we needed some
  • 0:45 - 0:45
    additional assumptions.
    Today we're going to look at a
  • 0:48 - 0:48
    problem that really only needs
    linear time, even though at
  • 0:51 - 0:51
    first glance it might look like
    it requires sorting.
  • 0:54 - 0:54
    So this is going to be an
    easier problem.
  • 0:57 - 0:57
    The problem is I give you a
    bunch of numbers.
  • 1:00 - 1:00
    Let's call them elements.
    And they are in some array,
  • 1:07 - 1:07
    let's say.
    And they're in no particular
  • 1:12 - 1:12
    order, so unsorted.
    I want to find the kth smallest
  • 1:18 - 1:18
    element.
  • 1:26 - 1:26
    This is called the element of
    rank k.
  • 1:37 - 1:37
    In other words,
    I have this list of numbers
  • 1:40 - 1:40
    which is unsorted.
    And, if I were to sort it,
  • 1:43 - 1:43
    I would like to know what the
    kth element is.
  • 1:46 - 1:46
    But I'm not allowed to sort it.
    One solution to this problem,
  • 1:50 - 1:50
    this is the naīve algorithm,
    is you just sort and then
  • 1:54 - 1:54
    return the kth element.
    This is another possible
  • 1:57 - 1:57
    definition of the problem.
    And we would like to do better
  • 2:03 - 2:03
    than that.
    So you could sort,
  • 2:06 - 2:06
    what's called the array A,
    and then return A[k].
  • 2:11 - 2:11
    That is one thing we could do.
    And if we use heap sort or
  • 2:16 - 2:16
    mergesort, this will take n lg n
    time.
  • 2:20 - 2:20
    We would like to do better than
    n lg n.
  • 2:24 - 2:24
    Ideally linear time.
    The problem is pretty natural,
  • 2:29 - 2:29
    straightforward.
    It has various applications.
  • 2:34 - 2:34
    Depending on how you choose k,
    k could be any number between 1
  • 2:40 - 2:40
    and n.
    For example,
  • 2:41 - 2:41
    if we choose k=1 that element
    has a name.
  • 2:45 - 2:45
    Any suggestions of what the
    name is?
  • 2:48 - 2:48
    The minimum.
    That's easy.
  • 2:50 - 2:50
    Any suggestions on how we could
    find the minimum element in an
  • 2:55 - 2:55
    array in linear time?
    Right.
  • 2:59 - 2:59
    Just scan through the array.
    Keep track of what the smallest
  • 3:04 - 3:04
    number is that you've seen.
    The same thing with the
  • 3:09 - 3:09
    maximum, k=n.
    These are rather trivial.
  • 3:12 - 3:12
    But a more interesting version
    of the order statistic problem
  • 3:17 - 3:17
    is to find the median.
    This is either k equals n plus
  • 3:22 - 3:22
    1 over 2 floor or ceiling.
    I will call both of those
  • 3:26 - 3:26
    elements medians.
  • 3:34 - 3:34
    Finding the median of an
    unsorted array in linear time is
  • 3:37 - 3:37
    quite tricky.
    And that sort of is the main
  • 3:39 - 3:39
    goal of this lecture,
    is to be able to find the
  • 3:42 - 3:42
    medians.
    For free we're going to be able
  • 3:44 - 3:44
    to find the arbitrary kth
    smallest element,
  • 3:46 - 3:46
    but typically we're most
    interested in finding the
  • 3:49 - 3:49
    median.
    And on Friday in recitation
  • 3:51 - 3:51
    you'll see why that is so
    useful.
  • 3:52 - 3:52
    There are all sorts of
    situations where you can use
  • 3:55 - 3:55
    median for really effective
    divide-and-conquer without
  • 3:58 - 3:58
    having to sort.
    You can solve a lot of problems
  • 4:03 - 4:03
    in linear time as a result.
    And we're going to cover today
  • 4:07 - 4:07
    two algorithms for finding order
    statistics.
  • 4:11 - 4:11
    Both of them are linear time.
    The first one is randomized,
  • 4:15 - 4:15
    so it's only linear expected
    time.
  • 4:18 - 4:18
    And the second one is
    worst-case linear time,
  • 4:22 - 4:22
    and it will build on the
    randomized version.
  • 4:25 - 4:25
    Let's start with a randomize
    divide-and-conquer algorithm.
  • 4:46 - 4:46
    This algorithm is called
    rand-select.
  • 5:02 - 5:02
    And the parameters are a little
    bit more than what we're used
  • 5:06 - 5:06
    to.
    The order statistics problem
  • 5:09 - 5:09
    you're given an array A.
    And here I've changed notation
  • 5:13 - 5:13
    and I'm looking for the ith
    smallest element,
  • 5:16 - 5:16
    so i is the index I'm looking
    for.
  • 5:18 - 5:18
    And I'm also going to change
    the problem a little bit.
  • 5:22 - 5:22
    And instead of trying to find
    it in the whole array,
  • 5:26 - 5:26
    I'm going to look in a
    particular interval of the
  • 5:29 - 5:29
    array, A from p up to q.
    We're going to need that for a
  • 5:34 - 5:34
    recursion.
    This better be a recursive
  • 5:36 - 5:36
    algorithm because we're using
    divide-and-conquer.
  • 5:39 - 5:39
    Here is the algorithm.
  • 5:51 - 5:51
    With a base case.
    It's pretty simple.
  • 5:55 - 5:55
    Then we're going to use part of
    the quicksort algorithm,
  • 6:00 - 6:00
    randomized quicksort.
  • 6:09 - 6:09
    We didn't actually define this
    subroutine two lectures ago,
  • 6:13 - 6:13
    but you should know what it
    does, especially if you've read
  • 6:18 - 6:18
    the textbook.
    This says in the array A[p...q]
  • 6:21 - 6:21
    pick a random element,
    so pick a random index between
  • 6:25 - 6:25
    p and q, swap it with the first
    element, then call partition.
  • 6:30 - 6:30
    And partition uses that first
    element to split the rest of the
  • 6:35 - 6:35
    array into less than or equal to
    that random partition and
  • 6:39 - 6:39
    greater than or equal to that
    partition.
  • 6:43 - 6:43
    This is just picking a random
    partition element between p and
  • 6:47 - 6:47
    q, cutting the array in half,
    although the two sizes may not
  • 6:52 - 6:52
    be equal.
    And it returns the index of
  • 6:55 - 6:55
    that partition element,
    some number between p and q.
  • 7:00 - 7:00
    And we're going to define k to
    be this particular value,
  • 7:08 - 7:08
    r minus p plus 1.
    And the reason for that is that
  • 7:15 - 7:15
    k is then the rank of the
    partition element.
  • 7:22 - 7:22
    This is in A[p...q].
    Let me draw a picture here.
  • 7:30 - 7:30
    We have our array A.
    It starts at p and ends at q.
  • 7:34 - 7:34
    There is other stuff,
    but for this recursive all we
  • 7:39 - 7:39
    care about is p up to q.
    We pick a random partition
  • 7:43 - 7:43
    element, say this one,
    and we partition things so that
  • 7:47 - 7:47
    everything in here,
    let's call this r,
  • 7:51 - 7:51
    is less than or equal to A[r]
    and everything up here is
  • 7:55 - 7:55
    greater than or equal to A[r].
    And A[r] is our partition
  • 8:00 - 8:00
    element.
    After this call,
  • 8:03 - 8:03
    that's what the array looks
    like.
  • 8:06 - 8:06
    And we get r.
    We get the index of where
  • 8:10 - 8:10
    partition element is stored.
    The number of elements that are
  • 8:15 - 8:15
    less than or equal to A[r] and
    including r is r minus p plus 1.
  • 8:20 - 8:20
    There will be r minus p
    elements here,
  • 8:24 - 8:24
    and we're adding 1 to get this
    element.
  • 8:28 - 8:28
    And, if you start counting at
    1, if this is rank 1,
  • 8:33 - 8:33
    rank 2, this element will have
    rank k.
  • 8:36 - 8:36
    That's just from the
    construction in the partition.
  • 8:40 - 8:40
    And now we get to recurse.
    And there are three cases --
  • 8:53 - 8:53
    -- depending on how i relates
    to k.
  • 8:55 - 8:55
    Remember i is the rank that
    we're looking for,
  • 8:58 - 8:58
    k is the rank that we happen to
    get out of this random
  • 9:01 - 9:01
    partition.
    We don't have much control over
  • 9:04 - 9:04
    k, but if we're lucky i=k.
    That's the element we want.
  • 9:13 - 9:13
    Then we just return the
    partition element.
  • 9:15 - 9:15
    More likely is that the element
    we're looking for is either to
  • 9:18 - 9:18
    the left or to the right.
    And if it's to the left we're
  • 9:21 - 9:21
    going to recurse in the
    left-hand portion of the array.
  • 9:24 - 9:24
    And if it's to the right we're
    going to recurse in the
  • 9:26 - 9:26
    right-hand portion.
    So, pretty straightforward at
  • 9:29 - 9:29
    this point.
  • 9:45 - 9:45
    I just have to get all the
    indices right.
  • 10:08 - 10:08
    Either we're going to recurse
    on the part between p and r
  • 10:11 - 10:11
    minus 1, that's this case.
    The rank we're looking for is
  • 10:15 - 10:15
    to the left of the rank of
    element A[r].
  • 10:17 - 10:17
    Or, we're going to recurse on
    the right part between r plus 1
  • 10:21 - 10:21
    and q.
    Where we recurse on the left
  • 10:23 - 10:23
    part the rank we're looking for
    remains the same,
  • 10:26 - 10:26
    but when we recurse on the
    right part the rank we're
  • 10:29 - 10:29
    looking for gets offset.
    Because we sort of got rid of
  • 10:34 - 10:34
    the k elements over here.
    I should have written this
  • 10:39 - 10:39
    length is k.
    We've sort of swept away k
  • 10:42 - 10:42
    ranks of elements.
    And now within this array we're
  • 10:47 - 10:47
    looking for the i minus kth
    smallest element.
  • 10:51 - 10:51
    That's the recursion.
    We only recurse once.
  • 10:55 - 10:55
    And random partition is not a
    recursion.
  • 11:00 - 11:00
    That just takes linear time.
    And the total amount of work
  • 11:05 - 11:05
    we're doing here should be
    linear time plus one recursion.
  • 11:09 - 11:09
    And we'd next like to see what
    the total running time is in
  • 11:14 - 11:14
    expectation, but let's first do
    a little example --
  • 11:26 - 11:26
    -- to make this algorithm
    perfectly clear.
  • 11:29 - 11:29
    Let's suppose we're looking for
    the seventh smallest element in
  • 11:34 - 11:34
    this array.
  • 11:50 - 11:50
    And let's suppose,
    just for example,
  • 11:53 - 11:53
    that the pivot we're using is
    just the first element.
  • 11:58 - 11:58
    So, nothing fancy.
    I would have to flip a few
  • 12:02 - 12:02
    coins in order to generate a
    random one, so let's just pick
  • 12:07 - 12:07
    this one.
    If I partition at the element
  • 12:10 - 12:10
    6, this is actually an example
    we did two weeks ago,
  • 12:14 - 12:14
    and I won't go through it
    again, but we get the same
  • 12:18 - 12:18
    array, as we did two weeks ago,
    namely 2, 5,
  • 12:21 - 12:21
    3, 6, 8, 13,
    10 and 11.
  • 12:23 - 12:23
    If you run through the
    partitioning algorithm,
  • 12:27 - 12:27
    that happens to be the order
    that it throws the elements
  • 12:31 - 12:31
    into.
    And this is our position r.
  • 12:35 - 12:35
    This is p here.
    It's just 1.
  • 12:37 - 12:37
    And q is just the end.
    And I am looking for the
  • 12:41 - 12:41
    seventh smallest element.
    And it happens when I run this
  • 12:45 - 12:45
    partition that 6 falls into the
    fourth place.
  • 12:48 - 12:48
    And we know that means,
    because all the elements here
  • 12:52 - 12:52
    are less than 6 and all the
    elements here are greater than
  • 12:57 - 12:57
    6, if this array were sorted,
    6 would be right here in
  • 13:01 - 13:01
    position four.
    So, r here is 4.
  • 13:05 - 13:05
    Yeah?
    The 12 turned into an 11?
  • 13:09 - 13:09
    This was an 11,
    believe it or not.
  • 13:14 - 13:14
    Let me be simple.
    Sorry.
  • 13:17 - 13:17
    Sometimes my ones look like
    twos.
  • 13:21 - 13:21
    Not a good feature.
    That's an easy way to cover.
  • 13:27 - 13:27
    [LAUGHTER]
    Don't try that on exams.
  • 13:32 - 13:32
    Oh, that one was just a two.
    No.
  • 13:34 - 13:34
    Even though we're not sorting
    the array, we're only spending
  • 13:38 - 13:38
    linear work here to partition by
    6.
  • 13:40 - 13:40
    We know that if we had sorted
    the array 6 would fall here.
  • 13:44 - 13:44
    We don't know about these other
    elements.
  • 13:46 - 13:46
    They're not in sorted order,
    but from the properties of
  • 13:50 - 13:50
    partition we know 6 went the
    right spot.
  • 13:53 - 13:53
    We now know rank of 6 is 4.
    We happened to be looking for 7
  • 13:56 - 13:56
    and we happened to get this
    number 4.
  • 14:00 - 14:00
    We want something over here.
    It turns out we're looking for
  • 14:04 - 14:04
    10, I guess.
    No, 11.
  • 14:05 - 14:05
    There should be eight elements
    in this array,
  • 14:08 - 14:08
    so it's the next to max.
    Max here is 13,
  • 14:11 - 14:11
    I'm cheating here.
    The answer we're looking for is
  • 14:14 - 14:14
    11.
    We know that what we're looking
  • 14:16 - 14:16
    for is in the right-hand part
    because the rank we're looking
  • 14:20 - 14:20
    for is 7, which is bigger than
    4.
  • 14:23 - 14:23
    Now, what rank are we looking
    for in here?
  • 14:25 - 14:25
    Well, we've gotten rid of four
    elements over here.
  • 14:30 - 14:30
    It happened here that k is also
    4 because p is 1 in this
  • 14:35 - 14:35
    example.
    The rank of 6 was 4.
  • 14:38 - 14:38
    We throw away those four
    elements.
  • 14:41 - 14:41
    Now we're looking for rank 7
    minus 4 which is 3.
  • 14:46 - 14:46
    And, indeed,
    the rank 3 element here is
  • 14:50 - 14:50
    still 11.
    So, you recursively find that.
  • 14:54 - 14:54
    That's your answer.
    Now that algorithm should be
  • 14:58 - 14:58
    pretty clear.
    The tricky part is to analyze
  • 15:03 - 15:03
    it.
    And the analysis here is quite
  • 15:06 - 15:06
    a bit like randomized quicksort,
    although not quite as hairy,
  • 15:10 - 15:10
    so it will go faster.
    But it will be also sort of a
  • 15:14 - 15:14
    nice review of the randomized
    quicksort analysis which was a
  • 15:18 - 15:18
    bit tricky and always good to
    see a couple of times.
  • 15:22 - 15:22
    We're going to follow the same
    kind of outline as before to
  • 15:26 - 15:26
    look at the expected running
    time of this algorithm.
  • 15:31 - 15:31
    And to start out we're going
    to, as before,
  • 15:35 - 15:35
    look at some intuition just to
    feel good about ourselves.
  • 15:40 - 15:40
    Also feel bad as you'll see.
    Let's think about two sort of
  • 15:45 - 15:45
    extreme cases,
    a good case and the worst case.
  • 15:49 - 15:49
    And I should mention that in
    all of the analyses today we
  • 15:54 - 15:54
    assume the elements are
    distinct.
  • 16:04 - 16:04
    It gets really messy if the
    elements are not distinct.
  • 16:08 - 16:08
    And you may even have to change
    the algorithms a little bit
  • 16:12 - 16:12
    because if all the elements are
    equal, if you pick a random
  • 16:17 - 16:17
    element, the partition does not
    do so well.
  • 16:20 - 16:20
    But let's assume they're all
    distinct, which is the really
  • 16:24 - 16:24
    interesting case.
    A pretty luck case --
  • 16:28 - 16:28
    I mean the best cases we
    partition right in the middle.
  • 16:33 - 16:33
    The number of elements to the
    left of our partition is equal
  • 16:38 - 16:38
    to the number of elements to the
    right of our partition.
  • 16:42 - 16:42
    But almost as good would be
    some kind of 1/10 to 9/10 split.
  • 16:47 - 16:47
    Any constant fraction,
    we should feel that.
  • 16:51 - 16:51
    Any constant fraction is as
    good as 1/2.
  • 16:54 - 16:54
    Then the recurrence we get is,
    let's say at most,
  • 16:58 - 16:58
    this bad.
    So, it depends.
  • 17:01 - 17:01
    If we have let's say 1/10 on
    the left and 9/10 on the right
  • 17:05 - 17:05
    every time we do a partition.
    It depends where our answer is.
  • 17:09 - 17:09
    It could be if i is really
    small it's in the 1/10 part.
  • 17:13 - 17:13
    If i is really big it's going
    to be in the 9/10 part,
  • 17:16 - 17:16
    or most of the time it's going
    to be in the 9/10 part.
  • 17:20 - 17:20
    We're doing worst-case analysis
    within the lucky case,
  • 17:23 - 17:23
    so we're happy to have upper
    bounds.
  • 17:26 - 17:26
    I will say t(n) is at most t of
    T(9/10n)+Theta(n).
  • 17:30 - 17:30
    Clearly it's worse if we're in
    the bigger part.
  • 17:35 - 17:35
    What is the solution to this
    recurrence?
  • 17:38 - 17:38
    Oh, solving recurrence was so
    long ago.
  • 17:42 - 17:42
    What method should we use for
    solving this recurrence?
  • 17:48 - 17:48
    The master method.
    What case are we in?
  • 17:51 - 17:51
    Three.
    Good.
  • 17:52 - 17:52
    You still remember.
    This is Case 3.
  • 17:56 - 17:56
    We're looking at nlog_b(a).
    b here is 10/9,
  • 18:01 - 18:01
    although it doesn't really
    matter because a is 1.
  • 18:06 - 18:06
    log base anything of 1 is 0.
    So, this is n^0 which is 1.
  • 18:11 - 18:11
    And n is polynomially larger
    than 1.
  • 18:15 - 18:15
    This is going to be O(n),
    which is good.
  • 18:19 - 18:19
    That is what we want,
    linear time.
  • 18:22 - 18:22
    If we're in the lucky case,
    great.
  • 18:25 - 18:25
    Unfortunately this is only
    intuition.
  • 18:30 - 18:30
    And we're not always going to
    get the lucky case.
  • 18:33 - 18:33
    We could do the same kind of
    analysis as we did with
  • 18:36 - 18:36
    randomized quicksort.
    If you alternate between lucky
  • 18:38 - 18:38
    and unlucky, things will still
    be good, but let's just talk
  • 18:42 - 18:42
    about the unlucky case to show
    how bad things can get.
  • 18:45 - 18:45
    And this really would be a
    worst-case analysis.
  • 18:53 - 18:53
    The unlucky case we get a split
    of 0:n-1.
  • 19:00 - 19:00
    Because we're removing the
    partition element either way.
  • 19:05 - 19:05
    And there could be nothing less
    than the partition element.
  • 19:10 - 19:10
    We have 0 on the left-hand side
    and we have n-1 on the
  • 19:15 - 19:15
    right-hand side.
    Now we get a recurrence like
  • 19:18 - 19:18
    T(n)=T(n-1) plus linear cost.
    And what's the solution to that
  • 19:24 - 19:24
    recurrence?
    n^2.
  • 19:25 - 19:25
    Yes.
    This one you should just know.
  • 19:28 - 19:28
    It's n^2 because it's an
    arithmetic series.
  • 19:38 - 19:38
    And that's pretty bad.
    This is much,
  • 19:40 - 19:40
    much worse than sorting and
    then picking the ith element.
  • 19:43 - 19:43
    In the worst-case this
    algorithm really sucks,
  • 19:46 - 19:46
    but most of the time it's going
    to do really well.
  • 19:49 - 19:49
    And, unless you're really,
    really unlucky and every coin
  • 19:52 - 19:52
    you flip gives the wrong answer,
    you won't get this case and you
  • 19:56 - 19:56
    will get something more like the
    lucky case.
  • 19:59 - 19:59
    At least that's what we'd like
    to prove.
  • 20:02 - 20:02
    And we will prove that the
    expected running time here is
  • 20:05 - 20:05
    linear.
    So, it's very rare to get
  • 20:07 - 20:07
    anything quadratic.
    But later on we will see how to
  • 20:10 - 20:10
    make the worst-case linear as
    well.
  • 20:12 - 20:12
    This would really,
    really solve the problem.
  • 20:30 - 20:30
    Let's get into the analysis.
  • 20:43 - 20:43
    Now, you've seen an analysis
    much like this before.
  • 20:47 - 20:47
    What do you suggest we do in
    order to analyze this expected
  • 20:52 - 20:52
    time?
    It's a divide-and-conquer
  • 20:54 - 20:54
    algorithm, so we kind of like to
    write down the recurrence on
  • 20:59 - 20:59
    something resembling the running
    time.
  • 21:09 - 21:09
    I don't need the answer,
    but what's the first step that
  • 21:13 - 21:13
    we might do to analyze the
    expected running time of this
  • 21:17 - 21:17
    algorithm?
    Sorry?
  • 21:18 - 21:18
    Look at different cases,
    yeah.
  • 21:20 - 21:20
    Exactly.
    We have all these possible ways
  • 21:23 - 21:23
    that random partition could
    split.
  • 21:25 - 21:25
    It could split 0 to the n-1.
    It could split in half.
  • 21:30 - 21:30
    There are n choices where it
    could split.
  • 21:33 - 21:33
    How can we break into those
    cases?
  • 21:36 - 21:36
    Indicator random variables.
    Cool.
  • 21:39 - 21:39
    Exactly.
    That's what we want to do.
  • 21:41 - 21:41
    Indicator random variable
    suggests that what we're dealing
  • 21:46 - 21:46
    with is not exactly just a
    function T(n) but it's a random
  • 21:51 - 21:51
    variable.
    This is one subtlety.
  • 21:53 - 21:53
    T(n) depends on the random
    choices, so it's really a random
  • 21:58 - 21:58
    variable.
  • 22:05 - 22:05
    And then we're going to use
    indicator random variables to
  • 22:08 - 22:08
    get a recurrence on T(n).
  • 22:25 - 22:25
    So, T(n) is the running time of
    rand-select on an input of size
  • 22:33 - 22:33
    n.
  • 22:40 - 22:40
    And I am also going to write
    down explicitly an assumption
  • 22:46 - 22:46
    about the random numbers.
  • 22:55 - 22:55
    That they should be chosen
    independently from each other.
  • 23:00 - 23:00
    Every time I call random
    partition, it's generating a
  • 23:03 - 23:03
    completely independent random
    number from all the other times
  • 23:07 - 23:07
    I call random partition.
    That is important,
  • 23:10 - 23:10
    of course, for this analysis to
    work.
  • 23:13 - 23:13
    We will see why some point down
    the line.
  • 23:15 - 23:15
    And now, to sort of write down
    an equation for T(n) we're going
  • 23:20 - 23:20
    to define indicator random
    variables, as you suggested.
  • 23:36 - 23:36
    And we will call it X_k.
    And this is for all k=0...n-1.
  • 23:50 - 23:50
    Indicator random variables
    either 1 or 0.
  • 23:54 - 23:54
    And it's going to be 1 if the
    partition comes out k on the
  • 24:01 - 24:01
    left-hand side.
    So say the partition generates
  • 24:07 - 24:07
    a k:n-k-1 split and it is 0
    otherwise.
  • 24:11 - 24:11
    We have n of these indicator
    random variables between
  • 24:17 - 24:17
    0...n-1.
    And in each case,
  • 24:20 - 24:20
    no matter how the random choice
    comes out, exactly one of them
  • 24:28 - 24:28
    will be 1.
    All the others will be 0.
  • 24:32 - 24:32
    Now we can divide out the
    running time of this algorithm
  • 24:37 - 24:37
    based on which case we're in.
  • 24:49 - 24:49
    That will sort of unify this
    intuition that we did and get
  • 24:57 - 24:57
    all the cases.
    And then we can look at the
  • 25:03 - 25:03
    expectation.
    T(n), if we just split out by
  • 25:09 - 25:09
    cases, we have an upper bound
    like this.
  • 25:28 - 25:28
    If we have 0 to n-1 split,
    the worst is we have n-1.
  • 25:33 - 25:33
    Then we have to recurse in a
    problem of size n-1.
  • 25:38 - 25:38
    In fact, it would be pretty
    hard to recurse in a problem of
  • 25:44 - 25:44
    size 0.
    If we have a 1 to n-2 split
  • 25:47 - 25:47
    then we take the max of the two
    sides.
  • 25:51 - 25:51
    That's certainly going to give
    us an upper bound and so on.
  • 26:03 - 26:03
    And at the bottom you get an
    n-1 to 0 split.
  • 26:14 - 26:14
    This is now sort of
    conditioning on various events,
  • 26:17 - 26:17
    but we have indicator random
    variables to tell us when these
  • 26:20 - 26:20
    events happen.
    We can just multiply each of
  • 26:22 - 26:22
    these values by the indicator
    random variable and it will come
  • 26:25 - 26:25
    out 0 if that's not the case and
    will come out 1 and give us this
  • 26:28 - 26:28
    value if that happens to be the
    split.
  • 26:31 - 26:31
    So, if we add up all of those
    we'll get the same thing.
  • 26:38 - 26:38
    This is equal to the sum over
    all k of the indicator random
  • 26:46 - 26:46
    variable times the cost in that
    case, which is t of max k,
  • 26:53 - 26:53
    and the other side,
    which is n-k-1,
  • 26:57 - 26:57
    plus theta n.
    This is our recurrence,
  • 27:02 - 27:02
    in some sense,
    for the random variable
  • 27:05 - 27:05
    representing running time.
    Now, the value will depend on
  • 27:09 - 27:09
    which case we come into.
    We know the probability of each
  • 27:14 - 27:14
    of these events happening is the
    same because we're choosing the
  • 27:19 - 27:19
    partition element uniformly at
    random, but we cannot really
  • 27:24 - 27:24
    simplify much beyond this until
    we take expectations.
  • 27:29 - 27:29
    We know this random variable
    could be as big as n^2.
  • 27:33 - 27:33
    Hopefully it's usually linear.
    We will take expectations of
  • 27:37 - 27:37
    both sides and get what we want.
  • 27:54 - 27:54
    Let's look at the expectation
    of this random variable,
  • 27:58 - 27:58
    which is just the expectation,
    I will copy over,
  • 28:02 - 28:02
    summation we have here so I can
    work on this board.
  • 28:30 - 28:30
    I want to compute the
    expectation of this summation.
  • 28:34 - 28:34
    What property of expectation
    should I use?
  • 28:37 - 28:37
    Linearity, good.
    We can bring the summation
  • 28:40 - 28:40
    outside.
  • 29:08 - 29:08
    Now I have a sum of
    expectation.
  • 29:10 - 29:10
    Let's look at each expectation
    individually.
  • 29:13 - 29:13
    It's a product of two random
    variables, if you will.
  • 29:16 - 29:16
    This is an indicator random
    variable and this is some more
  • 29:19 - 29:19
    complicated function,
    some more complicated random
  • 29:22 - 29:22
    variable representing some
    running time,
  • 29:25 - 29:25
    which depends on what random
    choices are made in that
  • 29:28 - 29:28
    recursive call.
    Now what should I do?
  • 29:32 - 29:32
    I have the expectation of the
    product of two random variables.
  • 29:38 - 29:38
    Independence,
    exactly.
  • 29:40 - 29:40
    If I know that these two random
    variables are independent then I
  • 29:46 - 29:46
    know that the expectation of the
    product is the product of the
  • 29:51 - 29:51
    expectations.
    Now we have to check are they
  • 29:55 - 29:55
    independent?
    I hope so because otherwise
  • 29:59 - 29:59
    there isn't much else I can do.
    Why are they independent?
  • 30:05 - 30:05
    Sorry?
    Because we stated that they
  • 30:08 - 30:08
    are, right.
    Because of this assumption.
  • 30:11 - 30:11
    We assume that all the random
    numbers are chosen
  • 30:14 - 30:14
    independently.
    We need to sort of interpolate
  • 30:18 - 30:18
    that here.
    These X_k's,
  • 30:19 - 30:19
    all the X_k's,
    X_0 up to X_n-1,
  • 30:22 - 30:22
    so all the ones appearing in
    this summation are dependent
  • 30:26 - 30:26
    upon a single random choice of
    this particular call to random
  • 30:31 - 30:31
    partition.
    All of these are correlated,
  • 30:37 - 30:37
    because if one of them is 1,
    all the others are forced to be
  • 30:44 - 30:44
    0.
    So, there is a lot of
  • 30:47 - 30:47
    correlation among the X_k's.
    But with respect to everything
  • 30:55 - 30:55
    that is in here,
    and the only random part is
  • 31:00 - 31:00
    this T(max(kn-k-1)).
    That is the reason that this
  • 31:07 - 31:07
    random variable is independent
    from these.
  • 31:13 - 31:13
    The same thing as quicksort,
    but I know some people got
  • 31:19 - 31:19
    confused about it a couple
    lectures ago so I am
  • 31:25 - 31:25
    reiterating.
    We get the product of
  • 31:29 - 31:29
    expectations,
    E[X_k] E[T(max(kn-k-1))].
  • 31:35 - 31:35
    I mean the order n comes
    outside, but let's leave it
  • 31:40 - 31:40
    inside for now.
    There is no expectation to
  • 31:45 - 31:45
    compute there for order n.
    Order n is order n.
  • 31:49 - 31:49
    What is the expectation of X_k?
    1/n, because they're all chosen
  • 31:56 - 31:56
    with equal probability.
    There is n of them,
  • 32:00 - 32:00
    so the expectation is 1/n.
    The value is either 1 or 0.
  • 32:05 - 32:05
    We start to be able to split
    this up.
  • 32:07 - 32:07
    We have 1/n times this expected
    value of some recursive T call,
  • 32:12 - 32:12
    and then we have plus 1 over n
    times order n,
  • 32:16 - 32:16
    also known as a constant,
    but everything is summed up n
  • 32:20 - 32:20
    times so let's expand this.
  • 32:35 - 32:35
    I have the sum k=0 to n-1.
    I guess the 1/n can come
  • 32:42 - 32:42
    outside.
    And we have expectation of
  • 32:47 - 32:47
    [T(max(kn-k-1))].
    Lots of nifty braces there.
  • 32:54 - 32:54
    And then plus we have,
    on the other hand,
  • 33:00 - 33:00
    the sum k=0 to n-1.
    Let me just write that out
  • 33:06 - 33:06
    again.
    We have a 1/n in front and we
  • 33:09 - 33:09
    have a Theta(n) inside.
    This summation is n^2.
  • 33:12 - 33:12
    And then we're dividing by n,
    so this whole thing is,
  • 33:17 - 33:17
    again, order n.
    Nothing fancy happened there.
  • 33:20 - 33:20
    This is really just saying the
    expectation of order n is order
  • 33:25 - 33:25
    n.
    Average value of order n is
  • 33:27 - 33:27
    order n.
    What is interesting is this
  • 33:32 - 33:32
    part.
    Now, what could we do with this
  • 33:35 - 33:35
    summation?
    Here we start to differ from
  • 33:39 - 33:39
    randomized quicksort because we
    have this max.
  • 33:43 - 33:43
    Randomized quicksort we had the
    sum of T(k) plus T(n-k-1)
  • 33:48 - 33:48
    because we were making both
    recursive calls.
  • 33:53 - 33:53
    Here we're only making the
    biggest one.
  • 33:56 - 33:56
    That max is really a pain for
    evaluating this recurrence.
  • 34:03 - 34:03
    How could I get rid of the max?
    That's one way to think of it.
  • 34:12 - 34:12
    Yeah?
  • 34:18 - 34:18
    Exactly.
    I could only sum up to halfway
  • 34:21 - 34:21
    and then double.
    In other words,
  • 34:23 - 34:23
    terms are getting repeated
    twice here.
  • 34:26 - 34:26
    When k=0 or when k=n-1,
    I get the same T(n-1).
  • 34:30 - 34:30
    When k=1 or n-2,
    I get the same thing,
  • 34:34 - 34:34
    2 and n-3.
    What I will actually do is sum
  • 34:38 - 34:38
    from halfway up.
    That's a little bit cleaner.
  • 34:43 - 34:43
    And let me get the indices
    right.
  • 34:46 - 34:46
    Floor of n/2 up to n-1 will be
    safe.
  • 34:49 - 34:49
    And then I just have E[T(k)],
    except I forgot to multiply by
  • 34:56 - 34:56
    2, so I'm going to change this 1
    to a 2.
  • 35:01 - 35:01
    And order n is preserved.
    This is just because each term
  • 35:05 - 35:05
    is appearing twice.
    I can factor it out.
  • 35:08 - 35:08
    And if n is odd,
    I'm actually double-counting
  • 35:11 - 35:11
    somewhat, but it's certain at
    most that.
  • 35:13 - 35:13
    So, that's a safe upper bound.
    And upper bounds are all we
  • 35:17 - 35:17
    care about because we're hoping
    to get linear.
  • 35:20 - 35:20
    And the running time of this
    algorithm is definitely at least
  • 35:25 - 35:25
    linear, so we just need an upper
    bounded linear.
  • 35:29 - 35:29
    So, this is a recurrence.
    E[T(n)] is at most 2/n times
  • 35:33 - 35:33
    the sum of half the numbers
    between 0 and n of
  • 35:36 - 35:36
    E[T(k)]+Theta(n).
    It's a bit of hairy recurrence.
  • 35:40 - 35:40
    We want to solve it,
    though.
  • 35:42 - 35:42
    And it's actually a little bit
    easier than the randomized
  • 35:46 - 35:46
    quicksort recurrence.
    We're going to solve it.
  • 35:49 - 35:49
    What method should we use?
    Sorry?
  • 35:51 - 35:51
    Master method?
    Master would be nice,
  • 35:54 - 35:54
    except that each of the
    recursive calls is with a
  • 35:57 - 35:57
    different value of k.
    The master method only works
  • 36:02 - 36:02
    when all the calls are with the
    same value, same size.
  • 36:06 - 36:06
    Alas, it would be nice if we
    could use the master method.
  • 36:09 - 36:09
    What else do we have?
    Substitution.
  • 36:12 - 36:12
    When it's hard,
    when in doubt,
  • 36:14 - 36:14
    use substitution.
    I mean the good thing here is
  • 36:17 - 36:17
    we know what we want.
    From the intuition at least,
  • 36:20 - 36:20
    which is now erased,
    we really feel that this should
  • 36:24 - 36:24
    be linear time.
    So, we know what we want to
  • 36:26 - 36:26
    prove.
    And indeed we can prove it just
  • 36:32 - 36:32
    directly with substitution.
  • 36:42 - 36:42
    I want to claim there is some
    constant c greater than zero
  • 36:46 - 36:46
    such that E[T(n)],
    according to this recurrence,
  • 36:50 - 36:50
    is at most c times n.
    Let's prove that over here.
  • 37:00 - 37:00
    As we guessed,
    the proof is by substitution.
  • 37:13 - 37:13
    What that means is we're going
    to assume, by induction,
  • 37:18 - 37:18
    that this inequality is true
    for all smaller m.
  • 37:23 - 37:23
    I will just say 4 less than n.
    And we need to prove it for n.
  • 37:29 - 37:29
    We get E[T(n)].
    Now we are just going to expand
  • 37:34 - 37:34
    using the recurrence that we
    have.
  • 37:36 - 37:36
    It's at most this.
    I will copy that over.
  • 37:54 - 37:54
    And then each of these
    recursive calls is with some
  • 37:58 - 37:58
    value k that is strictly smaller
    than n.
  • 38:01 - 38:01
    Sorry, I copied it wrong,
    floor of n over 2,
  • 38:04 - 38:04
    not zero.
    And so I can apply the
  • 38:07 - 38:07
    induction hypothesis to each of
    these.
  • 38:11 - 38:11
    This is at most c times k by
    the induction hypothesis.
  • 38:17 - 38:17
    And so I get this inequality.
  • 38:37 - 38:37
    This c can come outside the
    summation because it's just a
  • 38:41 - 38:41
    constant.
    And I will be slightly tedious
  • 38:43 - 38:43
    in writing this down again,
    because what I care about is
  • 38:47 - 38:47
    the summation here that is left
    over.
  • 38:56 - 38:56
    This is a good old-fashioned
    summation.
  • 39:01 - 39:01
    And if you remember back to
    your summation tricks or
  • 39:05 - 39:05
    whatever, you should be able to
    evaluate this.
  • 39:08 - 39:08
    If we started at zero and went
    up to n minus 1,
  • 39:11 - 39:11
    that's just an arithmetic
    series, but here we have the
  • 39:15 - 39:15
    tail end of an arithmetic
    series.
  • 39:17 - 39:17
    And you should know,
    at least up to theta,
  • 39:20 - 39:20
    what this is,
    right?
  • 39:21 - 39:21
    n^2, yeah.
    It's definitely T(n^2).
  • 39:24 - 39:24
    But we need here a slightly
    better upper bond,
  • 39:27 - 39:27
    as we will see the constants
    really matter.
  • 39:31 - 39:31
    What we're going to use is that
    this summation is at most 3/8
  • 39:35 - 39:35
    times n^2.
    And that will be critical,
  • 39:38 - 39:38
    the fact that 3/8 is smaller
    than 1/2, I believe.
  • 39:42 - 39:42
    So it's going to get rid of
    this 2.
  • 39:44 - 39:44
    I am not going to prove this.
    This is an exercise.
  • 39:48 - 39:48
    When you know that it is true,
    it's easy because you can just
  • 39:52 - 39:52
    prove it by induction.
    Figuring out that number is a
  • 39:56 - 39:56
    little bit more work,
    but not too much more.
  • 40:00 - 40:00
    So you should prove that by
    induction.
  • 40:04 - 40:04
    Now let me simplify.
    This is a bit messy,
  • 40:09 - 40:09
    but what I want is c times n.
    Let's write it as our desired
  • 40:16 - 40:16
    value minus the residual.
    And here we have some crazy
  • 40:22 - 40:22
    fractions.
    This is 2 times 3 which is 6
  • 40:27 - 40:27
    over 8 which is 3/4,
    right?
  • 40:31 - 40:31
    Here we have 1,
    so we have to subtract up 1/4
  • 40:35 - 40:35
    to get 3/4.
    And this should be,
  • 40:37 - 40:37
    I guess, 1/4 times c times n.
    And then we have this theta n
  • 40:42 - 40:42
    with double negation becomes a
    plus theta n.
  • 40:46 - 40:46
    That should be clear.
    I am just rewriting that.
  • 40:50 - 40:50
    So we have what we want over
    here.
  • 40:53 - 40:53
    And then we hope that this is
    nonnegative because what we want
  • 40:58 - 40:58
    is that this less than or equal
    to c times n.
  • 41:03 - 41:03
    That will be true,
    provided this thing is
  • 41:06 - 41:06
    nonnegative.
    And it looks pretty good
  • 41:09 - 41:09
    because we're free to choose c
    however large we want.
  • 41:13 - 41:13
    Whatever constant is imbedded
    in this beta notation is one
  • 41:18 - 41:18
    fixed constant,
    whatever makes this recurrence
  • 41:21 - 41:21
    true.
    We just set c to be bigger than
  • 41:24 - 41:24
    4 times that constant and then
    this will be nonnegative.
  • 41:28 - 41:28
    So this is true for c
    sufficiently large to dwarf that
  • 41:33 - 41:33
    theta constant.
    It's also the base case.
  • 41:37 - 41:37
    I just have to make the cursory
    mention that we choose c large
  • 41:41 - 41:41
    enough so that this claim is
    true, even in the base case
  • 41:46 - 41:46
    where n is at most some
    constant.
  • 41:48 - 41:48
    Here it's like 1 or so because
    then we're not making a
  • 41:52 - 41:52
    recursive call.
    What we get --
  • 41:55 - 41:55
    This algorithm,
    randomize select,
  • 41:59 - 41:59
    has expected running time order
    n, Theta(n).
  • 42:12 - 42:12
    The annoying this is that in
    the worst-case,
  • 42:16 - 42:16
    if you're really,
    really unlucky it's n^2.
  • 42:19 - 42:19
    Any questions before we move on
    from this point?
  • 42:24 - 42:24
    This finished off the proof of
    this fact that we have Theta(n)
  • 42:29 - 42:29
    expected time.
    We already saw the n^2
  • 42:33 - 42:33
    worst-case.
    All perfectly clear?
  • 42:35 - 42:35
    Good.
    You should go over these
  • 42:37 - 42:37
    proofs.
    They're intrinsically related
  • 42:40 - 42:40
    between randomized quicksort and
    randomized select.
  • 42:44 - 42:44
    Know them in your heart.
    This is a great algorithm that
  • 42:48 - 42:48
    works really well in practice
    because most of the time you're
  • 42:52 - 42:52
    going to split,
    say, in the middle,
  • 42:55 - 42:55
    somewhere between a 1/4 and 3/4
    and everything is good.
  • 43:00 - 43:00
    It's extremely unlikely that
    you get the n^2 worst-case.
  • 43:03 - 43:03
    It would have to happen with
    like 1 over n^n probability or
  • 43:07 - 43:07
    something really,
    really small.
  • 43:09 - 43:09
    But I am a theoretician at
    least.
  • 43:10 - 43:10
    And it would be really nice if
    you could get Theta(n) in the
  • 43:14 - 43:14
    worst-case.
    That would be the cleanest
  • 43:16 - 43:16
    result that you could hope for
    because that's optimal.
  • 43:19 - 43:19
    You cannot do better than
    Theta(n).
  • 43:21 - 43:21
    You've got to look at the
    elements.
  • 43:23 - 43:23
    So, you might ask,
    can we get rid of this
  • 43:26 - 43:26
    worst-case behavior and somehow
    avoid randomization and
  • 43:29 - 43:29
    guarantee Theta(n) worst-case
    running time?
  • 43:33 - 43:33
    And you can but it's a rather
    nontrivial algorithm.
  • 43:39 - 43:39
    And this is going to be one of
    the most sophisticated that
  • 43:46 - 43:46
    we've seen so far.
    It won't continue to be the
  • 43:51 - 43:51
    most sophisticated algorithm we
    will see, but here it is.
  • 43:58 - 43:58
    Worst-case linear time order
    statistics.
  • 44:09 - 44:09
    And this is an algorithm by
    several, all very famous people,
  • 44:23 - 44:23
    Blum, Floyd,
    Pratt, Rivest and Tarjan.
  • 44:32 - 44:32
    I think I've only met the B and
    the R and the T.
  • 44:35 - 44:35
    Oh, no, I've met Pratt as well.
    I'm getting close to all the
  • 44:40 - 44:40
    authors.
    This is a somewhat old result,
  • 44:42 - 44:42
    but at the time it was a major
    breakthrough and still is an
  • 44:47 - 44:47
    amazing algorithm.
    Ron Rivest is a professor here.
  • 44:50 - 44:50
    You should know him from the R
    in RSA.
  • 44:53 - 44:53
    When I took my PhD
    comprehensives some time ago,
  • 44:56 - 44:56
    on the cover sheet was a joke
    question.
  • 45:00 - 45:00
    It asked of the authors of the
    worst-case linear time order
  • 45:05 - 45:05
    statistics algorithm,
    which of them is the most rich?
  • 45:09 - 45:09
    Sadly it was not a graded part
    of the comprehensive exam,
  • 45:13 - 45:13
    but it was an amusing question.
    I won't answer it here because
  • 45:18 - 45:18
    we're on tape,
    [LAUGHTER] but think about it.
  • 45:22 - 45:22
    I may not be obvious.
    Several of them are rich.
  • 45:25 - 45:25
    It's just the question of who
    is the most rich.
  • 45:30 - 45:30
    Anyway, before they were rich
    they came up with this
  • 45:34 - 45:34
    algorithm.
    They've come up with many
  • 45:36 - 45:36
    algorithms since,
    even after getting rich,
  • 45:39 - 45:39
    believe it or not.
    What we want is a good pivot,
  • 45:42 - 45:42
    guaranteed good pivot.
    Random pivot is going to be
  • 45:45 - 45:45
    really good.
    And so the simplest algorithm
  • 45:48 - 45:48
    is just pick a random pivot.
    It's going to be good with high
  • 45:52 - 45:52
    probability.
    We want to force a good pivot
  • 45:55 - 45:55
    deterministically.
    And the new idea here is we're
  • 45:58 - 45:58
    going to generate it
    recursively.
  • 46:02 - 46:02
    What else could we do but
    recurse?
  • 46:04 - 46:04
    Well, you should know from your
    recurrences that if we did two
  • 46:09 - 46:09
    recursive calls on problems of
    half the size and we have a
  • 46:13 - 46:13
    linear extra work that's the
    mergesort recurrence,
  • 46:16 - 46:16
    T(n)=2[T(n/2)+Theta(n)].
    You should recite in your
  • 46:20 - 46:20
    sleep.
    That's n lg n.
  • 46:21 - 46:21
    So we cannot recurse on two
    problems of half the size.
  • 46:25 - 46:25
    We've got to do better.
    Somehow these recursions have
  • 46:30 - 46:30
    to add up to strictly less than
    n.
  • 46:33 - 46:33
    That's the magic of this
    algorithm.
  • 46:35 - 46:35
    So this will just be called
    select instead of rand-select.
  • 46:40 - 46:40
    And it really depends on an
    array, but I will focus on the
  • 46:44 - 46:44
    i-th element that we want to
    select and the size of the array
  • 46:49 - 46:49
    that we want to select in.
    And I am going to write this
  • 46:53 - 46:53
    algorithm slightly less formally
    than randomize-select because
  • 46:58 - 46:58
    it's a bit higher level of an
    algorithm.
  • 47:22 - 47:22
    And let me draw over here the
    picture of the algorithm.
  • 47:31 - 47:31
    The first step is sort of the
    weirdest and it's one of the key
  • 47:36 - 47:36
    ideas.
    You take your elements,
  • 47:39 - 47:39
    and they are in no particular
    order, so instead of drawing
  • 47:44 - 47:44
    them on a line,
    I am going to draw them in a 5
  • 47:48 - 47:48
    by n over 5 grid.
    Why not?
  • 47:50 - 47:50
    This, unfortunately,
    take a little while to draw,
  • 47:54 - 47:54
    but it will take you equally
    long so I will take my time.
  • 48:00 - 48:00
    It doesn't really matter what
    the width is,
  • 48:03 - 48:03
    but it should be width n over 5
    so make sure you draw your
  • 48:06 - 48:06
    figure accordingly.
    Width n over 5,
  • 48:08 - 48:08
    but the height should be
    exactly 5.
  • 48:10 - 48:10
    I think I got it right.
    I can count that high.
  • 48:13 - 48:13
    Here is 5.
    And this should be,
  • 48:15 - 48:15
    well, you know,
    our number may not be divisible
  • 48:18 - 48:18
    by 5, so maybe it ends off in
    sort of an odd way.
  • 48:21 - 48:21
    But what I would like is that
    these chunks should be floor of
  • 48:25 - 48:25
    n over 5.
    And then we will have,
  • 48:27 - 48:27
    at most, four elements left
    over.
  • 48:30 - 48:30
    So I am going to ignore those.
    They don't really matter.
  • 48:34 - 48:34
    It's just an additive constant.
    Here is my array.
  • 48:37 - 48:37
    I just happened to write it in
    this funny way.
  • 48:40 - 48:40
    And I will call these vertical
    things groups.
  • 48:43 - 48:43
    I would circle them,
    and I did that in my notes,
  • 48:46 - 48:46
    but things get really messy if
    you start circling.
  • 48:49 - 48:49
    This diagram is going to get
    really full, just to warn you.
  • 48:53 - 48:53
    By the end it will be almost
    unintelligible,
  • 48:56 - 48:56
    but there it is.
    If you are really feeling
  • 49:00 - 49:00
    bored, you can draw this a few
    times.
  • 49:03 - 49:03
    And you should draw how it
    grows.
  • 49:06 - 49:06
    So there are the groups,
    vertical groups of five.
  • 49:10 - 49:10
    Next step.
  • 49:18 - 49:18
    The second step is to recurse.
    This is where things are a bit
  • 49:25 - 49:25
    unusual, well,
    even more unusual.
  • 49:28 - 49:28
    Oops, sorry.
    I really should have had a line
  • 49:33 - 49:33
    between one and two so I am
    going to have to move this down
  • 49:37 - 49:37
    and insert it here.
    I also, in step one,
  • 49:40 - 49:40
    want to find the median of each
    group.
  • 49:53 - 49:53
    What I would like to do is just
    imagine this figure,
  • 49:56 - 49:56
    each of the five elements in
    each group gets reorganized so
  • 50:00 - 50:00
    that the middle one is the
    median.
  • 50:02 - 50:02
    So I am going to call these the
    medians of each group.
  • 50:06 - 50:06
    I have five elements so the
    median is right in the middle.
  • 50:10 - 50:10
    There are two elements less
    than the median,
  • 50:13 - 50:13
    two elements greater than the
    median.
  • 50:16 - 50:16
    Again, we're assuming all
    elements are distinct.
  • 50:19 - 50:19
    So there they are.
    I compute them.
  • 50:22 - 50:22
    How long does that take me?
    N over five groups,
  • 50:25 - 50:25
    each with five elements,
    compute the median of each one?
  • 50:30 - 50:30
    Sorry?
    Yeah, 2 times n over 5.
  • 50:32 - 50:32
    It's theta n,
    that's all I need to know.
  • 50:35 - 50:35
    I mean, you're counting
    comparisons, which is good.
  • 50:38 - 50:38
    It's definitely Theta(n).
    The point is within each group,
  • 50:42 - 50:42
    I only have to do a constant
    number of comparisons because
  • 50:46 - 50:46
    it's a constant number of
    elements.
  • 50:48 - 50:48
    It doesn't matter.
    You could use randomize select
  • 50:52 - 50:52
    for all I care.
    No matter what you do,
  • 50:54 - 50:54
    it can only take a constant
    number of comparisons.
  • 50:59 - 50:59
    As long as you don't make a
    comparison more than once.
  • 51:03 - 51:03
    So this is easy.
    You could sort the five numbers
  • 51:07 - 51:07
    and then look at the third one,
    it doesn't matter because there
  • 51:12 - 51:12
    are only five of them.
    That's one nifty idea.
  • 51:16 - 51:16
    Already we have some elements
    that are sort of vaguely in the
  • 51:21 - 51:21
    middle but just of the group.
    And we've only done linear
  • 51:26 - 51:26
    work.
    So doing well so far.
  • 51:29 - 51:29
    Now we get to the second step,
    which I started to write
  • 51:34 - 51:34
    before, where we recurse.
  • 51:58 - 51:58
    So the next idea is,
    well, we have these floor over
  • 52:02 - 52:02
    n over 5 medians.
    I am going to compute the
  • 52:04 - 52:04
    median of those medians.
    I am imagining that I
  • 52:07 - 52:07
    rearranged these.
    And, unfortunately,
  • 52:09 - 52:09
    it's an even number,
    there are six of them,
  • 52:12 - 52:12
    but I will rearrange so that
    this guy, I have drawn in a
  • 52:15 - 52:15
    second box, is the median of
    these elements so that these two
  • 52:19 - 52:19
    elements are strictly less than
    this guy, these three elements
  • 52:22 - 52:22
    are strictly greater than this
    guy.
  • 52:24 - 52:24
    Now, that doesn't directly tell
    me anything, it would seem,
  • 52:28 - 52:28
    about any of the elements out
    here.
  • 52:31 - 52:31
    We will come back to that.
    In fact, it does tell us about
  • 52:35 - 52:35
    some of the elements.
    But right now this element is
  • 52:39 - 52:39
    just the median of these guys.
    Each of these guys is a median
  • 52:43 - 52:43
    of five elements.
    That's all we know.
  • 52:45 - 52:45
    If we do that recursively,
    this is going to take T of n
  • 52:49 - 52:49
    over 5 time.
    So far so good.
  • 52:51 - 52:51
    We can afford a recursion on a
    problem of size n over 5 and
  • 52:55 - 52:55
    linear work.
    We know that works out to
  • 52:58 - 52:58
    linear time.
    But there is more.
  • 53:01 - 53:01
    We're obviously not done yet.
  • 53:10 - 53:10
    The next step is x is our
    partition element.
  • 53:13 - 53:13
    We partition there.
    The rest of the algorithm is
  • 53:16 - 53:16
    just like randomized partition,
    so we're going to define k to
  • 53:19 - 53:19
    be the rank of x.
    And this can be done,
  • 53:22 - 53:22
    I mean it's n minus r plus 1 or
    whatever, but I'm not going to
  • 53:26 - 53:26
    write out how to do that because
    we're at a higher level here.
  • 53:30 - 53:30
    But it can be done.
    And then we have the three-way
  • 53:35 - 53:35
    branching.
    So if i happens to equal k
  • 53:38 - 53:38
    we're happy.
    The pivot element is the
  • 53:41 - 53:41
    element we're looking for,
    but more likely i is either
  • 53:46 - 53:46
    less than k or it is bigger than
    k.
  • 53:49 - 53:49
    And then we make the
    appropriate recursive call,
  • 53:54 - 53:54
    so here we recursively select
    the i-th smallest element --
  • 54:08 - 54:08
    -- in the lower part of the
    array.
  • 54:11 - 54:11
    Left of the partition element.
    Otherwise, we recursively
  • 54:16 - 54:16
    select the i minus k-th smallest
    element in the upper part of the
  • 54:23 - 54:23
    array.
    I am writing this at a high
  • 54:26 - 54:26
    level because we've already seen
    it.
  • 54:30 - 54:30
    All of this is the same as the
    last couple steps of randomized
  • 54:36 - 54:36
    select.
  • 54:45 - 54:45
    That is the algorithm.
    The real question is why does
  • 54:48 - 54:48
    it work?
    Why is this linear time?
  • 54:50 - 54:50
    The first question is what's
    the recurrence?
  • 54:53 - 54:53
    We cannot quite write it down
    yet because we don't know how
  • 54:57 - 54:57
    big these recursive subproblems
    could be.
  • 55:00 - 55:00
    We're going to either recurse
    in the lower part or the upper
  • 55:04 - 55:04
    part, that's just like before.
    If we're unlucky and we have a
  • 55:08 - 55:08
    split of like zero to n minus
    one, this is going to be a
  • 55:12 - 55:12
    quadratic time algorithm.
    The claim is that this
  • 55:15 - 55:15
    partition element is guaranteed
    to be pretty good and good
  • 55:19 - 55:19
    enough.
    The running time of this thing
  • 55:21 - 55:21
    will be T of something times n,
    and we don't know what the
  • 55:25 - 55:25
    something is yet.
    How big could it be?
  • 55:27 - 55:27
    Well, I could ask you.
    But we're sort of indirect here
  • 55:32 - 55:32
    so I will tell you.
    We have already a recursive
  • 55:35 - 55:35
    call of T of n over 5.
    It better be that whatever
  • 55:38 - 55:38
    constant, so it's going to be
    something times n,
  • 55:41 - 55:41
    it better be that that constant
    is strictly less than 4/5.
  • 55:45 - 55:45
    If it's equal to 4/5 then
    you're not splitting up the
  • 55:48 - 55:48
    problem enough to get an n lg n
    running time.
  • 55:51 - 55:51
    If it's strictly less than 4/5
    then you're reducing the problem
  • 55:55 - 55:55
    by at least a constant factor.
    In the sense if you add up all
  • 56:00 - 56:00
    the recursive subproblems,
    n over 5 and something times n,
  • 56:04 - 56:04
    you get something that is a
    constant strictly less than one
  • 56:07 - 56:07
    times n.
    That forces the work to be
  • 56:10 - 56:10
    geometric.
    If it's geometric you're going
  • 56:12 - 56:12
    to get linear time.
    So this is intuition but it's
  • 56:15 - 56:15
    the right intuition.
    Whenever you're aiming for
  • 56:18 - 56:18
    linear time keep that in mind.
    If you're doing a
  • 56:21 - 56:21
    divide-and-conquer,
    you've got to get the total
  • 56:24 - 56:24
    subproblem size to be some
    constant less than one times n.
  • 56:28 - 56:28
    That will work.
    OK, so we've got to work out
  • 56:33 - 56:33
    this constant here.
    And we're going to use this
  • 56:37 - 56:37
    figure, which so far looks
    surprisingly uncluttered.
  • 56:42 - 56:42
    Now we will make it cluttered.
    What I would like to do is draw
  • 56:48 - 56:48
    an arrow between two vertices,
    two points, elements,
  • 56:54 - 56:54
    whatever you want to call them.
    Let's call them a and b.
  • 57:00 - 57:00
    And I want to orient the arrow
    so it points to a larger value,
  • 57:04 - 57:04
    so this means that a is less
    than b.
  • 57:07 - 57:07
    This is notation just for the
    diagram.
  • 57:10 - 57:10
    And so this element,
    I am going to write down what I
  • 57:13 - 57:13
    know.
    This element is the median of
  • 57:16 - 57:16
    these five elements.
    I will suppose that it is drawn
  • 57:19 - 57:19
    so that these elements are
    larger than the median,
  • 57:23 - 57:23
    these elements are smaller than
    the median.
  • 57:26 - 57:26
    Therefore, I have arrows like
    this.
  • 57:28 - 57:28
    Here is where I wish I had some
    colored chalk.
  • 57:33 - 57:33
    This is just stating this guy
    is in the middle of those five
  • 57:37 - 57:37
    elements.
    I know that in every single
  • 57:39 - 57:39
    column.
  • 57:55 - 57:55
    Here is where the diagram
    starts to get messy.
  • 57:58 - 57:58
    I am not done yet.
    Now, we also know that this
  • 58:02 - 58:02
    element is the median of the
    medians.
  • 58:04 - 58:04
    Of all the squared elements,
    this guy is the middle.
  • 58:07 - 58:07
    And I will draw it so that
    these are the ones smaller than
  • 58:10 - 58:10
    the median, these are the ones
    larger than the median.
  • 58:13 - 58:13
    I mean the algorithm cannot do
    this.
  • 58:15 - 58:15
    It doesn't necessarily know how
    all this works.
  • 58:18 - 58:18
    I guess it could,
    but this is just for analysis
  • 58:21 - 58:21
    purposes.
    We know this guy is bigger than
  • 58:23 - 58:23
    that one and bigger than that
    one.
  • 58:25 - 58:25
    We don't directly know about
    the other elements.
  • 58:29 - 58:29
    We just know that that one is
    bigger than both of those and
  • 58:33 - 58:33
    this guy is smaller than these.
    Now, that is as messy as the
  • 58:37 - 58:37
    figure will get.
    Now, the nice thing about less
  • 58:41 - 58:41
    than is that it's a transitive
    relation.
  • 58:43 - 58:43
    If I have a directed path in
    this graph, I know that this
  • 58:47 - 58:47
    element is strictly less than
    that element because this is
  • 58:51 - 58:51
    less than that one and this is
    less than that one.
  • 58:55 - 58:55
    Even though directly I only
    know within a column and within
  • 58:59 - 58:59
    this middle row,
    I actually know that this
  • 59:02 - 59:02
    element --
    This is x, by the way.
  • 59:06 - 59:06
    This element is larger than all
    of these elements because it's
  • 59:11 - 59:11
    larger than this one and this
    one and each of these is larger
  • 59:15 - 59:15
    than all of those by these
    arrows.
  • 59:18 - 59:18
    I also know that all of these
    elements in this rectangle here,
  • 59:23 - 59:23
    and you don't have to do this
    but I will make the background
  • 59:28 - 59:28
    even more cluttered.
    All of these elements in this
  • 59:33 - 59:33
    rectangle are greater than or
    equal to this one and all of the
  • 59:38 - 59:38
    elements in this rectangle are
    less than or equal to x.
  • 59:43 - 59:43
    Now, how many are there?
    Well, this is roughly halfway
  • 59:47 - 59:47
    along the set of groups and this
    is 3/5 of these columns.
  • 59:52 - 59:52
    So what we get is that there
    are at least --
  • 59:57 - 59:57
    We have n over 5 groups and we
    have half of the groups that
  • 60:04 - 60:04
    we're looking at here roughly,
    so let's call that floor of n
  • 60:10 - 60:10
    over 2, and then within each
    group we have three elements.
  • 60:17 - 60:17
    So we have at least 3 times
    floor of floor of n over 5 over
  • 60:23 - 60:23
    2 n floor elements that are less
    than or equal to x.
  • 60:30 - 60:30
    And we have the same that are
    greater than or equal to x.
  • 60:36 - 60:36
    Let me simplify this a little
    bit more.
  • 60:40 - 60:40
    I can also give you some more
    justification,
  • 60:45 - 60:45
    and we drew the picture,
    but just for why this is true.
  • 60:51 - 60:51
    We have at least n over 5 over
    2 group medians that are less
  • 60:58 - 60:58
    than or equal to x.
    This is the argument we use.
  • 61:03 - 61:03
    We have half of the group
    medians are less than or equal
  • 61:06 - 61:06
    to x because x is the median of
    the group median,
  • 61:09 - 61:09
    so that is no big surprise.
    This is almost an equality but
  • 61:12 - 61:12
    we're making floors so it's
    greater than or equal to.
  • 61:15 - 61:15
    And then, for each group
    median, we know that there are
  • 61:18 - 61:18
    three elements there that are
    less than or equal to that group
  • 61:22 - 61:22
    median.
    So, by transitivity,
  • 61:23 - 61:23
    they're also less than or equal
    to x.
  • 61:25 - 61:25
    We get this number times three.
    This is actually just floor of
  • 61:31 - 61:31
    n over 10.
    I was being unnecessarily
  • 61:34 - 61:34
    complicated there,
    but that is where it came from.
  • 61:38 - 61:38
    What we know is that this thing
    is now at least 3 times n over
  • 61:44 - 61:44
    10, which is roughly 3/10 of
    elements are in one side.
  • 61:48 - 61:48
    In fact, at least 3/10 of the
    elements are in each side.
  • 61:53 - 61:53
    Therefore, each side has at
    most 7/10 elements roughly.
  • 61:59 - 61:59
    So the number here will be
    7/10.
  • 62:01 - 62:01
    And, if I'm lucky,
    7/10 plus 1/5 is strictly less
  • 62:05 - 62:05
    than one.
    I believe it is,
  • 62:06 - 62:06
    but I have trouble working with
    tenths.
  • 62:09 - 62:09
    I can only handle powers of
    two.
  • 62:11 - 62:11
    What we're going to use is a
    minor simplification,
  • 62:15 - 62:15
    which just barely still works,
    is a little bit easier to think
  • 62:19 - 62:19
    about.
    It's mainly to get rid of this
  • 62:22 - 62:22
    floor because the floor is
    annoying.
  • 62:24 - 62:24
    And we don't really have a
    sloppiness lemma that applies
  • 62:28 - 62:28
    here.
    It turns out if n is
  • 62:31 - 62:31
    sufficiently large,
    3 times floor of n over 10 is
  • 62:35 - 62:35
    greater than or equal to 1/4.
    Quarters I can handle.
  • 62:39 - 62:39
    The claim is that each group
    has size at least 1/4,
  • 62:42 - 62:42
    therefore each group has size
    at most 3/4 because there's a
  • 62:47 - 62:47
    quarter on the side.
    This will be 3/4.
  • 62:49 - 62:49
    And I can definitely tell that
    1/5 is less than 1/4.
  • 62:53 - 62:53
    This is going to add up to
    something strictly less than one
  • 62:57 - 62:57
    and then it will work.
    How is my time?
  • 63:01 - 63:01
    Good.
    At this point,
  • 63:03 - 63:03
    the rest of the analysis is
    easy.
  • 63:06 - 63:06
    How the heck you would come up
    with this algorithm,
  • 63:10 - 63:10
    you realize that this is
    clearly a really good choice for
  • 63:15 - 63:15
    finding a partition element,
    just barely good enough that
  • 63:20 - 63:20
    both recursions add up to linear
    time.
  • 63:23 - 63:23
    Well, that's why it took so
    many famous people.
  • 63:28 - 63:28
    Especially in quizzes,
    but I think in general this
  • 63:31 - 63:31
    class, you won't have to come up
    with an algorithm this clever
  • 63:34 - 63:34
    because you can just use this
    algorithm to find the median.
  • 63:38 - 63:38
    And the median is a really good
    partition element.
  • 63:40 - 63:40
    Now that you know this
    algorithm, now that we're beyond
  • 63:43 - 63:43
    1973, you don't need to know how
    to do this.
  • 63:46 - 63:46
    I mean you should know how this
    algorithm works,
  • 63:48 - 63:48
    but you don't need to do this
    in another algorithm because you
  • 63:52 - 63:52
    can just say run this algorithm,
    you will get the median in
  • 63:55 - 63:55
    linear time, and then you can
    partition to the left and the
  • 63:59 - 63:59
    right.
    And then the left and the right
  • 64:02 - 64:02
    will have exactly equal size.
    Great.
  • 64:05 - 64:05
    This is a really powerful
    subroutine.
  • 64:07 - 64:07
    You could use this all over the
    place, and you will on Friday.
  • 64:12 - 64:12
    Have I analyzed the running
    time pretty much?
  • 64:15 - 64:15
    The first step is linear.
    The second step is T of n over
  • 64:19 - 64:19
    5.
    The third step,
  • 64:20 - 64:20
    I didn't write it,
    is linear.
  • 64:22 - 64:22
    And then the last step is just
    a recursive call.
  • 64:25 - 64:25
    And now we know that this is
    3/4.
  • 64:34 - 64:34
    I get this recurrence.
    T of n is, I'll say at most,
  • 64:40 - 64:40
    T of n over 5 plus T of 3/4n.
    You could have also used 7/10.
  • 64:47 - 64:47
    It would give the same answer,
    but you would also need a floor
  • 64:54 - 64:54
    so we won't do that.
    I claim that this is linear.
  • 65:01 - 65:01
    How should I prove it?
    Substitution.
  • 65:12 - 65:12
    Claim that T of n is at most
    again c times n,
  • 65:16 - 65:16
    that will be enough.
    Proof is by substitution.
  • 65:20 - 65:20
    Again, we assume this is true
    for smaller n.
  • 65:24 - 65:24
    And want to prove it for n.
    We have T of n is at most this
  • 65:29 - 65:29
    thing.
    T of n over 5.
  • 65:31 - 65:31
    And by induction,
    because n of 5 is smaller than
  • 65:36 - 65:36
    n, we know that this is at most
    c.
  • 65:40 - 65:40
    Let me write it as c over 5
    times n.
  • 65:44 - 65:44
    Sure, why not.
    Then we have here 3/4cn.
  • 65:48 - 65:48
    And then we have a linear term.
    Now, unfortunately,
  • 65:53 - 65:53
    I have to deal with things that
    are not powers of two.
  • 66:00 - 66:00
    I will cheat and look at my
    notes.
  • 66:02 - 66:02
    This is also known as 19/20
    times c times n plus theta n.
  • 66:07 - 66:07
    And the point is just that this
    is strictly less than one.
  • 66:11 - 66:11
    Because it's strictly less than
    one, I can write this as one
  • 66:15 - 66:15
    times c of n minus some
    constant, here it happens to be
  • 66:19 - 66:19
    1/20, as long as I have
    something left over here,
  • 66:23 - 66:23
    1/20 times c times n.
    Then I have this annoying theta
  • 66:27 - 66:27
    n term which I want to get rid
    of because I want this to be
  • 66:31 - 66:31
    nonnegative.
    But it is nonnegative,
  • 66:35 - 66:35
    as long as I set c to be
    really, really large,
  • 66:38 - 66:38
    at least 20 times whatever
    constant is here.
  • 66:42 - 66:42
    So this is at most c times n
    for c sufficiently large.
  • 66:46 - 66:46
    And, oh, by the way,
    if n is less than or equal to
  • 66:50 - 66:50
    50, which we used up here,
    then T of n is a constant,
  • 66:54 - 66:54
    it doesn't really matter what
    you do, and T of n is at most c
  • 66:59 - 66:59
    times n for c sufficiently
    large.
  • 67:03 - 67:03
    That proves this claim.
    Of course, the constant here is
  • 67:06 - 67:06
    pretty damn big.
    It depends exactly what the
  • 67:08 - 67:08
    constants and the running times
    are, which depends on your
  • 67:12 - 67:12
    machine, but practically this
    algorithm is not so hot because
  • 67:15 - 67:15
    the constants are pretty big.
    Even though this element is
  • 67:18 - 67:18
    guaranteed to be somewhere
    vaguely in the middle,
  • 67:21 - 67:21
    and even though these
    recursions add up to strictly
  • 67:24 - 67:24
    less than n and it's geometric,
    it's geometric because the
  • 67:27 - 67:27
    problem is reducing by at least
    a factor of 19/20 each time.
  • 67:31 - 67:31
    So it actually takes a while
    for the problem to get really
  • 67:35 - 67:35
    small.
    Practically you probably don't
  • 67:37 - 67:37
    want to use this algorithm
    unless you cannot somehow flip
  • 67:41 - 67:41
    coins.
    The randomized algorithm works
  • 67:43 - 67:43
    really, really fast.
    Theoretically this is your
  • 67:46 - 67:46
    dream, the best you could hope
    for because it's linear time and
  • 67:50 - 67:50
    you need linear time as
    guaranteed linear time.
  • 67:53 - 67:53
    I will mention,
    before we end,
  • 67:55 - 67:55
    an exercise.
  • 68:03 - 68:03
    Why did we use groups of five?
    Why not groups of three?
  • 68:06 - 68:06
    As you might guess,
    the answer is because it
  • 68:09 - 68:09
    doesn't work with groups of
    three.
  • 68:11 - 68:11
    But it's quite constructive to
    find out why.
  • 68:14 - 68:14
    If you work through this math
    with groups of three instead of
  • 68:18 - 68:18
    groups of five,
    you will find that you don't
  • 68:20 - 68:20
    quite get the problem reduction
    that you need.
  • 68:23 - 68:23
    Five is the smallest number for
    which this works.
  • 68:27 - 68:27
    It would work with seven,
    but theoretically not any
  • 68:30 - 68:30
    better than a constant factor.
    Any questions?
  • 68:33 - 68:33
    All right.
    Then recitation Friday.
  • 68:35 - 68:35
    Homework lab Sunday.
    Problem set due Monday.
  • 68:38 - 68:38
    Quiz one in two weeks.
Title:
Lec 6 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Description:

Lecture 06: Order Statistics, Median

View the complete course at: http://ocw.mit.edu/6-046JF05

License: Creative Commons BY-NC-SA

More information at http://ocw.mit.edu/terms

More courses at http://ocw.mit.edu

more » « less
Duration:
01:08:49

English subtitles

Revisions