< Return to Video

Lec 16 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

  • 0:07 - 0:07
    -- valuable experience.
    OK, today we're going to start
  • 0:12 - 0:12
    talking about a particular class
    of algorithms called greedy
  • 0:18 - 0:18
    algorithms.
    But we're going to do it in the
  • 0:23 - 0:23
    context of graphs.
    So, I want to review a little
  • 0:28 - 0:28
    bit about graphs,
    which mostly you can find in
  • 0:32 - 0:32
    the textbook in appendix B.
    And so, if you haven't reviewed
  • 0:40 - 0:40
    in appendix B recently,
    please sit down and review
  • 0:46 - 0:46
    appendix B.
    It will pay off especially
  • 0:50 - 0:50
    during our take-home quiz.
    So, just reminder,
  • 0:56 - 0:56
    a digraph, what's a digraph?
    What's that short for?
  • 1:02 - 1:02
    Directed graph,
    OK?
  • 1:04 - 1:04
    Directed graph,
    G equals (V,E),
  • 1:08 - 1:08
    OK, has a set,
    V, of vertices.
  • 1:13 - 1:13
    And, I always get people
    telling me that I have one
  • 1:17 - 1:17
    vertice.
    The singular is not vertice;
  • 1:21 - 1:21
    it is vertex,
    OK?
  • 1:22 - 1:22
    The plural is vertices.
    The singular is vertex.
  • 1:26 - 1:26
    It's one of those weird English
    words.
  • 1:29 - 1:29
    It's probably originally like
    French or something,
  • 1:33 - 1:33
    right?
    I don't know.
  • 1:37 - 1:37
    OK, anyway, and we have a set,
    E, which is a subset of V cross
  • 1:48 - 1:48
    V of edges.
    So that's a digraph.
  • 1:53 - 1:53
    And undirected graph,
    E contains unordered pairs.
  • 2:12 - 2:12
    OK, and, sorry?
    It's Latin, OK,
  • 2:16 - 2:16
    so it's probably pretty old,
    then, in English.
  • 2:21 - 2:21
    I guess the vertex would be a
    little bit of a giveaway that
  • 2:28 - 2:28
    maybe it wasn't French.
    It started to be used in 1570,
  • 2:35 - 2:35
    OK.
    OK, good, OK,
  • 2:39 - 2:39
    so the number of edges is,
    whether it's directed or
  • 2:52 - 2:52
    undirected, is O of what?
    V^2, good.
  • 3:02 - 3:02
    OK, and one of the conventions
    that will have when we're
  • 3:06 - 3:06
    dealing, once we get into
    graphs, we deal a lot with sets.
  • 3:09 - 3:09
    We generally drop the vertical
    bar notation within O's just
  • 3:13 - 3:13
    because it's applied.
    It just makes it messier.
  • 3:16 - 3:16
    So, once again,
    another abuse of notation.
  • 3:19 - 3:19
    It really should be order the
    size of V^2, but it just messes
  • 3:23 - 3:23
    up, I mean, it's just more stuff
    to write down.
  • 3:26 - 3:26
    And, you're multiplying these
    things, and all those vertical
  • 3:30 - 3:30
    bars.
    Since they don't even have a
  • 3:33 - 3:33
    sense to the vertical bar,
    it gets messy.
  • 3:37 - 3:37
    So, we just drop the vertical
    bars there when it's in
  • 3:41 - 3:41
    asymptotic notation.
    So, E is order V^2 when it's a
  • 3:45 - 3:45
    set of pairs,
    because if it's a set of pairs,
  • 3:48 - 3:48
    it's at most n choose two,
    which is where it's at most n^2
  • 3:53 - 3:53
    over 2, here it could be,
    at most, sorry,
  • 3:56 - 3:56
    V^2 over 2, here it's at most
    V^2.
  • 4:00 - 4:00
    And then, another property that
    sometimes comes up is if the G
  • 4:08 - 4:08
    is connected,
    we have another bound,
  • 4:12 - 4:12
    implies that the size of E is
    at least the size of V minus
  • 4:19 - 4:19
    one.
    OK, so if it's connected,
  • 4:23 - 4:23
    meaning, what does it mean to
    have a graph that's connected?
  • 4:31 - 4:31
    Yeah, there's a path from any
    vertex to any other vertex in
  • 4:37 - 4:37
    the graph.
    That's what it means to be
  • 4:41 - 4:41
    connected.
    So if that's the case,
  • 4:44 - 4:44
    that a number of edges is at
    least the number of vertices
  • 4:50 - 4:50
    minus one, OK?
    And so, what that says,
  • 4:54 - 4:54
    so one of the things we'll get
    into, a fact that I just wanted
  • 5:00 - 5:00
    to remind you,
    is that in that case,
  • 5:04 - 5:04
    if I look at log E,
    OK, log of the number of edges,
  • 5:09 - 5:09
    that is O of log V.
    And by this,
  • 5:14 - 5:14
    is omega of log V.
    So, it's equal to theta of log
  • 5:18 - 5:18
    V.
    OK, so basically the number of,
  • 5:21 - 5:21
    in the case of a connected
    graph, the number of edges,
  • 5:26 - 5:26
    and the number of vertices are
    polynomially related.
  • 5:31 - 5:31
    So, their logs are comparable.
    OK, so that's helpful just to
  • 5:37 - 5:37
    know because sometimes I just
    get questions later on where
  • 5:42 - 5:42
    people will say,
    oh, you showed it was log E but
  • 5:46 - 5:46
    you didn't show it was log V.
    And I could point out that it's
  • 5:51 - 5:51
    the same thing.
    OK, so there's various ways of
  • 5:55 - 5:55
    representing graphs in
    computers, and I'm just going to
  • 6:00 - 6:00
    cover a couple of the important
    ones.
  • 6:04 - 6:04
    There's actually more.
    We'll see some more.
  • 6:11 - 6:11
    So, the simplest one is what's
    called an adjacency matrix.
  • 6:21 - 6:21
    An adjacency matrix of the
    graph, G, equals (V,E),
  • 6:30 - 6:30
    where, for simplicity,
    I'll let V be the set of
  • 6:40 - 6:40
    integers from one up to n,
    OK, is the n by n matrix A
  • 6:52 - 6:52
    given by the ij-th at the entry
    is simply one if the edge,
  • 7:05 - 7:05
    ij, is in the edge set and zero
    if ij is not in the edge set.
  • 7:20 - 7:20
    OK, so it's simply the matrix
    where you say,
  • 7:23 - 7:23
    the ij entry is one if it's in
    the matrix.
  • 7:26 - 7:26
    So, this is,
    in some sense,
  • 7:28 - 7:28
    giving you the predicate for,
    is there an edge from i to j?
  • 7:32 - 7:32
    OK, remember,
    predicate is Boolean formula
  • 7:36 - 7:36
    that is either zero or one,
    and in this case,
  • 7:40 - 7:40
    you're saying it's one if there
    is an edge from i to j and zero
  • 7:45 - 7:45
    otherwise.
    OK, sometimes you have edge
  • 7:49 - 7:49
    weighted graphs,
    and then sometimes what people
  • 7:53 - 7:53
    will do is replace this by edge
    weights.
  • 7:56 - 7:56
    OK, it will be the weight of
    the edge from i to j.
  • 8:02 - 8:02
    So, let's just do an example of
    that just to make sure that our
  • 8:13 - 8:13
    intuition corresponds to our
    mathematical definitions.
  • 8:22 - 8:22
    So, here's an example graph.
    Let's say that's our graph.
  • 8:33 - 8:33
    So let's just draw the
    adjacency the matrix.
  • 8:37 - 8:37
    OK, so what this says:
    is there's an edge from one to
  • 8:42 - 8:42
    one?
    And the answer is no.
  • 8:44 - 8:44
    Is there an edge from one to
    two?
  • 8:47 - 8:47
    Yes.
    Is there an edge from one to
  • 8:50 - 8:50
    three here?
    Yep.
  • 8:52 - 8:52
    Is there an edge for one to
    four?
  • 8:55 - 8:55
    No.
    Is there an edge from two until
  • 8:58 - 8:58
    one?
    No.
  • 9:00 - 9:00
    Two to two?
    No.
  • 9:02 - 9:02
    Two to three?
    Yes.
  • 9:05 - 9:05
    Two to four?
    No.
  • 9:07 - 9:07
    No edges going out of three.
    Edge from four to three,
  • 9:14 - 9:14
    and that's it.
    That's the adjacency matrix for
  • 9:20 - 9:20
    this particular graph,
    OK?
  • 9:23 - 9:23
    And so, I can represent a graph
    as this adjacency matrix.
  • 9:33 - 9:33
    OK, when I represent it in this
    way, how much storage do I need?
  • 9:43 - 9:43
    OK, n^2 or V^2 because the size
    is the same thing for V^2
  • 9:53 - 9:53
    storage, OK, and that's what we
    call a dense representation.
  • 10:03 - 10:03
    OK, it works well when the
    graph is dense.
  • 10:06 - 10:06
    So, the graph is dense if the
    number of edges is close to all
  • 10:11 - 10:11
    of the edges possible.
    OK, then this is a good
  • 10:15 - 10:15
    representation.
    But for many types of graphs,
  • 10:18 - 10:18
    the number of edges is much
    less than the possible number of
  • 10:23 - 10:23
    edges, in which case we say the
    graph is sparse.
  • 10:27 - 10:27
    Can somebody give me an example
    of a sparse graph?
  • 10:32 - 10:32
    A class of graphs:
    so, I want a class of graphs
  • 10:36 - 10:36
    that as n grows,
    the number of edges in the
  • 10:39 - 10:39
    graph doesn't grow as the
    square, but grows rather as
  • 10:43 - 10:43
    something much smaller.
    A linked list,
  • 10:46 - 10:46
    so, a chain,
    OK, if you look at it from a
  • 10:49 - 10:49
    graph theoretically,
    is a perfectly good example:
  • 10:52 - 10:52
    only n edges in the chain for a
    chain of length n.
  • 10:56 - 10:56
    So therefore,
    the number of edges would be
  • 10:59 - 10:59
    order V.
    And in particular,
  • 11:03 - 11:03
    you'd only have one edge per
    row here.
  • 11:07 - 11:07
    What other graphs are sparse?
    Yeah?
  • 11:11 - 11:11
    Good, a planar graph,
    a graph that can be drawn in a
  • 11:16 - 11:16
    plane turns out that if it has V
    vertices has,
  • 11:21 - 11:21
    and V is at least three,
    then it has,
  • 11:25 - 11:25
    at most, three V minus six
    edges.
  • 11:30 - 11:30
    So, it turns out that's order V
    edges again.
  • 11:34 - 11:34
    What's another example of a
    common graph?
  • 11:38 - 11:38
    Yeah, binary tree,
    or even actually any tree,
  • 11:43 - 11:43
    you know, what's called a free
    tree if you read the appendix,
  • 11:49 - 11:49
    OK, a tree that just is a
    connected graph that has no
  • 11:54 - 11:54
    cycles, OK, is another example.
    What's an example of a graph
  • 12:00 - 12:00
    that's dense?
    A complete graph,
  • 12:04 - 12:04
    OK: it's all ones,
    OK, or if you have edge
  • 12:08 - 12:08
    weights, it would be a
    completely filled in matrix.
  • 12:12 - 12:12
    OK, good.
    So, this is good for dense
  • 12:14 - 12:14
    representation.
    But sometimes you want to have
  • 12:18 - 12:18
    a sparse representation so we
    don't have to spend V^2 space to
  • 12:23 - 12:23
    deal with all of the,
    where most of it's going to be
  • 12:27 - 12:27
    zeroes.
    OK, it's sort of like,
  • 12:29 - 12:29
    if we know why it's zero,
    why bother representing it as
  • 12:33 - 12:33
    zero?
    So, one such representation is
  • 12:41 - 12:41
    an adjacency list
    representation.
  • 12:47 - 12:47
    Actually, adjacency list of a
    given vertex is the list,
  • 12:57 - 12:57
    which we denote by Adj of V,
    of vertices adjacent to V.
  • 13:08 - 13:08
    OK, just in terms by their
    terminology, vertices are
  • 13:13 - 13:13
    adjacent, but edges are incident
    on vertices.
  • 13:17 - 13:17
    OK, so the incidence is a
    relation between a vertex and an
  • 13:22 - 13:22
    edge.
    An adjacency is a relation
  • 13:25 - 13:25
    between two vertices.
    OK, that's just the language.
  • 13:31 - 13:31
    Why they use to different
    terms, I don't know,
  • 13:36 - 13:36
    but that's what they do.
    So, in the graph,
  • 13:41 - 13:41
    for example,
    the adjacency list for vertex
  • 13:46 - 13:46
    one is just the list or the set
    of two three because one has
  • 13:53 - 13:53
    going out of one are edges to
    two and three.
  • 13:58 - 13:58
    The adjacency list for two is
    just three, four,
  • 14:03 - 14:03
    three.
    It's the empty set,
  • 14:07 - 14:07
    and for four,
    it is three.
  • 14:10 - 14:10
    OK, so that's the
    representation.
  • 14:14 - 14:14
    Now, if we want to figure out
    how much storage is required for
  • 14:21 - 14:21
    this representation,
    OK, we need to understand how
  • 14:27 - 14:27
    long the adjacency list is.
    So, what is the length of an
  • 14:36 - 14:36
    adjacency list of a vertex,
    V?
  • 14:41 - 14:41
    What name do we give to that?
    It's the degree.
  • 14:48 - 14:48
    So, in an undirected graph,
    we call it the degree of the
  • 14:58 - 14:58
    vertex.
    This is undirected.
  • 15:02 - 15:02
    OK, about here,
    OK.
  • 15:07 - 15:07
    So that's an undirected case.
    In the directed case,
  • 15:12 - 15:12
    OK, actually I guess the way we
    should do this is say this.
  • 15:18 - 15:18
    If the degree,
    we call it the out degree for a
  • 15:23 - 15:23
    digraph.
    OK, so in a digraph,
  • 15:26 - 15:26
    we have an out degree and an in
    degree for each vertex.
  • 15:32 - 15:32
    So here, the in degree is
    three.
  • 15:37 - 15:37
    Here, the out degree is two,
    OK?
  • 15:41 - 15:41
    So, one of the important lemma
    that comes up is what's called
  • 15:50 - 15:50
    the handshaking lemma.
    OK, it's one of these
  • 15:57 - 15:57
    mathematical lemmas.
  • 16:11 - 16:11
    And so, it comes from a story.
    Go to a dinner party,
  • 16:16 - 16:16
    and everybody at the dinner
    party shakes other people's
  • 16:22 - 16:22
    hands.
    Some people may not shake
  • 16:25 - 16:25
    anybody's hand.
    Some people may shake several
  • 16:29 - 16:29
    people's hands.
    Nobody shakes hands with
  • 16:33 - 16:33
    themselves.
    And at some point during the
  • 16:37 - 16:37
    dinner party,
    the host goes around and counts
  • 16:41 - 16:41
    up how many, the sum,
    of the number of hands that
  • 16:45 - 16:45
    each person has shaken.
    OK, so he says,
  • 16:48 - 16:48
    how many did you shake?
    How many did you shake?
  • 16:52 - 16:52
    How many did you shake?
    He adds them up,
  • 16:55 - 16:55
    OK, and that number is
    guaranteed to be even.
  • 17:00 - 17:00
    OK, that's the handshaking
    lemma.
  • 17:03 - 17:03
    Or, stated a little bit more
    precisely, if I take for any
  • 17:09 - 17:09
    graph the degree of the vertex,
    and sum them all up,
  • 17:14 - 17:14
    that's how many hands everybody
    shook, OK, that's actually equal
  • 17:20 - 17:20
    to always twice the number of
    edges.
  • 17:24 - 17:24
    So, why is that going to be
    true?
  • 17:27 - 17:27
    Why is that going to be twice
    the number of edges?
  • 17:33 - 17:33
    Yeah?
    Yeah.
  • 17:34 - 17:34
    Every time you put in an edge,
    you add one to the degree of
  • 17:39 - 17:39
    each person on each end.
    So, it's just two different
  • 17:44 - 17:44
    ways of counting up the same
    number of edges.
  • 17:48 - 17:48
    OK, I can go around,
    and if you imagine that,
  • 17:52 - 17:52
    that every time I count the
    degree of the node,
  • 17:57 - 17:57
    I put a mark on every edge.
    Then, when I'm done,
  • 18:01 - 18:01
    every edge has two marks on it,
    one for each end.
  • 18:07 - 18:07
    OK: a pretty simple theorem.
    So, what that says is that for
  • 18:18 - 18:18
    undirected graphs,
    that implies that the adjacency
  • 18:26 - 18:26
    list representation,
    uses how much storage?
  • 18:35 - 18:35
    OK, at most,
    2E, so order E because that's
  • 18:39 - 18:39
    not all.
    Yeah, so you have to have the
  • 18:43 - 18:43
    number of vertices plus order
    the number of edges,
  • 18:48 - 18:48
    OK, whether it's directed or
    undirected because I may have a
  • 18:54 - 18:54
    graph, say it has a whole bunch
    of vertices and no edges,
  • 19:00 - 19:00
    that's still going to cost me
    order V, OK?
  • 19:05 - 19:05
    So, it uses theta of V plus E
    storage.
  • 19:09 - 19:09
    And, it's basically the same
    thing asymptotically.
  • 19:15 - 19:15
    In fact, it's easier to see in
    some sense for digraphs because
  • 19:22 - 19:22
    for digraphs,
    what I do is I just add up the
  • 19:27 - 19:27
    out degrees, and that equal to
    E, OK, if I add up the out
  • 19:34 - 19:34
    degrees as equally.
    In fact, this is kind of like
  • 19:39 - 19:39
    it amortized analysis,
    if you will,
  • 19:42 - 19:42
    a book keeping analysis,
    that if I'm adding up the total
  • 19:46 - 19:46
    number of edges,
    one way of doing it is
  • 19:48 - 19:48
    accounting for a vertex by
    vertex.
  • 19:51 - 19:51
    OK, so for each vertex,
    I basically can take each
  • 19:54 - 19:54
    degree, and basically each
    vertex, look at the degree,
  • 19:58 - 19:58
    and that allocating of account
    per edge, and then ending up
  • 20:02 - 20:02
    with twice the number of edges,
    that's exactly accounting type
  • 20:07 - 20:07
    of analysis that we might do for
    amortized analysis.
  • 20:12 - 20:12
    OK, so we'll see that.
    So, this is a sparse
  • 20:17 - 20:17
    representation,
    and it's often better than an
  • 20:22 - 20:22
    adjacency matrix.
    For example,
  • 20:26 - 20:26
    you can imagine if the World
    Wide Web were done with an
  • 20:32 - 20:32
    adjacency matrix as opposed to,
    essentially,
  • 20:37 - 20:37
    with an adjacency list type of
    representation.
  • 20:44 - 20:44
    Every link on the World Wide
    Web, I had to say,
  • 20:47 - 20:47
    here are the ones that I'm
    connected to,
  • 20:50 - 20:50
    and here are all the ones I'm
    not connected to.
  • 20:53 - 20:53
    OK, that list of things you're
    not connected to for a given
  • 20:58 - 20:58
    page would be pretty
    dramatically,
  • 21:00 - 21:00
    show you that there is an
    advantage to sparse
  • 21:03 - 21:03
    representation.
    On the other hand,
  • 21:07 - 21:07
    one of the nice things about an
    adjacency matrix representation
  • 21:13 - 21:13
    is that each edge can be
    represented with a single bit,
  • 21:19 - 21:19
    whereas typical when I'm
    representing things with an
  • 21:25 - 21:25
    adjacency list representation,
    how many bits am I going to
  • 21:31 - 21:31
    need to represent each
    adjacency?
  • 21:35 - 21:35
    You'll need order log of V to
    be able to name each different
  • 21:39 - 21:39
    vertex.
    OK, the log of the number is
  • 21:42 - 21:42
    the number of bits that I need.
    So, there are places where this
  • 21:47 - 21:47
    is actually a far more efficient
    representation.
  • 21:50 - 21:50
    In particular,
    if you have a very dense graph,
  • 21:53 - 21:53
    OK, this may be a better way of
    representing it.
  • 21:57 - 21:57
    OK, the other thing I want you
    to get, and we're going to see
  • 22:01 - 22:01
    more of this in particular next
    week, is that a matrix and a
  • 22:06 - 22:06
    graph, there are two ways of
    looking at the same thing.
  • 22:11 - 22:11
    OK, and in fact,
    there's a lot of graph theory
  • 22:14 - 22:14
    that when you do things like
    multiply the adjacency matrix,
  • 22:18 - 22:18
    OK, and so forth.
    So, there's a lot of
  • 22:21 - 22:21
    commonality between graphs and
    matrices, a lot of mathematics
  • 22:25 - 22:25
    that if it applies for one,
    it applies to the other.
  • 22:28 - 22:28
    Do you have a question,
    or just holding your finger in
  • 22:32 - 22:32
    the air?
    OK, good.
  • 22:34 - 22:34
    OK, so that's all just review.
    Now I want to get onto today's
  • 22:38 - 22:38
    lecture.
    OK, so any questions about
  • 22:40 - 22:40
    graphs?
    So, this is a good time to
  • 22:43 - 22:43
    review appendix B.
    there are a lot of great
  • 22:46 - 22:46
    properties in there,
    and in particular,
  • 22:48 - 22:48
    there is a theorem that we're
    going to cover today that we're
  • 22:52 - 22:52
    going to talk about today,
    which is properties of trees.
  • 22:56 - 22:56
    Trees are very special kinds of
    graphs, so I really want you to
  • 23:01 - 23:01
    go and look to see what the
    properties are.
  • 23:05 - 23:05
    There is, I think,
    something like six different
  • 23:08 - 23:08
    definitions of trees that are
    all equivalent,
  • 23:12 - 23:12
    OK, and so, I think a very good
    idea to go through and read
  • 23:16 - 23:16
    through that theorem.
    We're not going to prove it in
  • 23:20 - 23:20
    class, but really,
    provides a very good basis for
  • 23:24 - 23:24
    the thinking that we're going to
    be doing today.
  • 23:27 - 23:27
    And we'll see more of that in
    the future.
  • 23:30 - 23:30
    OK, so today,
    we're going to talk about
  • 23:33 - 23:33
    minimum spanning trees.
    OK, this is one of the world's
  • 23:38 - 23:38
    most important algorithms.
    OK, it is important in
  • 23:42 - 23:42
    distributed systems.
    It's one of the first things
  • 23:46 - 23:46
    that almost any distributed
    system tries to find is a
  • 23:50 - 23:50
    minimum spanning tree of the
    nodes that happened to be alive
  • 23:55 - 23:55
    at any point,
    OK?
  • 23:57 - 23:57
    And one of the people who
    developed an algorithm for this,
  • 24:01 - 24:01
    we'll talk about this a little
    bit later, OK,
  • 24:05 - 24:05
    it was the basis of the billing
    system for AT&T for many years
  • 24:10 - 24:10
    while it was a monopoly.
    OK, so very important kind of
  • 24:17 - 24:17
    thing.
    It's got a huge number of
  • 24:21 - 24:21
    applications.
    So the problem is the
  • 24:26 - 24:26
    following.
    You have a connected undirected
  • 24:31 - 24:31
    graph,
    G equals (V,E),
  • 24:35 - 24:35
    with an edge weight function,
    w, which maps the edges into
  • 24:43 - 24:43
    weights that are real numbers.
    And for today's lecture,
  • 24:50 - 24:50
    we're going to make an
    important assumption,
  • 24:56 - 24:56
    OK, for simplicity.
    The book does not make this
  • 25:02 - 25:02
    assumption.
    And so, I encourage you to look
  • 25:09 - 25:09
    at the alternative presentation
    or, because what they do in the
  • 25:17 - 25:17
    book is much more general,
    but for simplicity and
  • 25:22 - 25:22
    intuition, I'm going to make
    this a little bit easier.
  • 25:29 - 25:29
    We're going to assume that all
    edge weights are distinct.
  • 25:37 - 25:37
    OK, all edge weights are
    distinct.
  • 25:39 - 25:39
    So what does that mean?
    What does that mean that this
  • 25:43 - 25:43
    function, w, what property does
    the function,
  • 25:46 - 25:46
    w, have if all edge weights are
    distinct?
  • 25:49 - 25:49
    Who remembers their discreet
    math?
  • 25:51 - 25:51
    It's injective.
    OK, it's one to one.
  • 25:54 - 25:54
    OK, it's not one to one and
    onto necessarily.
  • 25:57 - 25:57
    In fact, it would be kind of
    hard to do that because that's a
  • 26:01 - 26:01
    pretty big set.
    OK, but it's one to one.
  • 26:05 - 26:05
    It's injective.
    OK, so that's what we're going
  • 26:10 - 26:10
    to assume for simplicity.
    OK, and the book,
  • 26:14 - 26:14
    they don't assume that.
    It just means that the way you
  • 26:19 - 26:19
    have to state things is just a
    little more precise.
  • 26:24 - 26:24
    It has to be more technically
    precise.
  • 26:28 - 26:28
    So, that's the input.
    The output is--
  • 26:33 - 26:33
    The output is a spanning tree,
    T, and by spanning tree,
  • 26:45 - 26:45
    we mean it connects all the
    vertices.
  • 26:52 - 26:52
    OK, and it's got to have
    minimum weight.
  • 27:02 - 27:02
    OK, so we can write the weight
    of the tree is going to be,
  • 27:08 - 27:08
    by that, we meet the sum over
    all edges that are in the tree
  • 27:15 - 27:15
    of the weight of the individual
    edges.
  • 27:19 - 27:19
    OK, so here I'(V,E) done a
    little bit of abusive notation,
  • 27:25 - 27:25
    which is that what I should be
    writing is w of the edge (u,v)
  • 27:31 - 27:31
    because this is a mapping from
    edges, which would give me a
  • 27:38 - 27:38
    double parentheses.
    And, you know,
  • 27:42 - 27:42
    as you know,
    I love to abuse notation.
  • 27:45 - 27:45
    So, I'm going to drop that
    extra parentheses,
  • 27:49 - 27:49
    because we understand that it's
    really the weight of the edge,
  • 27:54 - 27:54
    OK, not the weight of the
    ordered pair.
  • 27:57 - 27:57
    So, that's just a little
    notational convenience.
  • 28:02 - 28:02
    OK, so one of the things,
    when we do the take-home exam,
  • 28:05 - 28:05
    notational convenience can make
    the difference between having a
  • 28:09 - 28:09
    horrible time writing up a
    problem, and an easy time.
  • 28:13 - 28:13
    So, it's worth thinking about
    what kinds of notation you'll
  • 28:16 - 28:16
    use in writing up solutions to
    problems, and so forth.
  • 28:20 - 28:20
    OK, and just in general,
    a technical communication,
  • 28:23 - 28:23
    you adopt good notation people
    understand you.
  • 28:26 - 28:26
    You adopt a poor notation:
    nobody pays attention to what
  • 28:29 - 28:29
    you're doing because they don't
    understand what you're saying.
  • 28:34 - 28:34
    OK, so let's do an example.
  • 28:45 - 28:45
    OK, so here's a graph.
    I think for this,
  • 28:52 - 28:52
    somebody asked once if I was
    inspired by biochemistry or
  • 29:03 - 29:03
    something, OK,
    but I wasn't.
  • 29:09 - 29:09
    I was just writing these things
    down, OK?
  • 29:12 - 29:12
    So, here's a graph.
    And let's give us some edge
  • 29:15 - 29:15
    weights.
  • 29:31 - 29:31
    OK, so there are some edge
    weights.
  • 29:35 - 29:35
    And now, what we want is we
    want to find a tree.
  • 29:40 - 29:40
    So a connected acyclic graph
    such that every vertex is part
  • 29:46 - 29:46
    of the tree.
    But it's got to have the
  • 29:50 - 29:50
    minimum weight possible.
    OK, so can somebody suggest to
  • 29:56 - 29:56
    me some edges that have to be in
    this minimum spanning tree?
  • 30:03 - 30:03
    Yeah, so nine,
    good.
  • 30:05 - 30:05
    Nine has to be in there
    because, why?
  • 30:09 - 30:09
    It's the only one connecting it
    to this vertex,
  • 30:15 - 30:15
    OK?
    And likewise,
  • 30:16 - 30:16
    15 has to be in there.
    So those both have to be in.
  • 30:22 - 30:22
    What other edges have to be in?
    Which one?
  • 30:27 - 30:27
    14 has to be it.
    Why does 14 have to be in?
  • 30:33 - 30:33
    Well, one of 14 and three has
    to be in there.
  • 30:41 - 30:41
    I want the minimum weight.
    The one that has the overall
  • 30:50 - 30:50
    smallest weight.
    So, can somebody argue to me
  • 30:58 - 30:58
    that three has to be in there?
    Yeah?
  • 31:04 - 31:04
    That's the minimum of two,
    which means that if I had a,
  • 31:09 - 31:09
    if you add something you said
    was a minimum spanning tree that
  • 31:15 - 31:15
    didn't include three,
    right, and so therefore it had
  • 31:19 - 31:19
    to include 14,
    then I could just delete this
  • 31:23 - 31:23
    edge, 14, and put in edge three.
    And, I have something of lower
  • 31:29 - 31:29
    weight, right?
    So, three has to be in there.
  • 31:34 - 31:34
    What other edges have to be in
    there?
  • 31:38 - 31:38
    Do a little puzzle logic.
    Six and five have to be in
  • 31:43 - 31:43
    there.
    Why do they have to be in
  • 31:47 - 31:47
    there?
  • 32:02 - 32:02
    Yeah, well, I mean,
    it could be connected through
  • 32:05 - 32:05
    this or something.
    It doesn't necessarily have to
  • 32:08 - 32:08
    go this way.
    Six definitely has to be in
  • 32:11 - 32:11
    there for the same reason that
    three had to be,
  • 32:14 - 32:14
    right?
    Because we got two choices to
  • 32:16 - 32:16
    connect up this guy.
    And so, if everything were
  • 32:19 - 32:19
    connected but it weren't,
    12, I mean, and 12 was in
  • 32:23 - 32:23
    there.
    I could always,
  • 32:24 - 32:24
    then, say, well,
    let's connect them up this way
  • 32:27 - 32:27
    instead.
    OK, so definitely that's in
  • 32:32 - 32:32
    there.
    I still don't have everything
  • 32:35 - 32:35
    connected up.
  • 32:50 - 32:50
    What else has to be in there
    for minimum spanning tree?
  • 33:03 - 33:03
    Seven, five,
    and eight, why seven,
  • 33:12 - 33:12
    five, and eight?
    OK, so can we argue those one
  • 33:23 - 33:23
    at a time?
    Why does five have to be in
  • 33:32 - 33:32
    there?
    Yeah?
  • 33:37 - 33:37
    OK, so we have four connected
    components because we have this
  • 33:42 - 33:42
    one, this one,
    we actually have,
  • 33:44 - 33:44
    yeah, this one here,
    and this one,
  • 33:46 - 33:46
    good.
    We need at least three edges to
  • 33:49 - 33:49
    connect them because each edge
    is going to reduce the connected
  • 33:54 - 33:54
    components by one.
    OK, so we need three edges,
  • 33:57 - 33:57
    and those are the three
    cheapest ones.
  • 34:00 - 34:00
    And they work.
    That works, right?
  • 34:04 - 34:04
    Any other edges are going to be
    bigger, so that works.
  • 34:11 - 34:11
    Good.
    OK, and so, now do we have a
  • 34:15 - 34:15
    spanning tree?
    Everything is,
  • 34:19 - 34:19
    we have one big connected graph
    here, right?
  • 34:25 - 34:25
    Is that what I got?
    Hey, that's the same as what I
  • 34:31 - 34:31
    got.
    Life is predictable.
  • 34:35 - 34:35
    OK, so, so everybody had the
    idea of what a minimum spanning
  • 34:41 - 34:41
    tree is, then,
    out of this,
  • 34:44 - 34:44
    OK, what's going on there?
    So, let's first of all make
  • 34:50 - 34:50
    some observations about this
    puzzle.
  • 34:54 - 34:54
    And what I want to do is remind
    you about the optimal
  • 34:59 - 34:59
    substructure property because it
    turns out minimum spanning tree
  • 35:06 - 35:06
    has a great optimal substructure
    property.
  • 35:12 - 35:12
    OK, so the setup is going to
    be, we're going to have some
  • 35:18 - 35:18
    minimum spanning tree.
    Let's call it T.
  • 35:22 - 35:22
    And, I'm going to show that
    with the other edges in the
  • 35:28 - 35:28
    graph, are not going to be
    shown.
  • 35:32 - 35:32
    OK, so here's a graph.
  • 35:54 - 35:54
    OK, so here's a graph.
    It looks like the one I have my
  • 35:58 - 35:58
    piece of paper here.
    OK, so the idea is,
  • 36:02 - 36:02
    this is some minimum spanning
    tree.
  • 36:05 - 36:05
    Now, we want to look at a
    property of optimal
  • 36:09 - 36:09
    substructure.
    And the way I'm going to get
  • 36:13 - 36:13
    that, is, I'm going to remove
    some edge, (u,v),
  • 36:18 - 36:18
    move an arbitrary edge,
    (u,v), in the minimum spanning
  • 36:23 - 36:23
    tree.
    So, let's call this u and this
  • 36:26 - 36:26
    V.
    And so, we're removing this
  • 36:29 - 36:29
    edge.
    OK, so when I remove an edge in
  • 36:34 - 36:34
    a tree, what happens to the
    tree?
  • 36:37 - 36:37
    What's left?
    I have two trees left,
  • 36:40 - 36:40
    OK?
    I have two trees left.
  • 36:42 - 36:42
    Now, proving that,
    that's basically one of the
  • 36:46 - 36:46
    properties in that appendix,
    and the properties of trees
  • 36:51 - 36:51
    that I want you to read,
    OK, because you can actually
  • 36:55 - 36:55
    prove that kind of thing rather
    than it just being obvious,
  • 37:01 - 37:01
    which is, OK?
    OK, so we remove that.
  • 37:06 - 37:06
    Then, T is partitioned into two
    subtrees.
  • 37:11 - 37:11
    And, we'll call them T_1 and
    T_2.
  • 37:16 - 37:16
    So, here's one subtree,
    and here's another subtree.
  • 37:22 - 37:22
    We'(V,E) partitioned it.
    No matter what edge I picked,
  • 37:29 - 37:29
    there would be two subtrees
    that it's partitioned into.
  • 37:38 - 37:38
    Even if the sub tree is a
    trivial subtree,
  • 37:41 - 37:41
    for example,
    it just has a single node in it
  • 37:44 - 37:44
    and no edges.
  • 37:58 - 37:58
    So, the theorem that we'll
    prove demonstrates a property of
  • 38:12 - 38:12
    optimal substructure.
    T_1 is a minimum spanning tree
  • 38:24 - 38:24
    for the graph,
    G_1, E_1,
  • 38:31 - 38:31
    a subgraph of G induced by the
    vertices in T_1.
  • 38:44 - 38:44
    OK, that is,
    V_1 is just the vertices in T_1
  • 38:55 - 38:55
    is what it means to be induced.
    OK, so V_1 is the vertices in
  • 39:09 - 39:09
    T_1.
    So, in this picture,
  • 39:13 - 39:13
    I didn't label it.
    This is T_1.
  • 39:17 - 39:17
    This is T_2.
    In this picture,
  • 39:21 - 39:21
    these are the vertices of T_1.
    So, that's V_1,
  • 39:27 - 39:27
    OK?
    And, E_1 is the set of pairs of
  • 39:32 - 39:32
    vertices, x and y,
    that are the edges that are in
  • 39:39 - 39:39
    E_1 such that both x and y
    belong to V_1.
  • 39:47 - 39:47
    OK, so I haven't shown the
    edges of G here.
  • 39:50 - 39:50
    But basically,
    if an edge went from here to
  • 39:53 - 39:53
    here, that would be in the E_1.
    If it went from here to here,
  • 39:57 - 39:57
    it would not.
    And if it went from here to
  • 40:00 - 40:00
    here, it would not.
    OK, so the vertices,
  • 40:04 - 40:04
    the subgraph induced by the
    vertices of T_1 are just those
  • 40:11 - 40:11
    that connect up things in T_1,
    and similarly for T_2.
  • 40:27 - 40:27
    So, the theorem says that if I
    look at just the edges within
  • 40:33 - 40:33
    the graph here,
    G_1, those that are induced by
  • 40:38 - 40:38
    these vertices,
    T_1 is, in fact,
  • 40:42 - 40:42
    a minimum spanning tree for
    that subgraph.
  • 40:46 - 40:46
    That's what the theorem says.
    OK, if I look over here
  • 40:52 - 40:52
    conversely, or correspondingly,
    if I look at the set of edges
  • 40:59 - 40:59
    that are induced by this set of
    vertices, the vertices in T_2,
  • 41:05 - 41:05
    in fact, T_2 is a minimum
    spanning tree on that subgraph.
  • 41:13 - 41:13
    OK, OK, we can even do it over
    here.
  • 41:18 - 41:18
    If I took a look,
    for example,
  • 41:22 - 41:22
    at these, let's see,
    let's say we cut out five,
  • 41:28 - 41:28
    and if I cut out edge five,
    that T_1 would be these four
  • 41:35 - 41:35
    vertices here.
    And, the point is that if I
  • 41:40 - 41:40
    look at the subgraph induced on
    that, that these edges here.
  • 41:45 - 41:45
    In fact, the six,
    eight, and three are all edges
  • 41:49 - 41:49
    in a minimum spanning tree for
    that subgraph.
  • 41:52 - 41:52
    OK, so that's what the theorem
    says.
  • 41:55 - 41:55
    So let's prove it.
  • 42:09 - 42:09
    OK, and so what technique are
    we going to use to prove it?
  • 42:20 - 42:20
    OK, we learned this technique
    last time: hint,
  • 42:28 - 42:28
    hint.
    It's something you do it in
  • 42:34 - 42:34
    your text editor all the time:
    cut and paste,
  • 42:39 - 42:39
    good, cut and paste.
    OK, so the weight of T I can
  • 42:46 - 42:46
    express as the weight of the
    edge I removed,
  • 42:52 - 42:52
    plus the weight of T_1,
    plus the weight of T_2.
  • 42:58 - 42:58
    OK, so that's the total weight.
    So, the argument is pretty
  • 43:07 - 43:07
    simple.
    Suppose that there were some
  • 43:13 - 43:13
    T_1 prime that was better than
    T_1 for G_1.
  • 43:20 - 43:20
    Suppose I had some better way
    of forming a spanning tree.
  • 43:31 - 43:31
    OK, then I would make up a T
    prime, which just contained the
  • 43:43 - 43:43
    edges, (u,v),
    and T_1 prime,
  • 43:48 - 43:48
    union T_2.
    So, I would take,
  • 43:54 - 43:54
    if I had a better spanning
    tree, a spanning tree of lower
  • 44:05 - 44:05
    weight for T_1.
    And I call that T_1 prime.
  • 44:12 - 44:12
    I just substitute that and make
    up a spanning tree that
  • 44:18 - 44:18
    consisted of my edge,
    (u,v), whatever works well for
  • 44:22 - 44:22
    T_1 prime and whatever works
    well for T.
  • 44:26 - 44:26
    And, that would be a spanning
    tree.
  • 44:30 - 44:30
    And it would be better than T
    itself was for G,
  • 44:36 - 44:36
    OK, because the weight of these
    is just as the weight for this,
  • 44:44 - 44:44
    I now just get to use the
    weight of T_1 prime,
  • 44:50 - 44:50
    and that's less.
    And so, therefore,
  • 44:55 - 44:55
    the assumption that T was a
    minimum spanning tree would be
  • 45:02 - 45:02
    violated if I could find a
    better one for the subpiece.
  • 45:11 - 45:11
    So, we have this nice property
    of optimal substructure.
  • 45:16 - 45:16
    OK, I have subproblems that
    exhibit optimal,
  • 45:20 - 45:20
    if I have a globally optimal
    solution to the whole problem
  • 45:25 - 45:25
    within it, I can find optimal
    solutions to subproblems.
  • 45:31 - 45:31
    So, now the question is,
    that's one hallmark.
  • 45:36 - 45:36
    That's one hallmark of dynamic
    programming.
  • 45:42 - 45:42
    What about overlapping
    subproblems?
  • 45:46 - 45:46
    Do I have that property?
    Do I have overlapping
  • 45:51 - 45:51
    subproblems over here for this
    type of problem?
  • 46:19 - 46:19
    So, imagine,
    for example,
  • 46:21 - 46:21
    that I'm removing different
    edges.
  • 46:23 - 46:23
    I look at the space of taking a
    given edge, and removing it.
  • 46:27 - 46:27
    It partitions it into two
    pieces, and now I have another
  • 46:31 - 46:31
    piece.
    And I remove it,
  • 46:32 - 46:32
    etc.
    Am I going to end up getting a
  • 46:35 - 46:35
    bunch of subproblems that are
    similar in there?
  • 46:38 - 46:38
    Yeah, I am.
    OK, if I take out this one,
  • 46:41 - 46:41
    then I take out,
    say, this one here,
  • 46:44 - 46:44
    and then I'll have another tree
    here and here.
  • 46:47 - 46:47
    OK, that would be the same as
    if I had originally taken this
  • 46:51 - 46:51
    out, and then taken that one
    out.
  • 46:53 - 46:53
    If I look at simple ordering of
    taking out the edges,
  • 46:57 - 46:57
    I'm going to end up with a
    whole bunch of overlapping
  • 47:01 - 47:01
    subproblems.
    Yeah, OK.
  • 47:05 - 47:05
    So then, what does that suggest
    we use as an approach?
  • 47:14 - 47:14
    Dynamic programming,
    good.
  • 47:18 - 47:18
    What a surprise!
    Yes, OK, you could use dynamic
  • 47:27 - 47:27
    programming.
    But it turns out that minimum
  • 47:34 - 47:34
    spanning tree exhibits an even
    more powerful property.
  • 47:41 - 47:41
    OK, so we'(V,E) got all the
    clues for dynamic programming,
  • 47:49 - 47:49
    but it turns out that there's
    an even bigger clue that's going
  • 47:57 - 47:57
    to help us to use an even more
    powerful technique.
  • 48:05 - 48:05
    And that, we call,
    the hallmark for greedy
  • 48:11 - 48:11
    algorithms.
  • 48:32 - 48:32
    And that is,
    we have a thing called the
  • 48:41 - 48:41
    greedy choice property,
    which says that a locally
  • 48:53 - 48:53
    optimal choice is globally
    optimal.
  • 49:03 - 49:03
    And, of course,
    as all these hallmarks is the
  • 49:06 - 49:06
    kind of thing you want to box,
    OK, because these are the clues
  • 49:10 - 49:10
    that you're going to be able to
    do that.
  • 49:12 - 49:12
    So, we have this property that
    we call the greedy choice
  • 49:16 - 49:16
    property.
    I'm going to show you how it
  • 49:18 - 49:18
    works in this case.
    And when you have a greedy
  • 49:21 - 49:21
    choice property,
    it turns out you can do even
  • 49:24 - 49:24
    better that dynamic programming.
    OK, so when you see the two
  • 49:29 - 49:29
    dynamic programming properties,
    there is a clue that says
  • 49:34 - 49:34
    dynamic programming,
    yes, but also it says,
  • 49:37 - 49:37
    let me see whether it also has
    this greedy property because if
  • 49:42 - 49:42
    it does, you're going to come up
    with something that's even
  • 49:46 - 49:46
    better than dynamic programming,
    OK?
  • 49:49 - 49:49
    So, if you just have the two,
    you can usually do dynamic
  • 49:53 - 49:53
    programming, but if you have
    this third one,
  • 49:57 - 49:57
    it's like, whoa!
    Jackpot!
  • 50:00 - 50:00
    OK, so here's the theorem we'll
    prove to illustrate this idea.
  • 50:04 - 50:04
    Once again, these are not,
    all these hallmarks are not
  • 50:08 - 50:08
    things.
    They are heuristics.
  • 50:10 - 50:10
    I can't give you an algorithm
    to say, here's where dynamic
  • 50:14 - 50:14
    programming works,
    or here's where greedy
  • 50:17 - 50:17
    algorithms work.
    But I can sort of indicate when
  • 50:20 - 50:20
    they work, the kind of structure
    they have.
  • 50:23 - 50:23
    OK, so here's the theorem.
    So let's let T be the MST of
  • 50:32 - 50:32
    our graph.
    And, let's let A be any subset
  • 50:41 - 50:41
    of V, so, some subset of
    vertices.
  • 50:49 - 50:49
    And now, let's suppose that
    edge, (u,v), is the least weight
  • 51:05 - 51:05
    edge connecting our set A to A
    complement, that is,
  • 51:18 - 51:18
    V minus A.
    Then the theorem says that
  • 51:27 - 51:27
    (u,v) is in the minimum spanning
    tree.
  • 51:39 - 51:39
    So let's just take a look at
    our graph over here and see if
  • 51:43 - 51:43
    that's, in fact,
    the case.
  • 51:45 - 51:45
    OK, so let's take,
    so one thing I could do for A
  • 51:49 - 51:49
    is just take a singleton node.
    So, I take a singleton node,
  • 51:54 - 51:54
    let's say this guy here,
    that can be my A,
  • 51:57 - 51:57
    and everything else is V minus
    A.
  • 52:00 - 52:00
    And I look at the least weight
    edge connecting this to
  • 52:04 - 52:04
    everything else.
    Well, there are only two edges
  • 52:08 - 52:08
    that connect it to everything
    else.
  • 52:11 - 52:11
    And the theorem says that the
    lighter one is in the minimum
  • 52:15 - 52:15
    spanning tree.
    Hey, I win.
  • 52:17 - 52:17
    OK, if you take a look,
    every vertex that I pick,
  • 52:21 - 52:21
    the latest edge coming out of
    that vertex is in the minimum
  • 52:26 - 52:26
    spanning tree.
    OK, the lightest weight edge
  • 52:29 - 52:29
    coming out, but that's not all
    the edges that are in here.
  • 52:35 - 52:35
    OK, or let's just imagine,
    let's take a look at these
  • 52:39 - 52:39
    three vertices connected to this
    set of vertices.
  • 52:43 - 52:43
    I have three edges is going
    across.
  • 52:46 - 52:46
    The least weight one is five.
    That's the minimum spanning
  • 52:51 - 52:51
    tree.
    Or, I can cut it this way.
  • 52:53 - 52:53
    OK, the ones above one,
    the edges going down are seven,
  • 52:58 - 52:58
    eight, and 14.
    Seven is the least weight.
  • 53:02 - 53:02
    It's in the minimum spanning
    tree.
  • 53:05 - 53:05
    So, no matter how I choose,
    I could make this one in,
  • 53:09 - 53:09
    this one out,
    this one in,
  • 53:11 - 53:11
    this one out,
    this one in,
  • 53:13 - 53:13
    this one out,
    this one in,
  • 53:15 - 53:15
    this one out,
    take a look at what all the
  • 53:18 - 53:18
    edges are.
    Which ever one to the least
  • 53:21 - 53:21
    weight: it's in the minimum
    spanning tree.
  • 53:24 - 53:24
    So, in some sense,
    that's a local property because
  • 53:28 - 53:28
    I don't have to look at what the
    rest of the tree is.
  • 53:34 - 53:34
    I'm just looking at some small
    set of vertices if I wish,
  • 53:38 - 53:38
    and I say, well,
    if I wanted to connect that set
  • 53:42 - 53:42
    of vertices to the rest of the
    world, what would I pick?
  • 53:47 - 53:47
    I'd pick the cheapest one.
    That's the greedy approach.
  • 53:51 - 53:51
    It turns out,
    that wins, OK,
  • 53:53 - 53:53
    that picking that thing that's
    locally good for that subset,
  • 53:58 - 53:58
    A, OK, is also globally good.
    OK, it optimizes the overall
  • 54:05 - 54:05
    function.
    That's what the theorem says,
  • 54:09 - 54:09
    OK?
    So, let's prove this theorem.
  • 54:13 - 54:13
    Any questions about this?
    OK, let's prove this theorem.
  • 54:20 - 54:20
    So, we have (u,v) is the least
    weight edge connecting A to D
  • 54:28 - 54:28
    minus A.
    So, let's suppose that this
  • 54:32 - 54:32
    edge, (u,v), is not in the
    minimum spanning tree.
  • 54:40 - 54:40
    OK, let's suppose that somehow
    there is a minimum spanning tree
  • 54:46 - 54:46
    that doesn't include this least
    weight edge.
  • 54:50 - 54:50
    OK, so what technique you think
    will use to prove to get a
  • 54:56 - 54:56
    contradiction here?
    Cut and paste,
  • 54:59 - 54:59
    good.
    Yeah, we're going to cut paste.
  • 55:04 - 55:04
    OK, we're going to cut and
    paste.
  • 55:09 - 55:09
    So here, I did an example.
    OK, so --
  • 55:40 - 55:40
    OK, and so I'm going to use the
    notation.
  • 55:42 - 55:42
    I'm going to color some of
    these in.
  • 56:05 - 56:05
    OK, and so my notation here is
    this is an element of A,
  • 56:10 - 56:10
    and color it in.
    It's an element of V minus A.
  • 56:14 - 56:14
    OK, so if it's not colored it,
    that's an A.
  • 56:18 - 56:18
    This is my minimum spanning
    tree.
  • 56:21 - 56:21
    Once again, I'm not showing the
    overall edges of all the graphs,
  • 56:27 - 56:27
    but they're there,
    OK?
  • 56:30 - 56:30
    So, my edge,
    (u,v), which is not my minimum
  • 56:34 - 56:34
    spanning tree I say,
    let's say is this edge here.
  • 56:38 - 56:38
    It's an edge from u,
    u as in A, v as in V minus A.
  • 56:43 - 56:43
    OK, so everybody see the setup?
    So, I want to prove that this
  • 56:48 - 56:48
    edge should have been in the
    minimum spanning tree,
  • 56:53 - 56:53
    OK, that the contention that
    this is a minimum spanning tree,
  • 56:58 - 56:58
    and does include (u,v) is
    wrong.
  • 57:02 - 57:02
    So, what I want to do,
    that, is I have a tree here,
  • 57:05 - 57:05
    T, and I have two vertices,
    u and v, and in a tree,
  • 57:09 - 57:09
    between any two vertices there
    is a unique, simple path:
  • 57:12 - 57:12
    simple path meaning it doesn't
    go back and forth and repeat
  • 57:16 - 57:16
    edges or vertices.
    OK, there's a unique,
  • 57:19 - 57:19
    simple path from u to v.
    So, let's consider that path.
  • 57:42 - 57:42
    OK, and the way that I know
    that that path exists is because
  • 57:47 - 57:47
    I'(V,E) read appendix B of the
    textbook, section B.5.1,
  • 57:51 - 57:51
    OK, which has this nice theorem
    about properties of trees.
  • 57:56 - 57:56
    OK, so that's how I know that
    there exists a unique,
  • 58:00 - 58:00
    simple path.
    OK, so now we're going to do is
  • 58:05 - 58:05
    take a look at that path.
    So in this case,
  • 58:10 - 58:10
    it goes from here,
    to here, to here,
  • 58:14 - 58:14
    to here.
    And along that path,
  • 58:17 - 58:17
    there must be a point where I
    connect from a vertex in A to a
  • 58:23 - 58:23
    vertex in V minus A.
    Why?
  • 58:26 - 58:26
    Well, because this is in A.
    This is in V minus A.
  • 58:32 - 58:32
    So, along the path somewhere,
    there must be a transition.
  • 58:42 - 58:42
    OK, they are not all in A,
    OK, because in particular,
  • 58:52 - 58:52
    V isn't.
    OK, so we're going to do is
  • 58:59 - 58:59
    swap (u,v) with the first edge
    on this path that connects a
  • 59:10 - 59:10
    vertex in A to a vertex in V
    minus A.
  • 59:18 - 59:18
    So in this case,
    it's this edge here.
  • 59:21 - 59:21
    I go from A to V minus A.
    In general, I might be
  • 59:25 - 59:25
    alternating many times,
    OK, and I just picked the first
  • 59:29 - 59:29
    one that I encounter.
    OK, that this guy here.
  • 59:32 - 59:32
    And what I do is I put this
    edge in.
  • 59:36 - 59:36
    OK, so then,
    what happens?
  • 59:38 - 59:38
    Well, the edge,
    (u,v), is the lightest thing
  • 59:42 - 59:42
    connecting something in A to
    something in V minus A.
  • 59:47 - 59:47
    So that means,
    in particular,
  • 59:49 - 59:49
    it's lighter than this edge,
    has lower weight.
  • 59:53 - 59:53
    So, by swapping this,
    I'(V,E) created a tree with
  • 59:58 - 59:58
    lower overall weight,
    contradicting the assumption
  • 60:02 - 60:02
    that this other thing was a
    minimum spanning tree.
  • 60:08 - 60:08
    OK: so, a lower weight spanning
    tree than T results,
  • 60:14 - 60:14
    and that's a contradiction --
  • 60:25 - 60:25
    -- than T results.
    And that's a contradiction,
  • 60:33 - 60:33
    OK?
    How are we doing?
  • 60:37 - 60:37
    Everybody with me?
    OK, now we get to do some
  • 60:44 - 60:44
    algorithms.
    Yea!
  • 60:47 - 60:47
    So, we are going to do an
    algorithm called Prim's
  • 60:55 - 60:55
    algorithm.
    Prim eventually became a very
  • 61:02 - 61:02
    high-up at AT&T because he
    invented this algorithm for
  • 61:07 - 61:07
    minimum spanning trees,
    and it was used in all of the
  • 61:12 - 61:12
    billing code for AT&T for many
    years.
  • 61:16 - 61:16
    He was very high up at Bell
    Labs back in the heyday of Bell
  • 61:21 - 61:21
    Laboratories.
    OK, so it just shows,
  • 61:25 - 61:25
    all you have to do is invent an
    algorithm.
  • 61:30 - 61:30
    You too can be a president of a
    corporate monopoly.
  • 61:37 - 61:37
    Of course, the government can
    do things to monopolies,
  • 61:44 - 61:44
    but anyway, if that's your
    mission in life,
  • 61:49 - 61:49
    invent an algorithm.
    OK, so here's the idea.
  • 61:55 - 61:55
    What we're going to do is we're
    going to maintain V minus A as a
  • 62:04 - 62:04
    priority queue.
    We'll call it Q.
  • 62:12 - 62:12
    And each vertex,
    we're going to key each vertex
  • 62:26 - 62:26
    in Q with the weight of the
    least weight edge,
  • 62:40 - 62:40
    connecting it to a vertex in A.
    So here's the code.
  • 62:53 - 62:53
    So, we're going to start out
    with Q being all vertices.
  • 63:00 - 63:00
    So, we start out with A being,
    if you will,
  • 63:04 - 63:04
    the empty set.
    OK, and what we're going to do
  • 63:08 - 63:08
    it is the least weight edge,
    therefore, for everything in
  • 63:13 - 63:13
    the priority queue is basically
    going to be infinity because
  • 63:19 - 63:19
    none of them have any edges.
    The least weight edge to the
  • 63:24 - 63:24
    empty set is going to be empty.
    And then, we're going to start
  • 63:29 - 63:29
    out with one guy.
    We'll call him S,
  • 63:34 - 63:34
    which will set to zero for some
    arbitrary S in V.
  • 63:39 - 63:39
    And then, the main part of the
    algorithm kicks in.
  • 63:45 - 63:45
    So that's our initialization.
    OK, when we do the analysis,
  • 63:52 - 63:52
    I'm going to write some stuff
    on the left hand side of the
  • 63:58 - 63:58
    board.
    So if you're taking notes,
  • 64:04 - 64:04
    you may want to also leave a
    little bit of space on the left
  • 64:14 - 64:14
    hand side of your notes.
    So, while Q is not empty,
  • 64:23 - 64:23
    we get the smallest element out
    of it.
  • 64:41 - 64:41
    And then we do some stuff.
  • 65:19 - 65:19
    That's it.
    And the only thing I should
  • 65:22 - 65:22
    mention here is,
    OK, so let's just see what's
  • 65:26 - 65:26
    going on here.
    And then we'll run it on the
  • 65:29 - 65:29
    example.
    OK, so what we do is we take
  • 65:32 - 65:32
    out the smallest element out of
    the queue at each step.
  • 65:37 - 65:37
    And then for each step in the
    adjacency list,
  • 65:40 - 65:40
    in other words,
    everything for which I have an
  • 65:44 - 65:44
    edge going from v to u,
    we take a look,
  • 65:47 - 65:47
    and if v is still in our set V
    minus A, so things we'(V,E)
  • 65:51 - 65:51
    taken out are going to be part
    of A.
  • 65:54 - 65:54
    OK, every time we take
    something out,
  • 65:57 - 65:57
    that's going to be a new A that
    we construct.
  • 66:02 - 66:02
    At every step,
    we want to find,
  • 66:04 - 66:04
    what's the cheapest edge
    connecting that A to everything
  • 66:08 - 66:08
    else?
    We basically are going to take
  • 66:11 - 66:11
    whatever that cheapest thing is,
    OK, add that edge in,
  • 66:15 - 66:15
    and now bring that into A and
    find the next cheapest one.
  • 66:19 - 66:19
    And we just keep repeating the
    process.
  • 66:22 - 66:22
    OK, we'll do it on the example.
    And what we do,
  • 66:26 - 66:26
    is every time we bring it in,
    I keep track of,
  • 66:29 - 66:29
    what was the vertex responsible
    for bringing me in.
  • 66:34 - 66:34
    And what I claim is that at the
    end, if I look at the set of
  • 66:44 - 66:44
    these pairs that I'(V,E) made
    here, V and pi of V,
  • 66:52 - 66:52
    that forms the minimum spanning
    tree.
  • 66:58 - 66:58
    So let's just do this.
    And, what's that?
  • 67:05 - 67:05
    We're all set up.
    So let's get rid of these guys
  • 67:12 - 67:12
    here because we are going to
    recompute them from scratch.
  • 67:20 - 67:20
    OK, so you may want to copy the
    graph over again in your notes.
  • 67:30 - 67:30
    I was going to do it,
    but it turned out,
  • 67:35 - 67:35
    this is exactly the board is
    going to erase this.
  • 67:41 - 67:41
    OK, well let me just modify it.
    OK, so we start out.
  • 67:47 - 67:47
    We make everything be infinity.
    OK, so that's where I'm going
  • 67:55 - 67:55
    to keep the key value.
    OK, and then what I'm going to
  • 68:01 - 68:01
    do is find one vertex.
    And I'm going to call him S.
  • 68:08 - 68:08
    And I'm going to do this vertex
    here.
  • 68:12 - 68:12
    We'll call that S.
    So basically,
  • 68:15 - 68:15
    I now make him be zero.
    And now, what I do,
  • 68:19 - 68:19
    is I execute extract min.
    So basically,
  • 68:23 - 68:23
    what I'll do is I'll just shade
    him like this,
  • 68:28 - 68:28
    indicating that he has now
    joined the set A.
  • 68:34 - 68:34
    So, this is going to be A.
    And this is element of V minus
  • 68:41 - 68:41
    A.
    OK, so then what we do is we
  • 68:45 - 68:45
    take a look.
    We extract him,
  • 68:48 - 68:48
    and then for each edge in the
    adjacency list,
  • 68:53 - 68:53
    OK, so for each vertex in the
    adjacency lists,
  • 68:59 - 68:59
    that these guys here,
    OK, we're going to look to see
  • 69:05 - 69:05
    if it's still in Q,
    that is, in V minus A.
  • 69:12 - 69:12
    And if so, and its key value is
    less than what the value is at
  • 69:17 - 69:17
    the edge, there,
    we're going to replace it by
  • 69:20 - 69:20
    the edge value.
    So, in this case,
  • 69:23 - 69:23
    we're going to replace this by
    seven.
  • 69:26 - 69:26
    We're going to replace this by
    15, and we're going to replace
  • 69:30 - 69:30
    this by ten, OK,
    because what we're interested
  • 69:34 - 69:34
    in is, what is the cheapest?
    Now, notice that everything in
  • 69:40 - 69:40
    V minus A, that is,
    what's in the priority queue,
  • 69:44 - 69:44
    everything in there,
    OK, now has its cheapest way of
  • 69:48 - 69:48
    connecting it to the things that
    I'(V,E) already removed,
  • 69:53 - 69:53
    the things that are in A.
    OK, and so now I just,
  • 69:57 - 69:57
    OK, when I actually do that
    update, there's actually
  • 70:02 - 70:02
    something implicit going on in
    this priority queue.
  • 70:07 - 70:07
    And that is that I have to do a
    decreased key.
  • 70:11 - 70:11
    So, there's an implicit
    decrease of the key.
  • 70:14 - 70:14
    So, decreased key is a priority
    queue operation that lowers the
  • 70:19 - 70:19
    value of the key in the priority
    queue.
  • 70:22 - 70:22
    And so, that's implicitly going
    on when I look at what data
  • 70:27 - 70:27
    structure I'm going to use to
    implement that priority queue.
  • 70:32 - 70:32
    OK, so common data structures
    for implementing a priority
  • 70:36 - 70:36
    queue are a min heap.
    OK, so I have to make sure that
  • 70:41 - 70:41
    I'm actually doing this
    operation.
  • 70:44 - 70:44
    I can't just change it and not
    affect my heap.
  • 70:47 - 70:47
    So, there is an implicit
    operation going on there.
  • 70:51 - 70:51
    OK, now I repeat.
    I find the cheapest thing,
  • 70:54 - 70:54
    oh, and I also have to set,
    now, a pointer from each of
  • 70:59 - 70:59
    these guys back to u.
    So here, this guy sets a
  • 71:03 - 71:03
    pointer going this way.
    This guy sets a pointer going
  • 71:07 - 71:07
    this way, and this guy sets a
    pointer going this way.
  • 71:11 - 71:11
    That's my pi thing that's going
    to keep track of who caused me
  • 71:16 - 71:16
    to set my value to what it is.
    So now, we go in and we find
  • 71:21 - 71:21
    the cheapest thing,
    again.
  • 71:23 - 71:23
    And we're going to do it fast,
    too.
  • 71:26 - 71:26
    OK, this is a fast algorithm.
    OK, so now we're going to go do
  • 71:32 - 71:32
    this again.
    So now, what's the cheapest
  • 71:36 - 71:36
    thing to extract?
    This guy here,
  • 71:40 - 71:40
    right?
    So, we'll take him out,
  • 71:43 - 71:43
    OK, and now we update all of
    his neighbors.
  • 71:48 - 71:48
    So this guy gets five.
    This guy gets 12.
  • 71:52 - 71:52
    This guy gets nine.
    This guy we don't update.
  • 71:57 - 71:57
    We don't update him because
    he's no longer in the priority
  • 72:03 - 72:03
    queue.
    And all of these guys now,
  • 72:07 - 72:07
    we make point to where they're
    supposed to point to.
  • 72:12 - 72:12
    And, we're done with that step.
    Now we find the cheapest one.
  • 72:18 - 72:18
    What's the cheapest one now?
    The five over here.
  • 72:22 - 72:22
    Good.
    So, we take him out.
  • 72:25 - 72:25
    OK, we update the neighbors.
    Here, yep, that goes to six
  • 72:30 - 72:30
    now.
    And, we have that pointer.
  • 72:34 - 72:34
    And, this guy we don't do,
    because he's not in there.
  • 72:40 - 72:40
    This guy becomes 14,
    and this guy here becomes
  • 72:45 - 72:45
    eight.
    So, we update that guy,
  • 72:48 - 72:48
    make him be eight.
    Did I do this the right way?
  • 72:53 - 72:53
    Yeah, because pi is a function
    of this guy.
  • 72:57 - 72:57
    So basically,
    this thing, then,
  • 73:01 - 73:01
    disappears.
    Yeah, did I have another one
  • 73:05 - 73:05
    that I missed?
    12, yes, good,
  • 73:09 - 73:09
    it's removed,
    OK, because pi is just a
  • 73:13 - 73:13
    function.
    And now I'm OK.
  • 73:15 - 73:15
    OK, so now what do I do?
    OK, so now my set,
  • 73:19 - 73:19
    A, consists of these three
    things, and now I want the
  • 73:23 - 73:23
    cheapest edge.
    I know it's in the minimum
  • 73:27 - 73:27
    spanning tree.
    So let me just greedily pick
  • 73:31 - 73:31
    it.
    OK, so what's the cheapest
  • 73:35 - 73:35
    thing now?
    This guy appear?
  • 73:37 - 73:37
    Yeah, six.
    So we take it.
  • 73:39 - 73:39
    We go to update these things,
    and nothing matters here.
  • 73:45 - 73:45
    OK, nothing changes because
    these guys are already in A.
  • 73:50 - 73:50
    OK, so now the cheapest one is
    eight here.
  • 73:54 - 73:54
    Good.
    So, we take eight out.
  • 73:57 - 73:57
    OK, we update this.
    Nothing to be done.
  • 74:02 - 74:02
    This: nothing to be done.
    This: oh, no,
  • 74:05 - 74:05
    this one, instead of 14 we can
    make this be three.
  • 74:09 - 74:09
    So, we get rid of that pointer
    and make it point that way.
  • 74:14 - 74:14
    Now three is the cheapest
    thing.
  • 74:17 - 74:17
    So, we take it out,
    and of course there's nothing
  • 74:21 - 74:21
    to be done over there.
    And now, last,
  • 74:24 - 74:24
    I take nine.
    And it's done.
  • 74:27 - 74:27
    And 15: it's done.
    And the algorithm terminates.
  • 74:32 - 74:32
    OK, and as I look at,
    now, all the edges that I
  • 74:37 - 74:37
    picked, those are exactly all
    the edges that we had at the
  • 74:43 - 74:43
    beginning.
    OK, let's do an analysis here.
  • 74:58 - 74:58
    OK, so let's see,
    this part here costs me order
  • 75:06 - 75:06
    V, right?
    OK, and this part,
  • 75:11 - 75:11
    let's see what we are doing
    here.
  • 75:17 - 75:17
    Well, we're going to go through
    this loop how many times?
  • 75:27 - 75:27
    V times.
    It's V elements we put into the
  • 75:33 - 75:33
    queue.
    We are not inserting anything.
  • 75:36 - 75:36
    We're just taking them out.
    This goes V times,
  • 75:40 - 75:40
    OK, and we do a certain number
    of extract Mins.
  • 75:44 - 75:44
    So, we're going to do order V
    extract Mins.
  • 75:47 - 75:47
    And then we go to the adjacency
    list, and we have some constant
  • 75:53 - 75:53
    things.
    But we have these implicit
  • 75:56 - 75:56
    decreased keys for this stuff
    here.
  • 76:00 - 76:00
    That's this thing here.
    OK, and so how many implicit
  • 76:07 - 76:07
    decreased keys do we have?
    That's going to be the
  • 76:14 - 76:14
    expensive thing.
    OK, we have,
  • 76:18 - 76:18
    in this case,
    the degree of u of those.
  • 76:25 - 76:25
    OK, so overall,
    how many implicit decreased
  • 76:31 - 76:31
    keys do we have?
    Well, we have V times through.
  • 76:38 - 76:38
    How big could the degree of u
    be?
  • 76:43 - 76:43
    OK, it could be as big as V,
    order V.
  • 76:48 - 76:48
    So, that's V^2 decreased use.
    But we can do a better bound
  • 76:57 - 76:57
    than that.
    How many do we really have?
  • 77:04 - 77:04
    Yeah, at most order E,
    OK, because what am I doing?
  • 77:12 - 77:12
    I'm summing up the degrees of
    all the vertices.
  • 77:19 - 77:19
    That's how many times I
    actually execute that.
  • 77:27 - 77:27
    So, I have order E,
    implicit decreased keys.
  • 77:34 - 77:34
    So the time overall is order V
    times time for whatever the
  • 77:44 - 77:44
    extract Min is plus E times the
    time for decreased key.
  • 77:53 - 77:53
    So now, let's look at data
    structures, and we can evaluate
  • 78:03 - 78:03
    for different data structures
    what this formula gives us.
  • 78:14 - 78:14
    So, we have different ways of
    implementing a data structure.
  • 78:21 - 78:21
    We have the cost of extract
    Min, and of decreased key,
  • 78:28 - 78:28
    and total.
    So, the simplest way of
  • 78:33 - 78:33
    implementing a data structure is
    an unsorted array.
  • 78:38 - 78:38
    If I have an unsorted array,
    how much time does it take me
  • 78:45 - 78:45
    to extract the minimum element?
    If I have an unsorted array?
  • 78:52 - 78:52
    Right, order V in this case
    because it's an array of size V.
  • 78:58 - 78:58
    And, to do a decreased key,
    OK, I can do it in order one.
  • 79:06 - 79:06
    So, the total is V^2,
    good, order V^2 algorithm.
  • 79:14 - 79:14
    Or, as people suggested,
    how about a binary heap?
  • 79:23 - 79:23
    OK, to do an extract Min in a
    binary heap will cost me what?
  • 79:33 - 79:33
    O of log V.
    Decreased key will cost me,
  • 79:39 - 79:39
    yeah, it turns out you can do
    that in order log V because
  • 79:45 - 79:45
    basically you just have to
    shuffle the value,
  • 79:50 - 79:50
    actually shuffle it up towards
    the root, OK?
  • 79:54 - 79:54
    Or at log V.
    And, the total cost therefore
  • 79:59 - 79:59
    is?
    E log V, good.
  • 80:02 - 80:02
    Which of these is better?
    It depends, good.
  • 80:07 - 80:07
    When is one better,
    and when is the other better?
  • 80:13 - 80:13
    Yeah, if it's a dense graph,
    E is close to V^2,
  • 80:18 - 80:18
    the array is better.
    But if it's a sparse graph,
  • 80:24 - 80:24
    and E is much smaller than V^2,
    then the binary heap is better.
  • 80:33 - 80:33
    So that motivated the invention
    of a data structure,
  • 80:38 - 80:38
    OK, called a Fibonacci Heap.
    So, Fibonacci Heap is covered
  • 80:43 - 80:43
    in Chapter 20 of CLRS.
    We're not going to hold you
  • 80:48 - 80:48
    responsible for the content,
    but it's an interesting data
  • 80:53 - 80:53
    structure because it's an
    amortized data structure.
  • 80:58 - 80:58
    And it turns out that it is
    data structure,
  • 81:02 - 81:02
    you can do extract Min in order
    log V amortized time.
  • 81:08 - 81:08
    And remarkably,
    you can do decreased key in
  • 81:13 - 81:13
    order one amortized.
    So, when I plug those in,
  • 81:18 - 81:18
    what do I get over here?
  • 81:34 - 81:34
    What's that going to be?
    Plug that it here.
  • 81:42 - 81:42
    It's going to be V times log V
    plus E: E plus V log V.
  • 81:52 - 81:52
    These are amortized,
    so what's this?
  • 82:00 - 82:00
    Trick question.
    It's worst-case.
  • 82:02 - 82:02
    It's not amortized over here.
    These are amortized,
  • 82:06 - 82:06
    but that's the beauty of
    amortization.
  • 82:09 - 82:09
    I can say it's going to be
    worst case: E plus V log V over
  • 82:13 - 82:13
    here, because when I add up the
    amortized cost of my operations,
  • 82:18 - 82:18
    it's an upper bound on the true
    costs.
  • 82:20 - 82:20
    OK, so that's why I say,
    one of the beauties of this
  • 82:24 - 82:24
    amortized analysis,
    and in particular,
  • 82:27 - 82:27
    being able to assign different
    costs to different operations is
  • 82:32 - 82:32
    I can just add them up and I get
    my worst-case costs.
  • 82:37 - 82:37
    So this is already V log V.
    There are a couple other
  • 82:41 - 82:41
    algorithms just before I let you
    go.
  • 82:43 - 82:43
    Kruskal's Algorithm in the book
    uses another amortized data
  • 82:47 - 82:47
    structure called a disjoint set
    data structure,
  • 82:50 - 82:50
    which also runs in E log V,
    that is, this time:
  • 82:53 - 82:53
    runs in this time,
    the same as using a binary
  • 82:57 - 82:57
    heap.
    So, I'll refer you to the book.
  • 83:00 - 83:00
    The best algorithm to date with
    this problem is done by our own
  • 83:05 - 83:05
    David Karger on the faculty here
    with one of our former
  • 83:09 - 83:09
    graduates, Phil Kline,
    who is now a professor at
  • 83:13 - 83:13
    Brown, and Robert Tarjan,
    who is sort of like the master
  • 83:17 - 83:17
    of all data structures who was a
    professor at Princeton in 1993.
  • 83:22 - 83:22
    OK, it's a randomized
    algorithm, and it gives you
  • 83:26 - 83:26
    order V plus E expected time.
    OK, so that's the best to date.
  • 83:32 - 83:32
    It's still open as to whether
    there is a deterministic,
  • 83:36 - 83:36
    there is worst-case bound,
    whether there is a worst-case
  • 83:41 - 83:41
    bound that is linear time.
    OK, but there is a randomized
  • 83:45 - 83:45
    to linear time,
    and otherwise,
  • 83:47 - 83:47
    this is essentially the best
    bound without additional
  • 83:52 - 83:52
    assumptions.
    OK, very cool stuff.
  • 83:54 - 83:54
    Next, we're going to see a lot
    of these ideas of greedy and
  • 83:59 - 83:59
    dynamic programming in practice.
Title:
Lec 16 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Description:

Lecture 16: Greedy Algorithms, Minimum Spanning Trees

View the complete course at: http://ocw.mit.edu/6-046JF05

License: Creative Commons BY-NC-SA

More information at http://ocw.mit.edu/terms

More courses at http://ocw.mit.edu

more » « less
Duration:
01:24:08

English subtitles

Revisions