< Return to Video

Lec 19 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

  • 0:07 - 0:07
    -- shortest paths.
    This is the finale.
  • 0:11 - 0:11
    Hopefully it was worth waiting
    for.
  • 0:14 - 0:14
    Remind you there's a quiz
    coming up soon,
  • 0:18 - 0:18
    you should be studying for it.
    There's no problem set due at
  • 0:23 - 0:23
    the same time as the quiz
    because you should be studying
  • 0:29 - 0:29
    now.
    It's a take-home exam.
  • 0:32 - 0:32
    It's required that you come to
    class on Monday.
  • 0:37 - 0:37
    Of course, you'll all come,
    but everyone watching at home
  • 0:43 - 0:43
    should also come next Monday to
    get the quiz.
  • 0:48 - 0:48
    It's the required lecture.
    So, we need a bit of a recap in
  • 0:54 - 0:54
    the trilogy so far.
    So, the last two lectures,
  • 0:58 - 0:58
    the last two episodes,
    or about single source shortest
  • 1:04 - 1:04
    paths.
    So, we wanted to find the
  • 1:08 - 1:08
    shortest path from a source
    vertex to every other vertex.
  • 1:14 - 1:14
    And, we saw a few algorithms
    for this.
  • 1:17 - 1:17
    Here's some recap.
    We saw in the unweighted case,
  • 1:22 - 1:22
    that was sort of the easiest
    where all the edge weights were
  • 1:27 - 1:27
    one.
    Then we could use breadth first
  • 1:30 - 1:30
    search.
    And this costs what we call
  • 1:35 - 1:35
    linear time in the graph world,
    the number of vertices plus the
  • 1:42 - 1:42
    number of edges.
    The next simplest case,
  • 1:46 - 1:46
    perhaps, is nonnegative edge
    weights.
  • 1:50 - 1:50
    And in that case,
    what algorithm do we use?
  • 1:55 - 1:55
    Dijkstra, all right,
    everyone's awake.
  • 2:00 - 2:00
    Several answers at once,
    great.
  • 2:04 - 2:04
    So this takes almost linear
    time if you use a good heap
  • 2:12 - 2:12
    structure, so,
    V log V plus E.
  • 2:16 - 2:16
    And, in the general case,
    general weights,
  • 2:21 - 2:21
    we would use Bellman-Ford which
    you saw.
  • 2:27 - 2:27
    And that costs VE,
    good, OK, which is quite a bit
  • 2:33 - 2:33
    worse.
    This is ignoring log factors.
  • 2:38 - 2:38
    Dijkstra is basically linear
    time, Bellman-Ford you're
  • 2:43 - 2:43
    quadratic if you have a
    connected graph.
  • 2:46 - 2:46
    So, in the sparse case,
    when E is order V,
  • 2:49 - 2:49
    this is about linear.
    This is about quadratic.
  • 2:53 - 2:53
    In the dense case,
    when E is about V^2,
  • 2:56 - 2:56
    this is quadratic,
    and this is cubic.
  • 3:00 - 3:00
    So, Dijkstra and Bellman-Ford
    are separated by about an order
  • 3:06 - 3:06
    of V factor, which is pretty
    bad.
  • 3:09 - 3:09
    OK, but that's the best we know
    how to do for single source
  • 3:15 - 3:15
    shortest paths,
    negative edge weights,
  • 3:19 - 3:19
    Bellman-Ford is the best.
    We also saw in recitation the
  • 3:25 - 3:25
    case of a DAG.
    And there, what do you do?
  • 3:30 - 3:30
    Topological sort,
    yeah.
  • 3:32 - 3:32
    So, you can do a topological
    sort to get an ordering on the
  • 3:39 - 3:39
    vertices.
    That you run Bellman-Ford,
  • 3:43 - 3:43
    one round.
    This is one way to think of
  • 3:47 - 3:47
    what's going on.
    You run Bellman-Ford in the
  • 3:52 - 3:52
    order given by the topological
    sort, which is once,
  • 3:58 - 3:58
    and you get a linear time
    algorithm.
  • 4:03 - 4:03
    So, DAG is another case where
    we know how to do well even with
  • 4:07 - 4:07
    weights.
    Unweighted, we can also do
  • 4:09 - 4:09
    linear time.
    But most of the time,
  • 4:10 - 4:10
    though, will be,
    so you should keep these in
  • 4:13 - 4:13
    mind in the quiz.
    When you get a shortest path
  • 4:16 - 4:16
    problem, or what you end up
    determining is the shortest path
  • 4:19 - 4:19
    problem, think about what's the
    best algorithm you can use in
  • 4:23 - 4:23
    that case?
    OK, so that's single source
  • 4:25 - 4:25
    shortest paths.
    And so, in our evolution of the
  • 4:27 - 4:27
    Death Star, initially it was
    just nonnegative edge weights.
  • 4:31 - 4:31
    Then we got negative edge
    weights.
  • 4:34 - 4:34
    Today, the Death Star
    challenges us with all pair
  • 4:37 - 4:37
    shortest paths,
    where we want to know the
  • 4:40 - 4:40
    shortest path weight between
    every pair of vertices.
  • 4:59 - 4:59
    OK, so let's get some quick
    results.
  • 5:04 - 5:04
    What could we do with this
    case?
  • 5:08 - 5:08
    So, for example,
    suppose I have an unweighted
  • 5:14 - 5:14
    graph.
    Any suggestions of how I should
  • 5:19 - 5:19
    compute all pair shortest paths?
    Between every pair of vertices,
  • 5:27 - 5:27
    I want to know the shortest
    path weight.
  • 5:32 - 5:32
    BFS, a couple more words?
    Yeah?
  • 5:38 - 5:38
    Right, BFS V times.
    OK, I'll say V times BFS,
  • 5:45 - 5:45
    OK?
    So, the running time would be
  • 5:50 - 5:50
    V^2 plus V times E,
    yeah, which is assuming your
  • 5:57 - 5:57
    graph is connected,
    V times E.
  • 6:03 - 6:03
    OK, good.
    That's probably about the best
  • 6:05 - 6:05
    algorithm we know for unweighted
    graphs.
  • 6:08 - 6:08
    So, a lot of these are going to
    sort of be the obvious answer.
  • 6:12 - 6:12
    You take your single source
    algorithm, you run it V times.
  • 6:15 - 6:15
    That's the best you can do,
    OK, or the best we know how to
  • 6:19 - 6:19
    do.
    This is not so bad.
  • 6:20 - 6:20
    This is like one iteration of
    Bellman-Ford,
  • 6:23 - 6:23
    for comparison.
    We definitely need at least,
  • 6:25 - 6:25
    like, V^2 time,
    because the size of the output
  • 6:28 - 6:28
    is V^2, shortest path weight we
    have to compute.
  • 6:32 - 6:32
    So, this is not perfect,
    but pretty good.
  • 6:37 - 6:37
    And we are not going to improve
    on that.
  • 6:42 - 6:42
    So, nonnegative edge weights:
    the natural thing to do is to
  • 6:49 - 6:49
    run Dijkstra V times,
    OK, no big surprise.
  • 6:54 - 6:54
    And the running time of that
    is, well, V times E again,
  • 7:01 - 7:01
    plus V^2, log V,
    which is also not too bad.
  • 7:08 - 7:08
    I mean, it's basically the same
    as running BFS.
  • 7:11 - 7:11
    And then, there's the log
    factor.
  • 7:13 - 7:13
    If you ignore the log factor,
    this is the dominant term.
  • 7:16 - 7:16
    And, I mean,
    this had an [added?] V^2 as
  • 7:18 - 7:18
    well.
    So, these are both pretty good.
  • 7:21 - 7:21
    I mean, this is kind of neat.
    Essentially,
  • 7:23 - 7:23
    the time it takes to run one
    Bellman-Ford plus a log factor,
  • 7:27 - 7:27
    you can compute all pair
    shortest paths if you have
  • 7:30 - 7:30
    nonnegative edge weights.
    So, I mean, comparing all pairs
  • 7:35 - 7:35
    to signal source,
    this seems a lot better,
  • 7:40 - 7:40
    except we can only handle
    nonnegative edge weights.
  • 7:45 - 7:45
    OK, so now let's think about
    the general case.
  • 7:50 - 7:50
    Well, this is the focus of
    today, and here's where we can
  • 7:56 - 7:56
    actually make an improvement.
    So the obvious thing is V times
  • 8:02 - 8:02
    Bellman-Ford,
    which would cost V^2 times E.
  • 8:08 - 8:08
    And that's pretty pitiful,
    and we're going to try to
  • 8:12 - 8:12
    improve that to something closer
    to that nonnegative edge weight
  • 8:16 - 8:16
    bound.
    So it turns out,
  • 8:17 - 8:17
    here, we can actually make an
    improvement whereas in these
  • 8:21 - 8:21
    special cases,
    we really can't do much better.
  • 8:24 - 8:24
    OK, I don't have a good
    intuition why,
  • 8:27 - 8:27
    but it's the case.
    So, we'll cover something like
  • 8:30 - 8:30
    three algorithms today for this
    problem.
  • 8:34 - 8:34
    The last one will be the best,
    but along the way we'll see
  • 8:37 - 8:37
    some nice connections between
    shortest paths and dynamic
  • 8:40 - 8:40
    programming, which we haven't
    really seen yet.
  • 8:43 - 8:43
    We've seen shortest path,
    and applying greedy algorithms
  • 8:46 - 8:46
    to it, but today will actually
    do dynamic programming.
  • 8:49 - 8:49
    The intuition is that with all
    pair shortest paths,
  • 8:52 - 8:52
    there's more potential
    subproblem reuse.
  • 8:54 - 8:54
    We've got to compute the
    shortest path from x to y for
  • 8:57 - 8:57
    all x and y.
    Maybe we can reuse those
  • 8:59 - 8:59
    shortest paths in computing
    other shortest paths.
  • 9:03 - 9:03
    OK, there's a bit more
    reusability, let's say.
  • 9:07 - 9:07
    OK, let me quickly define all
    pair shortest paths formally,
  • 9:12 - 9:12
    because we're going to change
    our notation slightly.
  • 9:17 - 9:17
    It's because we care about all
    pairs.
  • 9:20 - 9:20
    So, as usual,
    the input is directed graph,
  • 9:24 - 9:24
    so, vertices and edges.
    We're going to say that the
  • 9:30 - 9:30
    vertices are labeled one to n
    for convenience because with all
  • 9:36 - 9:36
    pairs, we're going to think of
    things more as an n by n matrix
  • 9:42 - 9:42
    instead of edges in some sense
    because it doesn't help to think
  • 9:48 - 9:48
    any more in terms of adjacency
    lists.
  • 9:52 - 9:52
    And, you have edge weights as
    usual.
  • 9:55 - 9:55
    This is what makes it
    interesting.
  • 10:00 - 10:00
    Some of them are going to be
    negative.
  • 10:05 - 10:05
    So, w maps to every real
    number, and the target output is
  • 10:14 - 10:14
    a shortest path matrix.
    So, this is now an n by n
  • 10:21 - 10:21
    matrix.
    So, n is just the number of
  • 10:26 - 10:26
    vertices of shortest path
    weights.
  • 10:32 - 10:32
    So, delta of i,
    j is the shortest path weight
  • 10:38 - 10:38
    from i to j for all pairs of
    vertices.
  • 10:43 - 10:43
    So this, you could represent as
    an n by n matrix in particular.
  • 10:51 - 10:51
    OK, so now let's start doing
    algorithms.
  • 10:57 - 10:57
    So, we have this very simple
    algorithm, V times Bellman-Ford,
  • 11:02 - 11:02
    V^2 times E,
    and just for comparison's sake,
  • 11:06 - 11:06
    I'm going to say,
    let me rewrite that,
  • 11:10 - 11:10
    V times Bellman-Ford gives us
    this running time of V^2 E,
  • 11:15 - 11:15
    and I'm going to think about
    the case where,
  • 11:18 - 11:18
    let's just say the graph is
    dense, meeting that the number
  • 11:23 - 11:23
    of edges is quadratic,
    and the number of vertices.
  • 11:29 - 11:29
    So in that case,
    this will take V^4 time,
  • 11:33 - 11:33
    which is pretty slow.
    We'd like to do better.
  • 11:38 - 11:38
    So, first goal would just be to
    beat V^4, V hypercubed,
  • 11:43 - 11:43
    I guess.
    OK, and we are going to use
  • 11:47 - 11:47
    dynamic programming to do that.
    Or at least that's what the
  • 11:53 - 11:53
    motivation will come from.
    It will take us a while before
  • 11:59 - 11:59
    we can even beat V^4,
    which is maybe a bit pathetic,
  • 12:04 - 12:04
    but it takes some clever
    insights, let's say.
  • 12:10 - 12:10
    OK, so I'm going to introduce a
    bit more notation for this
  • 12:20 - 12:20
    graph.
    So, I'm going to think about
  • 12:25 - 12:25
    the weighted adjacency matrix.
    So, I don't think we've really
  • 12:33 - 12:33
    seen this in lecture before,
    although I think it's in the
  • 12:38 - 12:38
    appendix.
    What that means,
  • 12:40 - 12:40
    so normally adjacency matrix is
    like one if there's an edge,
  • 12:44 - 12:44
    and zero if there isn't.
    And this is in a digraph,
  • 12:48 - 12:48
    so you have to be a little bit
    careful.
  • 12:51 - 12:51
    Here, these values,
    the entries in the matrix,
  • 12:54 - 12:54
    are going to be the weights of
    the edges.
  • 12:57 - 12:57
    OK, this is this if ij is an
    edge.
  • 13:01 - 13:01
    So, if ij is an edge in the
    graph, and it's going to be
  • 13:05 - 13:05
    infinity if there is no edge.
    OK, in terms of shortest paths,
  • 13:09 - 13:09
    this is a more useful way to
    represent the graph.
  • 13:12 - 13:12
    All right, and so this includes
    everything that we need from
  • 13:16 - 13:16
    here.
    And now we just have to think
  • 13:18 - 13:18
    about it as a matrix.
    Matrices will be a useful tool
  • 13:22 - 13:22
    in a little while.
    OK, so now I'm going to define
  • 13:25 - 13:25
    some sub problems.
    And, there's different ways
  • 13:28 - 13:28
    that you could define what's
    going on in the shortest paths
  • 13:32 - 13:32
    problem.
    OK, the natural thing is I want
  • 13:36 - 13:36
    to go from vertex i to vertex j.
    What's the shortest path?
  • 13:39 - 13:39
    OK, we need to refine the sub
    problems a little but more than
  • 13:43 - 13:43
    that.
    Not surprising.
  • 13:44 - 13:44
    And if you think about my
    analogy to Bellman-Ford,
  • 13:47 - 13:47
    what Bellman-Ford does is it
    tries to build longer and longer
  • 13:50 - 13:50
    shortest paths.
    But here, length is in terms of
  • 13:53 - 13:53
    the number of edges.
    So, first, it builds shortest
  • 13:55 - 13:55
    paths of length one.
    We've proven the first round it
  • 13:58 - 13:58
    does that.
    The second round,
  • 14:02 - 14:02
    it provides all shortest paths
    of length two,
  • 14:06 - 14:06
    of count two,
    and so on.
  • 14:09 - 14:09
    We'd like to do that sort of
    analogously, and try to reuse
  • 14:15 - 14:15
    things a little bit more.
    So, I'm going to say d_ij^(m)
  • 14:21 - 14:21
    is the weight of the shortest
    path from i to j with some
  • 14:26 - 14:26
    restriction involving m.
    So: shortest path from i to j
  • 14:33 - 14:33
    using at most m edges.
    OK, for example,
  • 14:37 - 14:37
    if m is zero,
    then we don't have to really
  • 14:41 - 14:41
    think very hard to find all
    shortest paths of length zero.
  • 14:47 - 14:47
    OK, they use zero edges,
    I should say.
  • 14:51 - 14:51
    So, Bellman-Ford sort of tells
    us how to go from m to m plus
  • 14:57 - 14:57
    one.
    So, let's just figure that out.
  • 15:02 - 15:02
    So one thing we know from the
    Bellman-Ford analysis is if we
  • 15:06 - 15:06
    look at d_ij^(m-1),
    we know that in some sense the
  • 15:09 - 15:09
    longest shortest path of
    relevance, unless you have
  • 15:12 - 15:12
    negative weight cycle,
    the longest shortest path of
  • 15:15 - 15:15
    relevance is when m equals n
    minus one because that's the
  • 15:19 - 15:19
    longest simple path you can
    have.
  • 15:21 - 15:21
    So, this should be a shortest
    path weight from i to j,
  • 15:25 - 15:25
    and it would be no matter what
    larger value you put in the
  • 15:28 - 15:28
    superscript.
    This should be delta of i comma
  • 15:32 - 15:32
    j if there's no negative weight
    cycles.
  • 15:35 - 15:35
    OK, so this feels good for
    dynamic programming.
  • 15:39 - 15:39
    This will give us the answer if
    we can compute this for all m.
  • 15:43 - 15:43
    Then we'll have the shortest
    path weights in particular.
  • 15:47 - 15:47
    We need a way to detect
    negative weight cycles,
  • 15:51 - 15:51
    but let's not worry about that
    too much for now.
  • 15:54 - 15:54
    There are negative weights,
    but let's just assume for now
  • 15:58 - 15:58
    there's no negative weight
    cycles.
  • 16:02 - 16:02
    OK, and we get a recursion
    recurrence.
  • 16:06 - 16:06
    And the base case is when m
    equals zero.
  • 16:11 - 16:11
    This is pretty easy.
    They have the same vertices,
  • 16:17 - 16:17
    the weight of zero,
    and otherwise it's infinity.
  • 16:22 - 16:22
    OK, and then the actual
    recursion is for m.
  • 16:57 - 16:57
    OK, if I got this right,
    this is a pretty easy,
  • 17:01 - 17:01
    intuitive recursion for
    d_ij^(m) is a min of smaller
  • 17:05 - 17:05
    things in terms of n minus one.
    I'll just show the picture,
  • 17:10 - 17:10
    and then the proof of that
    claim should be obvious.
  • 17:15 - 17:15
    So, this is proof by picture.
    So, we have on the one hand,
  • 17:20 - 17:20
    I over here,
    and j over here.
  • 17:22 - 17:22
    We want to know the shortest
    path from i to j.
  • 17:26 - 17:26
    And, we want to use,
    at most, m edges.
  • 17:30 - 17:30
    So, the idea is,
    well, you could use m minus one
  • 17:35 - 17:35
    edges to get somewhere.
    So this is, at most,
  • 17:39 - 17:39
    m minus one edges,
    some other place,
  • 17:43 - 17:43
    and we'll call it k.
    So this is a candidate for k.
  • 17:48 - 17:48
    And then you could take the
    edge directly from k to j.
  • 17:53 - 17:53
    So, this costs A_k^j,
    and this costs DIK m minus one.
  • 18:00 - 18:00
    OK, and that's a candidate path
    of length that uses,
  • 18:03 - 18:03
    at most, m edges from I to j.
    And this is essentially just
  • 18:06 - 18:06
    considering all of them.
    OK, so there's sort of many
  • 18:09 - 18:09
    paths we are considering.
    All of these are candidate
  • 18:12 - 18:12
    values of k.
    We are taking them in over all
  • 18:14 - 18:14
    k as intermediate nodes,
    whatever.
  • 18:16 - 18:16
    So there they are.
    We take the best such path.
  • 18:19 - 18:19
    That should encompass all
    shortest paths.
  • 18:21 - 18:21
    And this is essentially sort of
    what Bellman-Ford is doing,
  • 18:24 - 18:24
    although not exactly.
    We also sort of want to think
  • 18:27 - 18:27
    about, well, what if I just go
    directly with,
  • 18:29 - 18:29
    say, m minus one edges?
    What if there is no edge here
  • 18:34 - 18:34
    that I want to use,
    in some sense?
  • 18:37 - 18:37
    Well, we always think about
    there being, and the way the A's
  • 18:41 - 18:41
    are defined, there's always this
    zero weight edge to yourself.
  • 18:45 - 18:45
    So, you could just take a path
    that's shorter,
  • 18:49 - 18:49
    go from d i to j,
    and j is a particular value of
  • 18:52 - 18:52
    k that we might consider,
    and then take a zero weight
  • 18:56 - 18:56
    edge at the end from A and jj.
    OK, so this really encompasses
  • 19:00 - 19:00
    everything.
    So that's a pretty trivial
  • 19:04 - 19:04
    claim.
    OK, now once we have such a
  • 19:06 - 19:06
    recursion, we get a dynamic
    program.
  • 19:09 - 19:09
    I mean, there,
    this is it in some sense.
  • 19:12 - 19:12
    It's written recursively.
    You can write a bottom up.
  • 19:15 - 19:15
    And I would like to write it
    bottom up it little bit because
  • 19:20 - 19:20
    while it doesn't look like it,
    this is a relaxation.
  • 19:23 - 19:23
    This is yet another relaxation
    algorithm.
  • 19:26 - 19:26
    So, I'll give you,
    so, this is sort of the
  • 19:29 - 19:29
    algorithm.
    This is not a very interesting
  • 19:32 - 19:32
    algorithm.
    So, you don't have to write it
  • 19:36 - 19:36
    all down if you don't feel like
    it.
  • 19:38 - 19:38
    It's probably not even in the
    book.
  • 19:40 - 19:40
    This is just an intermediate
    step.
  • 19:43 - 19:43
    So, we loop over all m.
    That's sort of the outermost
  • 19:46 - 19:46
    thing to do.
    I want to build longer and
  • 19:48 - 19:48
    longer paths,
    and this vaguely corresponds to
  • 19:51 - 19:51
    Bellman-Ford,
    although it's actually worse
  • 19:54 - 19:54
    than Bellman-Ford.
    But hey, what the heck?
  • 19:56 - 19:56
    It's a stepping stone.
    OK, then for all i and j,
  • 20:04 - 20:04
    and then we want to compute
    this min.
  • 20:10 - 20:10
    So, we'll just loop over all k,
    and relax.
  • 20:18 - 20:18
    And, here's where we're
    actually computing the min.
  • 20:27 - 20:27
    And, it's a relaxation,
    is the point.
  • 20:35 - 20:35
    This is our good friend,
    the relaxation step,
  • 20:38 - 20:38
    relaxing edge.
    Well, it's not,
  • 20:40 - 20:40
    yeah.
    I guess we're relaxing edge kj,
  • 20:43 - 20:43
    or something,
    except we don't have the same
  • 20:46 - 20:46
    clear notion.
    I mean, it's a particular thing
  • 20:49 - 20:49
    that we're relaxing.
    It's not just a single edge
  • 20:52 - 20:52
    because we don't have a single
    source anymore.
  • 20:55 - 20:55
    It's now relative to source I,
    we are relaxing the edge kj,
  • 20:59 - 20:59
    something like that.
    But this is clearly a
  • 21:03 - 21:03
    relaxation.
    We are just making the triangle
  • 21:06 - 21:06
    inequality true if it wasn't
    before.
  • 21:08 - 21:08
    The tribal inequality has got
    to hold between all pairs.
  • 21:12 - 21:12
    And that's just implementing
    this min, right?
  • 21:14 - 21:14
    You're taking d ij.
    You take the min of what it was
  • 21:17 - 21:17
    before in some sense.
    That was one of the
  • 21:20 - 21:20
    possibilities we considered when
    we looked at the zero weight
  • 21:24 - 21:24
    edge.
    We say, well,
  • 21:25 - 21:25
    or you could go from i to some
    k in some way that we knew how
  • 21:28 - 21:28
    to before, and then add on the
    edge, and check whether that's
  • 21:32 - 21:32
    better if it's better,
    set our current estimate to
  • 21:35 - 21:35
    that.
    And, you do this for all k.
  • 21:39 - 21:39
    In particular,
    you might actually compute
  • 21:41 - 21:41
    something smaller than this min
    because I didn't put
  • 21:44 - 21:44
    superscripts up here.
    But that's just making paths
  • 21:47 - 21:47
    even better.
    OK, so you have to argue that
  • 21:49 - 21:49
    relaxation is always a good
    thing to do.
  • 21:51 - 21:51
    So, by not putting
    superscripts,
  • 21:53 - 21:53
    maybe I do some more
    relaxation, but more relaxation
  • 21:56 - 21:56
    never hurts us.
    You can still argue correctness
  • 21:59 - 21:59
    using this claim.
    So, it's not quite the direct
  • 22:03 - 22:03
    implementation,
    but there you go,
  • 22:06 - 22:06
    dynamic programming algorithm.
    The main reason I'll write it
  • 22:10 - 22:10
    down: so you see that it's a
    relaxation, and you see the
  • 22:15 - 22:15
    running time is n^4,
    OK, which is certainly no
  • 22:18 - 22:18
    better than Bellman-Ford.
    Bellman-Ford was n^4 even in
  • 22:23 - 22:23
    the dense case,
    and it's a little better in the
  • 22:26 - 22:26
    sparse case.
    So: not doing so great.
  • 22:30 - 22:30
    But it's a start.
    OK, it gets our dynamic
  • 22:35 - 22:35
    programming minds thinking.
    And, we'll get a better dynamic
  • 22:42 - 22:42
    program in a moment.
    But first, there's actually
  • 22:47 - 22:47
    something useful we can do with
    this formulation,
  • 22:53 - 22:53
    and I guess I'll ask,
    but I'll be really impressed if
  • 22:59 - 22:59
    anyone can see.
    Does this formula look like
  • 23:05 - 23:05
    anything else that you've seen
    in any context,
  • 23:10 - 23:10
    mathematical or algorithmic?
    Have you seen that recurrence
  • 23:16 - 23:16
    anywhere else?
    OK, not exactly as stated,
  • 23:20 - 23:20
    but similar.
    I'm sure if you thought about
  • 23:25 - 23:25
    it for awhile,
    you could come up with it.
  • 23:30 - 23:30
    Any answers?
    I didn't think you would be
  • 23:33 - 23:33
    very intuitive,
    but the answer is matrix
  • 23:36 - 23:36
    multiplication.
    And it may now be obvious to
  • 23:40 - 23:40
    you, or it may not.
    You have to think with the
  • 23:43 - 23:43
    right quirky mind.
    Then it's obvious that it's
  • 23:47 - 23:47
    matrix multiplication.
    Remember, matrix
  • 23:50 - 23:50
    multiplication,
    we have A, B,
  • 23:52 - 23:52
    and C.
    They're all n by n matrices.
  • 23:55 - 23:55
    And, we want to compute C
    equals A times B.
  • 24:00 - 24:00
    And what that meant was,
    well, c_ij was a sum over all k
  • 24:05 - 24:05
    of a_ik times b_kj.
    All right, that was our
  • 24:09 - 24:09
    definition of matrix
    multiplication.
  • 24:12 - 24:12
    And that formula looks kind of
    like this one.
  • 24:16 - 24:16
    I mean, notice the subscripts:
    ik and kj.
  • 24:19 - 24:19
    Now, the operators are a little
    different.
  • 24:23 - 24:23
    Here, we're multiplying the
    inside things and adding them
  • 24:28 - 24:28
    all together.
    There, we're adding the inside
  • 24:34 - 24:34
    things and taking them in.
    But other than that,
  • 24:41 - 24:41
    it's the same.
    OK, weird, but here we go.
  • 24:47 - 24:47
    So, the connection to shortest
    paths is you replace these
  • 24:55 - 24:55
    operators.
    So, let's take matrix
  • 25:00 - 25:00
    multiplication and replace,
    what should I do first,
  • 25:05 - 25:05
    plus this thing with min.
    So, why not just change the
  • 25:10 - 25:10
    operators, replace dot with
    plus?
  • 25:14 - 25:14
    This is just a different
    algebra to work in,
  • 25:18 - 25:18
    where plus actually means min,
    and dot actually means plus.
  • 25:24 - 25:24
    So, you have to check that
    things sort of work out in that
  • 25:30 - 25:30
    context, but if we do that,
    then we get that c_ij is the
  • 25:35 - 25:35
    min overall k of a_ik plus,
    a bit messy here,
  • 25:40 - 25:40
    b_kj.
    And that looks like what we
  • 25:44 - 25:44
    actually want to compute,
    here, for one value of m,
  • 25:49 - 25:49
    you have to sort of do this m
    times.
  • 25:52 - 25:52
    But this conceptually is
    d_ij^(m), and this is
  • 25:56 - 25:56
    d_ik^(m-1).
    So, this is looking like a
  • 26:00 - 26:00
    matrix product,
    which is kind of cool.
  • 26:04 - 26:04
    So, if we sort of plug in this
    claim, then, and think about
  • 26:12 - 26:12
    things as matrices,
    the recurrence gives us,
  • 26:17 - 26:17
    and I'll just write this now at
    matrix form, that d^(m) is d^(m)
  • 26:26 - 26:26
    minus one, funny product,
    A.
  • 26:30 - 26:30
    All right, so these are the
    weights.
  • 26:32 - 26:32
    These were the weighted
    adjacency matrix.
  • 26:35 - 26:35
    This was the previous d value.
    This is the new d value.
  • 26:38 - 26:38
    So, I'll just rewrite that in
    matrix form with capital
  • 26:41 - 26:41
    letters.
    OK, I have the circle up things
  • 26:44 - 26:44
    that are using this funny
    algebra, so, in particular,
  • 26:47 - 26:47
    circled product.
    OK, so that's kind of nifty.
  • 26:50 - 26:50
    We know something about
    computing matrix
  • 26:52 - 26:52
    multiplications.
    We can do it in n^3 time.
  • 26:55 - 26:55
    If we were a bit fancier,
    maybe we could do it in
  • 26:58 - 26:58
    sub-cubic time.
    So, we could try to sort of use
  • 27:03 - 27:03
    this connection.
    And, well, think about what we
  • 27:07 - 27:07
    are computing here.
    We are saying,
  • 27:10 - 27:10
    well, d to the m is the
    previous one times A.
  • 27:15 - 27:15
    So, what is d^(m)?
    Is that some other algebraic
  • 27:19 - 27:19
    notion that we know?
    Yeah, it's the exponent.
  • 27:23 - 27:23
    We're taking A,
    and we want to raise it to the
  • 27:28 - 27:28
    power, m, with this funny notion
    of product.
  • 27:33 - 27:33
    So, in other words,
    d to the m is really just A to
  • 27:37 - 27:37
    the m in a funny way.
    So, I'll circle it,
  • 27:40 - 27:40
    OK?
    So, that sounds good.
  • 27:42 - 27:42
    We also know how to compute
    powers of things relatively
  • 27:46 - 27:46
    quickly, if you remember how.
    OK, for this notion,
  • 27:50 - 27:50
    this power notion,
    to make sense,
  • 27:53 - 27:53
    I should say what A to the zero
    means.
  • 27:55 - 27:55
    And so, I need some kind of
    identity matrix.
  • 28:00 - 28:00
    And for here,
    the identity matrix is this
  • 28:03 - 28:03
    one, if I get it right.
    So, it has zeros along the
  • 28:07 - 28:07
    diagonal, and infinities
    everywhere else.
  • 28:09 - 28:09
    OK, that sort of just to match
    this definition.
  • 28:13 - 28:13
    d_ij zero should be zeros on
    the diagonals and infinity
  • 28:17 - 28:17
    everywhere else.
    But you can check this is
  • 28:20 - 28:20
    actually an identity.
    If you multiply it with this
  • 28:23 - 28:23
    funny multiplication against any
    other matrix,
  • 28:27 - 28:27
    you get the matrix back.
    Nothing changes.
  • 28:31 - 28:31
    This really is a valid identity
    matrix.
  • 28:35 - 28:35
    And, I should mention that for
    A to the m to make sense,
  • 28:40 - 28:40
    you really knew that your
    product operation is
  • 28:45 - 28:45
    associative.
    So, actually A to the m circled
  • 28:49 - 28:49
    makes sense because circled
    multiplication is associative,
  • 28:54 - 28:54
    and you can check that;
    not hard because,
  • 28:58 - 28:58
    I mean, min is associative,
    and addition is associative,
  • 29:04 - 29:04
    and all sorts of good stuff.
    And, you have some kind of
  • 29:11 - 29:11
    distributivity property.
    And, this is,
  • 29:14 - 29:14
    in turn, because the real
    numbers with,
  • 29:18 - 29:18
    and get the right order here,
    with min as your addition
  • 29:24 - 29:24
    operation, and plus as your
    multiplication operation is a
  • 29:29 - 29:29
    closed semi-ring.
    So, if ever you want to know
  • 29:34 - 29:34
    when powers make sense,
    this is a good rule.
  • 29:38 - 29:38
    If you have a closed semi-ring,
    then matrix products on that
  • 29:43 - 29:43
    semi-ring will give you an
    associative operator,
  • 29:47 - 29:47
    and then, good,
    you can take products.
  • 29:50 - 29:50
    OK, that's just some formalism.
    So now, we have some intuition.
  • 29:55 - 29:55
    The question is,
    what's the right.
  • 29:57 - 29:57
    Algorithm?
    There are many possible
  • 30:00 - 30:00
    answers, some of which are
    right, some of which are not.
  • 30:06 - 30:06
    So, we have this connection to
    matrix products,
  • 30:10 - 30:10
    and we have a connection to
    matrix powers.
  • 30:13 - 30:13
    And, we have algorithms for
    both.
  • 30:16 - 30:16
    The question is,
    what should we do?
  • 30:18 - 30:18
    So, all we need to do now is to
    compute A to the funny power,
  • 30:23 - 30:23
    n minus one.
    n minus one is when we get
  • 30:26 - 30:26
    shortest paths,
    assuming we have no negative
  • 30:30 - 30:30
    weight cycles.
    In fact, we could compute a
  • 30:34 - 30:34
    larger power than n minus one.
    Once you get beyond n minus
  • 30:39 - 30:39
    one, multipling by A doesn't
    change you anymore.
  • 30:43 - 30:43
    So, how should we do it?
    OK, you're not giving any smart
  • 30:48 - 30:48
    answers.
    I'll give the stupid answer.
  • 30:51 - 30:51
    You could say,
    well, I take A.
  • 30:53 - 30:53
    I multiply it by A.
    Then I multiply it by A,
  • 30:57 - 30:57
    and I multiply it by A,
    and I use normal,
  • 31:00 - 31:00
    boring matrix to
    multiplication.
  • 31:04 - 31:04
    So, I do, like,
    n minus two,
  • 31:07 - 31:07
    standard matrix multiplies.
    So, standard multiply costs,
  • 31:13 - 31:13
    like, n^3.
    And I'm doing n of them.
  • 31:17 - 31:17
    So, this gives me an n^4
    algorithm, and compute all the
  • 31:23 - 31:23
    shortest pathways in n^4.
    Woohoo!
  • 31:27 - 31:27
    OK, no improvement.
    So, how can I do better?
  • 31:32 - 31:32
    Right, natural thing to try
    which sadly does not work,
  • 31:36 - 31:36
    is to use the sub cubic matrix
    multiply algorithm.
  • 31:40 - 31:40
    We will, in some sense,
    get there in a moment with a
  • 31:44 - 31:44
    somewhat simpler problem.
    But, it's actually not known
  • 31:48 - 31:48
    how to compute shortest paths
    using fast matrix multiplication
  • 31:53 - 31:53
    like Strassen's system
    algorithm.
  • 31:56 - 31:56
    But, good suggestion.
    OK, you have to think about why
  • 32:00 - 32:00
    it doesn't work,
    and I'll tell you.
  • 32:04 - 32:04
    It's not obvious,
    so it's a perfectly reasonable
  • 32:07 - 32:07
    suggestion.
    But in this context it doesn't
  • 32:10 - 32:10
    quite work.
    It will come up in a few
  • 32:13 - 32:13
    moments.
    The problem is,
  • 32:14 - 32:14
    Strassen requires the notion of
    subtraction.
  • 32:17 - 32:17
    And here, addition is min.
    And, there's no inverse to min.
  • 32:21 - 32:21
    Once you take the arguments,
    you can't sort of undo a min.
  • 32:25 - 32:25
    OK, so there's no notion of
    subtraction, so it's not known
  • 32:29 - 32:29
    how to pull that off,
    sadly.
  • 32:32 - 32:32
    So, what other tricks do we
    have up our sleeve?
  • 32:36 - 32:36
    Yeah?
    Divide and conquer,
  • 32:38 - 32:38
    log n powering,
    yeah, repeated squaring.
  • 32:41 - 32:41
    That works.
    Good, we had a fancy way.
  • 32:44 - 32:44
    If you had a number n,
    you sort of looked at the
  • 32:48 - 32:48
    binary number representation of
    n, and you either squared the
  • 32:53 - 32:53
    number or squared it and added
    another factor of A.
  • 32:57 - 32:57
    Here, we don't even have to be
    smart about it.
  • 33:02 - 33:02
    OK, we can just compute,
    we really only have to think
  • 33:07 - 33:07
    about powers of two.
    What we want to know,
  • 33:11 - 33:11
    and I'm going to need a bigger
    font here because there's
  • 33:17 - 33:17
    multiple levels of subscripts,
    A to the circled power,
  • 33:23 - 33:23
    two to the ceiling of log n.
    Actually, n minus one would be
  • 33:28 - 33:28
    enough.
    But there you go.
  • 33:32 - 33:32
    You can write n if you didn't
    leave yourself enough space like
  • 33:36 - 33:36
    me, n the ceiling,
    n the circle.
  • 33:38 - 33:38
    This just means the next power
    of two after n minus one,
  • 33:41 - 33:41
    two to the ceiling log.
    So, we don't have to go
  • 33:44 - 33:44
    directly to n minus one.
    We can go further because
  • 33:47 - 33:47
    anything farther than n minus
    one is still just the shortest
  • 33:51 - 33:51
    pathways.
    If you look at the definition,
  • 33:53 - 33:53
    and you know that your paths
    are simple, which is true if you
  • 33:57 - 33:57
    have no negative weight cycles,
    then fine, just go farther.
  • 34:02 - 34:02
    Why not?
    And so, to compute this,
  • 34:05 - 34:05
    we just do ceiling of log n
    minus one products,
  • 34:09 - 34:09
    just take A squared,
    and then take the result and
  • 34:14 - 34:14
    square it; take the result and
    square it.
  • 34:17 - 34:17
    So, this is order log n
    squares.
  • 34:20 - 34:20
    And, we don't know how to use
    Strassen, but we can use the
  • 34:25 - 34:25
    boring, standard multiply of
    n^3, and that gives us n^3 log n
  • 34:31 - 34:31
    running time,
    OK, which finally is something
  • 34:35 - 34:35
    that beats Bellman-Ford in the
    dense case.
  • 34:40 - 34:40
    OK, in the dense case,
    Bellman-Ford was n^4.
  • 34:43 - 34:43
    Here we get n^3 log n,
    finally something better.
  • 34:47 - 34:47
    In the sparse case,
    it's about the same,
  • 34:50 - 34:50
    maybe a little worse.
    E is order V.
  • 34:52 - 34:52
    Then we're going to get,
    like, V3 for Bellman-Ford.
  • 34:56 - 34:56
    Here, we get n^3 log n.
    OK, after log factors,
  • 34:59 - 34:59
    this is an improvement some of
    the time.
  • 35:03 - 35:03
    OK, it's about the same the
    other times.
  • 35:06 - 35:06
    Another nifty thing that you
    get for free out of this,
  • 35:10 - 35:10
    is you can detect negative
    weight cycles.
  • 35:13 - 35:13
    So, here's a bit of a puzzle.
    How would I detect,
  • 35:17 - 35:17
    after I compute this product,
    A to the power to ceiling log n
  • 35:21 - 35:21
    minus one, how would I know if I
    found a negative weight cycle?
  • 35:26 - 35:26
    What would that mean it this
    matrix of all their shortest
  • 35:30 - 35:30
    paths of, at most,
    a certain length?
  • 35:34 - 35:34
    If I found a cycle,
    what would have to be in that
  • 35:37 - 35:37
    matrix?
    Yeah?
  • 35:38 - 35:38
    Right, so I could,
    for example,
  • 35:39 - 35:39
    take this thing,
    multiply it by A,
  • 35:41 - 35:41
    see if the matrix changed at
    all.
  • 35:43 - 35:43
    Right, that works fine.
    That's what we do in
  • 35:46 - 35:46
    Bellman-Ford.
    It's an even simpler thing.
  • 35:48 - 35:48
    It's already there.
    You don't have to multiply.
  • 35:51 - 35:51
    But that's the same running
    time.
  • 35:53 - 35:53
    That's a good answer.
    The diagonal would have a
  • 35:56 - 35:56
    negative value,
    yeah.
  • 35:57 - 35:57
    So, this is just a cute thing.
    Both approaches would work,
  • 36:05 - 36:05
    can detect a negative weight
    cycle just by looking at the
  • 36:15 - 36:15
    diagonal of the matrix.
    You just look for a negative
  • 36:24 - 36:24
    value in the diagonal.
    OK.
  • 36:30 - 36:30
    So, that's algorithm one,
    let's say.
  • 36:33 - 36:33
    I mean, we've seen several that
    are all bad, but I'll call this
  • 36:37 - 36:37
    number one.
    OK, we'll see two more.
  • 36:40 - 36:40
    This is the only one that will,
    well, I shouldn't say that.
  • 36:44 - 36:44
    Fine, there we go.
    So, this is one dynamic program
  • 36:48 - 36:48
    that wasn't so helpful,
    except it showed us a
  • 36:51 - 36:51
    connection to matrix
    multiplication,
  • 36:54 - 36:54
    which is interesting.
    We'll see why it's useful a
  • 36:58 - 36:58
    little bit more.
    But, it bled to this nasty four
  • 37:02 - 37:02
    nested loops.
    And, using this trick,
  • 37:05 - 37:05
    we got down to n^3 log n.
    Let's try, just for n^3.
  • 37:08 - 37:08
    OK, just get rid of that log.
    It's annoying.
  • 37:12 - 37:12
    It makes you a little bit worse
    than Bellman-Ford,
  • 37:15 - 37:15
    and the sparse case.
    So, let's just erase one of
  • 37:19 - 37:19
    these nested loops.
    OK, I want to do that.
  • 37:22 - 37:22
    OK, obviously that algorithm
    doesn't work because it's for
  • 37:26 - 37:26
    first decay, and it's not
    defined, but,
  • 37:28 - 37:28
    you know, I've got enough
    variables.
  • 37:31 - 37:31
    Why don't I just define k to
    the m?
  • 37:35 - 37:35
    OK, it turns out that works.
    I'll do it from scratch,
  • 37:40 - 37:40
    but why not?
    I don't know if that's how
  • 37:43 - 37:43
    Floyd and Warshall came up with
    their algorithm,
  • 37:47 - 37:47
    but here you go.
    Here's Floyd-Warshall.
  • 37:50 - 37:50
    The idea is to define the
    subproblems a little bit more
  • 37:55 - 37:55
    cleverly so that to compute one
    of these values,
  • 37:59 - 37:59
    you don't have to take the min
    of n things.
  • 38:04 - 38:04
    I just want to take the min of
    two things.
  • 38:07 - 38:07
    If I could do that,
    and I still only have n^3
  • 38:10 - 38:10
    subproblems, then I would have
    n^3 time.
  • 38:12 - 38:12
    So, all right,
    the running time of dynamic
  • 38:15 - 38:15
    program is number of subproblems
    times the time to compute the
  • 38:19 - 38:19
    recurrence for one subproblem.
    So, here's linear times n^3,
  • 38:23 - 38:23
    and we want n^3 times constant.
    That would be good.
  • 38:26 - 38:26
    So that's Floyd-Warshall.
    So, here's the way we're going
  • 38:30 - 38:30
    to redefine c_ij.
    Or I guess, there it was called
  • 38:35 - 38:35
    d_ij.
    Good, so we're going to define
  • 38:39 - 38:39
    something new.
    So, c_ij superscript k is now
  • 38:44 - 38:44
    going to be the weight of the
    shortest path from I to j as
  • 38:50 - 38:50
    before.
    Notice I used the superscript k
  • 38:54 - 38:54
    instead of m because I want k
    and m to be the same thing.
  • 39:00 - 39:00
    Deep.
    OK, now, here's the new
  • 39:04 - 39:04
    constraint.
    I want all intermediate
  • 39:06 - 39:06
    vertices along the path,
    meeting all vertices except for
  • 39:10 - 39:10
    I and j at the beginning and the
    end to have a small label.
  • 39:14 - 39:14
    So, they should be in the set
    from one up to k.
  • 39:17 - 39:17
    And this is where we are really
    using that our vertices are
  • 39:21 - 39:21
    labeled one up to m.
    So, I'm going to say,
  • 39:24 - 39:24
    well, first think about the
    shortest paths that don't use
  • 39:28 - 39:28
    any other vertices.
    That's when k is zero.
  • 39:32 - 39:32
    Then think about all the
    shortest paths that maybe they
  • 39:36 - 39:36
    use vertex one.
    And then think about the
  • 39:38 - 39:38
    shortest paths that maybe use
    vertex one or vertex two.
  • 39:42 - 39:42
    Why not?
    You could define it in this
  • 39:44 - 39:44
    way.
    It turns out,
  • 39:45 - 39:45
    then when you increase k,
    you only have to think about
  • 39:48 - 39:48
    one new vertex.
    Here, we had to take min over
  • 39:51 - 39:51
    all k.
    Now we know which k to look at.
  • 39:54 - 39:54
    OK, maybe that made sense.
    Maybe it's not quite obvious
  • 39:57 - 39:57
    yet.
    But I'm going to redo this
  • 39:59 - 39:59
    claim, redo a recurrence.
    So, maybe first I should say
  • 40:04 - 40:04
    some obvious things.
    So, if I want delta of ij of
  • 40:07 - 40:07
    the shortest pathway,
    well, just take all the
  • 40:11 - 40:11
    vertices.
    So, take c_ij superscript n.
  • 40:13 - 40:13
    That's everything.
    And this even works,
  • 40:16 - 40:16
    this is true even if you have a
    negative weight cycle.
  • 40:20 - 40:20
    Although, again,
    we're going to sort of ignore
  • 40:23 - 40:23
    negative weight cycles as long
    as we can detect them.
  • 40:27 - 40:27
    And, another simple case is if
    you have, well,
  • 40:30 - 40:30
    c_ij to zero.
    Let me put that in the claim to
  • 40:36 - 40:36
    be a little bit more consistent
    here.
  • 40:41 - 40:41
    So, here's the new claim.
    If we want to compute c_ij
  • 40:47 - 40:47
    superscript zero,
    what is it?
  • 40:51 - 40:51
    Superscript zero means I really
    shouldn't use any intermediate
  • 40:59 - 40:59
    vertices.
    So, this has a very simple
  • 41:04 - 41:04
    answer, a three letter answer.
    So, it's not zero.
  • 41:09 - 41:09
    It's four letters.
    What's that?
  • 41:13 - 41:13
    Nil.
    No, not working yet.
  • 41:15 - 41:15
    It has some subscripts,
    too.
  • 41:18 - 41:18
    So, the definition would be,
    what's the shortest path weight
  • 41:25 - 41:25
    from I to j when you're not
    allowed to use any intermediate
  • 41:32 - 41:32
    vertices?
    Sorry?
  • 41:35 - 41:35
    So, yeah, it has a very simple
    name.
  • 41:38 - 41:38
    That's the tricky part.
    All right, so if i equals j,
  • 41:43 - 41:43
    [LAUGHTER] you're clever,
    right, open bracket i equals j
  • 41:49 - 41:49
    means one, well,
    OK.
  • 41:50 - 41:50
    It sort of works,
    but it's not quite right.
  • 41:55 - 41:55
    In fact, I want infinity if i
    does not equal j.
  • 41:59 - 41:59
    And I want to zero if i equals
    j, a_ij, good.
  • 42:05 - 42:05
    I think it's a_ij.
    It should be,
  • 42:07 - 42:07
    right?
    Maybe I'm wrong.
  • 42:09 - 42:09
    Right, a_ij.
    So it's essentially not what I
  • 42:12 - 42:12
    said.
    That's the point.
  • 42:14 - 42:14
    If i does not equal j,
    you still have to think about a
  • 42:18 - 42:18
    single edge connecting i to j,
    right?
  • 42:21 - 42:21
    OK, so that's a bit of a
    subtlety.
  • 42:23 - 42:23
    This is only intermediate
    vertices, so you could still go
  • 42:27 - 42:27
    from i to j via a single edge.
    That will cost a_ij.
  • 42:33 - 42:33
    If there is an edge:
    infinity.
  • 42:35 - 42:35
    If there isn't one:
    that is a_ij.
  • 42:37 - 42:37
    So, OK, that gets us started.
    And then, we want a recurrence.
  • 42:42 - 42:42
    And, the recurrence is,
    well, maybe you get away with
  • 42:46 - 42:46
    all the vertices that you had
    before.
  • 42:49 - 42:49
    So, if you want to know paths
    that you had before,
  • 42:53 - 42:53
    so if you want to know paths
    that use one up to k,
  • 42:57 - 42:57
    maybe I just use one up to k
    minus one.
  • 43:01 - 43:01
    You could try that.
    Or, you could try using k.
  • 43:05 - 43:05
    So, either you use k or you
    don't.
  • 43:07 - 43:07
    If you don't,
    it's got to be this.
  • 43:10 - 43:10
    If you do, then you've got to
    go to k.
  • 43:13 - 43:13
    So why not go to k at the end?
    So, you go from I to k using
  • 43:17 - 43:17
    the previous vertices.
    Obviously, you don't want to
  • 43:21 - 43:21
    repeat k in there.
    And then, you go from k to j
  • 43:25 - 43:25
    somehow using vertices that are
    not k.
  • 43:29 - 43:29
    This should be pretty
    intuitive.
  • 43:31 - 43:31
    Again, I can draw a picture.
    So, either you never go to k,
  • 43:36 - 43:36
    and that's this wiggly line.
    You go from i to j using things
  • 43:40 - 43:40
    only one up to k minus one.
    In other words,
  • 43:44 - 43:44
    here we have to use one up to
    k.
  • 43:46 - 43:46
    So, this just means don't use
    k.
  • 43:48 - 43:48
    So, that's this thing.
    Or, you use k somewhere in the
  • 43:52 - 43:52
    middle there.
    OK, it's got to be one of the
  • 43:56 - 43:56
    two.
    And in this case,
  • 43:57 - 43:57
    you go from i to k using only
    smaller vertices,
  • 44:01 - 44:01
    because you don't want to
    repeat k.
  • 44:05 - 44:05
    And here, you go from k to j
    using only smaller labeled
  • 44:11 - 44:11
    vertices.
    So, every path is one of the
  • 44:15 - 44:15
    two.
    So, we take the shortest of
  • 44:18 - 44:18
    these two subproblems.
    That's the answer.
  • 44:22 - 44:22
    So, now we have a min of two
    things.
  • 44:26 - 44:26
    It takes constant time to
    compute.
  • 44:30 - 44:30
    So, we get a cubic algorithm.
    So, let me write it down.
  • 44:37 - 44:37
    So, this is the Floyd-Warshall
    algorithm.
  • 44:41 - 44:41
    I'll write the name again.
    You give it a matrix A.
  • 44:47 - 44:47
    That's all it really needs to
    know.
  • 44:50 - 44:50
    It codes everything.
    You copy C to A.
  • 44:54 - 44:54
    That's the warm up.
    Right at time zero,
  • 44:59 - 44:59
    C equals A.
    And then you just have these
  • 45:03 - 45:03
    three loops for every value of
    k, for every value of i,
  • 45:07 - 45:07
    and for every value of j.
    You compute that min.
  • 45:11 - 45:11
    And if you think about it a
    little bit, that min is a
  • 45:15 - 45:15
    relaxation.
    Surprise, surprise.
  • 45:47 - 45:47
    So, that is the Floyd-Warshall
    algorithm.
  • 45:52 - 45:52
    And, the running time is
    clearly n^3, three nested loops,
  • 45:58 - 45:58
    constant time inside.
    So, we're finally getting
  • 46:02 - 46:02
    something that is never worse
    than Bellman-Ford.
  • 46:05 - 46:05
    In the sparse case,
    it's the same.
  • 46:07 - 46:07
    And anything denser,
    the number of edges is super
  • 46:10 - 46:10
    linear.
    This is strictly better than
  • 46:12 - 46:12
    Bellman-Ford.
    And, it's better than
  • 46:14 - 46:14
    everything we've seen so far for
    all pair, shortest paths.
  • 46:17 - 46:17
    And, this handles negative
    weights; very simple algorithm,
  • 46:20 - 46:20
    even simpler than the one
    before.
  • 46:22 - 46:22
    It's just relaxation within
    three loops.
  • 46:24 - 46:24
    What more could you ask for?
    And we need to check that this
  • 46:27 - 46:27
    is indeed what min we're
    computing here,
  • 46:29 - 46:29
    except that the superscripts
    are omitted.
  • 46:33 - 46:33
    That's, again,
    a bit of hand waving a bit.
  • 46:36 - 46:36
    It's OK to omit subscripts
    because that can only mean that
  • 46:39 - 46:39
    you're doing more relaxation
    techniques should be.
  • 46:43 - 46:43
    Doing more relaxations can
    never hurt you.
  • 46:45 - 46:45
    In particular,
    we do all the ones that we have
  • 46:48 - 46:48
    to.
    Therefore, we find the shortest
  • 46:50 - 46:50
    path weights.
    And, again, here,
  • 46:52 - 46:52
    we're assuming that there is no
    negative weight cycles.
  • 46:56 - 46:56
    It shouldn't be hard to find
    them, but you have to think
  • 46:59 - 46:59
    about that a little bit.
    OK, you could run another round
  • 47:04 - 47:04
    of Bellman-Ford,
    see if it relaxes in a new
  • 47:08 - 47:08
    edges again.
    For example,
  • 47:10 - 47:10
    I think there's no nifty trick
    for that version.
  • 47:13 - 47:13
    And, we're going to cover,
    that's our second algorithm for
  • 47:18 - 47:18
    all pairs shortest paths.
    Before we go up to the third
  • 47:22 - 47:22
    algorithm, which is going to be
    the cleverest of them all,
  • 47:26 - 47:26
    the one Ring to rule them all,
    to switch trilogies,
  • 47:30 - 47:30
    we're going to take a little
    bit of a diversion,
  • 47:34 - 47:34
    side story, whatever,
    and talk about transitive
  • 47:37 - 47:37
    closure briefly.
    This is just a good thing to
  • 47:43 - 47:43
    know about.
    And, it relates to the
  • 47:46 - 47:46
    algorithms we've seen so far.
    So, here's a transitive closure
  • 47:51 - 47:51
    problem.
    I give you a directed graph,
  • 47:55 - 47:55
    and for all pair vertices,
    i and j, I want to compute this
  • 48:00 - 48:00
    number.
    It's one if there's a path from
  • 48:03 - 48:03
    i to j.
    From i to j,
  • 48:07 - 48:07
    OK, and then zero otherwise.
    OK, this is sort of like a
  • 48:14 - 48:14
    boring adjacency matrix with no
    weights, except it's about paths
  • 48:23 - 48:23
    instead of being about edges.
    OK, so how can I compute this?
  • 48:32 - 48:32
    That's very simple.
    How should I compute this?
  • 48:39 - 48:39
    This gives me a graph in some
    sense.
  • 48:45 - 48:45
    This is adjacency matrix of a
    new graph called the transitive
  • 48:55 - 48:55
    closure of my input graph.
    So, breadth first search,
  • 49:02 - 49:02
    yeah, good.
    So, all I need to do is find
  • 49:05 - 49:05
    shortest paths,
    and if the weights come out
  • 49:08 - 49:08
    infinity, then there's no path.
    If it's less than infinity,
  • 49:13 - 49:13
    that there's a path.
    And so here,
  • 49:15 - 49:15
    so you are saying maybe I don't
    care about the weights,
  • 49:20 - 49:20
    so I can run breadth first
    search n times,
  • 49:23 - 49:23
    and that will work indeed.
    So, if we do B times B of S,
  • 49:27 - 49:27
    so it's maybe weird that I'm
    covering here in the middle,
  • 49:32 - 49:32
    but it's just an interlude.
    So, we have,
  • 49:36 - 49:36
    then, something like V times E.
    OK, you can run any of these
  • 49:42 - 49:42
    algorithms.
    You could take Floyd-Warshall
  • 49:47 - 49:47
    for example.
    Why not?
  • 49:49 - 49:49
    OK, then it would just be V^ I
    mean, you could run in any of
  • 49:55 - 49:55
    these algorithms with weights of
    one or zero, and just check
  • 50:01 - 50:01
    whether the values are infinity
    or not.
  • 50:06 - 50:06
    So, I mean, t_ij equals zero,
    if and only if the shortest
  • 50:10 - 50:10
    path weight from i to j is
    infinity.
  • 50:13 - 50:13
    So, just solve this.
    This is an easier problem than
  • 50:16 - 50:16
    shortest paths.
    It is, in fact,
  • 50:18 - 50:18
    strictly easier in a certain
    sense, because what's going on
  • 50:23 - 50:23
    with transitive closure,
    and I just want to mention this
  • 50:27 - 50:27
    out of interest because
    transitive closure is a useful
  • 50:31 - 50:31
    thing to know about.
    Essentially,
  • 50:34 - 50:34
    what we are doing,
    let me get this right,
  • 50:37 - 50:37
    is using a different set of
    operators.
  • 50:39 - 50:39
    We're using or and and,
    a logical or and and instead of
  • 50:43 - 50:43
    min and plus,
    OK, because we want to know,
  • 50:46 - 50:46
    if you think about a
    relaxation, in some sense,
  • 50:50 - 50:50
    maybe I should think about it
    in terms of this min.
  • 50:53 - 50:53
    So, if I want to know,
    is there a pathway from I to j
  • 50:57 - 50:57
    that uses vertices labeled one
    through k in the middle?
  • 51:02 - 51:02
    Well, either there is a path
    that doesn't use the vertex k,
  • 51:06 - 51:06
    or there is a path that uses k,
    and then it would have to look
  • 51:10 - 51:10
    like that.
    OK, so there would have to be a
  • 51:12 - 51:12
    path here, and there would have
    to be a path there.
  • 51:16 - 51:16
    So, the min and plus get
    replaced with or and and.
  • 51:19 - 51:19
    And if you remember,
    this used to be plus,
  • 51:21 - 51:21
    and this used to be product in
    the matrix world.
  • 51:25 - 51:25
    So, plus is now like or.
    And, multiply is now like and,
  • 51:28 - 51:28
    which sounds very good,
    right?
  • 51:31 - 51:31
    Plus does feel like or,
    and multiply does feel like and
  • 51:36 - 51:36
    if you live in a zero-one world.
    So, in fact,
  • 51:40 - 51:40
    this is not quite the field Z
    mod two, but this is a good,
  • 51:45 - 51:45
    nice, field to work in.
    This is the Boolean world.
  • 51:50 - 51:50
    So, I'll just write Boole.
    Good old Boole knows all about
  • 51:55 - 51:55
    this.
    It's like his master's thesis,
  • 51:58 - 51:58
    I think, talking about Boolean
    algebra.
  • 52:03 - 52:03
    And, this actually means that
    you can use fast matrix
  • 52:07 - 52:07
    multiply.
    You can use Strassen's
  • 52:09 - 52:09
    algorithm, and the fancier
    algorithms, and you can compute
  • 52:14 - 52:14
    the transitive closure in
    subcubic time.
  • 52:16 - 52:16
    So, this is sub cubic if the
    edges are sparse.
  • 52:20 - 52:20
    But, it's cubic in the worst
    case if there are lots of edges.
  • 52:24 - 52:24
    This is cubic.
    You can actually do better
  • 52:27 - 52:27
    using Strassen.
    So, I'll just say you can do
  • 52:31 - 52:31
    it.
    No details here.
  • 52:33 - 52:33
    I think it should be,
    so in fact, there is a theorem.
  • 52:37 - 52:37
    This is probably not in the
    textbook, but there's a theorem
  • 52:41 - 52:41
    that says transitive closure is
    just as hard as matrix multiply.
  • 52:46 - 52:46
    OK, they are equivalent.
    Their running times are the
  • 52:50 - 52:50
    same.
    We don't know how long it takes
  • 52:52 - 52:52
    to do a matrix multiply over a
    field.
  • 52:55 - 52:55
    It's somewhere between n^2 and
    n^2.3.
  • 52:58 - 52:58
    But, whatever the answer is:
    same for transitive closure.
  • 53:03 - 53:03
    OK, there's the interlude.
    And that's where we actually
  • 53:09 - 53:09
    get to use Strassen and friends.
    Remember, Strassen was n to the
  • 53:17 - 53:17
    log base two of seven algorithm.
    Remember that,
  • 53:22 - 53:22
    especially on the final.
    Those are things you should
  • 53:28 - 53:28
    have at the tip of your tongue.
    OK, the last algorithm we're
  • 53:35 - 53:35
    going to cover is really going
    to build on what we saw last
  • 53:39 - 53:39
    time: Johnson's algorithm.
    And, I've lost some of the
  • 53:43 - 53:43
    running times here.
    But, when we had unweighted
  • 53:46 - 53:46
    graphs, we could do all pairs
    really fast, just as fast as a
  • 53:51 - 53:51
    single source Bellman-Ford.
    That's kind of nifty.
  • 53:54 - 53:54
    We don't know how to improve
    Bellman-Ford in the single
  • 53:58 - 53:58
    source case.
    So, we can't really help to get
  • 54:03 - 54:03
    anything better than V times E.
    And, if you remember running V
  • 54:07 - 54:07
    times Dijkstra,
    V times Dijkstra was about the
  • 54:11 - 54:11
    same.
    So, just put this in the recall
  • 54:14 - 54:14
    bubble here: V times Dijkstra
    would give us V times E plus V^2
  • 54:19 - 54:19
    log V.
    And, if you ignore that log
  • 54:22 - 54:22
    factor, this is just VE.
    OK, so this was really good.
  • 54:26 - 54:26
    Dijkstra was great.
    And this was for nonnegative
  • 54:30 - 54:30
    edge weights.
    So, with negative edge weights,
  • 54:34 - 54:34
    somehow we'd like to get the
    same running time.
  • 54:38 - 54:38
    Now, how might I get the same
    running time?
  • 54:41 - 54:41
    Well, it would be really nice
    if I could use Dijkstra.
  • 54:46 - 54:46
    Of course, Dijkstra doesn't
    work with negative weights.
  • 54:50 - 54:50
    So what could I do?
    What would I hope to do?
  • 54:53 - 54:53
    What could I hope to?
    Suppose I want,
  • 54:56 - 54:56
    in the middle of the algorithm,
    it says run Dijkstra n times.
  • 55:02 - 55:02
    Then, what should I do to
    prepare for that?
  • 55:06 - 55:06
    Make all the weights positive,
    or nonnegative.
  • 55:10 - 55:10
    Why not, right?
    We're being wishful thinking.
  • 55:13 - 55:13
    That's what we'll do.
    So, this is called graph
  • 55:17 - 55:17
    re-weighting.
    And, what's cool is we actually
  • 55:21 - 55:21
    already know how to do it.
    We just don't know that we know
  • 55:26 - 55:26
    how to do it.
    But I know that we know that we
  • 55:30 - 55:30
    know how to do it.
    You don't yet know that we know
  • 55:34 - 55:34
    that I know that we know how to
    do it.
  • 55:39 - 55:39
    So, it turns out you can
    re-weight the vertices.
  • 55:42 - 55:42
    So, at the end of the last
    class someone asked me,
  • 55:44 - 55:44
    can you just,
    like, add the same weight to
  • 55:47 - 55:47
    all the edges?
    That doesn't work.
  • 55:49 - 55:49
    Not quite, because different
    paths have different numbers of
  • 55:52 - 55:52
    edges.
    What we are going to do is add
  • 55:54 - 55:54
    a particular weight to each
    vertex.
  • 55:56 - 55:56
    What does that mean?
    Well, because we really only
  • 55:58 - 55:58
    have weights on the edges,
    here's what well do.
  • 56:02 - 56:02
    We'll re-weight each edge,
    so, (u,v), let's say,
  • 56:07 - 56:07
    going to go back into graph
    speak instead of matrix speak,
  • 56:12 - 56:12
    (u,v) instead of I and j,
    and we'll call this modified
  • 56:18 - 56:18
    weight w_h.
    h is our function.
  • 56:21 - 56:21
    It gives us a number for every
    vertex.
  • 56:24 - 56:24
    And, it's just going to be the
    old weight of that edge plus the
  • 56:31 - 56:31
    weight of the start vertex minus
    the weight of the terminating
  • 56:37 - 56:37
    vertex.
    I'm sure these have good names.
  • 56:41 - 56:41
    One of these is the head,
    and the other is the tail,
  • 56:44 - 56:44
    but I can never remember which.
    OK, so we've directed edge
  • 56:47 - 56:47
    (u,v).
    Just add one of them;
  • 56:49 - 56:49
    subtract the other.
    And, it's a directed edge,
  • 56:51 - 56:51
    so that's a consistent
    definition.
  • 56:53 - 56:53
    OK, so that's called
    re-weighting.
  • 56:55 - 56:55
    Now, this is actually a
    theorem.
  • 56:58 - 56:58
    If you do this,
    then, let's say,
  • 57:03 - 57:03
    for any vertices,
    u and v in the graph,
  • 57:10 - 57:10
    for any two vertices,
    all paths from u to v have the
  • 57:19 - 57:19
    same weight as they did before,
    well, not quite.
  • 57:27 - 57:27
    They have the same
    re-weighting.
  • 57:34 - 57:34
    So, if you look at all the
    different paths and you say,
  • 57:37 - 57:37
    well, what's the difference
    between vh, well,
  • 57:40 - 57:40
    sorry, let's say delta,
    which is the old shortest
  • 57:43 - 57:43
    paths, and deltas of h,
    which is the shortest path
  • 57:46 - 57:46
    weights according to this new
    weight function,
  • 57:49 - 57:49
    then that difference is the
    same.
  • 57:50 - 57:50
    So, we'll say that all these
    paths are re-weighted by the
  • 57:54 - 57:54
    same amounts.
    OK, this is actually a
  • 57:56 - 57:56
    statement about all paths,
    not just shortest paths.
  • 58:00 - 58:00
    There we go.
    OK, to how many people is this
  • 58:05 - 58:05
    obvious already?
    A few, yeah,
  • 58:09 - 58:09
    it is.
    And what's the one word?
  • 58:13 - 58:13
    OK, it's maybe not that
    obvious.
  • 58:16 - 58:16
    All right, shout out the word
    when you figure it out.
  • 58:23 - 58:23
    Meanwhile, I'll write out this
    rather verbose proof.
  • 58:29 - 58:29
    There's a one word proof,
    still waiting.
  • 58:36 - 58:36
    So, let's just take one of
    these paths that starts at u and
  • 58:42 - 58:42
    ends at v.
    Take any path.
  • 58:44 - 58:44
    We're just going to see what
    its new weight is relative to
  • 58:49 - 58:49
    its old weight.
    And so, let's just write out
  • 58:54 - 58:54
    w_h of the path,
    which we define in the usual
  • 58:58 - 58:58
    way as the sum over all edges of
    the new weight of the edge from
  • 59:04 - 59:04
    v_i to v_i plus one.
    Do you have the word?
  • 59:09 - 59:09
    No?
    Tough puzzle then,
  • 59:11 - 59:11
    OK.
    So that's the definition of the
  • 59:15 - 59:15
    weight of a path.
    And, then we know this thing is
  • 59:20 - 59:20
    just w of v_i,
    v_i plus one.
  • 59:23 - 59:23
    I'll get it right,
    plus the weight of the first
  • 59:28 - 59:28
    vertex, plus,
    sorry, the re-weighting of v_i
  • 59:32 - 59:32
    minus the re-weighting of v_i
    plus one.
  • 59:38 - 59:38
    This is all in parentheses
    that's summed over I.
  • 59:43 - 59:43
    Now I need the magic word.
    Telescopes, good.
  • 59:47 - 59:47
    Now this is obvious:
    each of these telescopes with
  • 59:51 - 59:51
    an extra previous,
    except the very beginning and
  • 59:56 - 59:56
    the very end.
    So, this is the sum of these
  • 60:00 - 60:00
    weights of edges,
    but then outside the sum,
  • 60:04 - 60:04
    we have plus h of v_1,
    and minus h of v_k.
  • 60:09 - 60:09
    OK, those guys don't quite
    cancel.
  • 60:12 - 60:12
    We're not looking at a cycle,
    just a path.
  • 60:16 - 60:16
    And, this thing is just w of
    the path, as this is the normal
  • 60:21 - 60:21
    weight of the path.
    And so the change,
  • 60:24 - 60:24
    the difference between w_h of P
    and w of P is this thing,
  • 60:29 - 60:29
    which is just h of u minus h of
    v.
  • 60:33 - 60:33
    And, the point is that's the
    same as long as you fix the
  • 60:37 - 60:37
    endpoints, u and v,
    of the shortest path,
  • 60:39 - 60:39
    you're changing this path
    weight by the same thing for all
  • 60:43 - 60:43
    paths.
    This is for any path from u to
  • 60:46 - 60:46
    v, and that proves the theorem.
    So, the one word here was
  • 60:50 - 60:50
    telescopes.
    These change in weights
  • 60:52 - 60:52
    telescope over any path.
    Therefore, if we want to find
  • 60:56 - 60:56
    shortest paths,
    you just find the shortest
  • 60:58 - 60:58
    paths in this re-weighted
    version, and then you just
  • 61:02 - 61:02
    change it by this one amount.
    You subtract off this amount
  • 61:07 - 61:07
    instead of adding it.
    That will give you the shortest
  • 61:10 - 61:10
    path weight in the original
    weights.
  • 61:13 - 61:13
    OK, so this is a tool.
    We now know how to change
  • 61:16 - 61:16
    weights in the graph.
    But what we really want is to
  • 61:19 - 61:19
    change weights in the graph so
    that the weights all come out
  • 61:23 - 61:23
    nonnegative.
    OK, how do we do that?
  • 61:25 - 61:25
    Why in the world would there be
    a function, h,
  • 61:28 - 61:28
    that makes all the edge weights
    nonnegative?
  • 61:32 - 61:32
    It doesn't make sense.
    It turns out we already know.
  • 61:43 - 61:43
    So, I should write down this
    consequence.
  • 62:12 - 62:12
    Let me get this in the right
    order.
  • 62:14 - 62:14
    So in particular,
    the shortest path changes by
  • 62:17 - 62:17
    this amount.
    And if you want to know this
  • 62:20 - 62:20
    value, you just move the stuff
    to the other side.
  • 62:23 - 62:23
    So, we compute deltas of h,
    then we can compute delta.
  • 62:26 - 62:26
    That's the consequence here.
    How many people here pronounce
  • 62:30 - 62:30
    this word corollary?
    OK, and how many people
  • 62:34 - 62:34
    pronounce it corollary?
    Yeah, we are alone.
  • 62:38 - 62:38
    Usually get at least one other
    student, and they're usually
  • 62:43 - 62:43
    Canadian or British or
    something.
  • 62:45 - 62:45
    I think that the accent.
    So, I always avoid pronouncing
  • 62:50 - 62:50
    his word unless I really think,
    it's corollary,
  • 62:54 - 62:54
    and get it right.
    I at least say Z not Zed.
  • 62:58 - 62:58
    OK, here we go.
    So, what we want to do is find
  • 63:03 - 63:03
    one of these functions.
    I mean, let's just write down
  • 63:09 - 63:09
    what we could hope to have.
    We want to find a re-weighted
  • 63:16 - 63:16
    function, h, the signs of weight
    to each vertex such that w_h of
  • 63:23 - 63:23
    (u,v) is nonnegative.
    That would be great for all
  • 63:28 - 63:28
    edges, all (u,v) in E.
    OK, then we could run Dijkstra.
  • 63:35 - 63:35
    We could run Dijkstra,
    get the delta h's,
  • 63:38 - 63:38
    and then just undo the
    re-weighting,
  • 63:41 - 63:41
    and get what we want.
    And, that is Johnson's
  • 63:45 - 63:45
    algorithm.
    The claim is that this is
  • 63:48 - 63:48
    always possible.
    OK, why should it always be
  • 63:52 - 63:52
    possible?
    Well, let's look at this
  • 63:55 - 63:55
    constraint.
    w_h of (u,v) is that.
  • 63:58 - 63:58
    So, it's w of (u,v) plus h of u
    minus h of V should be
  • 64:02 - 64:02
    nonnegative.
    Let me rewrite this a little
  • 64:10 - 64:10
    bit.
    I'm going to put these guys
  • 64:15 - 64:15
    over here.
    That would be the right thing,
  • 64:22 - 64:22
    h of v minus h of u is less
    than or equal to w of (u,v).
  • 64:31 - 64:31
    Does that look familiar?
    Did I get it right?
  • 64:39 - 64:39
    It should be right.
    Anyone seen that inequality
  • 64:46 - 64:46
    before?
    Yeah, yes, correct answer.
  • 64:52 - 64:52
    OK, where?
    In a previous lecture?
  • 64:57 - 64:57
    In the previous lecture.
    What is this called if I
  • 65:06 - 65:06
    replace h with x?
    Charles knows.
  • 65:11 - 65:11
    Good, anyone else remember all
    the way back to episode two?
  • 65:21 - 65:21
    I know there was a weekend.
    What's this operator called?
  • 65:31 - 65:31
    Not subtraction but,
    I think I heard it,
  • 65:34 - 65:34
    oh man.
    All right, I'll tell you.
  • 65:37 - 65:37
    It's a difference constraint,
    all right?
  • 65:40 - 65:40
    This is the difference
    operator.
  • 65:42 - 65:42
    OK, it's our good friend
    difference constraints.
  • 65:46 - 65:46
    So, this is what we want to
    satisfy.
  • 65:48 - 65:48
    We have a system of difference
    constraints.
  • 65:52 - 65:52
    h of V minus h of u should be,
    we want to find these.
  • 65:56 - 65:56
    These are our unknowns.
    Subject to these constraints,
  • 66:00 - 66:00
    we are given the w's.
    Now, we know in these
  • 66:06 - 66:06
    difference constraints are
    satisfiable.
  • 66:11 - 66:11
    Can someone tell me when these
    constraints are satisfiable?
  • 66:19 - 66:19
    We know exactly when for any
    set of difference constraints.
  • 66:27 - 66:27
    You've got to remember the
    math.
  • 66:32 - 66:32
    Terminology,
    I can understand.
  • 66:38 - 66:38
    It's hard to remember words
    unless you're a linguist,
  • 66:48 - 66:48
    perhaps.
    So, when is the system of
  • 66:54 - 66:54
    different constraints
    satisfiable?
  • 67:02 - 67:02
    All right, you should
    definitely, very good.
  • 67:08 - 67:08
    [LAUGHTER] Yes,
    very good.
  • 67:12 - 67:12
    Someone brought their lecture
    notes: when the constraint graph
  • 67:21 - 67:21
    has no negative weight cycles.
    Good, thank you.
  • 67:28 - 67:28
    Now, what is the constraint
    graph?
  • 67:34 - 67:34
    OK, this has a one letter
    answer more or less.
  • 67:38 - 67:38
    I'll accept the one letter
    answer.
  • 67:40 - 67:40
    What?
    A?
  • 67:41 - 67:41
    A: close.
    G.
  • 67:42 - 67:42
    Yeah, I mean,
    same thing.
  • 67:44 - 67:44
    Yeah, so the constraint graph
    is essentially G.
  • 67:48 - 67:48
    Actually, it is G.
    The constraint graph is G,
  • 67:51 - 67:51
    good.
    And, we prove this by adding a
  • 67:54 - 67:54
    new source for text,
    and connecting that to
  • 67:58 - 67:58
    everyone.
    But that's sort of beside the
  • 68:02 - 68:02
    point.
    That was in order to actually
  • 68:04 - 68:04
    satisfy them.
    But this is our
  • 68:06 - 68:06
    characterization.
    So, if we assume that there are
  • 68:09 - 68:09
    no negative weight cycles in our
    graph, which we've been doing
  • 68:12 - 68:12
    all the time,
    then we know that this thing is
  • 68:15 - 68:15
    satisfiable.
    Therefore, there is an
  • 68:17 - 68:17
    assignment of this h's.
    There is a re-weighting that
  • 68:20 - 68:20
    makes all the weights
    nonnegative.
  • 68:22 - 68:22
    Then we can run Dijkstra.
    OK, we're done.
  • 68:25 - 68:25
    Isn't that cool?
    And how do we satisfy these
  • 68:27 - 68:27
    constraints?
    We know how to do that with one
  • 68:30 - 68:30
    run of Bellman-Ford,
    which costs order VE,
  • 68:32 - 68:32
    which is less than V times
    Dijkstra.
  • 68:36 - 68:36
    So, that's it,
    write down the details
  • 68:40 - 68:40
    somewhere.
  • 69:00 - 69:00
    So, this is Johnson's
    algorithm.
  • 69:04 - 69:04
    This is the fanciest of them
    all.
  • 69:08 - 69:08
    It will be our fastest,
    all pairs shortest path
  • 69:14 - 69:14
    algorithm.
    So, the claim is,
  • 69:17 - 69:17
    we can find a function,
    h, from V to R such that the
  • 69:24 - 69:24
    modified weight of every edge is
    nonnegative for every edge,
  • 69:31 - 69:31
    (u,v), in our graph.
    And, we do that using
  • 69:37 - 69:37
    Bellman-Ford to solve the
    difference constraints.
  • 69:57 - 69:57
    These are exactly the different
    constraints that we were born to
  • 70:01 - 70:01
    solve that we learned to solve
    last time.
  • 70:04 - 70:04
    The graphs here are
    corresponding exactly if you
  • 70:07 - 70:07
    look back at the definition.
    Or, Bellman-Ford will tell us
  • 70:10 - 70:10
    that there is a negative weight
    cycle.
  • 70:13 - 70:13
    OK, great, so it's not that we
    really have to assume that there
  • 70:17 - 70:17
    is no negative weight cycle.
    We'll get to know.
  • 70:20 - 70:20
    And if your fancy,
    you can actually figure out the
  • 70:23 - 70:23
    minus infinities from this.
    But, at this point,
  • 70:26 - 70:26
    I just want to think about the
    case where there is no negative
  • 70:30 - 70:30
    weight cycle.
    But if there is,
  • 70:34 - 70:34
    we can find out that it exists,
    and that just tell the user.
  • 70:40 - 70:40
    OK, then we'd stop.
    Otherwise, there is no negative
  • 70:45 - 70:45
    weight cycle.
    Therefore, there is an
  • 70:49 - 70:49
    assignment that gives is
    nonnegative edge weights.
  • 70:54 - 70:54
    So, we just use it.
    We use it to run Dijkstra.
  • 71:00 - 71:00
    So, step two is,
    oh, I should say the running
  • 71:03 - 71:03
    time of all this is V times E.
    So, we're just running
  • 71:06 - 71:06
    Bellman-Ford on exactly the
    input graph.
  • 71:08 - 71:08
    Plus, we add a source,
    if you recall,
  • 71:11 - 71:11
    to solve a set of difference
    constraints.
  • 71:13 - 71:13
    You add a source vertex,
    S, connected to everyone at
  • 71:16 - 71:16
    weight zero, run Bellman-Ford
    from there because we don't have
  • 71:20 - 71:20
    a source here.
    We just have a graph.
  • 71:22 - 71:22
    We want to know all pairs.
    So, this, you can use to find
  • 71:26 - 71:26
    whether there is a negative
    weight cycle anywhere.
  • 71:30 - 71:30
    Or, we get this magic
    assignment.
  • 71:33 - 71:33
    So now, w_h is nonnegative,
    so we can run Dijkstra on w_h.
  • 71:40 - 71:40
    We'll say, using w_h,
    so you compute w_h.
  • 71:44 - 71:44
    That takes linear time.
    And, we run Dijkstra for each
  • 71:49 - 71:49
    possible source.
    I'll write this out explicitly.
  • 71:54 - 71:54
    We've had this in our minds
    several times.
  • 72:00 - 72:00
    But, when we said n times
    Dijkstra over n times BFS,
  • 72:05 - 72:05
    here it is.
    We want to compute delta sub h
  • 72:10 - 72:10
    now, of (u,v) for all V,
    and we do this separately for
  • 72:15 - 72:15
    all u.
    And so, the running time here
  • 72:19 - 72:19
    is VE plus V^2 log V.
    This is just V times the
  • 72:24 - 72:24
    running time of Dijkstra,
    which is E plus V log V.
  • 72:30 - 72:30
    OK, it happens that this term
    is the same as this one,
  • 72:35 - 72:35
    which is nice,
    because that means step one
  • 72:39 - 72:39
    costs us nothing asymptotically.
    OK, and then,
  • 72:43 - 72:43
    last step is,
    well, now we know delta h.
  • 72:47 - 72:47
    We just need to compute delta.
    So, for each pair of vertices,
  • 72:53 - 72:53
    we'll call it (u,v),
    we just compute what the
  • 72:57 - 72:57
    original weights would be,
    so what delta (u,v) is.
  • 73:03 - 73:03
    And we can do that using this
    corollary.
  • 73:07 - 73:07
    It's just delta sub h of (u,v)
    minus h of u plus h of v.
  • 73:14 - 73:14
    I got the signs right.
    Yeah, so this takes V^2 time,
  • 73:20 - 73:20
    also dwarfed by the running
    time of Dijkstra.
  • 73:25 - 73:25
    So, the overall running time of
    Johnson's algorithm is just the
  • 73:32 - 73:32
    running time of step two,
    running Dijkstra n times --
  • 73:51 - 73:51
    -- which is pretty cool.
    When it comes to single source
  • 73:55 - 73:55
    shortest paths,
    Bellman-Ford is the best thing
  • 73:58 - 73:58
    for general weights.
    Dijkstra is the best thing for
  • 74:02 - 74:02
    nonnegative weights.
    But for all pair shortest
  • 74:05 - 74:05
    paths, we can skirt the whole
    negative weight issue by using
  • 74:09 - 74:09
    this magic we saw from
    Bellman-Ford.
  • 74:11 - 74:11
    But now, running Dijkstra n
    times, which is still the best
  • 74:15 - 74:15
    thing we know how to do,
    pretty much,
  • 74:17 - 74:17
    for the all pairs nonnegative
    weights, now we can do it for
  • 74:21 - 74:21
    general weights too,
    which is a pretty nice
  • 74:24 - 74:24
    combination of all the
    techniques we've seen.
  • 74:28 - 74:28
    In the trilogy,
    and along the way,
  • 74:30 - 74:30
    we saw lots of dynamic
    programming, which is always
  • 74:34 - 74:34
    good practice.
    Any questions?
  • 74:35 - 74:35
    This is the last new content
    lecture before the quiz.
  • 74:39 - 74:39
    On Wednesday it will be quiz
    review, if I recall correctly.
  • 74:43 - 74:43
    And then it's Thanksgiving,
    so there's no recitation.
  • 74:46 - 74:46
    And then the quiz starts on
    Monday.
  • 74:49 - 74:49
    So, study up.
    See you then.
Title:
Lec 19 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Description:

Lecture 19: Shortest Paths III: All-pairs Shortest Paths, Matrix Multiplication, Floyd-Warshall, Johnson

View the complete course at: http://ocw.mit.edu/6-046JF05

License: Creative Commons BY-NC-SA

More information at http://ocw.mit.edu/terms

More courses at http://ocw.mit.edu

more » « less
Duration:
01:14:59

English subtitles

Revisions