hideHelp Amara.org break down language barriers and make media truly global by Donating to the Participatory Culture Foundation (PCF).
Join us in creating a more inclusive digital world!

< Return to Video

Lec 18 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

  • 0:07 - 0:07
    Good morning,
    everyone.
  • 0:10 - 0:10
    Glad you are all here bright
    and early.
  • 0:14 - 0:14
    I'm counting the days till the
    TA's outnumber the students.
  • 0:21 - 0:21
    They'll show up.
    We return to a familiar story.
  • 0:26 - 0:26
    This is part two,
    the Empire Strikes Back.
  • 0:32 - 0:32
    So last time,
    our adversary,
  • 0:34 - 0:34
    the graph, came to us with a
    problem.
  • 0:37 - 0:37
    We have a source,
    and we had a directed graph,
  • 0:40 - 0:40
    and we had weights on the
    edges, and they were all
  • 0:44 - 0:44
    nonnegative.
    And there was happiness.
  • 0:46 - 0:46
    And we triumphed over the
    Empire by designing Dijkstra's
  • 0:50 - 0:50
    algorithm, and very efficiently
    finding single source shortest
  • 0:55 - 0:55
    paths, shortest path weight from
    s to every other vertex.
  • 1:00 - 1:00
    Today, however,
    the Death Star has a new trick
  • 1:03 - 1:03
    up its sleeve,
    and we have negative weights,
  • 1:06 - 1:06
    potentially.
    And we're going to have to
  • 1:08 - 1:08
    somehow deal with,
    in particular,
  • 1:10 - 1:10
    negative weight cycles.
    And we saw that when we have a
  • 1:13 - 1:13
    negative weight cycle,
    we can just keep going around,
  • 1:16 - 1:16
    and around, and around,
    and go back in time farther,
  • 1:20 - 1:20
    and farther,
    and farther.
  • 1:21 - 1:21
    And we can get to be
    arbitrarily far back in the
  • 1:24 - 1:24
    past.
    And so there's no shortest
  • 1:26 - 1:26
    path, because whatever path you
    take you can get a shorter one.
  • 1:30 - 1:30
    So we want to address that
    issue today, and we're going to
  • 1:33 - 1:33
    come up with a new algorithm
    actually simpler than Dijkstra,
  • 1:37 - 1:37
    but not as fast,
    called the Bellman-Ford
  • 1:40 - 1:40
    algorithm.
    And, it's going to allow
  • 1:45 - 1:45
    negative weights,
    and in some sense allow
  • 1:49 - 1:49
    negative weight cycles,
    although maybe not as much as
  • 1:55 - 1:55
    you might hope.
    We have to leave room for a
  • 1:59 - 1:59
    sequel, of course.
    OK, so the Bellman-Ford
  • 2:04 - 2:04
    algorithm, invented by two guys,
    as you might expect,
  • 2:10 - 2:10
    it computes the shortest path
    weights.
  • 2:13 - 2:13
    So, it makes no assumption
    about the weights.
  • 2:18 - 2:18
    Weights are arbitrary,
    and it's going to compute the
  • 2:23 - 2:23
    shortest path weights.
    So, remember this notation:
  • 2:28 - 2:28
    delta of s, v is the weight of
    the shortest path from s to v.
  • 2:34 - 2:34
    s was called a source vertex.
    And, we want to compute these
  • 2:40 - 2:40
    weights for all vertices,
    little v.
  • 2:43 - 2:43
    The claim is that computing
    from s to everywhere is no
  • 2:47 - 2:47
    harder than computing s to a
    particular location.
  • 2:51 - 2:51
    So, we're going to do for all
    them.
  • 2:54 - 2:54
    It's still going to be the case
    here.
  • 2:57 - 2:57
    And, it allows negative
    weights.
  • 2:59 - 2:59
    And this is the good case,
    but there's an alternative,
  • 3:03 - 3:03
    which is that Bellman-Ford may
    just say, oops,
  • 3:07 - 3:07
    there's a negative weight
    cycle.
  • 3:11 - 3:11
    And in that case it will just
    say so.
  • 3:14 - 3:14
    So, they say a negative weight
    cycle exists.
  • 3:19 - 3:19
    Therefore, some of these deltas
    are minus infinity.
  • 3:23 - 3:23
    And that seems weird.
    So, Bellman-Ford as we'll
  • 3:28 - 3:28
    present it today is intended for
    the case, but there are no
  • 3:33 - 3:33
    negative weights cycles,
    which is more intuitive.
  • 3:39 - 3:39
    It sort of allows them,
    but it will just report them.
  • 3:43 - 3:43
    In that case,
    it will not give you delta
  • 3:46 - 3:46
    values.
    You can change the algorithm to
  • 3:48 - 3:48
    give you delta values in that
    case, but we are not going to
  • 3:53 - 3:53
    see it here.
    So, in exercise,
  • 3:55 - 3:55
    after you see the algorithm,
    exercise is:
  • 3:58 - 3:58
    compute these deltas in all
    cases.
  • 4:12 - 4:12
    So, it's not hard to do.
    But we don't have time for it
  • 4:20 - 4:20
    here.
    So, here's the algorithm.
  • 4:24 - 4:24
    It's pretty straightforward.
    As I said, it's easier than
  • 4:33 - 4:33
    Dijkstra.
    It's a relaxation algorithm.
  • 4:37 - 4:37
    So the main thing that it does
    is relax edges just like
  • 4:41 - 4:41
    Dijkstra.
    So, we'll be able to use a lot
  • 4:44 - 4:44
    of dilemmas from Dijkstra.
    And proof of correctness will
  • 4:47 - 4:47
    be three times shorter because
    the first two thirds we already
  • 4:52 - 4:52
    have from Dijkstra.
    But I'm jumping ahead a bit.
  • 4:55 - 4:55
    So, the first part is
    initialization.
  • 4:58 - 4:58
    Again, d of v will represent
    the estimated distance from s to
  • 5:02 - 5:02
    v.
    And we're going to be updating
  • 5:05 - 5:05
    those estimates as the algorithm
    goes along.
  • 5:08 - 5:08
    And initially,
    d of s is zero,
  • 5:10 - 5:10
    which now may not be the right
    answer conceivably.
  • 5:14 - 5:14
    Everyone else is infinity,
    which is certainly an upper
  • 5:18 - 5:18
    bound.
    OK, these are both upper bounds
  • 5:21 - 5:21
    on the true distance.
    So that's fine.
  • 5:23 - 5:23
    That's initialization just like
    before.
  • 5:36 - 5:36
    And now we have a main loop
    which happens v minus one times.
  • 5:39 - 5:39
    We're not actually going to use
    the index i.
  • 5:42 - 5:42
    It's just a counter.
  • 6:02 - 6:02
    And we're just going to look at
    every edge and relax it.
  • 6:08 - 6:08
    It's a very simple idea.
    If you learn about relaxation,
  • 6:13 - 6:13
    this is the first thing you
    might try.
  • 6:17 - 6:17
    The question is when do you
    stop.
  • 6:20 - 6:20
    It's sort of like I have this
    friend to what he was like six
  • 6:26 - 6:26
    years old he would claim,
    oh, I know how to spell banana.
  • 6:32 - 6:32
    I just don't know when to stop.
    OK, same thing with relaxation.
  • 6:38 - 6:38
    This is our relaxation step
    just as before.
  • 6:41 - 6:41
    We look at the edge;
    we see whether it violates the
  • 6:44 - 6:44
    triangle inequality according to
    our current estimates we know
  • 6:48 - 6:48
    the distance from s to v should
    be at most distance from s to
  • 6:52 - 6:52
    plus the weight of that edge
    from u to v.
  • 6:54 - 6:54
    If it isn't,
    we set it equal.
  • 6:56 - 6:56
    We've proved that this is
    always an OK thing to do.
  • 7:00 - 7:00
    We never violate,
    I mean, these d of v's never
  • 7:04 - 7:04
    get too small if we do a bunch
    of relaxations.
  • 7:07 - 7:07
    So, the idea is you take every
    edge.
  • 7:10 - 7:10
    You relax it.
    I don't care which order.
  • 7:13 - 7:13
    Just relax every edge,
    one each.
  • 7:15 - 7:15
    And that do that V minus one
    times.
  • 7:18 - 7:18
    The claim is that that should
    be enough if you have no
  • 7:22 - 7:22
    negative weights cycles.
    So, if there's a negative
  • 7:26 - 7:26
    weight cycle,
    we need to figure it out.
  • 7:30 - 7:30
    And, we'll do that in a fairly
    straightforward way,
  • 7:35 - 7:35
    which is we're going to do
    exactly the same thing.
  • 7:40 - 7:40
    So this is outside before loop
    here.
  • 7:44 - 7:44
    We'll have the same four loops
    for each edge in our graph.
  • 7:50 - 7:50
    We'll try to relax it.
    And if you can relax it,
  • 7:55 - 7:55
    the claim is that there has to
    be a negative weight cycle.
  • 8:02 - 8:02
    So this is the main thing that
    needs proof.
  • 8:28 - 8:28
    OK, and that's the algorithm.
    So the claim is that at the
  • 8:32 - 8:32
    ends we should have d of v,
    let's see, L's so to speak.
  • 8:35 - 8:35
    d of v equals delta of s comma
    v for every vertex,
  • 8:39 - 8:39
    v.
    If we don't find a negative
  • 8:40 - 8:40
    weight cycle according to this
    rule, that we should have all
  • 8:44 - 8:44
    the shortest path weights.
    That's the claim.
  • 8:47 - 8:47
    Now, the first question is,
    in here, the running time is
  • 8:51 - 8:51
    very easy to analyze.
    So let's start with the running
  • 8:54 - 8:54
    time.
    We can compare it to Dijkstra,
  • 8:57 - 8:57
    which is over here.
    What is the running time of
  • 9:02 - 9:02
    this algorithm?
    V times E, exactly.
  • 9:06 - 9:06
    OK, I'm going to assume,
    because it's pretty reasonable,
  • 9:13 - 9:13
    that V and E are both positive.
    Then it's V times E.
  • 9:19 - 9:19
    So, this is a little bit
    slower, or a fair amount slower,
  • 9:26 - 9:26
    than Dijkstra's algorithm.
    There it is:
  • 9:31 - 9:31
    E plus V log V is essentially,
    ignoring the logs is pretty
  • 9:36 - 9:36
    much linear time.
    Here we have something that's
  • 9:39 - 9:39
    at least quadratic in V,
    assuming your graph is
  • 9:43 - 9:43
    connected.
    So, it's slower,
  • 9:45 - 9:45
    but it's going to handle these
    negative weights.
  • 9:49 - 9:49
    Dijkstra can't handle negative
    weights at all.
  • 9:53 - 9:53
    So, let's do an example,
    make it clear why you might
  • 9:57 - 9:57
    hope this algorithm works.
    And then we'll prove that it
  • 10:03 - 10:03
    works, of course.
    But the proof will be pretty
  • 10:09 - 10:09
    easy.
    So, I'm going to draw a graph
  • 10:13 - 10:13
    that has negative weights,
    but no negative weight cycles
  • 10:19 - 10:19
    so that I get an interesting
    answer.
  • 10:55 - 10:55
    Good.
    The other thing I need in order
  • 10:57 - 10:57
    to make the output of this
    algorithm well defined,
  • 11:01 - 11:01
    it depends in which order you
    visit the edges.
  • 11:04 - 11:04
    So I'm going to assign an
    arbitrary order to these edges.
  • 11:07 - 11:07
    I could just ask you for an
    order, but to be consistent with
  • 11:11 - 11:11
    the notes, I'll put an ordering
    on it.
  • 11:14 - 11:14
    Let's say I put number four,
    say that's the fourth edge I'll
  • 11:17 - 11:17
    visit.
    It doesn't matter.
  • 11:19 - 11:19
    But it will affect what happens
    during the algorithm for a
  • 11:23 - 11:23
    particular graph.
  • 11:43 - 11:43
    Do they get them all?
    One, two, three,
  • 11:46 - 11:46
    four, five, six,
    seven, eight,
  • 11:49 - 11:49
    OK.
    And my source is going to be A.
  • 11:51 - 11:51
    And, that's it.
    So, I want to run this
  • 11:55 - 11:55
    algorithm.
    I'm just going to initialize
  • 11:58 - 11:58
    everything.
    So, I set the estimates for s
  • 12:01 - 12:01
    to be zero, and everyone else to
    be infinity.
  • 12:06 - 12:06
    And to give me some notion of
    time, over here I'm going to
  • 12:11 - 12:11
    draw or write down what all of
    these d values are as the
  • 12:16 - 12:16
    algorithm proceeds because I'm
    going to start crossing them out
  • 12:21 - 12:21
    and rewriting them that the
    figure will get a little bit
  • 12:26 - 12:26
    messier.
    But we can keep track of it
  • 12:29 - 12:29
    over here.
    It's initially zero and
  • 12:32 - 12:32
    infinities.
    Yeah?
  • 12:34 - 12:34
    It doesn't matter.
    So, for the algorithm you can
  • 12:37 - 12:37
    go to the edges in a different
    order every time if you want.
  • 12:40 - 12:40
    We'll prove that,
    but here I'm going to go
  • 12:43 - 12:43
    through the same order every
    time.
  • 12:45 - 12:45
    Good question.
    It turns out it doesn't matter
  • 12:47 - 12:47
    here.
    OK, so here's the starting
  • 12:49 - 12:49
    point.
    Now I'm going to relax every
  • 12:51 - 12:51
    edge.
    So, there's going to be a lot
  • 12:53 - 12:53
    of edges here that don't do
    anything.
  • 12:55 - 12:55
    I try to relax n minus one.
    I'd say, well,
  • 12:58 - 12:58
    I know how to get from s to B
    with weight infinity.
  • 13:02 - 13:02
    Infinity plus two I can get to
    from s to E.
  • 13:05 - 13:05
    Well, infinity plus two is not
    much better than infinity.
  • 13:08 - 13:08
    OK, so I don't do anything,
    don't update this to infinity.
  • 13:12 - 13:12
    I mean, infinity plus two
    sounds even worse.
  • 13:15 - 13:15
    But infinity plus two is
    infinity.
  • 13:17 - 13:17
    OK, that's the edge number one.
    So, no relaxation edge number
  • 13:21 - 13:21
    two, same deal as number three,
    same deal, edge number four we
  • 13:24 - 13:24
    start to get something
    interesting because I have a
  • 13:28 - 13:28
    finite value here that says I
    can get from A to B using a
  • 13:31 - 13:31
    total weight of minus one.
    So that seems good.
  • 13:36 - 13:36
    I'll write down minus one here,
    and update B to minus one.
  • 13:41 - 13:41
    The rest stay the same.
    So, I'm just going to keep
  • 13:46 - 13:46
    doing this over and over.
    That was edge number four.
  • 13:50 - 13:50
    Number five,
    we also get a relaxation.
  • 13:54 - 13:54
    Four is better than infinity.
    So, c gets a number of four.
  • 14:00 - 14:00
    Then we get to edge number six.
    That's infinity plus five is
  • 14:04 - 14:04
    worse than four.
    OK, so no relaxation there.
  • 14:08 - 14:08
    Edge number seven is
    interesting because I have a
  • 14:11 - 14:11
    finite value here minus one plus
    the weight of this edge,
  • 14:15 - 14:15
    which is three.
    That's a total of two,
  • 14:18 - 14:18
    which is actually better than
    four.
  • 14:21 - 14:21
    So, this route,
    A, B, c is actually better than
  • 14:24 - 14:24
    the route I just found a second
    ago.
  • 14:27 - 14:27
    So, this is now a two.
    This is all happening in one
  • 14:31 - 14:31
    iteration of the main loop.
    We actually found two good
  • 14:36 - 14:36
    paths to c.
    We found one better than the
  • 14:39 - 14:39
    other.
    OK, and that was edge number
  • 14:41 - 14:41
    seven, and edge number eight is
    over here.
  • 14:44 - 14:44
    It doesn't matter.
    OK, so that was round one of
  • 14:48 - 14:48
    this outer loop,
    so, the first value of i.
  • 14:51 - 14:51
    i equals one.
    OK, now we continue.
  • 14:53 - 14:53
    Just keep going.
    So, we start with edge number
  • 14:56 - 14:56
    one.
    Now, minus one plus two is one.
  • 15:00 - 15:00
    That's better than infinity.
    It'll start speeding up.
  • 15:05 - 15:05
    It's repetitive.
    It's actually not too much
  • 15:09 - 15:09
    longer until we're done.
    Number two, this is an infinity
  • 15:14 - 15:14
    so we don't do anything.
    Number three:
  • 15:18 - 15:18
    minus one plus two is one;
    better than infinity.
  • 15:22 - 15:22
    This is vertex d,
    and it's number three.
  • 15:26 - 15:26
    Number four we've already done.
    Nothing changed.
  • 15:31 - 15:31
    Number five:
    this is where we see the path
  • 15:35 - 15:35
    four again, but that's worse
    than two.
  • 15:38 - 15:38
    So, we don't update anything.
    Number six: one plus five is
  • 15:43 - 15:43
    six, which is bigger than two,
    so no good.
  • 15:47 - 15:47
    Go around this way.
    Number seven:
  • 15:50 - 15:50
    same deal.
    Number eight is interesting.
  • 15:53 - 15:53
    So, we have a weight of one
    here, a weight of minus three
  • 15:58 - 15:58
    here.
    So, the total is minus two,
  • 16:03 - 16:03
    which is better than one.
    So, that was d.
  • 16:07 - 16:07
    And, I believe that's it.
    So that was definitely the end
  • 16:13 - 16:13
    of that round.
    So, it's I plus two because we
  • 16:18 - 16:18
    just looked at the eighth edge.
    And, I'll cheat and check.
  • 16:24 - 16:24
    Indeed, that is the last thing
    that happens.
  • 16:30 - 16:30
    We can check the couple of
    outgoing edges from d because
  • 16:34 - 16:34
    that's the only one whose value
    just changed.
  • 16:37 - 16:37
    And, there are no more
    relaxations possible.
  • 16:40 - 16:40
    So, that was in two rounds.
    The claim is we got all the
  • 16:44 - 16:44
    shortest path weights.
    The algorithm would actually
  • 16:47 - 16:47
    loop four times to guarantee
    correctness because we have five
  • 16:51 - 16:51
    vertices here and one less than
    that.
  • 16:54 - 16:54
    So, in fact,
    in the execution here there are
  • 16:57 - 16:57
    two more blank rounds at the
    bottom.
  • 16:59 - 16:59
    Nothing happens.
    But, what the hell?
  • 17:03 - 17:03
    OK, so that is Bellman-Ford.
    I mean, it's certainly not
  • 17:06 - 17:06
    doing anything wrong.
    The question is,
  • 17:08 - 17:08
    why is it guaranteed to
    converge in V minus one steps
  • 17:12 - 17:12
    unless there is a negative
    weight cycle?
  • 17:14 - 17:14
    Question?
  • 17:24 - 17:24
    Right, so that's an
    optimization.
  • 17:26 - 17:26
    If you discover a whole round,
    and nothing happens,
  • 17:29 - 17:29
    so you can keep track of that
    in the algorithm thing,
  • 17:32 - 17:32
    you can stop.
    In the worst case,
  • 17:33 - 17:33
    it won't make a difference.
    But in practice,
  • 17:36 - 17:36
    you probably want to do that.
    Yeah?
  • 17:38 - 17:38
    Good question.
    All right, so some simple
  • 17:40 - 17:40
    observations,
    I mean, we're only doing
  • 17:42 - 17:42
    relaxation.
    So, we can use a lot of our
  • 17:44 - 17:44
    analysis from before.
    In particular,
  • 17:46 - 17:46
    the d values are only
    decreasing monotonically.
  • 17:49 - 17:49
    As we cross out values here,
    we are always making it
  • 17:52 - 17:52
    smaller, which is good.
    Another nifty thing about this
  • 17:55 - 17:55
    algorithm is that you can run it
    even in a distributed system.
  • 18:00 - 18:00
    If this is some actual network,
    some computer network,
  • 18:03 - 18:03
    and these are machines,
    and they're communicating by
  • 18:05 - 18:05
    these links, I mean,
    it's a purely local thing.
  • 18:07 - 18:07
    Relaxation is a local thing.
    You don't need any global
  • 18:10 - 18:10
    strategy, and you're asking
    about, can we do a different
  • 18:13 - 18:13
    order in each step?
    Well, yeah, you could just keep
  • 18:15 - 18:15
    relaxing edges,
    and keep relaxing edges,
  • 18:17 - 18:17
    and just keep going for the
    entire lifetime of the network.
  • 18:20 - 18:20
    And eventually,
    you will find shortest paths.
  • 18:22 - 18:22
    So, this algorithm is
    guaranteed to finish in V rounds
  • 18:24 - 18:24
    in a distributed system.
    It might be more asynchronous.
  • 18:27 - 18:27
    And, it's a little harder to
    analyze.
  • 18:30 - 18:30
    But it will still work
    eventually.
  • 18:34 - 18:34
    It's guaranteed to converge.
    And so, Bellman-Ford is used in
  • 18:42 - 18:42
    the Internet for finding
    shortest paths.
  • 18:46 - 18:46
    OK, so let's finally prove that
    it works.
  • 18:51 - 18:51
    This should only take a couple
    of boards.
  • 18:56 - 18:56
    So let's suppose we have a
    graph and some edge weights that
  • 19:04 - 19:04
    have no negative weight cycles.
    Then the claim is that we
  • 19:13 - 19:13
    terminate with the correct
    answer.
  • 19:19 - 19:19
    So, Bellman-Ford terminates
    with all of these d of v values
  • 19:30 - 19:30
    set to the delta values for
    every vertex.
  • 19:38 - 19:38
    OK, the proof is going to be
    pretty immediate using the
  • 19:42 - 19:42
    lemmas that we had from before
    if you remember them.
  • 19:46 - 19:46
    So, we're just going to look at
    every vertex separately.
  • 19:50 - 19:50
    So, I'll call the vertex v.
    The claim is that this holds by
  • 19:54 - 19:54
    the end of the algorithm.
    So, remember what we need to
  • 19:59 - 19:59
    prove is that at some point,
    d of v equals delta of s comma
  • 20:03 - 20:03
    v because we know it decreases
    monotonically,
  • 20:06 - 20:06
    and we know that it never gets
    any smaller than the correct
  • 20:11 - 20:11
    value because relaxations are
    always safe.
  • 20:15 - 20:15
    So, we just need to show at
    some point this holds,
  • 20:25 - 20:25
    and that it will hold at the
    end.
  • 20:32 - 20:32
    So, by monotonicity of the d
    values, and by correctness part
  • 20:41 - 20:41
    one, which was that the d of v's
    are always greater than or equal
  • 20:52 - 20:52
    to the deltas,
    we only need to show that at
  • 20:59 - 20:59
    some point we have equality.
  • 21:18 - 21:18
    So that's our goal.
    So what we're going to do is
  • 21:22 - 21:22
    just look at v,
    and the shortest path to v,
  • 21:25 - 21:25
    and see what happens to the
    algorithm relative to that path.
  • 21:30 - 21:30
    So, I'm going to name the path.
    Let's call it p.
  • 21:35 - 21:35
    It starts at vertex v_0 and
    goes to v_1, v_2,
  • 21:40 - 21:40
    whatever, and ends at v_k.
    And, this is not just any
  • 21:46 - 21:46
    shortest path,
    but it's one that starts at s.
  • 21:51 - 21:51
    So, v_0's s,
    and it ends at v.
  • 21:54 - 21:54
    So, I'm going to give a couple
    of names to s and v so I can
  • 22:01 - 22:01
    talk about the path more
    uniformly.
  • 22:05 - 22:05
    So, this is a shortest path
    from s to v.
  • 22:11 - 22:11
    Now, I also want it to be not
    just any shortest path from s to
  • 22:16 - 22:16
    v, but among all shortest paths
    from s to v I want it to be one
  • 22:20 - 22:20
    with the fewest possible edges.
  • 22:32 - 22:32
    OK, so shortest here means in
    terms of the total weight of the
  • 22:36 - 22:36
    path.
    Subject to being shortest in
  • 22:39 - 22:39
    weight, I wanted to also be
    shortest in the number of edges.
  • 22:43 - 22:43
    And, the reason I want that is
    to be able to conclude that p is
  • 22:47 - 22:47
    a simple path,
    meaning that it doesn't repeat
  • 22:50 - 22:50
    any vertices.
    Now, can anyone tell me why I
  • 22:53 - 22:53
    need to assume that the number
    of edges is the smallest
  • 22:57 - 22:57
    possible in order to guarantee
    that p is simple?
  • 23:01 - 23:01
    The claim is that not all
    shortest paths are necessarily
  • 23:04 - 23:04
    simple.
    Yeah?
  • 23:05 - 23:05
    Right, I can have a zero weight
    cycle, exactly.
  • 23:08 - 23:08
    So, we are hoping,
    I mean, in fact in the theorem
  • 23:11 - 23:11
    here, we're assuming that there
    are no negative weight cycles.
  • 23:15 - 23:15
    But there might be zero weight
    cycles still.
  • 23:17 - 23:17
    As a zero weight cycle,
    you can put that in the middle
  • 23:20 - 23:20
    of any shortest path to make it
    arbitrarily long,
  • 23:23 - 23:23
    repeat vertices over and over.
    That's going to be annoying.
  • 23:27 - 23:27
    What I want is that p is
    simple.
  • 23:30 - 23:30
    And, I can guarantee that
    essentially by shortcutting.
  • 23:34 - 23:34
    If ever I take a zero weight
    cycle, I throw it away.
  • 23:37 - 23:37
    And this is one mathematical
    way of doing that.
  • 23:40 - 23:40
    OK, now what else do we know
    about this shortest path?
  • 23:43 - 23:43
    Well, we know that subpaths are
    shortest paths are shortest
  • 23:47 - 23:47
    paths.
    That's optimal substructure.
  • 23:49 - 23:49
    So, we know what the shortest
    path from s to v_i is sort of
  • 23:53 - 23:53
    inductively.
    It's the shortest path,
  • 23:56 - 23:56
    I mean, it's the weight of that
    path, which is,
  • 23:59 - 23:59
    in particular,
    the shortest path from s to v
  • 24:02 - 24:02
    minus one plus the weight of the
    last edge, v minus one to v_i.
  • 24:07 - 24:07
    So, this is by optimal
    substructure as we proved last
  • 24:17 - 24:17
    time.
    OK, and I think that's pretty
  • 24:24 - 24:24
    much the warm-up.
    So, I want to sort of do this
  • 24:31 - 24:31
    inductively in I,
    start out with v zero,
  • 24:34 - 24:34
    and go up to v_k.
    So, the first question is,
  • 24:38 - 24:38
    what is d of v_0,
    which is s?
  • 24:40 - 24:40
    What is d of the source?
    Well, certainly at the
  • 24:44 - 24:44
    beginning of the algorithm,
    it's zero.
  • 24:47 - 24:47
    So, let's say equals zero
    initially because that's what we
  • 24:52 - 24:52
    set it to.
    And it only goes down from
  • 24:55 - 24:55
    there.
    So, it certainly,
  • 24:57 - 24:57
    at most, zero.
    The real question is,
  • 25:02 - 25:02
    what is delta of s comma v_0.
    What is the shortest path
  • 25:07 - 25:07
    weight from s to s?
    It has to be zero,
  • 25:10 - 25:10
    otherwise you have a negative
    weight cycle,
  • 25:14 - 25:14
    exactly.
    My favorite answer,
  • 25:16 - 25:16
    zero.
    So, if we had another path from
  • 25:19 - 25:19
    s to s, I mean,
    that is a cycle.
  • 25:22 - 25:22
    So, it's got to be zero.
    So, these are actually equal at
  • 25:27 - 25:27
    the beginning of the algorithm,
    which is great.
  • 25:32 - 25:32
    That means they will be for all
    time because we just argued up
  • 25:37 - 25:37
    here, only goes down,
    never can get too small.
  • 25:41 - 25:41
    So, we have d of v_0 set to the
    right thing.
  • 25:45 - 25:45
    Great: good for the base case
    of the induction.
  • 25:49 - 25:49
    Of course, what we really care
    about is v_k,
  • 25:53 - 25:53
    which is v.
    So, let's talk about the v_i
  • 25:57 - 25:57
    inductively, and then we will
    get v_k as a result.
  • 26:11 - 26:11
    So, yeah, let's do it by
    induction.
  • 26:14 - 26:14
    That's more fun.
  • 26:27 - 26:27
    Let's say that d of v_i is
    equal to delta of s v_i after I
  • 26:33 - 26:33
    rounds of the algorithm.
    So, this is actually referring
  • 26:38 - 26:38
    to the I that is in the
    algorithm here.
  • 26:42 - 26:42
    These are rounds.
    So, one round is an entire
  • 26:47 - 26:47
    execution of all the edges,
    relaxation of all the edges.
  • 26:52 - 26:52
    So, this is certainly true for
    I equals zero.
  • 26:57 - 26:57
    We just proved that.
    After zero rounds,
  • 27:01 - 27:01
    at the beginning of the
    algorithm, d of v_0 equals delta
  • 27:06 - 27:06
    of s, v_0.
    OK, so now, that's not really
  • 27:11 - 27:11
    what I wanted,
    but OK, fine.
  • 27:14 - 27:14
    Now we'll prove it for d of v_i
    plus one.
  • 27:17 - 27:17
    Generally, I recommend you
    assume something.
  • 27:20 - 27:20
    In fact, why don't I follow my
    own advice and change it?
  • 27:25 - 27:25
    It's usually nicer to think of
    induction as recursion.
  • 27:29 - 27:29
    So, you assume that this is
    true, let's say,
  • 27:33 - 27:33
    for j less than the i that you
    care about, and then you prove
  • 27:37 - 27:37
    it for d of v_i.
    It's usually a lot easier to
  • 27:42 - 27:42
    think about it that way.
    In particular,
  • 27:45 - 27:45
    you can use strong induction
    for all less than i.
  • 27:48 - 27:48
    Here, we're only going to need
    it for one less.
  • 27:52 - 27:52
    We have some relation between I
    and I minus one here in terms of
  • 27:56 - 27:56
    the deltas.
    And so, we want to argue
  • 27:59 - 27:59
    something about the d values.
    OK, well, let's think about
  • 28:05 - 28:05
    what's going on here.
    We know that,
  • 28:09 - 28:09
    let's say, after I minus one
    rounds, we have this inductive
  • 28:16 - 28:16
    hypothesis, d of v_i minus one
    equals delta of s v_i minus one.
  • 28:23 - 28:23
    And, we want to conclude that
    after i rounds,
  • 28:28 - 28:28
    so we have one more round to do
    this.
  • 28:32 - 28:32
    We want to conclude that d of
    v_i has the right answer,
  • 28:38 - 28:38
    delta of s comma v_i.
    Does that look familiar at all?
  • 28:44 - 28:44
    So we want to relax every edge
    in this round.
  • 28:48 - 28:48
    In particular,
    at some point,
  • 28:50 - 28:50
    we have to relax the edge from
    v_i minus one to v_i.
  • 28:54 - 28:54
    We know that this path consists
    of edges.
  • 28:57 - 28:57
    That's the definition of a
    path.
  • 29:00 - 29:00
    So, during the i'th round,
    we relax every edge.
  • 29:10 - 29:10
    So, we better relax v_i minus
    one v_i.
  • 29:19 - 29:19
    And, what happens then?
    It's a test of memory.
  • 29:43 - 29:43
    Quick, the Death Star is
    approaching.
  • 29:47 - 29:47
    So, if we have the correct
    value for v_i minus one,
  • 29:52 - 29:52
    that we relax an outgoing edge
    from there, and that edge is an
  • 29:58 - 29:58
    edge of the shortest path from s
    to v_i.
  • 30:02 - 30:02
    What do we know?
    d of v_i becomes the correct
  • 30:07 - 30:07
    value, delta of s comma v_i.
    This was called correctness
  • 30:13 - 30:13
    lemma last time.
    One of the things we proved
  • 30:18 - 30:18
    about Dijkstra's algorithm,
    but it was really just a fact
  • 30:24 - 30:24
    about relaxation.
    And it was a pretty simple
  • 30:29 - 30:29
    proof.
    And it comes from this fact.
  • 30:33 - 30:33
    We know the shortest path
    weight is this.
  • 30:36 - 30:36
    So, certainly d of v_i was at
    least this big,
  • 30:38 - 30:38
    and let's suppose it's greater,
    or otherwise we were done.
  • 30:42 - 30:42
    We know d of v_i minus one is
    set to this.
  • 30:45 - 30:45
    And so, this is exactly the
    condition that's being checked
  • 30:49 - 30:49
    in the relaxation step.
    And, the d of v_i value will be
  • 30:52 - 30:52
    greater than this,
    let's suppose.
  • 30:54 - 30:54
    And then, we'll set it equal to
    this.
  • 30:57 - 30:57
    And that's exactly d of s v_i.
    So, when we relax that edge,
  • 31:02 - 31:02
    we've got to set it to the
    right value.
  • 31:04 - 31:04
    So, this is the end of the
    proof, right?
  • 31:07 - 31:07
    It's very simple.
    The point is,
  • 31:09 - 31:09
    you look at your shortest path.
    Here it is.
  • 31:12 - 31:12
    And if we assume there's no
    negative weight cycles,
  • 31:15 - 31:15
    this has the correct value
    initially.
  • 31:17 - 31:17
    d of s is going to be zero.
    After the first round,
  • 31:20 - 31:20
    you've got to relax this edge.
    And then you get the right
  • 31:24 - 31:24
    value for that vertex.
    After the second round,
  • 31:27 - 31:27
    you've got to relax this edge,
    which gets you the right d
  • 31:31 - 31:31
    value for this vertex and so on.
    And so, no matter which
  • 31:36 - 31:36
    shortest path you take,
    you can apply this analysis.
  • 31:41 - 31:41
    And you know that by,
    if the length of this path,
  • 31:45 - 31:45
    here we assumed it was k edges,
    then after k rounds you've got
  • 31:50 - 31:50
    to be done.
    OK, so this was not actually
  • 31:54 - 31:54
    the end of the proof.
    Sorry.
  • 31:57 - 31:57
    So this means after k rounds,
    we have the right answer for
  • 32:03 - 32:03
    v_k, which is v.
    So, the only question is how
  • 32:08 - 32:08
    big could k be?
    And, it better be the right
  • 32:13 - 32:13
    answer, at most,
    v minus one is the claim by the
  • 32:18 - 32:18
    algorithm that you only need to
    do v minus one steps.
  • 32:24 - 32:24
    And indeed, the number of edges
    in a simple path in a graph is,
  • 32:31 - 32:31
    at most, the number of vertices
    minus one.
  • 32:37 - 32:37
    k is, at most,
    v minus one because p is
  • 32:40 - 32:40
    simple.
    So, that's why we had to assume
  • 32:44 - 32:44
    that it wasn't just any shortest
    path.
  • 32:47 - 32:47
    It had to be a simple one so it
    didn't repeat any vertices.
  • 32:52 - 32:52
    So there are,
    at most, V vertices in the
  • 32:56 - 32:56
    path, so at most,
    V minus one edges in the path.
  • 33:01 - 33:01
    OK, and that's all there is to
    Bellman-Ford.
  • 33:06 - 33:06
    So: pretty simple in
    correctness.
  • 33:09 - 33:09
    Of course, we're using a lot of
    the lemmas that we proved last
  • 33:15 - 33:15
    time, which makes it easier.
    OK, a consequence of this
  • 33:21 - 33:21
    theorem, or of this proof is
    that if Bellman-Ford fails to
  • 33:27 - 33:27
    converge, and that's what the
    algorithm is checking is whether
  • 33:34 - 33:34
    this relaxation still requires
    work after these d minus one
  • 33:40 - 33:40
    steps.
    Right, the end of this
  • 33:44 - 33:44
    algorithm is run another round,
    a V'th round,
  • 33:48 - 33:48
    see whether anything changes.
    So, we'll say that the
  • 33:53 - 33:53
    algorithm fails to converge
    after V minus one steps or
  • 33:59 - 33:59
    rounds.
    Then, there has to be a
  • 34:02 - 34:02
    negative weight cycle.
    OK, this is just a
  • 34:04 - 34:04
    contrapositive of what we
    proved.
  • 34:07 - 34:07
    We proved that if you assume
    there's no negative weight
  • 34:10 - 34:10
    cycle, then we know that d of s
    is zero, and then all this
  • 34:14 - 34:14
    argument says is you've got to
    converge after v minus one
  • 34:18 - 34:18
    rounds.
    There can't be anything left to
  • 34:21 - 34:21
    do once you've reached the
    shortest path weights because
  • 34:25 - 34:25
    you're going monotonically;
    you can never hit the bottom.
  • 34:30 - 34:30
    You can never go to the floor.
    So, if you fail to converge
  • 34:34 - 34:34
    somehow after V minus one
    rounds, you've got to have
  • 34:37 - 34:37
    violated the assumption.
    The only assumption we made was
  • 34:41 - 34:41
    there's no negative weight
    cycle.
  • 34:43 - 34:43
    So, this tells us that
    Bellman-Ford is actually
  • 34:46 - 34:46
    correct.
    When it says that there is a
  • 34:48 - 34:48
    negative weight cycle,
    it indeed means it.
  • 34:51 - 34:51
    It's true.
    OK, and you can modify
  • 34:53 - 34:53
    Bellman-Ford in that case to
    sort of run a little longer,
  • 34:57 - 34:57
    and find where all the minus
    infinities are.
  • 35:01 - 35:01
    And that is,
    in some sense,
  • 35:03 - 35:03
    one of the things you have to
    do in your problem set,
  • 35:06 - 35:06
    I believe.
    So, I won't cover it here.
  • 35:08 - 35:08
    But, it's a good exercise in
    any case to figure out how you
  • 35:12 - 35:12
    would find where the minus
    infinities are.
  • 35:15 - 35:15
    What are all the vertices
    reachable from negative weight
  • 35:18 - 35:18
    cycle?
    Those are the ones that have
  • 35:20 - 35:20
    minus infinities.
    OK, so you might say,
  • 35:23 - 35:23
    well, that was awfully fast.
    Actually, it's not over yet.
  • 35:26 - 35:26
    The episode is not yet ended.
    We're going to use Bellman-Ford
  • 35:30 - 35:30
    to solve the even bigger and
    greater shortest path problems.
  • 35:35 - 35:35
    And in the remainder of today's
    lecture, we will see it applied
  • 35:39 - 35:39
    to a more general problem,
    in some sense,
  • 35:42 - 35:42
    called linear programming.
    And the next lecture,
  • 35:46 - 35:46
    we'll really use it to do some
    amazing stuff with all pairs
  • 35:50 - 35:50
    shortest paths.
    Let's go over here.
  • 35:52 - 35:52
    So, our goal,
    although it won't be obvious
  • 35:55 - 35:55
    today, is to be able to compute
    the shortest paths between every
  • 36:00 - 36:00
    pair of vertices,
    which we could certainly do at
  • 36:03 - 36:03
    this point just by running
    Bellman-Ford v times.
  • 36:08 - 36:08
    OK, but we want to do better
    than that, of course.
  • 36:15 - 36:15
    And, that will be the climax of
    the trilogy.
  • 36:22 - 36:22
    OK, today we just discovered
    who Luke's father is.
  • 36:30 - 36:30
    So, it turns out the father of
    shortest paths is linear
  • 36:37 - 36:37
    programming.
    Actually, simultaneously the
  • 36:43 - 36:43
    father and the mother because
    programs do not have gender.
  • 36:51 - 36:51
    OK, my father likes to say,
    we both took improv comedy
  • 36:58 - 36:58
    lessons so we have degrees in
    improvisation.
  • 37:05 - 37:05
    And he said,
    you know, we went to improv
  • 37:07 - 37:07
    classes in order to learn how to
    make our humor better.
  • 37:11 - 37:11
    And, the problem is,
    it didn't actually make our
  • 37:14 - 37:14
    humor better.
    It just made us less afraid to
  • 37:16 - 37:16
    use it.
    [LAUGHTER] So,
  • 37:17 - 37:17
    you are subjected to all this
    improv humor.
  • 37:20 - 37:20
    I didn't see the connection of
    Luke's father,
  • 37:23 - 37:23
    but there you go.
    OK, so, linear programming is a
  • 37:26 - 37:26
    very general problem,
    a very big tool.
  • 37:29 - 37:29
    Has anyone seen linear
    programming before?
  • 37:33 - 37:33
    OK, one person.
    And, I'm sure you will,
  • 37:36 - 37:36
    at some time in your life,
    do anything vaguely computing
  • 37:41 - 37:41
    optimization related,
    linear programming comes up at
  • 37:45 - 37:45
    some point.
    It's a very useful tool.
  • 37:49 - 37:49
    You're given a matrix and two
    vectors: not too exciting yet.
  • 37:54 - 37:54
    What you want to do is find a
    vector.
  • 37:57 - 37:57
    This is a very dry description.
    We'll see what makes it so
  • 38:03 - 38:03
    interesting in a moment.
  • 38:17 - 38:17
    So, you want to maximize some
    objective, and you have some
  • 38:21 - 38:21
    constraints.
    And they're all linear.
  • 38:24 - 38:24
    So, the objective is a linear
    function in the variables x,
  • 38:29 - 38:29
    and your constraints are a
    bunch of linear constraints,
  • 38:33 - 38:33
    inequality constraints,
    that's one makes an
  • 38:36 - 38:36
    interesting.
    It's not just solving a linear
  • 38:39 - 38:39
    system as you've seen in linear
    algebra, or whatever.
  • 38:43 - 38:43
    Or, of course,
    it could be that there is no
  • 38:47 - 38:47
    such x.
    OK: vaguely familiar you might
  • 38:49 - 38:49
    think to the theorem about
    Bellman-Ford.
  • 38:53 - 38:53
    And, we'll show that there's
    some kind of connection here
  • 38:57 - 38:57
    that either you want to find
    something, or show that it
  • 39:01 - 39:01
    doesn't exist.
    Well, that's still a pretty
  • 39:06 - 39:06
    vague connection,
    but I also want to maximize
  • 39:09 - 39:09
    something, or are sort of
    minimize the shortest paths,
  • 39:13 - 39:13
    OK, somewhat similar.
    We have these constraints.
  • 39:17 - 39:17
    So, yeah.
    This may be intuitive to you,
  • 39:20 - 39:20
    I don't know.
    I prefer a more geometric
  • 39:23 - 39:23
    picture, and I will try to draw
    such a geometric picture,
  • 39:27 - 39:27
    and I've never tried to do this
    on a blackboard,
  • 39:31 - 39:31
    so it should be interesting.
    I think I'm going to fail
  • 39:36 - 39:36
    miserably.
    It sort of looks like a
  • 39:40 - 39:40
    dodecahedron,
    right?
  • 39:42 - 39:42
    Sort of, kind of,
    not really.
  • 39:44 - 39:44
    A bit rough on the bottom,
    OK.
  • 39:47 - 39:47
    So, if you have a bunch of
    linear constraints,
  • 39:52 - 39:52
    this is supposed to be in 3-D.
    Now I labeled it.
  • 39:57 - 39:57
    It's now in 3-D.
    Good.
  • 40:00 - 40:00
    So, you have these linear
    constraints.
  • 40:03 - 40:03
    That turns out to define
    hyperplanes in n dimensions.
  • 40:07 - 40:07
    OK, so you have this base here
    that's three-dimensional space.
  • 40:11 - 40:11
    So, n equals three.
    And, these hyperplanes,
  • 40:14 - 40:14
    if you're looking at one side
    of the hyperplane,
  • 40:18 - 40:18
    that's the less than or equal
    to, if you take the
  • 40:21 - 40:21
    intersection,
    you get some convex polytope or
  • 40:24 - 40:24
    polyhedron.
    In 3-D, you might get a
  • 40:27 - 40:27
    dodecahedron or whatever.
    And, your goal,
  • 40:30 - 40:30
    you have some objective vector
    c, let's say,
  • 40:33 - 40:33
    up.
    Suppose that's the c vector.
  • 40:37 - 40:37
    Your goal is to find the
    highest point in this polytope.
  • 40:42 - 40:42
    So here, it's maybe this one.
    OK, this is the target.
  • 40:47 - 40:47
    This is the optimal,
    x.
  • 40:49 - 40:49
    That is the geometric view.
    If you prefer the algebraic
  • 40:54 - 40:54
    view, you want to maximize the c
    transpose times x.
  • 41:00 - 41:00
    So, this is m.
    This is n.
  • 41:02 - 41:02
    Check out the dimensions work
    out.
  • 41:05 - 41:05
    So that's saying you want to
    maximize the dot product.
  • 41:09 - 41:09
    You want to maximize the extent
    to which x is in the direction
  • 41:14 - 41:14
    c.
    And, you want to maximize that
  • 41:16 - 41:16
    subject to some constraints,
    which looks something like
  • 41:20 - 41:20
    this, maybe.
    So, this is A,
  • 41:23 - 41:23
    and it's m by n.
    You want to multiply it by,
  • 41:26 - 41:26
    it should be something of
    height n.
  • 41:30 - 41:30
    That's x.
    Let me put x down here,
  • 41:33 - 41:33
    n by one.
    And, it should be less than or
  • 41:36 - 41:36
    equal to something of this
    height, which is B,
  • 41:40 - 41:40
    the right hand side.
    OK, that's the algebraic view,
  • 41:44 - 41:44
    which is to check out all the
    dimensions are working out.
  • 41:49 - 41:49
    But, you can read these off in
    each row here,
  • 41:53 - 41:53
    when multiplied by this column,
    gives you one value here.
  • 41:57 - 41:57
    And as just a linear
    constraints on all the x sides.
  • 42:03 - 42:03
    So, you want to maximize this
    linear function of x_1 up to x_n
  • 42:09 - 42:09
    subject to these constraints,
    OK?
  • 42:12 - 42:12
    Pretty simple,
    but pretty powerful in general.
  • 42:16 - 42:16
    So, it turns out that with,
    you can formulate a huge number
  • 42:22 - 42:22
    of problems such as shortest
    paths as a linear program.
  • 42:27 - 42:27
    So, it's a general tool.
    And in this class,
  • 42:32 - 42:32
    we will not cover any
    algorithms for solving linear
  • 42:37 - 42:37
    programming.
    It's a bit tricky.
  • 42:40 - 42:40
    I'll just mention that they are
    out there.
  • 42:45 - 42:45
    So, there's many efficient
    algorithms, and lots of code
  • 42:51 - 42:51
    that does this.
    It's a very practical setup.
  • 42:55 - 42:55
    So, lots of algorithms to solve
    LP's, linear programs.
  • 43:02 - 43:02
    Linear programming is usually
    called LP.
  • 43:06 - 43:06
    And, I'll mention a few of
    them.
  • 43:09 - 43:09
    There's the simplex algorithm.
    This is one of the first.
  • 43:14 - 43:14
    I think it is the first,
    the ellipsoid algorithm.
  • 43:19 - 43:19
    There's interior point methods,
    and there's random sampling.
  • 43:25 - 43:25
    I'll just say a little bit
    about each of these because
  • 43:30 - 43:30
    we're not going to talk about
    any of them in depth.
  • 43:36 - 43:36
    The simplex algorithm,
    this is, I mean,
  • 43:38 - 43:38
    one of the first algorithms in
    the world in some sense,
  • 43:42 - 43:42
    certainly one of the most
    popular.
  • 43:44 - 43:44
    It's still used today.
    Almost all linear programming
  • 43:47 - 43:47
    code uses the simplex algorithm.
    It happens to run an
  • 43:50 - 43:50
    exponential time in the
    worst-case, so it's actually
  • 43:54 - 43:54
    pretty bad theoretically.
    But in practice,
  • 43:56 - 43:56
    it works really well.
    And there is some recent work
  • 43:59 - 43:59
    that tries to understand this.
    It's still exponential in the
  • 44:03 - 44:03
    worst case.
    But, it's practical.
  • 44:06 - 44:06
    There's actually an open
    problem whether there exists a
  • 44:10 - 44:10
    variation of simplex that runs
    in polynomial time.
  • 44:14 - 44:14
    But, I won't go into that.
    That's a major open problem in
  • 44:18 - 44:18
    this area of linear programming.
    The ellipsoid algorithm was the
  • 44:22 - 44:22
    first algorithm to solve linear
    programming in polynomial time.
  • 44:26 - 44:26
    So, for a long time,
    people didn't know.
  • 44:30 - 44:30
    Around this time,
    people started realizing
  • 44:32 - 44:32
    polynomial time is a good thing.
    That happened around the late
  • 44:36 - 44:36
    60s.
    Polynomial time is good.
  • 44:38 - 44:38
    And, the ellipsoid algorithm is
    the first one to do it.
  • 44:41 - 44:41
    It's a very general algorithm,
    and very powerful,
  • 44:44 - 44:44
    theoretically:
    completely impractical.
  • 44:46 - 44:46
    But, it's cool.
    It lets you do things like you
  • 44:49 - 44:49
    can solve a linear program that
    has exponentially many
  • 44:52 - 44:52
    constraints in polynomial time.
    You've got all sorts of crazy
  • 44:56 - 44:56
    things.
    So, I'll just say it's
  • 44:58 - 44:58
    polynomial time.
    I can't say something nice
  • 45:02 - 45:02
    about it; don't say it at all.
    It's impractical.
  • 45:05 - 45:05
    Interior point methods are sort
    of the mixture.
  • 45:08 - 45:08
    They run in polynomial time.
    You can guarantee that.
  • 45:11 - 45:11
    And, they are also pretty
    practical, and there's sort of
  • 45:15 - 45:15
    this competition these days
    about whether simplex or
  • 45:18 - 45:18
    interior point is better.
    And, I don't know what it is
  • 45:22 - 45:22
    today but a few years ago they
    were neck and neck.
  • 45:25 - 45:25
    And, random sampling is a brand
    new approach.
  • 45:28 - 45:28
    This is just from a couple
    years ago by two MIT professors,
  • 45:32 - 45:32
    Dimitris Bertsimas and Santosh
    Vempala, I guess the other is in
  • 45:36 - 45:36
    applied math.
    So, just to show you,
  • 45:39 - 45:39
    there's active work in this
    area.
  • 45:41 - 45:41
    People are still finding new
    ways to solve linear programs.
  • 45:45 - 45:45
    This is completely randomized,
    and very simple,
  • 45:47 - 45:47
    and very general.
    It hasn't been implemented,
  • 45:50 - 45:50
    so we don't know how practical
    it is yet.
  • 45:52 - 45:52
    But, it has potential.
    OK: pretty neat.
  • 45:55 - 45:55
    OK, we're going to look at a
    somewhat simpler version of
  • 45:58 - 45:58
    linear programming.
    The first restriction we are
  • 46:02 - 46:02
    going to make is actually not
    much of a restriction.
  • 46:06 - 46:06
    But, nonetheless we will
    consider it, it's a little bit
  • 46:10 - 46:10
    easier to think about.
    So here, we had some polytope
  • 46:14 - 46:14
    we wanted to maximize some
    objective.
  • 46:16 - 46:16
    In a feasibility problem,
    I just want to know,
  • 46:20 - 46:20
    is the polytope empty?
    Can you find any point in that
  • 46:23 - 46:23
    polytope?
    Can you find any set of values,
  • 46:26 - 46:26
    x, that satisfy these
    constraints?
  • 46:30 - 46:30
    OK, so there's no objective.
    c, just find x such that AX is
  • 46:35 - 46:35
    less than or equal to B.
    OK, it turns out you can prove
  • 46:39 - 46:39
    a very general theorem that if
    you can solve linear
  • 46:43 - 46:43
    feasibility, you can also solve
    linear programming.
  • 46:48 - 46:48
    We won't prove that here,
    but this is actually no easier
  • 46:52 - 46:52
    than the original problem even
    though it feels easier,
  • 46:56 - 46:56
    and it's easier to think about.
    I was just saying actually no
  • 47:03 - 47:03
    easier than LP.
    OK, the next restriction we're
  • 47:08 - 47:08
    going to make is a real
    restriction.
  • 47:12 - 47:12
    And it simplifies the problem
    quite a bit.
  • 47:30 - 47:30
    And that's to look at different
    constraints.
  • 47:35 - 47:35
    And, if all this seemed a bit
    abstract so far,
  • 47:41 - 47:41
    we will now ground ourselves
    little bit.
  • 47:46 - 47:46
    A system of different
    constraints is a linear
  • 47:51 - 47:51
    feasibility problem.
    So, it's an LP where there's no
  • 47:58 - 47:58
    objective.
    And, it's with a restriction,
  • 48:06 - 48:06
    so, where each row of the
    matrix, so, the matrix,
  • 48:17 - 48:17
    A, has one one,
    and it has one minus one,
  • 48:26 - 48:26
    and everything else in the row
    is zero.
  • 48:36 - 48:36
    OK, in other words,
    each constraint has its very
  • 48:41 - 48:41
    simple form.
    It involves two variables and
  • 48:45 - 48:45
    some number.
    So, we have something like x_j
  • 48:50 - 48:50
    minus x_i is less than or equal
    to w_ij.
  • 48:54 - 48:54
    So, this is just a number.
    These are two variables.
  • 49:00 - 49:00
    There's a minus sign,
    no values up here,
  • 49:03 - 49:03
    no coefficients,
    no other of the X_k's appear,
  • 49:06 - 49:06
    just two of them.
    And, you have a bunch of
  • 49:09 - 49:09
    constraints of this form,
    one per row of the matrix.
  • 49:13 - 49:13
    Geometrically,
    I haven't thought about what
  • 49:16 - 49:16
    this means.
    I think it means the
  • 49:18 - 49:18
    hyperplanes are pretty simple.
    Sorry I can't do better than
  • 49:23 - 49:23
    that.
    It's a little hard to see this
  • 49:25 - 49:25
    in high dimensions.
    But, it will start to
  • 49:31 - 49:31
    correspond to something we've
    seen, namely the board that its
  • 49:39 - 49:39
    next to, very shortly.
    OK, so let's do a very quick
  • 49:45 - 49:45
    example mainly to have something
    to point at.
  • 49:51 - 49:51
    Here's a very simple system of
    difference constraints --
  • 50:11 - 50:11
    -- OK, and a solution.
    Why not?
  • 50:14 - 50:14
    It's not totally trivial to
    solve this, but here's a
  • 50:18 - 50:18
    solution.
    And the only thing to check is
  • 50:21 - 50:21
    that each of these constraints
    is satisfied.
  • 50:25 - 50:25
    x_1 minus x_2 is three,
    which is less than or equal to
  • 50:30 - 50:30
    three, and so on.
    There could be negative values.
  • 50:36 - 50:36
    There could be positive values.
    It doesn't matter.
  • 50:42 - 50:42
    I'd like to transform this
    system of difference constraints
  • 50:50 - 50:50
    into a graph because we know a
    lot about graphs.
  • 50:56 - 50:56
    So, we're going to call this
    the constraint graph.
  • 51:03 - 51:03
    And, it's going to represent
    these constraints.
  • 51:08 - 51:08
    How'd I do it?
    Well, I take every constraint,
  • 51:14 - 51:14
    which in general looks like
    this, and I convert it into an
  • 51:20 - 51:20
    edge.
    OK, so if I write it as x_j
  • 51:24 - 51:24
    minus x_i is less than or equal
    to some w_ij,
  • 51:29 - 51:29
    w seems suggestive of weights.
    That's exactly why I called it
  • 51:36 - 51:36
    w.
    I'm going to make that an edge
  • 51:39 - 51:39
    from v_i to v_j.
    So, the order flips a little
  • 51:42 - 51:42
    bit.
    And, the weight of that edge is
  • 51:45 - 51:45
    w_ij.
    So, just do that.
  • 51:46 - 51:46
    Make n vertices.
    So, you have the number of
  • 51:49 - 51:49
    vertices equals n.
    The number of edges equals the
  • 51:53 - 51:53
    number of constraints,
    which is m, the height of the
  • 51:57 - 51:57
    matrix, and just transform.
    So, for example,
  • 52:01 - 52:01
    here we have three variables.
    So, we have three vertices,
  • 52:06 - 52:06
    v_1, v_2, v_3.
    We have x_1 minus x_2.
  • 52:10 - 52:10
    So, we have an edge from v_2 to
    v_1 of weight three.
  • 52:14 - 52:14
    We have x_2 minus x_3.
    So, we have an edge from v_3 to
  • 52:19 - 52:19
    v_2 of weight minus two.
    And, we have x_1 minus x_3.
  • 52:23 - 52:23
    So, we have an edge from v_3 to
    v_1 of weight two.
  • 52:28 - 52:28
    I hope I got the directions
    right.
  • 52:32 - 52:32
    Yep.
    So, there it is,
  • 52:34 - 52:34
    a graph: currently no obvious
    connection to shortest paths,
  • 52:40 - 52:40
    right?
    But in fact,
  • 52:42 - 52:42
    this constraint is closely
    related to shortest paths.
  • 52:48 - 52:48
    So let me just rewrite it.
    You could say,
  • 52:52 - 52:52
    well, an x_j is less than or
    equal to x_i plus w_ij.
  • 52:59 - 52:59
    Or, you could think of it as
    d[j] less than or equal to d[i]
  • 53:04 - 53:04
    plus w_ij.
    This is a conceptual balloon.
  • 53:07 - 53:07
    Look awfully familiar?
    A lot like the triangle
  • 53:11 - 53:11
    inequality, a lot like
    relaxation.
  • 53:13 - 53:13
    So, there's a very close
    connection between these two
  • 53:18 - 53:18
    problems as we will now prove.
  • 53:43 - 53:43
    So, we're going to have two
    theorems.
  • 53:45 - 53:45
    And, they're going to look
    similar to the correctness of
  • 53:49 - 53:49
    Bellman-Ford in that they talk
    about negative weight cycles.
  • 53:53 - 53:53
    Here we go.
    It turns out,
  • 53:55 - 53:55
    I mean, we have this constraint
    graph.
  • 53:57 - 53:57
    It can have negative weights.
    It can have positive weights.
  • 54:02 - 54:02
    It turns out what matters is if
    you have a negative weight
  • 54:06 - 54:06
    cycle.
    So, the first thing to prove is
  • 54:08 - 54:08
    that if you have a negative
    weight cycle that something bad
  • 54:12 - 54:12
    happens.
    OK, what could happen bad?
  • 54:14 - 54:14
    Well, we're just trying to
    satisfy this system of
  • 54:17 - 54:17
    constraints.
    So, the bad thing is that there
  • 54:19 - 54:19
    might not be any solution.
    These constraints may be
  • 54:23 - 54:23
    infeasible.
    And that's the claim.
  • 54:25 - 54:25
    The claim is that this is
    actually an if and only if.
  • 54:29 - 54:29
    But first we'll proved the if.
    If you have a negative weight
  • 54:34 - 54:34
    cycle, you're doomed.
    The difference constraints are
  • 54:38 - 54:38
    unsatisfiable.
    That's a more intuitive way to
  • 54:42 - 54:42
    say it.
    In the LP world,
  • 54:44 - 54:44
    they call it infeasible.
    But unsatisfiable makes a lot
  • 54:48 - 54:48
    more sense.
    There's no way to assign the
  • 54:52 - 54:52
    x_i's in order to satisfy all
    the constraints simultaneously.
  • 54:57 - 54:57
    So, let's just take a look.
    Consider a negative weight
  • 55:02 - 55:02
    cycle.
    It starts at some vertex,
  • 55:04 - 55:04
    goes through some vertices,
    and at some point comes back.
  • 55:08 - 55:08
    I don't care whether it repeats
    vertices, just as long as this
  • 55:12 - 55:12
    cycle, from v_1 to v_1 is a
    negative weight cycle strictly
  • 55:15 - 55:15
    negative weight.
  • 55:26 - 55:26
    OK, and what I'm going to do is
    just write down all the
  • 55:31 - 55:31
    constraints.
    Each of these edges corresponds
  • 55:34 - 55:34
    to a constraint,
    which must be in the set of
  • 55:38 - 55:38
    constraints because we had that
    graph.
  • 55:41 - 55:41
    So, these are all edges.
    Let's look at what they give
  • 55:45 - 55:45
    us.
    So, we have an edge from v_1 to
  • 55:48 - 55:48
    v_2.
    That corresponds to x_2 minus
  • 55:51 - 55:51
    x_1 is, at most,
    something, w_12.
  • 55:53 - 55:53
    Then we have x_3 minus x_2.
    That's the weight w_23,
  • 55:58 - 55:58
    and so on.
    And eventually we get up to
  • 56:04 - 56:04
    something like x_k minus
    x_(k-1).
  • 56:09 - 56:09
    That's this edge:
    w_(k-1),k , and lastly we have
  • 56:16 - 56:16
    this edge, which wraps around.
    So, it's x_1 minus x_k,
  • 56:24 - 56:24
    w_k1 if I've got the signs
    right.
  • 56:30 - 56:30
    Good, so here's a bunch of
    constraints.
  • 56:36 - 56:36
    What do you suggest I do with
    them?
  • 56:41 - 56:41
    Anything interesting about
    these constraints,
  • 56:47 - 56:47
    say, the left hand sides?
    Sorry?
  • 56:52 - 56:52
    It sounded like the right word.
    What was it?
  • 57:00 - 57:00
    Telescopes, yes,
    good.
  • 57:01 - 57:01
    Everything cancels.
    If I added these up,
  • 57:04 - 57:04
    there's an x_2 and a minus x_2.
    There's a minus x_1 and an x_1.
  • 57:08 - 57:08
    There's a minus XK and an XK.
    Everything here cancels if I
  • 57:12 - 57:12
    add up the left hand sides.
    So, what happens if I add up
  • 57:16 - 57:16
    the right hand sides?
    Over here I get zero,
  • 57:19 - 57:19
    my favorite answer.
    And over here,
  • 57:21 - 57:21
    we get all the weights of all
    the edges in the negative weight
  • 57:25 - 57:25
    cycle, which is the weight of
    the cycle, which is negative.
  • 57:30 - 57:30
    So, zero is strictly less than
    zero: contradiction.
  • 57:33 - 57:33
    Contradiction:
    wait a minute,
  • 57:35 - 57:35
    we didn't assume anything that
    was false.
  • 57:38 - 57:38
    So, it's not really a
    contradiction in the
  • 57:40 - 57:40
    mathematical sense.
    We didn't contradict the world.
  • 57:44 - 57:44
    We just said that these
    constraints are contradictory.
  • 57:47 - 57:47
    In other words,
    if you pick any values of the
  • 57:50 - 57:50
    x_i's, there is no way that
    these can all be true because
  • 57:54 - 57:54
    that you would get a
    contradiction.
  • 57:56 - 57:56
    So, it's impossible for these
    things to be satisfied by some
  • 58:00 - 58:00
    real x_i's.
    So, these must be
  • 58:02 - 58:02
    unsatisfiable.
    Let's say there's no satisfying
  • 58:07 - 58:07
    assignment, a little more
    precise, x_1 up to x_m,
  • 58:12 - 58:12
    no weights.
    Can we satisfy those
  • 58:15 - 58:15
    constraints?
    Because they add up to zero on
  • 58:19 - 58:19
    the left-hand side,
    and negative on the right-hand
  • 58:24 - 58:24
    side.
    OK, so that's an easy proof.
  • 58:27 - 58:27
    The reverse direction will be
    only slightly harder.
  • 58:33 - 58:33
    OK, so, cool.
    We have this connection.
  • 58:35 - 58:35
    So motivation is,
    suppose you'd want to solve
  • 58:37 - 58:37
    these difference constraints.
    And we'll see one such
  • 58:40 - 58:40
    application.
    I Googled around for difference
  • 58:42 - 58:42
    constraints.
    There is a fair number of
  • 58:44 - 58:44
    papers that care about
    difference constraints.
  • 58:47 - 58:47
    And, they all use shortest
    paths to solve them.
  • 58:49 - 58:49
    So, if we can prove a
    connection between shortest
  • 58:52 - 58:52
    paths, which we know how to
    compute, and difference
  • 58:55 - 58:55
    constraints, then we'll have
    something cool.
  • 58:57 - 58:57
    And, next class will see even
    more applications of difference
  • 59:00 - 59:00
    constraints.
    It turns out they're really
  • 59:05 - 59:05
    useful for all pairs shortest
    paths.
  • 59:10 - 59:10
    OK, but for now let's just
    prove this equivalence and
  • 59:16 - 59:16
    finish it off.
    So, the reverse direction is if
  • 59:22 - 59:22
    there's no negative weight cycle
    in this constraint graph,
  • 59:29 - 59:29
    then the system better be
    satisfiable.
  • 59:35 - 59:35
    The claim is that these
    negative weight cycles are the
  • 59:42 - 59:42
    only barriers for finding a
    solution to these difference
  • 59:49 - 59:49
    constraints.
    I have this feeling somewhere
  • 59:55 - 59:55
    here.
    I had to talk about the
  • 59:59 - 59:59
    constraint graph.
    Good.
  • 60:13 - 60:13
    Satisfied, good.
    So, here we're going to see a
  • 60:20 - 60:20
    technique that is very useful
    when thinking about shortest
  • 60:28 - 60:28
    paths.
    And, it's a bit hard to guess,
  • 60:33 - 60:33
    especially if you haven't seen
    it before.
  • 60:37 - 60:37
    This is useful in problem sets,
    and in quizzes,
  • 60:41 - 60:41
    and finals, and everything.
    So, keep this in mind.
  • 60:45 - 60:45
    I mean, I'm using it to prove
    this rather simple theorem,
  • 60:51 - 60:51
    but the idea of changing the
    graph, so I'm going to call this
  • 60:56 - 60:56
    constraint graph G.
    Changing the graph is a very
  • 61:00 - 61:00
    powerful idea.
    So, we're going to add a new
  • 61:04 - 61:04
    vertex, s, or source,
    use the source,
  • 61:08 - 61:08
    Luke, and we're going to add a
    bunch of edges from s because
  • 61:13 - 61:13
    being a source,
    it better be connected to some
  • 61:17 - 61:17
    things.
    So, we are going to add a zero
  • 61:24 - 61:24
    weight edge, or weight zero edge
    from s to everywhere,
  • 61:30 - 61:30
    so, to every other vertex in
    the constraint graph.
  • 61:36 - 61:36
    Those vertices are called v_i,
    v_1 up to v_n.
  • 61:40 - 61:40
    So, I have my constraint graph.
    But I'll copy this one so I can
  • 61:46 - 61:46
    change it.
    It's always good to backup your
  • 61:50 - 61:50
    work before you make changes,
    right?
  • 61:53 - 61:53
    So now, I want to add a new
    vertex, s, over here,
  • 61:58 - 61:58
    my new source.
    I just take my constraint
  • 62:01 - 62:01
    graph, whatever it looks like,
    add in weight zero edges to all
  • 62:07 - 62:07
    the other vertices.
    Simple enough.
  • 62:11 - 62:11
    Now, what did I do?
    What did you do?
  • 62:14 - 62:14
    Well, I have a candidate source
    now which can reach all the
  • 62:19 - 62:19
    vertices.
    So, shortest path from s,
  • 62:22 - 62:22
    hopefully, well,
    paths from s exist.
  • 62:25 - 62:25
    I can get from s to everywhere
    in weight at most zero.
  • 62:30 - 62:30
    OK, maybe less.
    Could it be less?
  • 62:32 - 62:32
    Well, you know,
    like v_2, I can get to it by
  • 62:34 - 62:34
    zero minus two.
    So, that's less than zero.
  • 62:37 - 62:37
    So I've got to be a little
    careful.
  • 62:39 - 62:39
    What if there's a negative
    weight cycle?
  • 62:41 - 62:41
    Oh no?
    Then there wouldn't be any
  • 62:43 - 62:43
    shortest paths.
    Fortunately,
  • 62:44 - 62:44
    we assume that there's no
    negative weight cycle in the
  • 62:47 - 62:47
    original graph.
    And if you think about it,
  • 62:50 - 62:50
    if there's no negative weight
    cycle in the original graph,
  • 62:53 - 62:53
    we add an edge from s to
    everywhere else.
  • 62:55 - 62:55
    We're not making any new
    negative weight cycles because
  • 62:59 - 62:59
    you can start at s and go
    somewhere at a cost of zero,
  • 63:02 - 63:02
    which doesn't affect any
    weights.
  • 63:05 - 63:05
    And then, you are forced to
    stay in the old graph.
  • 63:09 - 63:09
    So, there can't be any new
    negative weight cycles.
  • 63:13 - 63:13
    So, the modified graph has no
    negative weight cycles.
  • 63:17 - 63:17
    That's good because it also has
    paths from s,
  • 63:21 - 63:21
    and therefore it also has
    shortest paths from s.
  • 63:25 - 63:25
    The modified graph has no
    negative weight because it
  • 63:30 - 63:30
    didn't before.
    And, it has paths from s.
  • 63:34 - 63:34
    There's a path from s to every
    vertex.
  • 63:38 - 63:38
    There may not have been before.
    Before, I couldn't get from v_2
  • 63:45 - 63:45
    to v_3, for example.
    Well, that's still true.
  • 63:50 - 63:50
    But from s I can get to
    everywhere.
  • 63:53 - 63:53
    So, that means that this graph,
    this modified graph,
  • 63:59 - 63:59
    has shortest paths.
    Shortest paths exist from s.
  • 64:05 - 64:05
    In other words,
    if I took all the shortest path
  • 64:10 - 64:10
    weights, like I ran Bellman-Ford
    from s, then,
  • 64:15 - 64:15
    I would get a bunch of finite
    numbers, d of v,
  • 64:19 - 64:19
    for every value,
    for every vertex.
  • 64:23 - 64:23
    That seems like a good idea.
    Let's do it.
  • 64:27 - 64:27
    So, shortest paths exist.
    Let's just assign x_i to be the
  • 64:34 - 64:34
    shortest path weight from s to
    v_i.
  • 64:37 - 64:37
    Why not?
    That's a good choice for a
  • 64:40 - 64:40
    number, the shortest path weight
    from s to v_i.
  • 64:44 - 64:44
    This is finite because it's
    less than infinity,
  • 64:48 - 64:48
    and it's greater than minus
    infinity, so,
  • 64:52 - 64:52
    some finite number.
    That's what we need to do in
  • 64:56 - 64:56
    order to satisfy these
    constraints.
  • 65:00 - 65:00
    The claim is that this is a
    satisfying assignment.
  • 65:04 - 65:04
    Why?
    Triangle inequality.
  • 65:06 - 65:06
    Somewhere here we wrote
    triangle inequality.
  • 65:09 - 65:09
    This looks a lot like the
    triangle inequality.
  • 65:13 - 65:13
    In fact, I think that's the end
    of the proof.
  • 65:16 - 65:16
    Let's see here.
    What we want to be true with
  • 65:20 - 65:20
    this assignment is that x_j
    minus x_i is less than or equal
  • 65:25 - 65:25
    to w_ij whenever ij is an edge.
    Or, let's say v_i,
  • 65:28 - 65:28
    v_j, for every such constraint,
    so, for v_i,
  • 65:32 - 65:32
    v_j in the edge set.
    OK, so what is this true?
  • 65:37 - 65:37
    Well, let's just expand it out.
    So, x_i is this delta,
  • 65:42 - 65:42
    and x_j is some other delta.
    So, we have delta of s,
  • 65:47 - 65:47
    vj minus delta of s_vi.
    And, on the right-hand side,
  • 65:52 - 65:52
    well, w_ij, that was the weight
    of the edge from I to J.
  • 65:57 - 65:57
    So, this is the weight of v_i
    to v_j.
  • 66:01 - 66:01
    OK, I will rewrite this
    slightly.
  • 66:04 - 66:04
    Delta s, vj is less than or
    equal to delta s,
  • 66:07 - 66:07
    vi plus w of v_i,
    v_j.
  • 66:09 - 66:09
    And that's the triangle
    inequality more or less.
  • 66:13 - 66:13
    The shortest path from s to v_j
    is, at most, shortest path from
  • 66:18 - 66:18
    s to v_i plus a particular path
    from v_i to v_j,
  • 66:22 - 66:22
    namely the single edge v_i to
    v_j.
  • 66:25 - 66:25
    This could only be longer than
    the shortest path.
  • 66:30 - 66:30
    And so, that makes the
    right-hand side bigger,
  • 66:33 - 66:33
    which makes this inequality
    more true, meaning it was true
  • 66:38 - 66:38
    before.
    And now it's still true.
  • 66:40 - 66:40
    And, that proves it.
    This is true.
  • 66:42 - 66:42
    And, these were all equivalent
    statements.
  • 66:46 - 66:46
    This we know to be true by
    triangle inequality.
  • 66:49 - 66:49
    Therefore, these constraints
    are all satisfied.
  • 66:52 - 66:52
    Magic.
    I'm so excited here.
  • 66:54 - 66:54
    So, we've proved that having a
    negative weight cycle is exactly
  • 66:59 - 66:59
    when these system of difference
    constraints are unsatisfiable.
  • 67:05 - 67:05
    So, if we want to satisfy them,
    if we want to find the right
  • 67:08 - 67:08
    answer to x, we run
    Bellman-Ford.
  • 67:10 - 67:10
    Either it says,
    oh, no negative weight cycle.
  • 67:12 - 67:12
    Then you are hosed.
    Then, there is no solution.
  • 67:15 - 67:15
    But that's the best you could
    hope to know.
  • 67:17 - 67:17
    Otherwise, it says,
    oh, there was no negative
  • 67:20 - 67:20
    weight cycle,
    and here are your shortest path
  • 67:22 - 67:22
    weights.
    You just plug them in,
  • 67:24 - 67:24
    and bam, you have your x_i's
    that satisfy the constraints.
  • 67:27 - 67:27
    Awesome.
    Now, it wasn't just any graph.
  • 67:30 - 67:30
    I mean, we started with
    constraints, algebra,
  • 67:33 - 67:33
    we converted it into a graph by
    this transform.
  • 67:36 - 67:36
    Then we added a source vertex,
    s.
  • 67:38 - 67:38
    So, I mean, we had to build a
    graph to solve our problem,
  • 67:42 - 67:42
    very powerful idea.
    Cool.
  • 67:43 - 67:43
    This is the idea of reduction.
    You can reduce the problem you
  • 67:47 - 67:47
    want to solve into some problem
    you know how to solve.
  • 67:51 - 67:51
    You know how to solve shortest
    paths when there are no negative
  • 67:55 - 67:55
    weight cycles,
    or find out that there is a
  • 67:57 - 67:57
    negative weight cycle by
    Bellman-Ford.
  • 68:01 - 68:01
    So, now we know how to solve
    difference constraints.
  • 68:06 - 68:06
    It turns out you can do even
    more.
  • 68:09 - 68:09
    Bellman-Ford does a little bit
    more than just solve these
  • 68:15 - 68:15
    constraints.
    But first let me write down
  • 68:19 - 68:19
    what I've been jumping up and
    down about.
  • 68:23 - 68:23
    The corollary is you can use
    Bellman-Ford.
  • 68:27 - 68:27
    I mean, you make this graph.
    Then you apply Bellman-Ford,
  • 68:34 - 68:34
    and it will solve your system
    of difference constraints.
  • 68:41 - 68:41
    So, let me put in some numbers
    here.
  • 68:46 - 68:46
    You have m difference
    constraints.
  • 68:50 - 68:50
    And, you have n variables.
    And, it will solve them in
  • 68:56 - 68:56
    order m times n time.
    Actually, these numbers go up
  • 69:02 - 69:02
    slightly because we are adding n
    edges, and we're adding one
  • 69:07 - 69:07
    vertex, but assuming all of
    these numbers are nontrivial,
  • 69:12 - 69:12
    m is at least n.
    It's order MN time.
  • 69:15 - 69:15
    OK, trying to avoid cases where
    some of them are close to zero.
  • 69:20 - 69:20
    Good.
    So, some other facts,
  • 69:22 - 69:22
    that's what I just said.
    And we'll leave these as
  • 69:26 - 69:26
    exercises because they're not
    too essential.
  • 69:31 - 69:31
    The main thing we need is this.
    But, some other cool facts is
  • 69:36 - 69:36
    that Bellman-Ford actually
    optimizes some objective
  • 69:39 - 69:39
    functions.
    So, we are saying it's just a
  • 69:42 - 69:42
    feasibility problem.
    We just want to know whether
  • 69:46 - 69:46
    these constraints are
    satisfiable.
  • 69:49 - 69:49
    In fact, you can add a
    particular objective function.
  • 69:53 - 69:53
    So, you can't give it an
    arbitrary objective function,
  • 69:57 - 69:57
    but here's one of interest.
    x_1 plus x_2 plus x_n,
  • 70:05 - 70:05
    OK, but not just that.
    We have some constraints.
  • 70:24 - 70:24
    OK, this is a linear program.
    I want to maximize the sum of
  • 70:27 - 70:27
    the x_i's subject to all the
    x_i's being nonpositive and the
  • 70:31 - 70:31
    difference constraints.
    So, this we had before.
  • 70:34 - 70:34
    This is fine.
    We noticed at some point you
  • 70:36 - 70:36
    could get from s to everywhere
    with cost, at most,
  • 70:39 - 70:39
    zero.
    So, we know that in this
  • 70:41 - 70:41
    assignment all of the x_i's are
    negative.
  • 70:43 - 70:43
    That's not necessary,
    but it's true when you run
  • 70:46 - 70:46
    Bellman-Ford.
    So if you solve your system
  • 70:48 - 70:48
    using Bellman-Ford,
    which is no less general than
  • 70:51 - 70:51
    anything else,
    you happen to get nonpositive
  • 70:53 - 70:53
    x_i's.
    And so, subject to that
  • 70:55 - 70:55
    constraint, it actually makes
    them is close to zero as
  • 70:58 - 70:58
    possible in the L1 norm.
    In the sum of these values,
  • 71:04 - 71:04
    it tries to make the sum as
    close to zero,
  • 71:09 - 71:09
    it tries to make the values as
    small as possible in absolute
  • 71:15 - 71:15
    value in this sense.
    OK, it does more than that.
  • 71:20 - 71:20
    It cooks, it cleans,
    it finds shortest paths.
  • 71:25 - 71:25
    It also minimizes the spread,
    the maximum over all i of x_i
  • 71:32 - 71:32
    minus the minimum over all i of
    x_i.
  • 71:37 - 71:37
    So, I mean, if you have your
    real line, and here are the
  • 71:41 - 71:41
    x_i's wherever they are.
    It minimizes this distance.
  • 71:44 - 71:44
    And zero is somewhere over
    here.
  • 71:47 - 71:47
    So, it tries to make the x_i's
    as compact as possible.
  • 71:50 - 71:50
    This is actually the L infinity
    norm, if you know stuff about
  • 71:54 - 71:54
    norms from your linear algebra
    class.
  • 71:57 - 71:57
    OK, this is the L1 norm.
    I think it minimizes every LP
  • 72:01 - 72:01
    norm.
    Good, so let's use this for
  • 72:05 - 72:05
    something.
    Yeah, let's solve a real
  • 72:09 - 72:09
    problem, and then we'll be done
    for today.
  • 72:14 - 72:14
    Next class we'll see the really
    cool stuff, the really cool
  • 72:21 - 72:21
    application of all of this.
    For now, and we'll see a cool
  • 72:27 - 72:27
    but relatively simple
    application, which is VLSI
  • 72:33 - 72:33
    layout.
    We talked a little bit about
  • 72:38 - 72:38
    VLSI way back and divide and
    conquer.
  • 72:41 - 72:41
    You have a bunch of chips,
    or you want to arrange them,
  • 72:46 - 72:46
    and minimize some objectives.
    So, here's a particular,
  • 72:50 - 72:50
    tons of problems that come out
    of VLSI layout.
  • 72:55 - 72:55
    Here's one of them.
    You have a bunch of features of
  • 72:59 - 72:59
    an integrated circuit.
    You want to somehow arrange
  • 73:05 - 73:05
    them on your circuit without
    putting any two of them too
  • 73:10 - 73:10
    close to each other.
    You have some minimum
  • 73:14 - 73:14
    separation like at least they
    should not get top of each
  • 73:19 - 73:19
    other.
    Probably, you also need some
  • 73:22 - 73:22
    separation to put wires in
    between, and so on,
  • 73:27 - 73:27
    so, without putting any two
    features too close together.
  • 73:33 - 73:33
    OK, so just to give you an
    idea, so I have some objects and
  • 73:37 - 73:37
    I'm going to be a little bit
    vague about how this works.
  • 73:41 - 73:41
    You have some features.
    This is stuff,
  • 73:44 - 73:44
    some chips, whatever.
    We don't really care what their
  • 73:47 - 73:47
    shapes look like.
    I just want to be able to move
  • 73:51 - 73:51
    them around so that the gap at
    any point, so let me just think
  • 73:55 - 73:55
    about this gap.
    This gap should be at least
  • 73:58 - 73:58
    some delta.
    Or, I don't want to use delta.
  • 74:01 - 74:01
    Let's say epsilon,
    good, small number.
  • 74:05 - 74:05
    So, I just need some separation
    between all of my parts.
  • 74:09 - 74:09
    And for this problem,
    I'm going to be pretty simple,
  • 74:12 - 74:12
    just say that the parts are
    only allowed to slide
  • 74:16 - 74:16
    horizontally.
    So, it's a one-dimensional
  • 74:18 - 74:18
    problem.
    These objects are in 2-d,
  • 74:21 - 74:21
    or whatever,
    but I can only slide them an x
  • 74:24 - 74:24
    coordinate.
    So, to model that,
  • 74:26 - 74:26
    I'm going to look at the left
    edge of every part and say,
  • 74:30 - 74:30
    well, these two left edges
    should be at least some
  • 74:33 - 74:33
    separation.
    So, I think of it as whatever
  • 74:37 - 74:37
    the distance is plus some
    epsilon.
  • 74:39 - 74:39
    But, you know,
    if you have some funky 2-d
  • 74:42 - 74:42
    shapes you have to compute,
    well, this is a little bit too
  • 74:45 - 74:45
    close because these come into
    alignment.
  • 74:48 - 74:48
    But, there's some constraint,
    well, for any two pieces,
  • 74:51 - 74:51
    I could figure out how close
    they can get.
  • 74:54 - 74:54
    They should get no closer.
    So, I'm going to call this x_1.
  • 74:57 - 74:57
    I'll call this x_2.
    So, we have some constraint
  • 75:00 - 75:00
    like x_2 minus x_1 is at least d
    plus epsilon,
  • 75:03 - 75:03
    or whatever you compute that
    weight to be.
  • 75:07 - 75:07
    OK, so for every pair of
    pieces, I can do this,
  • 75:10 - 75:10
    compute some constraint on how
    far apart they have to be.
  • 75:13 - 75:13
    And, now I'd like to assign
    these x coordinates.
  • 75:16 - 75:16
    Right now, I'm assuming they're
    just variables.
  • 75:19 - 75:19
    I want to slide these pieces
    around horizontally in order to
  • 75:22 - 75:22
    compactify them as much as
    possible so they fit in the
  • 75:25 - 75:25
    smallest chip that I can make
    because it costs money,
  • 75:28 - 75:28
    and time, and everything,
    and power, everything.
  • 75:31 - 75:31
    You always want your chip
    small.
  • 75:34 - 75:34
    So, Bellman-Ford does that.
    All right, so Bellman-Ford
  • 75:40 - 75:40
    solves these constraints because
    it's just a bunch of difference
  • 75:48 - 75:48
    constraints.
    And we know that they are
  • 75:52 - 75:52
    solvable because you could
    spread all the pieces out
  • 75:58 - 75:58
    arbitrarily far.
    And, it minimizes the spread,
  • 76:03 - 76:03
    minimizes the size of the chip
    I need, a max of x_i minus the
  • 76:10 - 76:10
    min of x_i.
    So, this is it maximizes
  • 76:15 - 76:15
    compactness, or minimizes size
    of the chip.
  • 76:18 - 76:18
    OK, this is a one-dimensional
    problem, so it may seem a little
  • 76:23 - 76:23
    artificial, but the two
    dimensional problem is really
  • 76:27 - 76:27
    hard to solve.
    And this is,
  • 76:29 - 76:29
    in fact, the best you can do
    with a nice polynomial time
  • 76:33 - 76:33
    algorithm.
    There are other applications if
  • 76:37 - 76:37
    you're scheduling events in,
    like, a multimedia environment,
  • 76:42 - 76:42
    and you want to guarantee that
    this audio plays at least two
  • 76:47 - 76:47
    seconds after this video,
    but then there are things that
  • 76:51 - 76:51
    are playing at the same time,
    and they have to be within some
  • 76:56 - 76:56
    gap of each other,
    so, lots of papers about using
  • 76:59 - 76:59
    Bellman-Ford,
    solve difference constraints to
  • 77:03 - 77:03
    enable multimedia environments.
    OK, so there you go.
  • 77:07 - 77:07
    And next class we'll see more
    applications of Bellman-Ford to
  • 77:11 - 77:11
    all pairs shortest paths.
    Questions?
  • 77:14 - 77:14
    Great.
Title:
Lec 18 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Description:

Lecture 18: Shortest Paths II: Bellman-Ford, Linear Programming, Difference Constraints

View the complete course at: http://ocw.mit.edu/6-046JF05

License: Creative Commons BY-NC-SA

More information at http://ocw.mit.edu/terms

More courses at http://ocw.mit.edu

more » « less
Duration:
01:17:17

English subtitles

Revisions