< Return to Video

Lec 1 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005

  • 0:10 - 0:10
    We're going to get started.
    Handouts are the by the door if
  • 0:15 - 0:15
    anybody didn't pick one up.
    My name is Charles Leiserson.
  • 0:19 - 0:19
    I will be lecturing this course
    this term, Introduction to
  • 0:23 - 0:23
    Algorithms, with Erik Demaine.
    In addition,
  • 0:27 - 0:27
    this is an SMA course,
    a Singapore MIT Alliance course
  • 0:31 - 0:31
    which will be run in Singapore
    by David Hsu.
  • 0:35 - 0:35
    And so all the lectures will be
    videotaped and made available on
  • 0:41 - 0:41
    the Web for the Singapore
    students, as well as for MIT
  • 0:46 - 0:46
    students who choose to watch
    them on the Web.
  • 0:50 - 0:50
    If you have an issue of not
    wanting to be on the videotape,
  • 0:55 - 0:55
    you should sit in the back row.
    OK?
  • 0:58 - 0:58
    Otherwise, you will be on it.
    There is a video recording
  • 1:04 - 1:04
    policy, but it seems like they
    ran out.
  • 1:06 - 1:06
    If anybody wants to see it,
    people, if they could just sort
  • 1:10 - 1:10
    of pass them around maybe a
    little bit, once you're done
  • 1:14 - 1:14
    reading it, or you can come up.
    I did secure one copy.
  • 1:18 - 1:18
    Before we get into the content
    of the course,
  • 1:21 - 1:21
    let's briefly go over the
    course information because there
  • 1:25 - 1:25
    are some administrative things
    that we sort of have to do.
  • 1:30 - 1:30
    As you can see,
    this term we have a big staff.
  • 1:34 - 1:34
    Take a look at the handout
    here.
  • 1:36 - 1:36
    Including this term six TAs,
    which is two more TAs than we
  • 1:40 - 1:40
    normally get for this course.
    That means recitations will be
  • 1:45 - 1:45
    particularly small.
    There is a World Wide Web page,
  • 1:49 - 1:49
    and you should bookmark that
    and go there regularly because
  • 1:54 - 1:54
    that is where everything will be
    distributed.
  • 1:58 - 1:58
    Email.
    You should not be emailing
  • 2:00 - 2:00
    directly to, even though we give
    you our email addresses,
  • 2:04 - 2:04
    to the individual members of
    the staff.
  • 2:07 - 2:07
    You should email us generally.
    And the reason is you will get
  • 2:11 - 2:11
    much faster response.
    And also, for any
  • 2:14 - 2:14
    communications,
    generally we like to monitor
  • 2:17 - 2:17
    what the communications are so
    it's helpful to have emails
  • 2:21 - 2:21
    coming to everybody on the
    course staff.
  • 2:23 - 2:23
    As I mentioned,
    we will be doing distance
  • 2:26 - 2:26
    learning this term.
    And so you can watch lectures
  • 2:29 - 2:29
    online if you choose to do that.
    I would recommend,
  • 2:34 - 2:34
    for people who have the
    opportunity to watch,
  • 2:38 - 2:38
    to come live.
    It's better live.
  • 2:41 - 2:41
    You get to interact.
    There's an intangible that
  • 2:44 - 2:44
    comes with having it live.
    In fact, in addition to the
  • 2:49 - 2:49
    videos, I meet weekly with the
    Singapore students so that they
  • 2:54 - 2:54
    have a live session as well.
    Prerequisites.
  • 2:58 - 2:58
    The prerequisites for this
    course are 6.042,
  • 3:02 - 3:02
    which is Math for Computer
    Science, and 6.001.
  • 3:05 - 3:05
    You basically need discrete
    mathematics and probability,
  • 3:10 - 3:10
    as well as programming
    experience to take this course
  • 3:14 - 3:14
    successfully.
    People do not have that
  • 3:17 - 3:17
    background should not be in the
    class.
  • 3:20 - 3:20
    We will be checking
    prerequisites.
  • 3:23 - 3:23
    If you have any questions,
    please come to talk to us after
  • 3:27 - 3:27
    class.
    Let's see.
  • 3:30 - 3:30
    Lectures are here.
    For SMA students,
  • 3:32 - 3:32
    they have the videotapes and
    they will also have a weekly
  • 3:37 - 3:37
    meeting.
    Students must attend a one-hour
  • 3:40 - 3:40
    recitation session each week.
    There will be new material
  • 3:44 - 3:44
    presented in the recitation.
    Unlike the lectures,
  • 3:48 - 3:48
    they will not be online.
    Unlike the lectures,
  • 3:51 - 3:51
    there will not be lecture notes
    distributed for the recitations
  • 3:56 - 3:56
    in general.
    And, yet, there will be
  • 4:00 - 4:00
    material there that is directly
    on the exams.
  • 4:03 - 4:03
    And so every term we say oh,
    when did you cover that?
  • 4:07 - 4:07
    That was in recitation.
    You missed that one.
  • 4:10 - 4:10
    So, recitations are mandatory.
    And, in particular,
  • 4:14 - 4:14
    also let me just mention your
    recitation instructor is the one
  • 4:19 - 4:19
    who assigns your final grade.
    So we have a grade meeting and
  • 4:23 - 4:23
    keep everybody normal,
    but your recitation has the
  • 4:27 - 4:27
    final say on your grade.
    Handouts.
  • 4:31 - 4:31
    Handouts are available on the
    course Web page.
  • 4:35 - 4:35
    We will not generally,
    except for this one,
  • 4:38 - 4:38
    first handout,
    be bringing handouts to class.
  • 4:42 - 4:42
    Textbook is this book,
    Introduction to Algorithms.
  • 4:46 - 4:46
    MIT students can get it any of
    the local bookstores,
  • 4:51 - 4:51
    including the MIT Coop.
    There is also a new online
  • 4:55 - 4:55
    service that provides textbooks.
    You can also get a discount if
  • 5:02 - 5:02
    you buy it at the MIT Press
    Bookstore.
  • 5:05 - 5:05
    There is a coupon in the MIT
    Student Telephone Directory for
  • 5:10 - 5:10
    a discount on MIT Press books.
    And you can use that to
  • 5:15 - 5:15
    purchase this book at a
    discount.
  • 5:18 - 5:18
    Course website.
    This is the course website.
  • 5:22 - 5:22
    It links to the Stellar
    website, which is where,
  • 5:26 - 5:26
    actually, everything will be
    kept.
  • 5:30 - 5:30
    And SMA students have their own
    website.
  • 5:33 - 5:33
    Some students find this course
    particularly challenges so we
  • 5:38 - 5:38
    will have extra help.
    We will post weekly office
  • 5:42 - 5:42
    hours on the course website for
    the TAs.
  • 5:45 - 5:45
    And then as an experiment this
    term, we are going to offer
  • 5:49 - 5:49
    homework labs for this class.
    What a homework lab is,
  • 5:53 - 5:53
    is it's a place and a time you
    can go where other people in the
  • 5:58 - 5:58
    course will go to do homework.
    And there will be typically two
  • 6:04 - 6:04
    TAs who staff the lab.
    And so, as you're working on
  • 6:08 - 6:08
    your homework,
    you can get help from the TAs
  • 6:11 - 6:11
    if you need it.
    And it's generally a place,
  • 6:14 - 6:14
    we're going to schedule those,
    and they will be on the course
  • 6:18 - 6:18
    calendar for where it is and
    when it is that they will be
  • 6:22 - 6:22
    held, but usually Sundays 2:00
    to 4:00 pm, or else it will be
  • 6:26 - 6:26
    some evening.
    I think the first one is an
  • 6:29 - 6:29
    evening, right?
    Near to when the homework is
  • 6:33 - 6:33
    due.
    Your best bet is try to do the
  • 6:36 - 6:36
    homework in advance of the
    homework lab.
  • 6:39 - 6:39
    But then, if you want extra
    help, if you want to talk over
  • 6:44 - 6:44
    your solutions with people
    because as we will talk about
  • 6:48 - 6:48
    problem sets you can solve in
    collaboration with other people
  • 6:53 - 6:53
    in the class.
    In addition,
  • 6:55 - 6:55
    there are several peer
    assistance programs.
  • 7:00 - 7:00
    Also the office of Minority
    Education has an assistance
  • 7:04 - 7:04
    program, and those usually get
    booked up pretty quickly.
  • 7:08 - 7:08
    If you're interested in those,
    good idea to make an
  • 7:12 - 7:12
    appointment to get there and get
    help soon.
  • 7:15 - 7:15
    The homework labs,
    I hope a lot of people will try
  • 7:19 - 7:19
    that out.
    We've never done this.
  • 7:22 - 7:22
    I don't know of any other
    course.
  • 7:24 - 7:24
    Do other people know of courses
    at MIT that have done this?
  • 7:28 - 7:28
    6.011 did it,
    OK.
  • 7:31 - 7:31
    Good.
    And was it successful in that
  • 7:34 - 7:34
    class?
    It never went,
  • 7:36 - 7:36
    OK.
    Good.
  • 7:37 - 7:37
    [LAUGHTER] We will see.
    If it's not paying off then we
  • 7:42 - 7:42
    will just return to ordinary
    office hours for those TAs,
  • 7:47 - 7:47
    but I think for some students
    that is a good opportunity.
  • 7:53 - 7:53
    If you wish to be registered in
    this course, you must sign up on
  • 7:59 - 7:59
    the course Web page.
    So, that is requirement one.
  • 8:05 - 8:05
    It must be done today.
    You will find it difficult to
  • 8:09 - 8:09
    pass the course if you are not
    in the class.
  • 8:13 - 8:13
    And you should notify your TA
    if you decide to drop so that we
  • 8:19 - 8:19
    can get you off and stop the
    mailings, stop the spam.
  • 8:24 - 8:24
    And you should register today
    before 7:00 PM.
  • 8:29 - 8:29
    And then we're going to email
    your recitation assignment to
  • 8:33 - 8:33
    you before Noon tomorrow.
    And if you don't receive this
  • 8:37 - 8:37
    information by Thursday Noon,
    please send us an email to the
  • 8:41 - 8:41
    course staff generally,
    not to me individually,
  • 8:44 - 8:44
    saying that you didn't receive
    your recitation assignment.
  • 8:48 - 8:48
    And so if you haven't received
    it by Thursday Noon you want to.
  • 8:53 - 8:53
    I think generally they are
    going to send them out tonight
  • 8:57 - 8:57
    or at least by tomorrow morning.
    Yeah.
  • 9:00 - 9:00
    OK.
    SMA students don't have to
  • 9:02 - 9:02
    worry about this.
    Problem sets.
  • 9:04 - 9:04
    We have nine problem sets that
    we project will be assigned
  • 9:07 - 9:07
    during the semester.
    A couple things about problem
  • 9:10 - 9:10
    sets.
    Homeworks won't generally be
  • 9:12 - 9:12
    accepted, if you have
    extenuating circumstances you
  • 9:15 - 9:15
    should make prior arrangements
    with your recitation instructor.
  • 9:19 - 9:19
    In fact, almost all of the
    administrative stuff,
  • 9:22 - 9:22
    you shouldn't come to me to ask
    and say can I hand in something
  • 9:25 - 9:25
    late?
    You should be talking to your
  • 9:27 - 9:27
    recitation instructor.
    You can read the other things
  • 9:32 - 9:32
    about the form,
    but let me just mention that
  • 9:36 - 9:36
    there are exercises that should
    be solved but not handed in as
  • 9:41 - 9:41
    well to give you drill on the
    material.
  • 9:44 - 9:44
    I highly recommend you doing
    the exercises.
  • 9:47 - 9:47
    They both test your
    understanding of the material,
  • 9:51 - 9:51
    and exercises have this way of
    finding themselves on quizzes.
  • 9:56 - 9:56
    You're often asked to describe
    algorithms.
  • 10:00 - 10:00
    And here is a little outline of
    what you can use to describe an
  • 10:06 - 10:06
    algorithm.
    The grading policy is something
  • 10:10 - 10:10
    that somehow I cover.
    And always every term there are
  • 10:15 - 10:15
    at least a couple of students
    who pretend like I never showed
  • 10:21 - 10:21
    them this.
    If you skip problems it has a
  • 10:25 - 10:25
    nonlinear effect on your grade.
    Nonlinear, OK?
  • 10:30 - 10:30
    If you don't skip any problems,
    no effect on your grade.
  • 10:34 - 10:34
    If you skip one problem,
    a hundredth of a letter grade,
  • 10:39 - 10:39
    we can handle that.
    But two problems it's a tenth.
  • 10:43 - 10:43
    And, as you see,
    by the time you have skipped
  • 10:46 - 10:46
    like five letter grades,
    it is already five problems.
  • 10:50 - 10:50
    This is not problem sets,
    by the way.
  • 10:53 - 10:53
    This is problems,
    OK?
  • 10:55 - 10:55
    You're down a third of a letter
    grade.
  • 10:59 - 10:59
    And if you don't do nine or
    more, so that's typically about
  • 11:03 - 11:03
    three to four problem sets,
    you don't pass the class.
  • 11:07 - 11:07
    I always have some students
    coming at the end of the year
  • 11:12 - 11:12
    saying oh, I didn't do any of my
    problems.
  • 11:15 - 11:15
    Can you just pass me because I
    did OK on the exams?
  • 11:19 - 11:19
    Answer no, a very simple answer
    because we've said it upfront.
  • 11:23 - 11:23
    So, the problem sets are an
    integral part of the course.
  • 11:27 - 11:27
    Collaboration policy.
    This is extremely important so
  • 11:32 - 11:32
    everybody pay attention.
    If you are asleep now wake up.
  • 11:36 - 11:36
    Like that's going to wake
    anybody up, right?
  • 11:39 - 11:39
    [LAUGHTER] The goal of
    homework.
  • 11:41 - 11:41
    Professor Demaine and my
    philosophy is that the goal of
  • 11:45 - 11:45
    homework is to help you learn
    the material.
  • 11:48 - 11:48
    And one way of helping to learn
    is not to just be stuck and
  • 11:52 - 11:52
    unable to solve something
    because then you're in no better
  • 11:56 - 11:56
    shape when the exam roles
    around, which is where we're
  • 12:00 - 12:00
    actually evaluating you.
    So, you're encouraged to
  • 12:05 - 12:05
    collaborate.
    But there are some commonsense
  • 12:08 - 12:08
    things about collaboration.
    If you go and you collaborate
  • 12:12 - 12:12
    to the extent that all you're
    doing is getting the information
  • 12:16 - 12:16
    from somebody else,
    you're not learning the
  • 12:19 - 12:19
    material and you're not going to
    do well on the exams.
  • 12:23 - 12:23
    In our experience,
    students who collaborate
  • 12:26 - 12:26
    generally do better than
    students who work alone.
  • 12:30 - 12:30
    But you owe it to yourself,
    if you're going to work in a
  • 12:34 - 12:34
    study group, to be prepared for
    your study group meeting.
  • 12:37 - 12:37
    And specifically you should
    spend a half an hour to 45
  • 12:41 - 12:41
    minutes on each problem before
    you go to group so you're up to
  • 12:45 - 12:45
    speed and you've tried out your
    ideas.
  • 12:47 - 12:47
    And you may have solutions to
    some, you may be stuck on some
  • 12:51 - 12:51
    other ones, but at least you
    applied yourself to it.
  • 12:54 - 12:54
    After 30 to 45 minutes,
    if you cannot get the problem,
  • 12:58 - 12:58
    just sitting there and banging
    your head against it makes no
  • 13:01 - 13:01
    sense.
    It's not a productive use of
  • 13:05 - 13:05
    your time.
    And I know most of you have
  • 13:07 - 13:07
    issues with having time on your
    hands, right?
  • 13:10 - 13:10
    Like it's not there.
    So, don't go banging your head
  • 13:13 - 13:13
    against problems that are too
    hard or where you don't
  • 13:16 - 13:16
    understand what's going on or
    whatever.
  • 13:19 - 13:19
    That's when the study group can
    help out.
  • 13:21 - 13:21
    And, as I mentioned,
    we'll have homework labs which
  • 13:24 - 13:24
    will give you an opportunity to
    go and do that and coordinate
  • 13:28 - 13:28
    with other students rather than
    necessarily having to form your
  • 13:32 - 13:32
    own group.
    And the TAs will be there.
  • 13:36 - 13:36
    If your group is unable to
    solve the problem then talk to
  • 13:40 - 13:40
    other groups or ask your
    recitation instruction.
  • 13:44 - 13:44
    And, that's how you go about
    solving them.
  • 13:47 - 13:47
    Writing up the problem sets,
    however, is your individual
  • 13:51 - 13:51
    responsibility and should be
    done alone.
  • 13:54 - 13:54
    You don't write up your problem
    solutions with other students,
  • 13:59 - 13:59
    you write them up on your own.
    And you should on your problem
  • 14:04 - 14:04
    sets, because this is an
    academic place,
  • 14:08 - 14:08
    we understand that the source
    of academic information is very
  • 14:13 - 14:13
    important, if you collaborated
    on solutions you should write a
  • 14:18 - 14:18
    list of the collaborators.
    Say I worked with so and so on
  • 14:22 - 14:22
    this solution.
    It does not affect your grade.
  • 14:26 - 14:26
    It's just a question of being
    scholarly.
  • 14:30 - 14:30
    It is a violation of this
    policy to submit a problem
  • 14:34 - 14:34
    solution that you cannot orally
    explain to a member of the
  • 14:39 - 14:39
    course staff.
    You say oh, well,
  • 14:41 - 14:41
    my write-up is similar to that
    other person's.
  • 14:45 - 14:45
    I didn't copy them.
    We may ask you to orally
  • 14:48 - 14:48
    explain your solution.
    If you are unable,
  • 14:52 - 14:52
    according to this policy,
    the presumption is that you
  • 14:56 - 14:56
    cheated.
    So, do not write up stuff that
  • 14:59 - 14:59
    you don't understand.
    You should be able to write up
  • 15:04 - 15:04
    the stuff that you understand.
    Understand why you're putting
  • 15:08 - 15:08
    down what you're putting down.
    If it isn't obvious,
  • 15:12 - 15:12
    no collaboration whatsoever is
    permitted on exams.
  • 15:16 - 15:16
    Exams is when we evaluate you.
    And now we're not interested in
  • 15:20 - 15:20
    evaluating other people,
    we're interested in evaluating
  • 15:24 - 15:24
    you.
    So, no collaboration on exams.
  • 15:26 - 15:26
    We will have a take-home exam
    for the second quiz.
  • 15:31 - 15:31
    You should look at the
    schedule.
  • 15:33 - 15:33
    If there are problems with the
    schedule of that,
  • 15:36 - 15:36
    we want to know early.
    And we will give you more
  • 15:40 - 15:40
    details about the collaboration
    in the lecture on Monday,
  • 15:44 - 15:44
    November 28th.
    Now, generally,
  • 15:46 - 15:46
    the lectures here,
    they're mandatory and you have
  • 15:49 - 15:49
    to know them,
    but I know that some people say
  • 15:52 - 15:52
    gee, 9:30 is kind of early,
    especially on a Monday or
  • 15:56 - 15:56
    whatever.
    It can be kind of early to get
  • 15:59 - 15:59
    up.
    However, on Monday,
  • 16:02 - 16:02
    November 28th,
    you fail the exam if you do not
  • 16:05 - 16:05
    show up to lecture on time.
    That one day you must show up.
  • 16:10 - 16:10
    Any questions about that?
    That one day you must show up
  • 16:14 - 16:14
    here, even if you've been
    watching them on the Web.
  • 16:19 - 16:19
    And generally,
    if you think you have
  • 16:21 - 16:21
    transgressed,
    the best is to come to us to
  • 16:25 - 16:25
    talk about it.
    We can usually work something
  • 16:28 - 16:28
    out.
    It's when we find somebody has
  • 16:32 - 16:32
    transgressed from a third-party
    or from obvious analyses that we
  • 16:38 - 16:38
    do with homeworks,
    that's when things get messy.
  • 16:42 - 16:42
    So, if you think,
    for some reason or other,
  • 16:45 - 16:45
    oh, I may have done something
    wrong, please come and talk to
  • 16:50 - 16:50
    us.
    We actually were students once,
  • 16:53 - 16:53
    too, albeit many years ago.
    Any questions?
  • 16:56 - 16:56
    So, this class has great
    material.
  • 17:00 - 17:00
    Fabulous material.
    And it's really fun,
  • 17:07 - 17:07
    but you do have to work hard.
    Let's talk content.
  • 17:29 - 17:29
    This is the topic of the first
    part of the course.
  • 17:32 - 17:32
    The first part of the course is
    focused on analysis.
  • 17:36 - 17:36
    The second part of the course
    is focused on design.
  • 17:39 - 17:39
    Before you can do design,
    you have to master a bunch of
  • 17:43 - 17:43
    techniques for analyzing
    algorithms.
  • 17:45 - 17:45
    And then you'll be in a
    position to design algorithms
  • 17:49 - 17:49
    that you can analyze and that
    which are efficient.
  • 17:52 - 17:52
    The analysis of algorithm is
    the theoretical study --
  • 18:05 - 18:05
    -- of computer program
    performance --
  • 18:19 - 18:19
    -- and resource usage.
    And a particular focus on
  • 18:24 - 18:24
    performance.
    We're studying how to make
  • 18:28 - 18:28
    things fast.
    In particular,
  • 18:31 - 18:31
    computer programs.
    We also will discover and talk
  • 18:36 - 18:36
    about other resources such as
    communication,
  • 18:40 - 18:40
    such as memory,
    whether RAM memory or disk
  • 18:45 - 18:45
    memory.
    There are other resources that
  • 18:48 - 18:48
    we may care about,
    but predominantly we focus on
  • 18:53 - 18:53
    performance.
    Because this is a course about
  • 18:57 - 18:57
    performance, I like to put
    things in perspective a little
  • 19:03 - 19:03
    bit by starting out and asking,
    in programming,
  • 19:07 - 19:07
    what is more important than
    performance?
  • 19:13 - 19:13
    If you're in an engineering
    situation and writing code,
  • 19:19 - 19:19
    writing software,
    what's more important than
  • 19:24 - 19:24
    performance?
    Correctness.
  • 19:26 - 19:26
    Good.
    OK.
  • 19:27 - 19:27
    What else?
    Simplicity can be.
  • 19:31 - 19:31
    Very good.
    Yeah.
  • 19:33 - 19:33
    Maintainability often much more
    important than performance.
  • 19:41 - 19:41
    Cost.
    And what type of cost are you
  • 19:45 - 19:45
    thinking?
    No, I mean cost of what?
  • 19:49 - 19:49
    We're talking software here,
    right?
  • 19:54 - 19:54
    What type of cost do you have
    in mind?
  • 20:00 - 20:00
    There are some costs that are
    involved when programming like
  • 20:05 - 20:05
    programmer time.
    So, programmer time is another
  • 20:09 - 20:09
    thing also that might be.
    Stability.
  • 20:12 - 20:12
    Robustness of the software.
    Does it break all the time?
  • 20:16 - 20:16
    What else?
  • 20:25 - 20:25
    Come on.
    We've got a bunch of engineers
  • 20:28 - 20:28
    here.
    A lot of things.
  • 20:30 - 20:30
    How about features?
    Features can be more important.
  • 20:34 - 20:34
    Having a wider collection of
    features than your competitors.
  • 20:38 - 20:38
    Functionality.
    Modularity.
  • 20:40 - 20:40
    Is it designed in a way where
    you can make changes in a local
  • 20:44 - 20:44
    part of the code and you don't
    have to make changes across the
  • 20:49 - 20:49
    code in order to affect a simple
    change in the functionality?
  • 20:53 - 20:53
    There is one big one which
    definitely, especially in the
  • 20:57 - 20:57
    `90s, was like the big thing in
    computers.
  • 21:01 - 21:01
    The big thing.
    Well, security actually.
  • 21:04 - 21:04
    Good.
    I don't even have that one
  • 21:06 - 21:06
    down.
    Security is excellent.
  • 21:08 - 21:08
    That's actually been more in
    the 2000.
  • 21:11 - 21:11
    Security has been far more
    important often than
  • 21:15 - 21:15
    performance.
    Scalability has been important,
  • 21:18 - 21:18
    although scalability,
    in some sense,
  • 21:21 - 21:21
    is performance related.
    But, yes, scalability is good.
  • 21:25 - 21:25
    What was the big breakthrough
    and why do people use Macintosh
  • 21:29 - 21:29
    rather than Windows,
    those people who are of that
  • 21:33 - 21:33
    religion?
    User-friendliness.
  • 21:36 - 21:36
    Wow.
    If you look at the number of
  • 21:39 - 21:39
    cycles of computers that went
    into user-friendliness in the
  • 21:44 - 21:44
    `90s, it grew from almost
    nothing to where it's now the
  • 21:48 - 21:48
    vast part of the computation
    goes into user-friendly.
  • 21:52 - 21:52
    So, all those things are more
    important than performance.
  • 21:57 - 21:57
    This is a course on
    performance.
  • 22:00 - 22:00
    Then you can say OK,
    well, why do we bother and why
  • 22:06 - 22:06
    study algorithms and performance
    if it's at the bottom of the
  • 22:12 - 22:12
    heap?
    Almost always people would
  • 22:16 - 22:16
    rather have these other things
    than performance.
  • 22:21 - 22:21
    You go off and you say to
    somebody, would I rather have
  • 22:27 - 22:27
    performance or more
    user-friendliness?
  • 22:32 - 22:32
    It's almost always more
    important than performance.
  • 22:36 - 22:36
    Why do we care then?
    Yeah?
  • 22:44 - 22:44
    That wasn't user-friendly.
    Sometimes performance is
  • 22:48 - 22:48
    correlated with
    user-friendliness,
  • 22:50 - 22:50
    absolutely.
    Nothing is more frustrating
  • 22:53 - 22:53
    than sitting there waiting,
    right?
  • 22:55 - 22:55
    So, that's a good reason.
    What are some other reasons
  • 22:59 - 22:59
    why?
    Sometimes they have real-time
  • 23:03 - 23:03
    constraints so they don't
    actually work unless they
  • 23:08 - 23:08
    perform adequately.
    Yeah?
  • 23:10 - 23:10
    Hard to get,
    well, we don't usually quantify
  • 23:14 - 23:14
    user-friendliness so I'm not
    sure, but I understand what
  • 23:20 - 23:20
    you're saying.
    He said we don't get
  • 23:23 - 23:23
    exponential performance
    improvements in
  • 23:27 - 23:27
    user-friendliness.
    We often don't get that in
  • 23:32 - 23:32
    performance either,
    by the way.
  • 23:35 - 23:35
    [LAUGHTER] Sometimes we do,
    but that's good.
  • 23:38 - 23:38
    There are several reasons that
    I think are important.
  • 23:42 - 23:42
    Once is that often performance
    measures the line between the
  • 23:47 - 23:47
    feasible and the infeasible.
    We have heard some of these
  • 23:52 - 23:52
    things.
    For example,
  • 23:53 - 23:53
    when there are real-time
    requirements,
  • 23:57 - 23:57
    if it's not fast enough it's
    simply not functional.
  • 24:02 - 24:02
    Or, if it uses too much memory
    it's simply not going to work
  • 24:06 - 24:06
    for you.
    And, as a consequence,
  • 24:08 - 24:08
    what you find is algorithms are
    on the cutting edge of
  • 24:11 - 24:11
    entrepreneurship.
    If you're talking about just
  • 24:14 - 24:14
    re-implementing stuff that
    people did ten years ago,
  • 24:17 - 24:17
    performance isn't that
    important at some level.
  • 24:20 - 24:20
    But, if you're talking about
    doing stuff that nobody has done
  • 24:23 - 24:23
    before, one of the reasons often
    that they haven't done it is
  • 24:27 - 24:27
    because it's too time-consuming.
    Things don't scale and so
  • 24:32 - 24:32
    forth.
    So, that's one reason,
  • 24:34 - 24:34
    is the feasible versus
    infeasible.
  • 24:37 - 24:37
    Another thing is that
    algorithms give you a language
  • 24:40 - 24:40
    for talking about program
    behavior, and that turns out to
  • 24:45 - 24:45
    be a language that has been
    pervasive through computer
  • 24:49 - 24:49
    science, is that the theoretical
    language is what gets adopted by
  • 24:54 - 24:54
    all the practitioners because
    it's a clean way of thinking
  • 24:58 - 24:58
    about things.
    A good way I think about
  • 25:02 - 25:02
    performance, and the reason it's
    on the bottom of the heap,
  • 25:08 - 25:08
    is sort of like performance is
    like money, it's like currency.
  • 25:13 - 25:13
    You say what good does a stack
    of hundred dollar bills do for
  • 25:19 - 25:19
    you?
    Would you rather have food or
  • 25:22 - 25:22
    water or shelter or whatever?
    And you're willing to pay those
  • 25:27 - 25:27
    hundred dollar bills,
    if you have hundred dollar
  • 25:31 - 25:31
    bills, for that commodity.
    Even though water is far more
  • 25:37 - 25:37
    important to your living.
    Well, similarly,
  • 25:40 - 25:40
    performance is what you use to
    pay for user-friendliness.
  • 25:44 - 25:44
    It's what you pay for security.
    And you hear people say,
  • 25:48 - 25:48
    for example,
    that I want greater
  • 25:51 - 25:51
    functionality,
    so people will program in Java,
  • 25:54 - 25:54
    even though it's much slower
    than C, because they'll say it
  • 25:58 - 25:58
    costs me maybe a factor of three
    or something in performance to
  • 26:03 - 26:03
    program in Java.
    But Java is worth it because
  • 26:07 - 26:07
    it's got all these
    object-oriented features and so
  • 26:10 - 26:10
    forth, exception mechanisms and
    so on.
  • 26:13 - 26:13
    And so people are willing to
    pay a factor of three in
  • 26:16 - 26:16
    performance.
    So, that's why you want
  • 26:18 - 26:18
    performance because you can use
    it to pay for these other things
  • 26:23 - 26:23
    that you want.
    And that's why,
  • 26:25 - 26:25
    in some sense,
    it's on the bottom of the heap,
  • 26:28 - 26:28
    because it's the universal
    thing that you quantify.
  • 26:32 - 26:32
    Do you want to spend a factor
    of two on this or spend a factor
  • 26:37 - 26:37
    of three on security,
    et cetera?
  • 26:40 - 26:40
    And, in addition,
    the lessons generalize to other
  • 26:44 - 26:44
    resource measures like
    communication,
  • 26:47 - 26:47
    like memory and so forth.
    And the last reason we study
  • 26:51 - 26:51
    algorithm performance is it's
    tons of fun.
  • 26:54 - 26:54
    Speed is always fun,
    right?
  • 26:56 - 26:56
    Why do people drive fast cars,
    race horses,
  • 27:00 - 27:00
    whatever?
    Rockets, et cetera,
  • 27:03 - 27:03
    why do we do that?
    Because speed is fun.
  • 27:06 - 27:06
    Ski.
    Who likes to ski?
  • 27:08 - 27:08
    I love to ski.
    I like going fast on those
  • 27:10 - 27:10
    skis.
    It's fun.
  • 27:11 - 27:11
    Hockey, fast sports,
    right?
  • 27:13 - 27:13
    We all like the fast sports.
    Not all of us,
  • 27:16 - 27:16
    I mean.
    Some people say he's not
  • 27:18 - 27:18
    talking to me.
    OK, let's move on.
  • 27:21 - 27:21
    That's sort of a little bit of
    a notion as to why we study
  • 27:25 - 27:25
    this, is that it does,
    in some sense,
  • 27:27 - 27:27
    form a common basis for all
    these other things we care
  • 27:31 - 27:31
    about.
    And so we want to understand
  • 27:35 - 27:35
    how can we generate money for
    ourselves in computation?
  • 27:40 - 27:40
    We're going to start out with a
    very simple problem.
  • 27:45 - 27:45
    It's one of the oldest problems
    that has been studied in
  • 27:49 - 27:49
    algorithms, is the problem of
    sorting.
  • 27:53 - 27:53
    We're going to actually study
    this for several lectures
  • 27:57 - 27:57
    because sorting contains many
    algorithmic techniques.
  • 28:03 - 28:03
    The sorting problem is the
    following.
  • 28:10 - 28:10
    We have a sequence a_1,
    a_2 up to a_n of numbers as
  • 28:21 - 28:21
    input.
    And our output is a permutation
  • 28:28 - 28:28
    of those numbers.
  • 28:42 - 28:42
    A permutation is a
    rearrangement of the numbers.
  • 28:47 - 28:47
    Every number appears exactly
    once in the rearrangement such
  • 28:53 - 28:53
    that, I sometimes use a dollar
    sign to mean "such that," a_1 is
  • 29:00 - 29:00
    less than or equal to a_2 prime.
    Such that they are
  • 29:07 - 29:07
    monotonically increasing in
    size.
  • 29:11 - 29:11
    Take a bunch of numbers,
    put them in order.
  • 29:18 - 29:18
    Here's an algorithm to do it.
    It's called insertion sort.
  • 29:40 - 29:40
    And we will write this
    algorithm in what we call
  • 29:44 - 29:44
    pseudocode.
    It's sort of a programming
  • 29:47 - 29:47
    language, except it's got
    English in there often.
  • 29:52 - 29:52
    And it's just a shorthand for
    writing for being precise.
  • 29:57 - 29:57
    So this sorts A from 1 to n.
    And here is the code for it.
  • 30:59 - 30:59
    This is what we call
    pseudocode.
  • 31:02 - 31:02
    And if you don't understand the
    pseudocode then you should ask
  • 31:07 - 31:07
    questions about any of the
    notations.
  • 31:10 - 31:10
    You will start to get used to
    it as we go on.
  • 31:14 - 31:14
    One thing is that in the
    pseudocode we use indentation,
  • 31:18 - 31:18
    where in most languages they
    have some kind of begin-end
  • 31:23 - 31:23
    delimiters like curly braces or
    something in Java or C,
  • 31:28 - 31:28
    for example.
    We just use indentation.
  • 31:32 - 31:32
    The whole idea of the
    pseudocode is to try to get the
  • 31:35 - 31:35
    algorithms as short as possible
    while still understanding what
  • 31:40 - 31:40
    the individual steps are.
    In practice,
  • 31:42 - 31:42
    there actually have been
    languages that use indentation
  • 31:46 - 31:46
    as a means of showing the
    nesting of things.
  • 31:49 - 31:49
    It's generally a bad idea,
    because if things go over one
  • 31:53 - 31:53
    page to another,
    for example,
  • 31:55 - 31:55
    you cannot tell what level of
    nesting it is.
  • 31:59 - 31:59
    Whereas, with explicit braces
    it's much easier to tell.
  • 32:04 - 32:04
    So, there are reasons why this
    is a bad notation if you were
  • 32:09 - 32:09
    doing software engineering.
    But it's a good one for us
  • 32:14 - 32:14
    because it just keeps things
    short and makes fewer things to
  • 32:20 - 32:20
    write down.
    So, this is insertion sort.
  • 32:23 - 32:23
    Let's try to figure out a
    little bit what this does.
  • 32:29 - 32:29
    It basically takes an array A
    and at any point the thing to
  • 32:37 - 32:37
    understand is,
    we're setting basically,
  • 32:42 - 32:42
    we're running the outer loop
    from j is 2 to n,
  • 32:48 - 32:48
    and the inner loop that starts
    at j minus 1 and then goes down
  • 32:57 - 32:57
    until it's zero.
    Basically, if we look at any
  • 33:02 - 33:02
    point in the algorithm,
    we essentially are looking at
  • 33:07 - 33:07
    some element here j.
    A of j, the jth element.
  • 33:11 - 33:11
    And what we do essentially is
    we pull a value out here that we
  • 33:16 - 33:16
    call the key.
    And at this point the important
  • 33:20 - 33:20
    thing to understand,
    and we'll talk more about this
  • 33:24 - 33:24
    in recitation on Friday,
    is that there is an invariant
  • 33:29 - 33:29
    that is being maintained by this
    loop each time through.
  • 33:35 - 33:35
    And the invariant is that this
    part of the array is sorted.
  • 33:41 - 33:41
    And the goal each time through
    the loop is to increase,
  • 33:46 - 33:46
    is to add one to the length of
    the things that are sorted.
  • 33:51 - 33:51
    And the way we do that is we
    pull out the key and we just
  • 33:57 - 33:57
    copy values up like this.
    And keep copying up until we
  • 34:02 - 34:02
    find the place where this key
    goes, and then we insert it in
  • 34:08 - 34:08
    that place.
    And that's why it's called
  • 34:11 - 34:11
    insertion sort.
    We just sort of move the
  • 34:14 - 34:14
    things, copy the things up until
    we find where it goes,
  • 34:19 - 34:19
    and then we put it into place.
    And now we have it from A from
  • 34:24 - 34:24
    one to j is sorted,
    and now we can work on j plus
  • 34:29 - 34:29
    one.
    Let's give an example of that.
  • 34:34 - 34:34
    Imagine we are doing 8,
    2, 4, 9, 3, 6.
  • 34:38 - 34:38
    We start out with j equals 2.
    And we figure out that we want
  • 34:45 - 34:45
    to insert it there.
    Now we have 2,
  • 34:50 - 34:50
    8, 4, 9, 3, 6.
    Then we look at the four and
  • 34:55 - 34:55
    say oh, well,
    that goes over here.
  • 35:00 - 35:00
    We get 2, 4,
    8, 9, 3, 6 after the second
  • 35:04 - 35:04
    iteration of the outer loop.
    Then we look at 9 and discover
  • 35:09 - 35:09
    immediately it just goes right
    there.
  • 35:12 - 35:12
    Very little work to do on that
    step.
  • 35:16 - 35:16
    So, we have exactly the same
    output after that iteration.
  • 35:21 - 35:21
    Then we look at the 3 and
    that's going to be inserted over
  • 35:26 - 35:26
    there.
    2, 3, 4, 8, 9,
  • 35:30 - 35:30
    6.
    And finally we look at the 6
  • 35:34 - 35:34
    and that goes in there.
    2, 3, 4, 6, 8,
  • 35:40 - 35:40
    9.
    And at that point we are done.
  • 35:45 - 35:45
    Question?
  • 35:58 - 35:58
    The array initially starts at
    one, yes.
  • 36:02 - 36:02
    A[1...n], OK?
    So, this is the insertion sort
  • 36:06 - 36:06
    algorithm.
    And it's the first algorithm
  • 36:10 - 36:10
    that we're going to analyze.
    And we're going to pull out
  • 36:15 - 36:15
    some tools that we have from our
    math background to help to
  • 36:21 - 36:21
    analyze it.
    First of all,
  • 36:23 - 36:23
    let's take a look at the issue
    of running time.
  • 36:29 - 36:29
    The running time depends,
    of this algorithm depends on a
  • 36:36 - 36:36
    lot of things.
    One thing it depends on is the
  • 36:42 - 36:42
    input itself.
    For example,
  • 36:45 - 36:45
    if the input is already sorted
    --
  • 36:55 - 36:55
    -- then insertion sort has very
    little work to do.
  • 37:00 - 37:00
    Because every time through it's
    going to be like this case.
  • 37:04 - 37:04
    It doesn't have to shuffle too
    many guys over because they're
  • 37:09 - 37:09
    already in place.
    Whereas, in some sense,
  • 37:12 - 37:12
    what's the worst case for
    insertion sort?
  • 37:15 - 37:15
    If it is reverse sorted then
    it's going to have to do a lot
  • 37:19 - 37:19
    of work because it's going to
    have to shuffle everything over
  • 37:24 - 37:24
    on each step of the outer loop.
    In addition to the actual input
  • 37:29 - 37:29
    it depends, of course,
    on the input size.
  • 37:38 - 37:38
    Here, for example,
    we did six elements.
  • 37:42 - 37:42
    It's going to take longer if
    we, for example,
  • 37:46 - 37:46
    do six times ten to the ninth
    elements.
  • 37:50 - 37:50
    If we were sorting a lot more
    stuff, it's going to take us a
  • 37:57 - 37:57
    lot longer.
    Typically, the way we handle
  • 38:01 - 38:01
    that is we are going to
    parameterize things in the input
  • 38:06 - 38:06
    size.
    We are going to talk about time
  • 38:11 - 38:11
    as a function of the size of
    things that we are sorting so we
  • 38:16 - 38:16
    can look at what is the behavior
    of that.
  • 38:20 - 38:20
    And the last thing I want to
    say about running time is
  • 38:25 - 38:25
    generally we want upper bonds on
    the running time.
  • 38:30 - 38:30
    We want to know that the time
    is no more than a certain
  • 38:34 - 38:34
    amount.
    And the reason is because that
  • 38:37 - 38:37
    represents a guarantee to the
    user.
  • 38:40 - 38:40
    If I say it's not going to run,
    for example,
  • 38:44 - 38:44
    if I tell you here's a program
    and it won't run more than three
  • 38:49 - 38:49
    seconds, that gives you real
    information about how you could
  • 38:53 - 38:53
    use it, for example,
    in a real-time setting.
  • 38:58 - 38:58
    Whereas, if I said here's a
    program and it goes at least
  • 39:03 - 39:03
    three seconds,
    you don't know if it's going to
  • 39:06 - 39:06
    go for three years.
    It doesn't give you that much
  • 39:10 - 39:10
    guarantee if you are a user of
    it.
  • 39:13 - 39:13
    Generally we want upper bonds
    because it represents a
  • 39:17 - 39:17
    guarantee to the user.
  • 39:30 - 39:30
    There are different kinds of
    analyses that people do.
  • 39:44 - 39:44
    The one we're mostly going to
    focus on is what's called
  • 39:53 - 39:53
    worst-case analysis.
    And this is what we do usually
  • 40:02 - 40:02
    where we define T of n to be the
    maximum time on any input of
  • 40:12 - 40:12
    size n.
    So, it's the maximum input,
  • 40:16 - 40:16
    the maximum it could possibly
    cost us on an input of size n.
  • 40:20 - 40:20
    What that does is,
    if you look at the fact that
  • 40:23 - 40:23
    sometimes the inputs are better
    and sometimes they're worse,
  • 40:27 - 40:27
    we're looking at the worst case
    of those because that's the way
  • 40:31 - 40:31
    we're going to be able to make a
    guarantee.
  • 40:34 - 40:34
    It always does something rather
    than just sometimes does
  • 40:38 - 40:38
    something.
    So, we're looking at the
  • 40:40 - 40:40
    maximum.
    Notice that if I didn't have
  • 40:43 - 40:43
    maximum then T(n) in some sense
    is a relation,
  • 40:46 - 40:46
    not a function,
    because the time on an input of
  • 40:50 - 40:50
    size n depends on which input of
    size n.
  • 40:52 - 40:52
    I could have many different
    times, but by putting the
  • 40:56 - 40:56
    maximum at it,
    it turns that relation into a
  • 40:59 - 40:59
    function because there's only
    one maximum time that it will
  • 41:03 - 41:03
    take.
    Sometimes we will talk about
  • 41:12 - 41:12
    average case.
    Sometimes we will do this.
  • 41:22 - 41:22
    Here T of n is then the
    expected time over all inputs of
  • 41:35 - 41:35
    size n.
    It's the expected time.
  • 41:40 - 41:40
    Now, if I talk about expected
    time, what else do I need to say
  • 41:45 - 41:45
    here?
    What does that mean,
  • 41:47 - 41:47
    expected time?
    I'm sorry.
  • 41:49 - 41:49
    Raise your hand.
    Expected inputs.
  • 41:52 - 41:52
    What does that mean,
    expected inputs?
  • 42:05 - 42:05
    I need more math.
    What do I need by expected time
  • 42:10 - 42:10
    here, math?
    You have to take the time of
  • 42:14 - 42:14
    every input and then average
    them, OK.
  • 42:18 - 42:18
    That's kind of what we mean by
    expected time.
  • 42:23 - 42:23
    Good.
    Not quite.
  • 42:24 - 42:24
    I mean, what you say is
    completely correct,
  • 42:29 - 42:29
    except is not quite enough.
    Yeah?
  • 42:34 - 42:34
    It's the time of every input
    times the probability that it
  • 42:39 - 42:39
    will be that input.
    It's a way of taking a weighted
  • 42:45 - 42:45
    average, exactly right.
    How do I know what the
  • 42:49 - 42:49
    probability of every input is?
    How do I know what the
  • 42:55 - 42:55
    probability a particular input
    occurs is in a given situation?
  • 43:02 - 43:02
    I don't.
    I have to make an assumption.
  • 43:06 - 43:06
    What's that assumption called?
    What kind of assumption do I
  • 43:12 - 43:12
    have to meet?
    I need an assumption --
  • 43:24 - 43:24
    -- of the statistical
    distribution of inputs.
  • 43:28 - 43:28
    Otherwise, expected time
    doesn't mean anything because I
  • 43:34 - 43:34
    don't know what the probability
    of something is.
  • 43:38 - 43:38
    In order to do probability,
    you need some assumptions and
  • 43:44 - 43:44
    you've got to state those
    assumptions clearly.
  • 43:48 - 43:48
    One of the most common
    assumptions is that all inputs
  • 43:53 - 43:53
    are equally likely.
    That's called the uniform
  • 43:57 - 43:57
    distribution.
    Every input of size n is
  • 44:02 - 44:02
    equally likely,
    that kind of thing.
  • 44:05 - 44:05
    But there are other ways that
    you could make that assumption,
  • 44:09 - 44:09
    and they may not all be true.
    This is much more complicated,
  • 44:14 - 44:14
    as you can see.
    Fortunately,
  • 44:16 - 44:16
    all of you have a strong
    probability background.
  • 44:20 - 44:20
    And so we will not have any
    trouble addressing these
  • 44:24 - 44:24
    probabilistic issues of dealing
    with expectations and such.
  • 44:30 - 44:30
    If you don't,
    time to go and say gee,
  • 44:34 - 44:34
    maybe I should take that
    Probability class that is a
  • 44:40 - 44:40
    prerequisite for this class.
    The last one I am going to
  • 44:47 - 44:47
    mention is best-case analysis.
    And this I claim is bogus.
  • 44:53 - 44:53
    Bogus.
    No good.
  • 44:55 - 44:55
    Why is best-case analysis
    bogus?
  • 45:00 - 45:00
    Yeah?
    The best-case probably doesn't
  • 45:03 - 45:03
    ever happen.
    Actually, it's interesting
  • 45:06 - 45:06
    because for the sorting problem,
    the most common things that get
  • 45:11 - 45:11
    sorted are things that are
    already sorted interestingly,
  • 45:15 - 45:15
    or at least almost sorted.
    For example,
  • 45:18 - 45:18
    one of the most common things
    that are sorted is check numbers
  • 45:23 - 45:23
    by banks.
    They tend to come in,
  • 45:26 - 45:26
    in the same order that they are
    written.
  • 45:30 - 45:30
    They're sorting things that are
    almost always sorted.
  • 45:37 - 45:37
    I mean, it's good.
    When upper bond,
  • 45:41 - 45:41
    not lower bound?
    Yeah, you want to make a
  • 45:46 - 45:46
    guarantee.
    And so why is this not a
  • 45:51 - 45:51
    guarantee?
    You're onto something there,
  • 45:55 - 45:55
    but we need a little more
    precision here.
  • 46:02 - 46:02
    How can I cheat?
    Yeah?
  • 46:04 - 46:04
    Yeah, you can cheat.
    You cheat.
  • 46:08 - 46:08
    You take any slow algorithm
    that you want and just check for
  • 46:14 - 46:14
    some particular input,
    and if it's that input,
  • 46:19 - 46:19
    then you say immediately yeah,
    OK, here is the answer.
  • 46:25 - 46:25
    And then it's got a good
    best-case.
  • 46:30 - 46:30
    But I didn't tell you anything
    about the vast majority of what
  • 46:37 - 46:37
    is going on.
    So, you can cheat with a slow
  • 46:42 - 46:42
    algorithm that works fast on
    some input.
  • 46:47 - 46:47
    It doesn't really do much for
    you so we normally don't worry
  • 46:54 - 46:54
    about that.
    Let's see.
  • 46:56 - 46:56
    What is insertion sorts
    worst-case time?
  • 47:02 - 47:02
    Now we get into some sort of
    funny issues.
  • 47:07 - 47:07
    First of all,
    it sort of depends on the
  • 47:12 - 47:12
    computer you're running on.
    Whose computer,
  • 47:18 - 47:18
    right?
    Is it a big supercomputer or is
  • 47:23 - 47:23
    it your wristwatch?
    They have different
  • 47:28 - 47:28
    computational abilities.
    And when we compare algorithms,
  • 47:34 - 47:34
    we compare them typically for
    relative speed.
  • 47:38 - 47:38
    This is if you compared two
    algorithms on the same machine.
  • 47:42 - 47:42
    You could argue,
    well, it doesn't really matter
  • 47:46 - 47:46
    what the machine is because I
    will just look at their relative
  • 47:51 - 47:51
    speed.
    But, of course,
  • 47:52 - 47:52
    I may also be interested in
    absolute speed.
  • 47:56 - 47:56
    Is one algorithm actually
    better no matter what machine
  • 48:00 - 48:00
    it's run on?
  • 48:08 - 48:08
    And so this kind of gets sort
    of confusing as to how I can
  • 48:13 - 48:13
    talk about the worst-case time
    of an algorithm of a piece of
  • 48:18 - 48:18
    software when I am not talking
    about the hardware because,
  • 48:23 - 48:23
    clearly, if I had run on a
    faster machine,
  • 48:27 - 48:27
    my algorithms are going to go
    faster.
  • 48:30 - 48:30
    So, this is where you get the
    big idea of algorithms.
  • 48:36 - 48:36
    Which is why algorithm is such
    a huge field,
  • 48:39 - 48:39
    why it spawns companies like
    Google, like Akamai,
  • 48:43 - 48:43
    like Amazon.
    Why algorithmic analysis,
  • 48:46 - 48:46
    throughout the history of
    computing, has been such a huge
  • 48:51 - 48:51
    success, is our ability to
    master and to be able to take
  • 48:55 - 48:55
    what is apparently a really
    messy, complicated situation and
  • 49:00 - 49:00
    reduce it to being able to do
    some mathematics.
  • 49:05 - 49:05
    And that idea is called
    asymptotic analysis.
  • 49:17 - 49:17
    And the basic idea of
    asymptotic analysis is to ignore
  • 49:22 - 49:22
    machine-dependent constants --
  • 49:34 - 49:34
    -- and, instead of the actual
    running time,
  • 49:39 - 49:39
    look at the growth of the
    running time.
  • 49:59 - 49:59
    So, we don't look at the actual
    running time.
  • 50:03 - 50:03
    We look at the growth.
    Let's see what we mean by that.
  • 50:07 - 50:07
    This is a huge idea.
    It's not a hard idea,
  • 50:11 - 50:11
    otherwise I wouldn't be able to
    teach it in the first lecture,
  • 50:16 - 50:16
    but it's a huge idea.
    We are going to spend a couple
  • 50:20 - 50:20
    of lectures understanding the
    implications of that and will
  • 50:25 - 50:25
    basically be doing it throughout
    the term.
  • 50:30 - 50:30
    And if you go on to be
    practicing engineers,
  • 50:34 - 50:34
    you will be doing it all the
    time.
  • 50:36 - 50:36
    In order to do that,
    we adopt some notations that
  • 50:41 - 50:41
    are going to help us.
    In particular,
  • 50:44 - 50:44
    we will adopt asymptotic
    notation.
  • 50:46 - 50:46
    Most of you have seen some kind
    of asymptotic notation.
  • 50:51 - 50:51
    Maybe a few of you haven't,
    but mostly you should have seen
  • 50:56 - 50:56
    a little bit.
    The one we're going to be using
  • 51:00 - 51:00
    in this class predominantly is
    theta notation.
  • 51:05 - 51:05
    And theta notation is pretty
    easy notation to master because
  • 51:13 - 51:13
    all you do is,
    from a formula,
  • 51:16 - 51:16
    just drop low order terms and
    ignore leading constants.
  • 51:30 - 51:30
    For example,
    if I have a formula like 3n^3 =
  • 51:36 - 51:36
    90n^2 - 5n + 6046,
    I say, well,
  • 51:40 - 51:40
    what low-order terms do I drop?
    Well, n^3 is a bigger term n^2
  • 51:48 - 51:48
    than.
    I am going to drop all these
  • 51:53 - 51:53
    terms and ignore the leading
    constant, so I say that's
  • 52:00 - 52:00
    Theta(n^3).
    That's pretty easy.
  • 52:05 - 52:05
    So, that's theta notation.
    Now, this is an engineering way
  • 52:09 - 52:09
    of manipulating theta notation.
    There is actually a
  • 52:13 - 52:13
    mathematical definition for
    this, which we are going to talk
  • 52:18 - 52:18
    about next time,
    which is a definition in terms
  • 52:22 - 52:22
    of sets of functions.
    And, you are going to be
  • 52:26 - 52:26
    responsible, this is both a math
    and a computer science
  • 52:30 - 52:30
    engineering class.
    Throughout the course you are
  • 52:35 - 52:35
    going to be responsible both for
    mathematical rigor as if it were
  • 52:39 - 52:39
    a math course and engineering
    commonsense because it's an
  • 52:43 - 52:43
    engineering course.
    We are going to be doing both.
  • 52:46 - 52:46
    This is the engineering way of
    understanding what you do,
  • 52:50 - 52:50
    so you're responsible for being
    able to do these manipulations.
  • 52:54 - 52:54
    You're also going to be
    responsible for understanding
  • 52:58 - 52:58
    the mathematical definition of
    theta notion and of its related
  • 53:02 - 53:02
    O notation and omega notation.
    If I take a look as n
  • 53:08 - 53:08
    approached infinity,
    a Theta(n^2) algorithm always
  • 53:15 - 53:15
    beats, eventually,
    a Theta(n^3) algorithm.
  • 53:20 - 53:20
    As n gets bigger,
    it doesn't matter what these
  • 53:27 - 53:27
    other terms were if I were
    describing the absolute precise
  • 53:34 - 53:34
    behavior in terms of a formula.
    If I had a Theta(n^2)
  • 53:42 - 53:42
    algorithm, it would always be
    faster for sufficiently large n
  • 53:46 - 53:46
    than a Theta(n^3) algorithm.
    It wouldn't matter what those
  • 53:51 - 53:51
    low-order terms were.
    It wouldn't matter what the
  • 53:54 - 53:54
    leading constant was.
    This one will always be faster.
  • 53:59 - 53:59
    Even if you ran the Theta(n^2)
    algorithm on a slow computer and
  • 54:05 - 54:05
    the Theta(n^3) algorithm on a
    fast computer.
  • 54:09 - 54:09
    The great thing about
    asymptotic notation is it
  • 54:14 - 54:14
    satisfies our issue of being
    able to compare both relative
  • 54:20 - 54:20
    and absolute speed,
    because we are able to do this
  • 54:24 - 54:24
    no matter what the computer
    platform.
  • 54:29 - 54:29
    On different platforms we may
    get different constants here,
  • 54:35 - 54:35
    machine-dependent constants for
    the actual running time,
  • 54:40 - 54:40
    but if I look at the growth as
    the size of the input gets
  • 54:45 - 54:45
    larger, the asymptotics
    generally won't change.
  • 54:50 - 54:50
    For example,
    I will just draw that as a
  • 54:54 - 54:54
    picture.
    If I have n on this axis and
  • 54:57 - 54:57
    T(n) on this axis.
    This may be,
  • 55:01 - 55:01
    for example,
    a Theta(n^3) algorithm and this
  • 55:05 - 55:05
    may be a Theta(n^2) algorithm.
    There is always going to be
  • 55:09 - 55:09
    some point n_o where for
    everything larger the Theta(n^2)
  • 55:14 - 55:14
    algorithm is going to be cheaper
    than the Theta(n^3) algorithm
  • 55:19 - 55:19
    not matter how much advantage
    you give it at the beginning in
  • 55:24 - 55:24
    terms of the speed of the
    computer you are running on.
  • 55:30 - 55:30
    Now, from an engineering point
    of view, there are some issues
  • 55:34 - 55:34
    we have to deal with because
    sometimes it could be that that
  • 55:39 - 55:39
    n_o is so large that the
    computers aren't big enough to
  • 55:43 - 55:43
    run the problem.
    That's why we,
  • 55:45 - 55:45
    nevertheless,
    are interested in some of the
  • 55:48 - 55:48
    slower algorithms,
    because some of the slower
  • 55:51 - 55:51
    algorithms, even though they may
    not asymptotically be slower,
  • 55:56 - 55:56
    I mean asymptotically they will
    be slower.
  • 56:00 - 56:00
    They may still be faster on
    reasonable sizes of things.
  • 56:04 - 56:04
    And so we have to both balance
    our mathematical understanding
  • 56:08 - 56:08
    with our engineering commonsense
    in order to do good programming.
  • 56:12 - 56:12
    So, just having done analysis
    of algorithms doesn't
  • 56:16 - 56:16
    automatically make you a good
    programmer.
  • 56:18 - 56:18
    You also need to learn how to
    program and use these tools in
  • 56:23 - 56:23
    practice to understand when they
    are relevant and when they are
  • 56:27 - 56:27
    not relevant.
    There is a saying.
  • 56:30 - 56:30
    If you want to be a good
    program, you just program ever
  • 56:35 - 56:35
    day for two years,
    you will be an excellent
  • 56:38 - 56:38
    programmer.
    If you want to be a world-class
  • 56:42 - 56:42
    programmer, you can program
    every day for ten years,
  • 56:47 - 56:47
    or you can program every day
    for two years and take an
  • 56:51 - 56:51
    algorithms class.
    Let's get back to what we were
  • 56:55 - 56:55
    doing, which is analyzing
    insertion sort.
  • 57:00 - 57:00
    We are going to look at the
    worse-case.
  • 57:16 - 57:16
    Which, as we mentioned before,
    is when the input is reverse
  • 57:21 - 57:21
    sorted.
    The biggest element comes first
  • 57:25 - 57:25
    and the smallest last because
    now every time you do the
  • 57:30 - 57:30
    insertion you've got to shuffle
    everything over.
  • 57:35 - 57:35
    You can write down the running
    time by looking at the nesting
  • 57:39 - 57:39
    of loops.
    What we do is we sum up.
  • 57:41 - 57:41
    What we assume is that every
    operation, every elemental
  • 57:44 - 57:44
    operation is going to take some
    constant amount of time.
  • 57:47 - 57:47
    But we don't have to worry
    about what that constant is
  • 57:50 - 57:50
    because we're going to be doing
    asymptotic analysis.
  • 57:53 - 57:53
    As I say, the beautify of the
    method is that it causes all
  • 57:57 - 57:57
    these things that are real
    distinctions to sort of vanish.
  • 58:01 - 58:01
    We sort of look at them from
    30,000 feet rather than from
  • 58:05 - 58:05
    three millimeters or something.
    Each of these operations is
  • 58:10 - 58:10
    going to sort of be a basic
    operation.
  • 58:13 - 58:13
    One way to think about this,
    in terms of counting
  • 58:17 - 58:17
    operations, is counting memory
    references.
  • 58:20 - 58:20
    How many times do you actually
    access some variable?
  • 58:24 - 58:24
    That's another way of sort of
    thinking about this model.
  • 58:29 - 58:29
    When we do that,
    well, we're going to go through
  • 58:34 - 58:34
    this loop, j is going from 2 to
    n, and then we're going to add
  • 58:40 - 58:40
    up the work that we do within
    the loop.
  • 58:44 - 58:44
    We can sort of write that in
    math as summation of j equals 2
  • 58:50 - 58:50
    to n.
    And then what is the work that
  • 58:53 - 58:53
    is going on in this loop?
    Well, the work that is going on
  • 58:59 - 58:59
    in this loop varies,
    but in the worst case how many
  • 59:04 - 59:04
    operations are going on here for
    each value of j?
  • 59:10 - 59:10
    For a given value of j,
    how much work goes on in this
  • 59:17 - 59:17
    loop?
    Can somebody tell me?
  • 59:21 - 59:21
    Asymptotically.
    It's j times some constant,
  • 59:27 - 59:27
    so it's theta j.
    So, there is theta j work going
  • 59:33 - 59:33
    on here because this loop starts
    out with i being j minus 1,
  • 59:39 - 59:39
    and then it's doing just a
    constant amount of stuff for
  • 59:44 - 59:44
    each step of the value of i,
    and i is running from j minus
  • 59:50 - 59:50
    one down to zero.
    So, we can say that is theta j
  • 59:55 - 59:55
    work that is going on.
    Do people follow that?
  • 60:00 - 60:00
    OK.
    And now we have a formula we
  • 60:03 - 60:03
    can evaluate.
    What is the evaluation?
  • 60:06 - 60:06
    If I want to simplify this
    formula, what is that equal to?
  • 60:20 - 60:20
    Sorry. In the back there.
  • 60:28 - 60:28
    Yeah. OK.
    That's just Theta(n^2),
  • 60:34 - 60:34
    good.
    Because when you're saying is
  • 60:36 - 60:36
    the sum of consecutive numbers,
    you mean what?
  • 60:40 - 60:40
    What's the mathematic term we
    have for that so we can
  • 60:43 - 60:43
    communicate?
    You've got to know these things
  • 60:47 - 60:47
    so you can communicate.
    It's called what type of
  • 60:50 - 60:50
    sequence?
    It's actually a series,
  • 60:52 - 60:52
    but that's OK.
    What type of series is this
  • 60:56 - 60:56
    called?
    Arithmetic series,
  • 60:57 - 60:57
    good.
    Wow, we've got some sharp
  • 61:00 - 61:00
    people who can communicate.
    This is an arithmetic series.
  • 61:06 - 61:06
    You're basically summing 1 + 2
    + 3 + 4, some constants in
  • 61:12 - 61:12
    there, but basically it's 1 + 2
    + 3 + 4 + 5 + 6 up to n.
  • 61:17 - 61:17
    That's Theta(n^2).
    If you don't know this math,
  • 61:22 - 61:22
    there is a chapter in the book,
    or you could have taken the
  • 61:28 - 61:28
    prerequisite.
    Erythematic series.
  • 61:31 - 61:31
    People have this vague
    recollection.
  • 61:34 - 61:34
    Oh, yeah.
    Good.
  • 61:35 - 61:35
    Now, you have to learn these
    manipulations.
  • 61:38 - 61:38
    We will talk about a bit next
    time, but you have to learn your
  • 61:43 - 61:43
    theta manipulations for what
    works with theta.
  • 61:46 - 61:46
    And you have to be very careful
    because theta is a weak
  • 61:50 - 61:50
    notation.
    A strong notation is something
  • 61:53 - 61:53
    like Leibniz notation from
    calculus where the chain rule is
  • 61:57 - 61:57
    just canceling two things.
    It's just fabulous that you can
  • 62:02 - 62:02
    cancel in the chain rule.
    And Leibniz notation just
  • 62:05 - 62:05
    expresses that so directly you
    can manipulate.
  • 62:08 - 62:08
    Theta notation is not like
    that.
  • 62:09 - 62:09
    If you think it is like that
    you are in trouble.
  • 62:12 - 62:12
    You really have to think of
    what is going on under the theta
  • 62:16 - 62:16
    notation.
    And it is more of a descriptive
  • 62:18 - 62:18
    notation than it is a
    manipulative notation.
  • 62:21 - 62:21
    There are manipulations you can
    do with it, but unless you
  • 62:24 - 62:24
    understand what is really going
    on under the theta notation you
  • 62:28 - 62:28
    will find yourself in trouble.
    And next time we will talk a
  • 62:33 - 62:33
    little bit more about theta
    notation.
  • 62:36 - 62:36
    Is insertion sort fast?
  • 62:49 - 62:49
    Well, it turns out for small n
    it is moderately fast.
  • 63:02 - 63:02
    But it is not at all for large
    n.
  • 63:18 - 63:18
    So, I am going to give you an
    algorithm that is faster.
  • 63:22 - 63:22
    It's called merge sort.
    I wonder if I should leave
  • 63:25 - 63:25
    insertion sort up.
    Why not.
  • 63:46 - 63:46
    I am going to write on this
    later, so if you are taking
  • 63:52 - 63:52
    notes, leave some space on the
    left.
  • 63:56 - 63:56
    Here is merge sort of an array
    A from 1 up to n.
  • 64:02 - 64:02
    And it is basically three
    steps.
  • 64:06 - 64:06
    If n equals 1 we are done.
    Sorting one element,
  • 64:12 - 64:12
    it is already sorted.
    All right.
  • 64:16 - 64:16
    Recursive algorithm.
    Otherwise, what we do is we
  • 64:22 - 64:22
    recursively sort A from 1 up to
    the ceiling of n over 2.
  • 64:30 - 64:30
    And the array A of the ceiling
    of n over 2 plus one up to n.
  • 64:39 - 64:39
    So, we sort two halves of the
    input.
  • 64:45 - 64:45
    And then, three,
    we take those two lists that we
  • 64:52 - 64:52
    have done and we merge them.
  • 65:03 - 65:03
    And, to do that,
    we use a merge subroutine which
  • 65:06 - 65:06
    I will show you.
  • 65:14 - 65:14
    The key subroutine here is
    merge, and it works like this.
  • 65:20 - 65:20
    I have two lists.
    Let's say one of them is 20.
  • 65:26 - 65:26
    I am doing this in reverse
    order.
  • 65:30 - 65:30
    I have sorted this like this.
    And then I sort another one.
  • 65:35 - 65:35
    I don't know why I do it this
    order, but anyway.
  • 65:40 - 65:40
    Here is my other list.
    I have my two lists that I have
  • 65:45 - 65:45
    sorted.
    So, this is A[1] to A[|n/2|]
  • 65:48 - 65:48
    and A[|n/2|+1] to A[n] for the
    way it will be called in this
  • 65:54 - 65:54
    program.
    And now to merge these two,
  • 65:57 - 65:57
    what I want to do is produce a
    sorted list out of both of them.
  • 66:04 - 66:04
    What I do is first observe
    where is the smallest element of
  • 66:09 - 66:09
    any two lists that are already
    sorted?
  • 66:12 - 66:12
    It's in one of two places,
    the head of the first list or
  • 66:16 - 66:16
    the head of the second list.
    I look at those two elements
  • 66:21 - 66:21
    and say which one is smaller?
    This one is smaller.
  • 66:25 - 66:25
    Then what I do is output into
    my output array the smaller of
  • 66:29 - 66:29
    the two.
    And I cross it off.
  • 66:32 - 66:32
    And now where is the next
    smallest element?
  • 66:36 - 66:36
    And the answer is it's going to
    be the head of one of these two
  • 66:40 - 66:40
    lists.
    Then I cross out this guy and
  • 66:43 - 66:43
    put him here and circle this
    one.
  • 66:46 - 66:46
    Now I look at these two guys.
    This one is smaller so I output
  • 66:50 - 66:50
    that and circle that one.
    Now I look at these two guys,
  • 66:54 - 66:54
    output 9.
    So, every step here is some
  • 66:57 - 66:57
    fixed number of operations that
    is independent of the size of
  • 67:02 - 67:02
    the arrays at each step.
    Each individual step is just me
  • 67:08 - 67:08
    looking at two elements and
    picking out the smallest and
  • 67:14 - 67:14
    advancing some pointers into the
    array so that I know where the
  • 67:20 - 67:20
    current head of that list is.
    And so, therefore,
  • 67:25 - 67:25
    the time is order n on n total
    elements.
  • 67:30 - 67:30
    The time to actually go through
    this and merge two lists is
  • 67:35 - 67:35
    order n.
    We sometimes call this linear
  • 67:38 - 67:38
    time because it's not quadratic
    or whatever.
  • 67:41 - 67:41
    It is proportional to n,
    proportional to the input size.
  • 67:45 - 67:45
    It's linear time.
    I go through and just do this
  • 67:49 - 67:49
    simple operation,
    just working up these lists,
  • 67:53 - 67:53
    and in the end I have done
    essentially n operations,
  • 67:57 - 67:57
    order n operations each of
    which cost constant time.
  • 68:02 - 68:02
    That's a total of order n time.
    Everybody with me?
  • 68:07 - 68:07
    OK.
    So, this is a recursive
  • 68:10 - 68:10
    program.
    We can actually now write what
  • 68:13 - 68:13
    is called a recurrence for this
    program.
  • 68:17 - 68:17
    The way we do that is say let's
    let the time to sort n elements
  • 68:24 - 68:24
    to be T(n).
    Then how long does it take to
  • 68:28 - 68:28
    do step one?
  • 68:35 - 68:35
    That's just constant time.
    We just check to see if n is 1,
  • 68:39 - 68:39
    and if it is we return.
    That's independent of the size
  • 68:43 - 68:43
    of anything that we are doing.
    It just takes a certain number
  • 68:48 - 68:48
    of machine instructions on
    whatever machine and we say it
  • 68:52 - 68:52
    is constant time.
    We call that theta one.
  • 68:55 - 68:55
    This is actually a little bit
    of an abuse if you get into it.
  • 69:00 - 69:00
    And the reason is because
    typically in order to say it you
  • 69:04 - 69:04
    need to say what it is growing
    with.
  • 69:07 - 69:07
    Nevertheless,
    we use this as an abuse of the
  • 69:11 - 69:11
    notation just to mean it is a
    constant.
  • 69:14 - 69:14
    So, that's an abuse just so
    people know.
  • 69:17 - 69:17
    But it simplifies things if I
    can just write theta one.
  • 69:21 - 69:21
    And it basically means the same
    thing.
  • 69:24 - 69:24
    Now we recursively sort these
    two things.
  • 69:27 - 69:27
    How can I describe that?
    The time to do this,
  • 69:33 - 69:33
    I can describe recursively as T
    of ceiling of n over 2 plus T of
  • 69:41 - 69:41
    n minus ceiling of n over 2.
    That is actually kind of messy,
  • 69:48 - 69:48
    so what we will do is just be
    sloppy and write 2T(n/2).
  • 69:55 - 69:55
    So, this is just us being
    sloppy.
  • 70:00 - 70:00
    And we will see on Friday in
    recitation that it is OK to be
  • 70:04 - 70:04
    sloppy.
    That's the great thing about
  • 70:07 - 70:07
    algorithms.
    As long as you are rigorous and
  • 70:10 - 70:10
    precise, you can be as sloppy as
    you want.
  • 70:13 - 70:13
    [LAUGHTER] This is sloppy
    because I didn't worry about
  • 70:16 - 70:16
    what was going on,
    because it turns out it doesn't
  • 70:20 - 70:20
    make any difference.
    And we are going to actually
  • 70:23 - 70:23
    see that that is the case.
    And, finally,
  • 70:27 - 70:27
    I have to merge the two sorted
    lists which have a total of n
  • 70:30 - 70:30
    elements.
    And we just analyze that using
  • 70:32 - 70:32
    the merge subroutine.
    And that takes us to theta n
  • 70:34 - 70:34
    time.
  • 70:40 - 70:40
    That allows us now to write a
    recurrence for the performance
  • 70:44 - 70:44
    of merge sort.
  • 70:57 - 70:57
    Which is to say that T of n is
    equal to theta 1 if n equals 1
  • 71:05 - 71:05
    and 2T of n over 2 plus theta of
    n if n is bigger than 1.
  • 71:12 - 71:12
    Because either I am doing step
    one or I am doing all steps one,
  • 71:20 - 71:20
    two and three.
    Here I am doing step one and I
  • 71:26 - 71:26
    return and I am done.
    Or else I am doing step one,
  • 71:32 - 71:32
    I don't return,
    and then I also do steps two
  • 71:36 - 71:36
    and three.
    So, I add those together.
  • 71:39 - 71:39
    I could say theta n plus theta
    1, but theta n plus theta 1 is
  • 71:44 - 71:44
    just theta n because theta 1 is
    a lower order term than theta n
  • 71:49 - 71:49
    and I can throw it away.
    It is either theta 1 or it is
  • 71:53 - 71:53
    2T of n over 2 plus theta n.
    Now, typically we won't be
  • 71:58 - 71:58
    writing this.
    Usually we omit this.
  • 72:01 - 72:01
    If it makes no difference to
    the solution of the recurrence,
  • 72:06 - 72:06
    we will usually omit constant
    base cases.
  • 72:08 - 72:08
    In algorithms,
    it's not true generally in
  • 72:11 - 72:11
    mathematics, but in algorithms
    if you are running something on
  • 72:16 - 72:16
    a constant size input it takes
    constant time always.
  • 72:19 - 72:19
    So, we don't worry about what
    this value is.
  • 72:22 - 72:22
    And it turns out it has no real
    impact on the asymptotic
  • 72:26 - 72:26
    solution of the recurrence.
    How do we solve a recurrence
  • 72:31 - 72:31
    like this?
    I now have T of n expressed in
  • 72:35 - 72:35
    terms of T of n over 2.
    That's in the book and it is
  • 72:39 - 72:39
    also in Lecture 2.
    We are going to do Lecture 2 to
  • 72:43 - 72:43
    solve that, but in the meantime
    what I am going to do is give
  • 72:48 - 72:48
    you a visual way of
    understanding what this costs,
  • 72:52 - 72:52
    which is one of the techniques
    we will elaborate on next time.
  • 72:58 - 72:58
    It is called a recursion tree
    technique.
  • 73:02 - 73:02
    And I will use it for the
    actual recurrence that is almost
  • 73:08 - 73:08
    the same 2T(n/2),
    but I am going to actually
  • 73:12 - 73:12
    explicitly, because I want you
    to see where it occurs,
  • 73:17 - 73:17
    plus some constant times n
    where c is a constant greater
  • 73:23 - 73:23
    than zero.
    So, we are going to look at
  • 73:26 - 73:26
    this recurrence with a base case
    of order one.
  • 73:32 - 73:32
    I am just making the constant
    in here, the upper bound on the
  • 73:36 - 73:36
    constant be explicit rather than
    implicit.
  • 73:39 - 73:39
    And the way you do a recursion
    tree is the following.
  • 73:43 - 73:43
    You start out by writing down
    the left-hand side of the
  • 73:47 - 73:47
    recurrence.
    And then what you do is you say
  • 73:50 - 73:50
    well, that is equal to,
    and now let's write it as a
  • 73:54 - 73:54
    tree.
    I do c of n work plus now I am
  • 73:56 - 73:56
    going to have to do work on each
    of my two children.
  • 74:01 - 74:01
    T of n over 2 and T of n over
    2.
  • 74:05 - 74:05
    If I sum up what is in here,
    I get this because that is what
  • 74:11 - 74:11
    the recurrence says,
    T(n)=2T(n/2)+cn.
  • 74:15 - 74:15
    I have 2T(n/2)+cn.
    Then I do it again.
  • 74:20 - 74:20
    I have cn here.
    I now have here cn/2.
  • 74:24 - 74:24
    And here is cn/2.
    And each of these now has a
  • 74:29 - 74:29
    T(n/4).
  • 74:36 - 74:36
    And these each have a T(n/4).
    And this has a T(n/4).
  • 74:43 - 74:43
    And I keep doing that,
    the dangerous dot,
  • 74:49 - 74:49
    dot, dots.
    And, if I keep doing that,
  • 74:54 - 74:54
    I end up with it looking like
    this.
  • 75:18 - 75:18
    And I keep going down until I
    get to a leaf.
  • 75:23 - 75:23
    And a leaf, I have essentially
    a T(1).
  • 75:28 - 75:28
    That is T(1).
    And so the first question I ask
  • 75:34 - 75:34
    here is, what is the height of
    this tree?
  • 75:39 - 75:39
    Yeah.
    It's log n.
  • 75:41 - 75:41
    It's actually very close to
    exactly log n because I am
  • 75:48 - 75:48
    starting out at the top with n
    and then I go to n/2 and n/4 and
  • 75:56 - 75:56
    all the way down until I get to
    1.
  • 76:01 - 76:01
    The number of halvings of n
    until I get to 1 is log n so the
  • 76:05 - 76:05
    height here is log n.
    It's OK if it is constant times
  • 76:09 - 76:09
    log n.
    It doesn't matter.
  • 76:11 - 76:11
    How many leaves are in this
    tree, by the way?
  • 76:25 - 76:25
    How many leaves does this tree
    have?
  • 76:28 - 76:28
    Yeah.
    The number of leaves,
  • 76:31 - 76:31
    once again, is actually pretty
    close.
  • 76:34 - 76:34
    It's actually n.
    If you took it all the way
  • 76:38 - 76:38
    down.
    Let's make some simplifying
  • 76:41 - 76:41
    assumption.
    n is a perfect power of 2,
  • 76:45 - 76:45
    so it is an integer power of 2.
    Then this is exactly log n to
  • 76:51 - 76:51
    get down to T(1).
    And then there are exactly n
  • 76:55 - 76:55
    leaves, because the number of
    leaves here, the number of nodes
  • 77:01 - 77:01
    at this level is 1,
    2, 4, 8.
  • 77:05 - 77:05
    And if I go down height h,
    I have 2 to the h leaves,
  • 77:11 - 77:11
    2 to the log n,
    that is just n.
  • 77:14 - 77:14
    We are doing math here,
    right?
  • 77:17 - 77:17
    Now let's figure out how much
    work, if I look at adding up
  • 77:23 - 77:23
    everything in this tree I am
    going to get T(n),
  • 77:29 - 77:29
    so let's add that up.
    Well, let's add it up level by
  • 77:36 - 77:36
    level.
    How much do we have in the
  • 77:40 - 77:40
    first level?
    Just cn.
  • 77:42 - 77:42
    If I add up the second level,
    how much do I have?
  • 77:48 - 77:48
    cn.
    How about if I add up the third
  • 77:52 - 77:52
    level?
    cn.
  • 77:53 - 77:53
    How about if I add up all the
    leaves?
  • 77:57 - 77:57
    Theta n.
    It is not necessarily cn
  • 78:03 - 78:03
    because the boundary case may
    have a different constant.
  • 78:09 - 78:09
    It is actually theta n,
    but cn all the way here.
  • 78:15 - 78:15
    If I add up the total amount,
    that is equal to cn times log
  • 78:22 - 78:22
    n, because that's the height,
    that is how many cn's I have
  • 78:29 - 78:29
    here, plus theta n.
    And this is a higher order term
  • 78:36 - 78:36
    than this, so this goes away,
    get rid of the constants,
  • 78:42 - 78:42
    that is equal to theta(n lg n).
    And theta(n lg n) is
  • 78:48 - 78:48
    asymptotically faster than
    theta(n^2).
  • 78:53 - 78:53
    So, merge sort,
    on a large enough input size,
  • 78:58 - 78:58
    is going to beat insertion
    sort.
  • 79:03 - 79:03
    Merge sort is going to be a
    faster algorithm.
  • 79:07 - 79:07
    Sorry, you guys,
    I didn't realize you couldn't
  • 79:12 - 79:12
    see over there.
    You should speak up if you
  • 79:16 - 79:16
    cannot see.
    So, this is a faster algorithm
  • 79:20 - 79:20
    because theta(n lg n) grows more
    slowly than theta(n^2).
  • 79:25 - 79:25
    And merge sort asymptotically
    beats insertion sort.
  • 79:31 - 79:31
    Even if you ran insertion sort
    on a supercomputer,
  • 79:35 - 79:35
    somebody running on a PC with
    merge sort for sufficient large
  • 79:41 - 79:41
    input will clobber them because
    actually n^2 is way bigger than
  • 79:46 - 79:46
    n log n once you get the n's to
    be large.
  • 79:50 - 79:50
    And, in practice,
    merge sort tends to win here
  • 79:54 - 79:54
    for n bigger than,
    say, 30 or so.
  • 79:58 - 79:58
    If you have a very small input
    like 30 elements,
  • 80:02 - 80:02
    insertion sort is a perfectly
    decent sort to use.
  • 80:06 - 80:06
    But merge sort is going to be a
    lot faster even for something
  • 80:11 - 80:11
    that is only a few dozen
    elements.
  • 80:14 - 80:14
    It is going to actually be a
    faster algorithm.
  • 80:18 - 80:18
    That's sort of the lessons,
    OK?
  • 80:21 - 80:21
    Remember that to get your
    recitation assignments and
  • 80:25 - 80:25
    attend recitation on Friday.
    Because we are going to be
  • 80:31 - 80:31
    going through a bunch of the
    things that I have left on the
  • 80:36 - 80:36
    table here.
    And see you next Monday.
Title:
Lec 1 | MIT 6.046J / 18.410J Introduction to Algorithms (SMA 5503), Fall 2005
Video Language:
English
Duration:
01:20:36

English subtitles

Revisions