-
>> PARLANTE: So, in this section, I want to
play up this idea of modules of existing code
-
that you might want to use to just sort of
solve common problems. In this case, I'm going
-
to show you some file system interface stuff
and also how you call an external process
-
and capture its output and the like to do
something. So you can imagine using Python
-
sort of--you might use Bash but just--it's
sort of a better Bash to sort of glue something
-
together some sort of--some sort of utility.
So I'll start off in the interpreter here,
-
fire up Python. And the first module I want
to talk about is the OS module since we're
-
operating system, I think, and I'm just going
to do a DIR on it. So I import the OS module.
-
I'm going to look inside of there, and you
can see there are all sorts of functions in
-
there. There's obviously, you know, "setpgid"
and "nice." There are obviously kind of operating
-
system-oriented utilities, have a very kind
of a UNIX-y feeling. In theory, these are--these
-
try to be platform-independent. So if you
write a--wrote a Python program and it's running
-
on Windows, some of these are stubbed out
where you could call, you know, and try to
-
get the current time or whatever, and it's
going to translate it. I don't believe it's
-
done perfectly, but it tries. So going through
those, there's a--there's at least, theoretically,
-
a degree of platform independence. So I would
like to show you a couple--so obviously, there's
-
tons of stuff in here. The one I'd like to
show you for starters is listdir. That one.
-
So, actually, I could do help on it. Just
to show you how that works, so I say, "os.listdir."
-
So it says--okay, what this does, it takes--nice
summary--it takes a path and it's going to
-
give me a list of strings. So what it's going
to do is I give it a path to a directory and
-
then it's going to figure out what all the
filenames are in that directory and just return
-
it to me as a list of Python strings. So let
me go to the interpreter here. So to demonstrate
-
this, what I thought I'll do is I'll modify
the long-suffering hello.py example to just,
-
you know, I don't know, list files. So I'll
say "import os" here. I'll rename this list,
-
upper case L. Let's say this will take a directory.
So let's see, I'll say, "filenames = os.listdir
-
(dir)" and I just sort of as I've been encouraging
you to do for the exercise--well, I'll just
-
print what that gives me for starters. So
here I'll say, lists are here in the main.
-
I'll just leave it the way it is. So I'll
just assume that there's one command line
-
argument and I'll just list it. So, hopefully,
it's in the same directory. So if I say, "hello."
-
and you could see it's a, you know, it's found
so I'll do an ls; that way, we can access
-
this information. So there's this .DS_Store
thing that the Macintosh, like, pathologically
-
puts everywhere, and other than that, you'll
see there's just kind of regular file names.
-
So let me make this code do something a little
more interesting. So at least I printed that
-
it's there. So here, I'll--let's loop through
them. So I'll say, "for filename in filenames:"
-
sort of typical kind of thing. So one thing
I can do, if I want to make a path out of
-
this--but what's important to understand is
that when you do a listdir to get file names
-
out of a directory, just that filename on
its own, just out in space, is not a valid
-
path, right? It needs to be connected to the
directory it came from to make a valid path.
-
So the way I could do that--and sort--you
always sort--you always sort--as soon as you
-
call listdir, you're disconnecting the filename
from path. So you have to realize, you've
-
immediately--now, with the current directory
as dot or something, you might be able to
-
kind of fudge around some of these but then
you'd have a bug if someone was running in
-
a different directory. So in the OS package,
there's an OS--there's a subpart called os.path.
-
And inside of os.path, there are utilities
for manipulating file paths; taking them apart,
-
putting them together, that kind of stuff.
And again, these are a little bit platform-independent,
-
so on Windows or whatever, like, there's some
chance this might work. I'm sorry, it'll--it
-
would definitely work. So, "join" takes a
directory and a filename and then puts them
-
back together in a platform-valid way, so
that makes a valid path. So I could say, like,
-
"print." I'll just print that. And then also,
there's an os.path.abspath; I'll do that on
-
path. What that's going to do--it's kind of
like a PWD--oh, I'm sorry, it's path there.
-
It takes the path and it's going to just fill
it out to be replete. So let's try that. So
-
if I run that on dot and there's a module--okay,
what did I do wrong there? OS--oh, it's not
-
absbath. All right. Well, let it be said,
my demos do not lack for realism. All right.
-
So here, I'm running it on the directory dot,
and so then here, the, you know, the first
-
line is it just puts it back together with
a slash. I think--let's just try--if I said
-
like, "./" notice, it's smart about not doubling
up the slash there where if you just put it
-
together with a plus, you would have done
the wrong thing in there; it would have said
-
".//". Anyway, it's a nicety of going through
the real utility to do it right. And then
-
here is, you know, this is just on my--oh
no, this is on my unit--my Google box or whatever.
-
That's, you know, that's the full path of
the thing. And I'm sort of cheating on--just--I
-
could have--you know, I could say, like, "/tmp"
and like, whatever--God knows what that is.
-
It's some source thing or whatever. Anyway,
so I can--as an argument, I can just give
-
it any directory. It's going to list it up.
All right, so let me go back to--so, mundane
-
yet useful, all right? You want to be able
to manipulate list, do stuff with directory
-
paths; take them apart; put them together.
I've shown you just a few of the utilities.
-
You can look in there--you know, there's--yeah,
there's all the imaginable utilities you would
-
want for manipulating that kind of stuff.
So let me show you--I'm going to drop back
-
into Python here--I want to show you one other
one--just one I want. There's a os.path.exist('/tmp/foo').
-
That's hopefully--I don't actually know if
that exists--oh, it does. All right, of course.
-
How about "baz"? Oh, okay, that's not there.
There's also os.--I'm not going to run this
-
one--"os.mikdir" you get a path if you want
to make something. And then finally, one that
-
you would never in a million years would it
occur to you to find, but there's a module
-
called, "s-h-u-til" of which I think, historically,
was sort of like shell utilities. And in s-h-u-til,
-
there is a ".copy" and what this does is file
copying for you. So you give it a source path
-
and a dest path and it just kind of like goes
right in there. Obviously, you could do it
-
manually by reading the bytes of the file
or whatever, but if--yeah, it's--you just--yeah,
-
as I was saying, living higher on the food
chain, yeah, you just want to call the thing
-
that does that. I think the name of s-h-u-til--it
also shows how, I think, Python has grown
-
sort of organically, right? It's not like
a committee got together and said, well, for
-
a job, I feel like it's a much more a top-down
design with the names and stuff, where, you
-
know--it's not that a committee got together
and said, "Well, I think we should have a
-
file copying utility" and, you know, "Here's
what the names should be done." Instead--I'm
-
just guessing--like, some guy said, "Oh, here,
I've made this s-h-u-til thing, you know,
-
didn't really give a lot of thought into the
name and it was just kind of useful and it's
-
open source so it just kind of gotten picked
up." And so now, by historical accident, like
-
that's the slightly obscure name for that
utility is now, so typical kind of community-driven
-
open source, you know? It's kind of lovable
and powerful, but yet like a little bit undisciplined.
-
The--all righty, so that is the--that stuff
I wanted to show you with OS. Now, I wanted
-
to show you another--I'm going to stick, I'm
going to stick with doing stuff in the interpreter
-
just to reinforce that though. So, the other
thing I wanted to show you is how you launch
-
an external process and wait for it to finish
like very common kind of, you know, utility,
-
get things done, sort of things to do. There
are a bunch of Python modules that do this,
-
a bizarrely large number. I'm going to show
you what I--if you only knew one, I think
-
this is the most useful one. There's a module
called "commands." And inside of commands,
-
there's a function called, "get status output,"
I'll do help on it. Oh, boy, the help is pretty
-
short. What it does is it runs that command.
So it's going to shell out as an external
-
process, it's going to run that command and
you're going to block. So it causes you to
-
wait. It's going to wait for that certain
process to exit. And the standard out and
-
standard error of that cell so process--so
process are captured; they're not just written
-
onto your standard out--standard dir. So the
thing is--it's kind of sealed. So once the
-
thing exits, then what gets--what output is
going to do is it returns a Tuple-length tube.
-
Returning a Tuple is kind of the Python way
of saying, "Well look, I wanted to return
-
two things," or two or three things or whatever
so you could just return a Tuple. The Tuple
-
that it returns is--the first is the "int"
exit code. So just in a very typical UNIX-y
-
kind of way where, you know, you can recover
the exit code out of there. And then the second
-
is a big string, which is all of the output
of this thing. And in this case, I think it's
-
both the standard output and the standard
error kind of caught into each other. Now,
-
there are a bunch of variance of this if you
want to capture the standard dir separately
-
or--all sorts of permutations are covered,
but this is the one we're going to use today.
-
And so I'll get out of here and I think what
I'm going to do is I'm going to modify my--well,
-
here, we'll leave this as list but I'm going
to, I'm going to have it work differently
-
now. So I'm going to say, let's make this
command; I'll say, "'ls -l' + dir." It's kind
-
of weird, right? So as a string, I'm putting
together like, "Oh, here's the thing I'd like
-
to shell out and have it, like, launch the
ls program." And so then, I'm going to write
-
a Tuple so I'll say, "status, output" is equal
to--actually, no here, I'm going to--let's
-
skip this stuff. So the way I like to do these--well,
I'll do this one. So the way you call it is
-
I'll say, "status output" that's the Tuple,
"= commands.getstatusoutput" and I'll just
-
pass in the command I want to do. And then
here, we'll just like, you know, print the
-
output. Get rid of all these. And for a--normally,
I would forget to do the import and go through
-
that but just--since we're short of time,
I'll just--I'll go ahead and do it correctly
-
so import commands. All right, yeah, I think
that might work. All right. So I'd enter the
-
Phyton. So if I say--I'm just going to give
it a dot again. Oh, there we go. So what that
-
did is it put together the ls-l. It went through
the commands module. It launched it. My Phyton
-
number waits, blocks. Eventually, the thing
ran. It produced your, you know, typical ls-l
-
sort of output. And then, then I'm done. All
right, so now this is--now, I'm going to fix
-
this up in a couple of ways. I'll regard this
as like, "Not quite right." So one thing I
-
want to do is I want to notice if this thing
failed. And the way I'm going to do that,
-
the simplest way is I'm just going to say,
"if status," if the status is non-zero, then
-
I want to notice if there was an error. So
because status is coming through as an int
-
(ph--if, you know, if it's zero, that's going
to count as false and the other value is kind
-
of true. So that's sort of the most primitive
way of detecting an error here. So then I'm
-
going to say something like print--I think
I could refer to "sys.stderr," you know, there's,
-
you know, whatever. There was an error. Now,
I'm being little picky here because when you
-
capture the standard error of a subprocess,
if I were to sort of squelch it, if I was
-
just try to kind of eat it and hide it, it
makes the system undebuggable. I mean if you
-
think about software systems where it's, you
know, some big thing with a lot of parts,
-
the key piece of information when it's used
incorrectly which of course it is is that
-
whatever the lowest level was that ran into
error reports it. It raises some kind of message
-
like, "Hey, this didn't work and you are really
dependent on that low level letting you know.
-
Or put the other way, if the low level fails
and remains silent, it's very, very difficult
-
to debug. And I'm pointing this out because
this the rare case where we are capturing
-
the standard error of that thing. And so we
are kind of responsible for making sure that
-
it gets supported. So, I--and, I'll just say
something like that. And then I'm going to
-
say "sys.exit(1)" I'm just going to be like,
yes, we are--I'm just--I'm giving up--I'm
-
terminating. So, that is one--one thing I
would want to do. Now, the other thing I'm
-
going to change here is when I'm--like suppose
you have a bug in your baby name's code, you
-
know, you like did the regular expression
wrong. And like, really, what are the consequences
-
of that? Oh, well, you know, whatever, some
of the baby name data is a little bit incorrect
-
or you missed something. But having--an error
in your code, you just get like slightly bad
-
data which is not that bad, I'm going to say.
Now, what if I have a bug here in the string
-
where I'm putting together a command which
I'm about to shell out and run as me? And
-
I just wanted to point out, the ramifications
of doing that wrong are potentially much worse.
-
All right, that I'm--whenever I write command,
I'm immediately on this slightly heightened
-
sense of paying attention. I'm like, "Okay,
well, yeah, I could really delete everything
-
or whatever." So just to demonstrate that,
what if I were to change this to say, "'rm
-
-rf' *" or let's say, you know, why stop there,"/*,"
right? Oh, I'm sorry, the directory is already
-
there. It's an argument. Okay, there, all
right? Here, I'll--here, I'll show you. I'm
-
going to save it, all right? Now, if you're
anything like me, like I maybe like, "Oh,
-
okay, it sounds good, all right, so here's
what I recommend doing: when I'm writing this
-
kind of stuff I'll say "print 'about to do
this:,'" Oh, there's the command. And then
-
I'll just like return, whatever, just don't
get to the stuff below because you can sort
-
of debug your program, all these other reading
directories or whatever kind of stuff and
-
you can still have it printed, here's the
command it's going to do. And so it's more
-
pleasing I think to debug it that way. So
let's just try this. So I'm going to save
-
it and that definitely returns, right? Oh,
hey, you know, the snapshot directory will
-
be out here. It's unscrapable. All right,
sorry. I just got on the wrong part. All right,
-
so what I meant to do is go down here, "hello.py."
There was a--what's the problem with that?
-
Did I forget a "if, print, if status print"--oh,
oh, oh, oh--all right, okay, this one's--okay,
-
never mind that. Let me just--I'm not used
to write some text for--so, let me just get
-
rid of that for now. All right, okay, so I'm
about to do this, "'rm -rf.'" So I'll be like,
-
"Oh, oh, wait a minute, I didn't mean rf--rm-rf,
I meant "'ls -l'" so that's our--that's what
-
we're going to do. So that's just kind of--I
mean, you know, in your next exercise, I'm
-
going to ask you to shell out and so just,
you know, just for like saying it or whatever.
-
Now, this error--I'll try to do it the other
way. The print syntax for writing--like normally,
-
when we say "print," it just go to standard
out. But printing to another file handle,
-
the syntax is sort of terrible. I'm going
to--I think--I think I can do "dot" right
-
there. I'll put this together with a plus.
I think that's better. So let's see. Now,
-
it's doing "'ls -l'," all right. Anyway, so
that's the--that is the better syntax for
-
that. All righty, so let me show you--so those
are the two module things I want you to work
-
on for this next bit. So let me show you our
next exercise. All right, so I'm going to
-
go into day 2 here and the next one I want
to work on is "copyspecial." So as before,
-
there's a printed form of the description
of this. So I'm just going to kind of demo
-
through it, but then you really want to look
at the printed direction. So, you know, it's
-
going to have a part A and a part B. This
one's a little smaller and so I want to spend
-
like a little bit last on this one. If you
don't get to Part B, that's okay because then
-
the third assignment, the last one I think
is the most interesting and that's one of
-
the bigger ones so I want to make sure we
save time for that. Okay, so here's the idea
-
with this. The idea is in the file system,
there are certain file names which are special.
-
In a particular, I'm going to say that a file
name is special if it has the pattern that
-
somewhere in the file name there are two underbars
and then one or more word characters followed
-
by two underbars. And so for example, in this
directory, there are two special files. There's
-
the "hello" and the something and then the
solution directory and copyspecial--well,
-
those aren’t special. So, this is, you know,
sort of Google admin kind of thing. You got
-
at least directories scattered all over the
place and you want to move from around and
-
stuff. So the first thing I'd like to do,
let's see--now, if you run the command with
-
no arguments, it always kind of tells you
what the--what the arguments are. So in this
-
case, I can run it just with a directory.
So here, I'm going to run it on "dot" as the
-
current directory. So if I run it on "dot"
what I want you to do--so it takes a directory
-
as an argument. What I want you to do is I
want you to find all the special files and
-
just list them. And oh, in particular, list
them by their absolute paths. The absolute
-
path is something--if you were write a path
to a file or whatever, that's the path that's
-
nice because it's independent of the process
that produced it. It doesn't depend on the
-
notion of current directory. It's like this
really is where that file is. So that's the--that's
-
the simplest case. Just find them, list them.
The next most complicated thing I want you
-
to do is this thing takes a two-directory
argument. So I'll say "/tmp/"--now, I'm thinking
-
of some random word I haven’t used. What
day is it today? Thursday, I'll say Thursday.
-
All right, so in that case, what I wanted
to do is find all the special files and create
-
that directory if it doesn't exist and copy
all the special files to it. So, I'll find
-
out "cd /tmp/ thus" and do it--oh, somebody
checked it out over there. All right, I'll
-
go back. That's Part B. So then, I'm pretty
happy if you get that. But if you just have
-
enough time, then I also want you to have
a two-zip, which is very similar to the two-directory
-
but instead, now I want to be able to say,
"blah.zip" and what I want it to do is I want
-
to find all the special files, invoke the
zip utility to zip them all up into the zip
-
file named here. So if I call that, and you
can see actually my debugging is still in
-
here, right? "Command I'm about to do," and
then, oops, and then here is the zip command.
-
Zip incidentally by the way--I think the worst
man page ever written. I defy you to find
-
one less useful. It just talks about all the
stuff you would never want to do. And it never
-
talks about the thing that you want to do--my
personal experience. So it turns out the command
-
you want is "zip -j" and then the name of
the zip file and then you just--and then you
-
just have all the paths. Now in this case
I used absolute paths--really the zip is going
-
to have the same current directory as me,
so you could do the shorthand--anyway--depending
-
on your tolerance for that kind of fragility.
It's fine. So that will--that will zip it
-
up. Okay, so that is--that's the next exercise?
So I'd like you to go ahead and get started
-
on that. And then let's say I'll pull you
guys back here a little before 2, and then
-
we'll do the next exercises. All right, go.