English subtitles for clip: File:10 - 3 - Worked Exercise Lists (16 35).webm

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
1
00:00:01,210 --> 00:00:04,130
Hello, and welcome to yet another worked
exercise.

2
00:00:04,130 --> 00:00:07,220
I'm the author of the book Python for
Informatics: Exploring Information,

3
00:00:07,220 --> 00:00:12,430
as well as your host for this
particular worked exercise.

4
00:00:13,470 --> 00:00:16,640
As always, the book, this audio and video,

5
00:00:16,640 --> 00:00:19,860
and these slides are copyright Creative
Commons Attribution, and

6
00:00:19,860 --> 00:00:22,840
I hope you find exciting and interesting
ways to

7
00:00:22,840 --> 00:00:26,320
reuse them and re-purpose them and add to
them.

8
00:00:26,320 --> 00:00:31,610
So pythonlearn.com is my support website
for the Python for Informatics book.

9
00:00:31,610 --> 00:00:33,750
And if you, hopefully by now you've got
all this

10
00:00:33,750 --> 00:00:37,280
worked out, but it does teach you how to
get started.

11
00:00:37,280 --> 00:00:40,770
So that, how all the things that you need
to know,

12
00:00:40,770 --> 00:00:45,240
to edit files and use the command line to run
Python programs you've got figured out.

13
00:00:45,240 --> 00:00:45,890
Okay?

14
00:00:45,890 --> 00:00:49,300
So our program for today is from the list
chapter.

15
00:00:49,300 --> 00:00:52,180
And our program is actually, we're
going to debug a program

16
00:00:52,180 --> 00:00:55,550
rather than just writing a program from
scratch.

17
00:00:55,550 --> 00:00:59,870
And so what our, our task is, in this
program is we're looking

18
00:00:59,870 --> 00:01:05,360
for words that start with from,
looking for lines that start with from.

19
00:01:05,360 --> 00:01:07,840
And on the lines that start with from,
we're going to pull out the

20
00:01:07,840 --> 00:01:10,330
day of the week that this particular

21
00:01:10,330 --> 00:01:14,370
email message was sent. Saturday, Friday,
whatever.

22
00:01:16,620 --> 00:01:19,460
And so the structure of the program
is pretty standard.

23
00:01:19,460 --> 00:01:21,010
We're going to open a file at the top.

24
00:01:21,010 --> 00:01:23,980
We're going to loop through the file.

25
00:01:23,980 --> 00:01:25,660
Strip the newlines at the end, right?

26
00:01:25,660 --> 00:01:28,370
There's a little newline at the end of
each line.

27
00:01:28,370 --> 00:01:29,830
And then we're going to use split to split
it into words,

28
00:01:29,830 --> 00:01:34,810
which means that'll make this the
zero word, the one word,

29
00:01:34,810 --> 00:01:36,350
and the two word.

30
00:01:36,350 --> 00:01:38,820
And then we're going to check the zero
word to see if it's from,

31
00:01:38,820 --> 00:01:42,040
and if the zero word's not from we're
uninterested in that line so we're going

32
00:01:42,040 --> 00:01:46,740
to skip back up. And if it is good, if we
wfind a line

33
00:01:46,740 --> 00:01:49,870
we're interested in, we're going to skip
the majority of the lines in this file.

34
00:01:51,090 --> 00:01:53,130
Less than one in 100 of these lines
actually

35
00:01:53,130 --> 00:01:55,770
have from in the, as the first word.

36
00:01:55,770 --> 00:01:57,680
And then the second word is what we're
interested in

37
00:01:57,680 --> 00:01:59,480
so we're going to print the, the day of
the week.

38
00:02:00,910 --> 00:02:05,300
So this program as we type it in is going
to have a

39
00:02:05,300 --> 00:02:08,280
traceback and we go like, you know, as soon
as you see traceback

40
00:02:08,280 --> 00:02:12,290
you immediately are drawn to the word
traceback and maybe you've gotten to

41
00:02:12,290 --> 00:02:16,910
the point where you start to read the why,
list index out of range.

42
00:02:16,910 --> 00:02:19,870
And then, you know, it says line five, so
that's kind of helpful.

43
00:02:19,870 --> 00:02:25,660
But the first thing to not get distracted
about is, is this.

44
00:02:25,660 --> 00:02:27,960
Our program actually ran a little bit.

45
00:02:27,960 --> 00:02:31,810
So when you look at a traceback, look
right above it and make sure, and

46
00:02:31,810 --> 00:02:33,550
maybe you won't see any output, but maybe

47
00:02:33,550 --> 00:02:36,680
your program will have partially run,
and so

48
00:02:36,680 --> 00:02:39,580
don't immediately assume the program is
totally broken. So this

49
00:02:39,580 --> 00:02:43,770
is actually successful output of one of
the many from lines.

50
00:02:43,770 --> 00:02:45,670
And it, it dies later.

51
00:02:45,670 --> 00:02:51,490
And so, as we debug it, we'll sort of come
up with techniques to sort of figure out

52
00:02:51,490 --> 00:02:53,400
how much your program has done before it
died,

53
00:02:53,400 --> 00:02:57,200
because that might be an important,
important question to answer.

54
00:02:57,200 --> 00:03:02,100
How far did your program get, because if
it dies before the first line, or on

55
00:03:02,200 --> 00:03:03,850
the first line, that's something different
than if

56
00:03:03,850 --> 00:03:07,000
it goes like 300 lines and then dies.

57
00:03:07,110 --> 00:03:10,730
If it goes a long ways, 300 lines, and
then dies,

58
00:03:10,730 --> 00:03:14,540
maybe there's something subtle or weird
about the line it's dying on,

59
00:03:14,540 --> 00:03:15,840
rather than just your program is

60
00:03:15,840 --> 00:03:17,710
just like totally broken.

61
00:03:17,710 --> 00:03:22,430
So you got, look for something where your
program actually partially worked,

62
00:03:22,430 --> 00:03:25,370
and helps you kind of narrow down
your suspicions as

63
00:03:25,370 --> 00:03:26,730
to what might be wrong with this.

64
00:03:26,730 --> 00:03:29,590
So let's get on with the, let's get on
with the programming.

65
00:03:32,020 --> 00:03:34,750
And so I will simply steal this.

66
00:03:34,750 --> 00:03:35,780
I'll cut and paste it.

67
00:03:36,970 --> 00:03:38,570
And that will be that.

68
00:03:38,570 --> 00:03:42,640
So I'm going to cut and paste this into my
TextWrangler.

69
00:03:42,640 --> 00:03:44,610
Now it's always dangerous cutting and
pasting.

70
00:03:44,610 --> 00:03:47,200
But I'm cutting and pasting from a slide
which seems

71
00:03:47,200 --> 00:03:49,020
to work a little better than cutting and
pasting from PDFs.

72
00:03:49,020 --> 00:03:51,390
And you can even get away with cutting and

73
00:03:51,390 --> 00:03:53,580
pasting with PDFs as long as you don't
freak out

74
00:03:53,580 --> 00:03:54,760
when the first thing you see is a bunch

75
00:03:54,760 --> 00:03:59,430
of crazy syntax errors because of
characters being coded improperly.

76
00:03:59,430 --> 00:04:01,760
So now I'm in this file, and I'm going to
save this.

77
00:04:01,760 --> 00:04:05,735
Save As, and I'll put this on my Desktop

78
00:04:05,735 --> 00:04:11,567
in the lists folder.

79
00:04:11,567 --> 00:04:19,440
And we'll call this day of the week .py in the
lists folder.

80
00:04:21,420 --> 00:04:23,310
Okay, and of course it's, syntax

81
00:04:23,310 --> 00:04:25,590
colored it, which makes my TextWrangler

82
00:04:25,590 --> 00:04:27,410
syntax colored it, and so I've got sitting

83
00:04:27,410 --> 00:04:30,760
in this lists folder, I've got the
day of the week .py.

84
00:04:30,760 --> 00:04:35,200
Now one of the things that we've got to do
is we've actually got to get our data file.

85
00:04:35,200 --> 00:04:37,200
See this mbox-short.txt?

86
00:04:37,200 --> 00:04:42,730
So the mbox-short.txt is sitting here on
pythonlearn.com.

87
00:04:42,730 --> 00:04:48,110
I'll go to the book page and you'll see
this, all the stuff and code samples.

88
00:04:48,110 --> 00:04:51,085
Including maybe a worked version of this
example, you just never know.

89
00:04:51,085 --> 00:04:54,790
[NOISE] Ii doesn't look like
this one's here but

90
00:04:54,790 --> 00:05:00,030
the file I'm looking for is mbox dot
short dot t-x-t.

91
00:05:01,030 --> 00:05:01,635
So there it is.

92
00:05:01,635 --> 00:05:03,870
It's a mailing thing and here's one of
those lines

93
00:05:03,870 --> 00:05:05,750
we're looking at and we're going to
there's many of them.

94
00:05:05,750 --> 00:05:07,320
This happens to be the first line of the
file.

95
00:05:07,320 --> 00:05:11,950
So I'm going to make sure that I save this
into the folder.

96
00:05:11,950 --> 00:05:17,370
Go back to my Desktop and put it in lists
and then save it.

97
00:05:18,550 --> 00:05:19,350
And so.

98
00:05:21,000 --> 00:05:26,640
At that point I should be able to go back
and see ah-hah, I have mbox-short.txt.

99
00:05:26,640 --> 00:05:31,830
Now the interesting thing is if you read
this file, [SOUND] if you open this file,

100
00:05:35,310 --> 00:05:36,610
TextWrangler's perfectly happy.

101
00:05:36,610 --> 00:05:37,920
So we see from.

102
00:05:37,920 --> 00:05:44,270
So if I was to do sort of a search, and I
want to search for the string

103
00:05:44,270 --> 00:05:47,211
capital from space. Okay?

104
00:05:47,211 --> 00:05:51,540
And then I'm going to search next.

105
00:05:53,310 --> 00:05:54,260
Oh, theres lots of them.

106
00:05:54,260 --> 00:05:56,370
I'm going to make them case sensitive,
okay?

107
00:05:56,370 --> 00:05:59,180
So it'll only be the froms that are case
sensitive.

108
00:06:00,390 --> 00:06:04,899
There we go, and so in a sense our program is
doing this little from where it's looking.

109
00:06:05,920 --> 00:06:08,420
It's throwing away all the lines except
those that start with

110
00:06:08,420 --> 00:06:10,580
from and a space, because of the way the
split works.

111
00:06:10,580 --> 00:06:15,680
And then it's going to pull these, these
days of the week, these Fridays.

112
00:06:15,680 --> 00:06:18,680
So I'm kind of doing, if I was doing it by
hand it

113
00:06:18,680 --> 00:06:22,820
would be like find the line that starts
with from blank, and then

114
00:06:22,820 --> 00:06:25,990
grab this little text, and that's what I'm
interested in.

115
00:06:25,990 --> 00:06:28,500
Maybe I'm curious as to whether or not
these

116
00:06:28,500 --> 00:06:31,690
folks work on the weekends or on the
weekdays.

117
00:06:31,690 --> 00:06:33,860
And that's the purpose of this.

118
00:06:33,860 --> 00:06:34,540
Okay?

119
00:06:34,540 --> 00:06:35,460
So I'll close this.

120
00:06:35,460 --> 00:06:39,240
Just, it's, that's what, that's the, Close
Document.

121
00:06:39,240 --> 00:06:39,760
There we go.

122
00:06:39,760 --> 00:06:41,880
So here's my little program.

123
00:06:41,880 --> 00:06:46,700
And so, let's, I can get rid of this now.

124
00:06:46,700 --> 00:06:48,890
So now I'm going to go into my Desktop.

125
00:06:48,890 --> 00:06:50,570
I'm in my home directory.

126
00:06:50,570 --> 00:06:56,320
cd Desktop, cd lists. If I do ls, there
I am.

127
00:06:56,320 --> 00:07:00,830
I've got this day, day of the week Python
program, and mbox-short.txt sitting here.

128
00:07:00,830 --> 00:07:09,160
So I'm going to run it [SOUND]
and there it works.

129
00:07:09,160 --> 00:07:11,800
I'll get, I mean, well, works for some
value of work.

130
00:07:11,800 --> 00:07:14,140
So we get the same traceback the lecture
slide suggested

131
00:07:14,140 --> 00:07:17,640
we would, and thankfully that probably
means the lecture slide's right.

132
00:07:17,640 --> 00:07:19,450
So, so here we go.

133
00:07:19,450 --> 00:07:22,360
We got, we got this one thing where the
day of the

134
00:07:22,360 --> 00:07:26,340
week is kind of coming out and then it
dies on line five.

135
00:07:26,340 --> 00:07:28,150
And it even gives us the line it's dying on.

136
00:07:28,150 --> 00:07:30,750
So, it's complaining about list index out
of range, and you

137
00:07:30,750 --> 00:07:32,570
might be able to stare at this and maybe
you're smart.

138
00:07:32,570 --> 00:07:34,290
You're, you have enough skill already that

139
00:07:34,290 --> 00:07:35,930
you're seeing these kinds of errors and
you

140
00:07:35,930 --> 00:07:39,360
just read it right away, know what the
problem is, but that's not so much fun.

141
00:07:39,360 --> 00:07:41,330
So here I am in line five, right?

142
00:07:41,330 --> 00:07:43,400
I'm going to go right to line five.

143
00:07:43,400 --> 00:07:46,980
And, and so the first thing I want to do
on line five,

144
00:07:46,980 --> 00:07:50,810
is I often add a print statement right
before the line that's dying.

145
00:07:50,810 --> 00:07:54,258
And I'll just print out something [SOUND].

146
00:07:54,258 --> 00:07:55,830
Something random.

147
00:07:55,830 --> 00:07:59,110
Just, it, you know, sometimes I print the
letter a out, right?

148
00:07:59,110 --> 00:08:00,310
And so now I'll save this.

149
00:08:01,490 --> 00:08:03,120
And now I'm going to run it again,

150
00:08:05,380 --> 00:08:10,650
and this tells me, just by putting a print
statement in, what's going on here.

151
00:08:10,650 --> 00:08:11,970
It's like, okay wait a sec.

152
00:08:11,970 --> 00:08:13,090
This is the good line.

153
00:08:13,090 --> 00:08:15,150
This is the line, I mean, this is the line
I'm interested in.

154
00:08:15,150 --> 00:08:17,150
As a matter of fact, I'm going to make
another change

155
00:08:17,150 --> 00:08:19,820
just to help me sort of visually see
what's going on.

156
00:08:19,820 --> 00:08:22,572
[SOUND] I'm going to put like a bunch of
equal signs at

157
00:08:22,572 --> 00:08:25,770
the beginning of this line and now I'm
going to run it again.

158
00:08:26,950 --> 00:08:29,650
So if I look, it's like, oh, dude, that's
the good line.

159
00:08:29,650 --> 00:08:30,680
That's the line I'm liking.

160
00:08:30,680 --> 00:08:33,720
And here's my debugging, Something random,
Something random, Something random.

161
00:08:33,720 --> 00:08:35,750
So a lot of stuff is going on here.

162
00:08:35,750 --> 00:08:38,860
And then finally it dies, right?

163
00:08:38,860 --> 00:08:40,550
So finally, it dies here.

164
00:08:40,550 --> 00:08:42,760
Now, it's line six because I added a line.

165
00:08:42,760 --> 00:08:44,570
So the, so perhaps instead of printing

166
00:08:44,570 --> 00:08:48,370
out Something random, I'll print something
useful.

167
00:08:49,590 --> 00:08:53,490
So, the first thing to do is to look at
this statement, the one that's dying.

168
00:08:53,490 --> 00:08:57,910
And say, what is the most suspicious thing
in here, okay?

169
00:08:57,910 --> 00:09:01,240
And, and so, if something's going wrong
with this

170
00:09:01,240 --> 00:09:04,330
words sub zero, it's saying index out of
range, right?

171
00:09:04,330 --> 00:09:06,120
Something going wrong with words sub zero.

172
00:09:06,120 --> 00:09:07,350
So what's the deal?

173
00:09:07,350 --> 00:09:10,210
So I'm not going to print out words sub
zero, I'm going to print out words.

174
00:09:10,210 --> 00:09:14,330
I'm going to say, like, what is in this
words list at this point?

175
00:09:14,330 --> 00:09:15,200
So now let me save that.

176
00:09:15,200 --> 00:09:19,650
So instead of printing something random,
we'll see a bunch of words go by.

177
00:09:19,650 --> 00:09:21,860
And so here's that first line broken

178
00:09:21,870 --> 00:09:26,020
into words, where from Stephen Marquard,
Saturday.

179
00:09:26,020 --> 00:09:27,960
And here's our good line.

180
00:09:27,960 --> 00:09:31,090
And then here's one that doesn't have from
as the first word, so we skip it.

181
00:09:31,090 --> 00:09:32,160
Here's one that doesn't.

182
00:09:32,160 --> 00:09:33,850
So let's continue onto where it's, you
know,

183
00:09:33,850 --> 00:09:36,820
we see lots of lines, everything's
cruising along.

184
00:09:36,820 --> 00:09:40,260
We actually see one that says from colon,
but that's not what we're looking for.

185
00:09:40,260 --> 00:09:41,940
So that one gets skipped too.

186
00:09:41,940 --> 00:09:43,140
So make sure you skip that.

187
00:09:43,140 --> 00:09:43,840
Now, here we are.

188
00:09:45,250 --> 00:09:46,370
Here's the words.

189
00:09:49,140 --> 00:09:51,930
Here's that list of words that we've
split.

190
00:09:51,930 --> 00:09:53,850
Now, this was the line before, here's the
next line.

191
00:09:53,850 --> 00:09:55,630
It's like, okay, what's going on here?

192
00:09:56,870 --> 00:09:58,880
That looks like an empty list.

193
00:09:58,880 --> 00:10:02,540
Okay, so I'm a little more curious, I'm
going to,

194
00:10:02,540 --> 00:10:05,608
I'm going to print, I'm going to print
line out too.

195
00:10:05,608 --> 00:10:09,050
[SOUND] And I'm going to put, just so I
know what's

196
00:10:09,050 --> 00:10:12,120
going on, I'll stick some, a couple pluses
in front of that.

197
00:10:12,120 --> 00:10:14,590
So the line will have pluses on it, and
it'll print the word.

198
00:10:14,590 --> 00:10:15,350
So let's run this again.

199
00:10:15,350 --> 00:10:18,160
So I'm adding junk to this, right?

200
00:10:18,160 --> 00:10:21,890
So now when I look at my things, the
pluses mean that's the line I'm seeing.

201
00:10:21,890 --> 00:10:22,060
Right?

202
00:10:22,060 --> 00:10:23,480
I just put those pluses in for my own

203
00:10:23,480 --> 00:10:26,630
visible, so I can visually see it more
naturally.

204
00:10:26,630 --> 00:10:29,460
That's the line and you can see how nicely
it breaks

205
00:10:29,460 --> 00:10:32,190
it into words, so it breaks this line at
the space.

206
00:10:32,190 --> 00:10:32,970
Oops, sorry.

207
00:10:32,970 --> 00:10:34,540
That was not so good.

208
00:10:34,540 --> 00:10:40,040
It, it breaks this line, spam probability
line, at this space and gives me two words.

209
00:10:40,040 --> 00:10:40,540
Right?

210
00:10:41,810 --> 00:10:42,640
So here we are.

211
00:10:42,640 --> 00:10:43,620
Two, here we go.

212
00:10:43,620 --> 00:10:47,720
Here is this space, and then it breaks
it into one word, and another word.

213
00:10:47,720 --> 00:10:48,950
And so that's working.

214
00:10:48,950 --> 00:10:51,760
But if I look here, that's the line that
I'm on.

215
00:10:51,760 --> 00:10:53,510
Oh, well that's a blank line.

216
00:10:53,510 --> 00:10:55,400
Hang on a sec, hang on a sec. So let's

217
00:10:55,400 --> 00:10:58,050
just real quick take a look at that file
again.

218
00:11:01,320 --> 00:11:05,520
Let's look at that file, and so here we
go, here we go

219
00:11:08,670 --> 00:11:10,660
So that's cool, that's cool, that's cool,
let's look at the line.

220
00:11:13,640 --> 00:11:16,220
And here we go, so if you look.

221
00:11:16,220 --> 00:11:19,440
The last thing that we read successfully
was this line right here.

222
00:11:20,940 --> 00:11:26,410
This line is a blank line, so our program
is choking on blank lines.

223
00:11:26,410 --> 00:11:29,650
It's not failing on lines that work right,
and it's

224
00:11:29,650 --> 00:11:32,050
not failing on lines that don't have from
in them.

225
00:11:32,050 --> 00:11:36,350
It can handle the front line fine, it can
handle the, any other line fine.

226
00:11:36,350 --> 00:11:41,640
But it freaks out on the simplest of
things, blank lines.

227
00:11:41,640 --> 00:11:44,805
So the question is, how do we fix this?

228
00:11:44,805 --> 00:11:47,240
[SOUND] Right?

229
00:11:47,240 --> 00:11:50,050
How do we deal with this fact that
this bit

230
00:11:50,050 --> 00:11:57,530
of code, right here, is dying whenever
there's a blank line, okay?

231
00:11:57,530 --> 00:12:00,135
So, the way we do this is what's called
the guardian pattern.

232
00:12:00,135 --> 00:12:02,180
[SOUND] I'm going to get rid of this.

233
00:12:03,910 --> 00:12:07,210
So one of the guardian patterns is it
looks like this.

234
00:12:07,210 --> 00:12:08,770
So, we don't want to get to this.

235
00:12:08,770 --> 00:12:10,350
So we want to put something in front of it
to

236
00:12:10,350 --> 00:12:13,180
make sure we never hit it in a dangerous
situation.

237
00:12:13,180 --> 00:12:17,370
So it's if hmm, hmm, hmm, something,
continue.

238
00:12:17,370 --> 00:12:24,730
And this is if, we'll call this thing the
safety check. [SOUND]

239
00:12:24,730 --> 00:12:25,370
Check.

240
00:12:27,240 --> 00:12:29,330
If some safety check, continue.

241
00:12:29,330 --> 00:12:35,210
So that means that, you know, you know, if
the safety check matches,

242
00:12:35,210 --> 00:12:38,370
we're going to not fall through and not do
this dangerous line.

243
00:12:38,370 --> 00:12:40,830
So the question is what would we put in as
the

244
00:12:40,830 --> 00:12:45,390
safety check here to protect. This line is
guarding this line.

245
00:12:45,390 --> 00:12:47,470
That's what the guardian pattern means, is do

246
00:12:47,470 --> 00:12:50,130
something before the scary thing that
hurts you.

247
00:12:51,520 --> 00:12:52,111
Okay?

248
00:12:52,111 --> 00:12:59,060
So, one thing we can do is we can say,
okay, well, what's words?

249
00:12:59,060 --> 00:13:00,700
Well words is an empty list.

250
00:13:00,700 --> 00:13:01,620
You know what?

251
00:13:01,620 --> 00:13:05,820
If, if I got an empty list of words, I
have no interest in this.

252
00:13:05,820 --> 00:13:11,340
So if I say words is equal to an empty
list, continue.

253
00:13:11,340 --> 00:13:12,526
So what this basically says is.

254
00:13:12,526 --> 00:13:13,097
You know.

255
00:13:13,097 --> 00:13:15,148
You know.

256
00:13:15,148 --> 00:13:18,630
Read the line, strip it, split it into
words.

257
00:13:18,630 --> 00:13:19,630
We'll print it.

258
00:13:19,630 --> 00:13:26,230
If it's an empty list, continue and go up
to the next line.

259
00:13:26,230 --> 00:13:29,770
And then, if it is a non-empty list, it
continues here.

260
00:13:29,770 --> 00:13:33,010
And checks to see if the first word is
From.

261
00:13:33,010 --> 00:13:36,110
And if it is not, it continues, and then if
it works, it works.

262
00:13:36,110 --> 00:13:40,850
So this here, this line here, is
the guardian line, it's

263
00:13:40,850 --> 00:13:44,240
guarding the, this other line and you
gotta do it in order.

264
00:13:44,240 --> 00:13:45,220
If you do the guard, you have to

265
00:13:45,220 --> 00:13:48,650
have the guardian happen before the line
in question.

266
00:13:48,650 --> 00:13:49,520
So now let's run it.

267
00:13:52,360 --> 00:13:53,850
Oops.

268
00:13:53,850 --> 00:13:56,280
Let's run it, and now it works great.

269
00:13:56,280 --> 00:14:02,860
It's, we got a little too much crap, so we
better get rid of this print statement.

270
00:14:04,150 --> 00:14:05,570
It didn't give me a traceback, right?

271
00:14:05,570 --> 00:14:08,690
So now I'm going to get rid of that print
statement and look at that.

272
00:14:08,690 --> 00:14:11,900
I'm getting the line Saturday, Friday,
Friday, Friday, Friday, Thursday.

273
00:14:11,900 --> 00:14:14,140
So it doesn't look like these people work
weekends.

274
00:14:14,140 --> 00:14:18,320
But it does look like they like finishing
up stuff toward the end of the week.

275
00:14:18,320 --> 00:14:19,620
So, there we go.

276
00:14:19,620 --> 00:14:24,000
I mean, it's a little bit of, it's not
much data to conclude anything solidly.

277
00:14:24,000 --> 00:14:28,140
Maybe if we look at the bigger mailbox
file, we can find something else.

278
00:14:29,300 --> 00:14:30,590
Okay, so now it works.

279
00:14:30,590 --> 00:14:32,930
Let me show you a couple of other things

280
00:14:32,930 --> 00:14:34,890
that you could have used as a guardian
pattern.

281
00:14:34,890 --> 00:14:37,910
So we could say  if the words we got back is
an empty list.

282
00:14:37,910 --> 00:14:43,640
I could say if the number of elements in
words is less than one, continue.

283
00:14:43,640 --> 00:14:47,680
That basically says, look, if the words
that I got, if I got fewer than

284
00:14:47,680 --> 00:14:49,230
one word, then I'm not going to look

285
00:14:49,230 --> 00:14:51,680
for the first word, because that's quite
dangerous.

286
00:14:51,680 --> 00:14:54,810
So if I run that, it again works.

287
00:14:54,810 --> 00:15:03,280
Another more direct, even potentially more
direct, would be if let's move this up.

288
00:15:06,770 --> 00:15:13,630
I could say if line is equal to empty
string, continue.

289
00:15:14,690 --> 00:15:17,590
So now what it says is, you know, if I'm
getting a blank line,

290
00:15:17,590 --> 00:15:19,220
I'm not even going to bother splitting it

291
00:15:19,220 --> 00:15:21,240
because I know exactly what's going to
happen.

292
00:15:21,240 --> 00:15:22,455
So let's see if this works.

293
00:15:29,455 --> 00:15:31,560
[SOUND]
So this one works as well.

294
00:15:31,560 --> 00:15:37,310
So now I've got, you know, a couple of
codes, a couple of code paths through here.

295
00:15:39,030 --> 00:15:43,650
Right? If, if the blank line, I do it after
the strip to make sure it is really empty.

296
00:15:43,650 --> 00:15:47,060
I skip blank lines the way, I come all
the way down

297
00:15:47,060 --> 00:15:50,360
and split and then skip lines that don't
start with from this way.

298
00:15:50,360 --> 00:15:53,280
And the only way I make it all the way
down to here is if it is

299
00:15:53,280 --> 00:15:58,620
a non-blank line and if the first word is
a From, and then I do my thing.

300
00:15:58,620 --> 00:16:02,370
So that basically is the notion of

301
00:16:05,230 --> 00:16:08,330
the guardian pattern, and the pattern is
simply,

302
00:16:11,340 --> 00:16:15,980
if there is some code that might have a
problem depending on perhaps user input,

303
00:16:15,980 --> 00:16:18,200
put something in before it

304
00:16:18,200 --> 00:16:20,755
that makes sure that you never get to the
dangerous code.

305
00:16:20,755 --> 00:16:22,980
And don't use a try/except for this.

306
00:16:22,980 --> 00:16:27,356
That just would be tacky, tacky, tacky,
tacky.

307
00:16:27,356 --> 00:16:32,600
So that's the end of this this
presentation.

308
00:16:32,600 --> 00:16:34,860
So thanks a lot, and see you on the net.