1 \documentstyle[twoside,
11pt,myformat
]{report}
3 \title{Python Tutorial
}
18 Python is a simple, yet powerful programming language that bridges the
19 gap between C and shell programming, and is thus ideally suited for
20 ``throw-away programming''
21 and rapid prototyping. Its syntax is put
22 together from constructs borrowed from a variety of other languages;
23 most prominent are influences from ABC, C, Modula-
3 and Icon.
25 The Python interpreter is easily extended with new functions and data
26 types implemented in C. Python is also suitable as an extension
27 language for highly customizable C applications such as editors or
30 Python is available for various operating systems, amongst which
31 several flavors of
{\UNIX}, Amoeba, the Apple Macintosh O.S.,
34 This tutorial introduces the reader informally to the basic concepts
35 and features of the Python language and system. It helps to have a
36 Python interpreter handy for hands-on experience, but as the examples
37 are self-contained, the tutorial can be read off-line as well.
39 For a description of standard objects and modules, see the
{\em Python
40 Library Reference
} document. The
{\em Python Reference Manual
} gives
41 a more formal definition of the language.
53 \pagenumbering{arabic
}
56 \chapter{Whetting Your Appetite
}
58 If you ever wrote a large shell script, you probably know this
59 feeling: you'd love to add yet another feature, but it's already so
60 slow, and so big, and so complicated; or the feature involves a system
61 call or other function that is only accessible from C
\ldots Usually
62 the problem at hand isn't serious enough to warrant rewriting the
63 script in C; perhaps because the problem requires variable-length
64 strings or other data types (like sorted lists of file names) that are
65 easy in the shell but lots of work to implement in C; or perhaps just
66 because you're not sufficiently familiar with C.
68 In such cases, Python may be just the language for you. Python is
69 simple to use, but it is a real programming language, offering much
70 more structure and support for large programs than the shell has. On
71 the other hand, it also offers much more error checking than C, and,
72 being a
{\em very-high-level language
}, it has high-level data types
73 built in, such as flexible arrays and dictionaries that would cost you
74 days to implement efficiently in C. Because of its more general data
75 types Python is applicable to a much larger problem domain than
{\em
76 Awk
} or even
{\em Perl
}, yet many things are at least as easy in
77 Python as in those languages.
79 Python allows you to split up your program in modules that can be
80 reused in other Python programs. It comes with a large collection of
81 standard modules that you can use as the basis of your programs --- or
82 as examples to start learning to program in Python. There are also
83 built-in modules that provide things like file I/O, system calls,
84 sockets, and even a generic interface to window systems (STDWIN).
86 Python is an interpreted language, which can save you considerable time
87 during program development because no compilation and linking is
88 necessary. The interpreter can be used interactively, which makes it
89 easy to experiment with features of the language, to write throw-away
90 programs, or to test functions during bottom-up program development.
91 It is also a handy desk calculator.
93 Python allows writing very compact and readable programs. Programs
94 written in Python are typically much shorter than equivalent C
95 programs, for several reasons:
98 the high-level data types allow you to express complex operations in a
101 statement grouping is done by indentation instead of begin/end
104 no variable or argument declarations are necessary.
107 Python is
{\em extensible
}: if you know how to program in C it is easy
108 to add a new built-in
110 module to the interpreter, either to
111 perform critical operations at maximum speed, or to link Python
112 programs to libraries that may only be available in binary form (such
113 as a vendor-specific graphics library). Once you are really hooked,
114 you can link the Python interpreter into an application written in C
115 and use it as an extension or command language for that application.
117 By the way, the language is named after the BBC show ``Monty
118 Python's Flying Circus'' and has nothing to do with nasty reptiles...
120 \section{Where From Here
}
122 Now that you are all excited about Python, you'll want to examine it
123 in some more detail. Since the best way to learn a language is
124 using it, you are invited here to do so.
126 In the next chapter, the mechanics of using the interpreter are
127 explained. This is rather mundane information, but essential for
128 trying out the examples shown later.
130 The rest of the tutorial introduces various features of the Python
131 language and system though examples, beginning with simple
132 expressions, statements and data types, through functions and modules,
133 and finally touching upon advanced concepts like exceptions
134 and user-defined classes.
136 When you're through with the tutorial (or just getting bored), you
137 should read the Library Reference, which gives complete (though terse)
138 reference material about built-in and standard types, functions and
139 modules that can save you a lot of time when writing Python programs.
142 \chapter{Using the Python Interpreter
}
144 \section{Invoking the Interpreter
}
146 The Python interpreter is usually installed as
{\tt /usr/local/bin/python
}
147 on those machines where it is available; putting
{\tt /usr/local/bin
} in
148 your
{\UNIX} shell's search path makes it possible to start it by
151 \bcode\begin{verbatim
}
155 to the shell. Since the choice of the directory where the interpreter
156 lives is an installation option, other places are possible; check with
157 your local Python guru or system administrator. (E.g.,
{\tt
158 /usr/local/python
} is a popular alternative location.)
160 The interpreter operates somewhat like the
{\UNIX} shell: when called
161 with standard input connected to a tty device, it reads and executes
162 commands interactively; when called with a file name argument or with
163 a file as standard input, it reads and executes a
{\em script
} from
166 A third way of starting the interpreter is
167 ``
{\tt python -c command
[arg
] ...
}'', which
168 executes the statement(s) in
{\tt command
}, analogous to the shell's
169 {\tt -c
} option. Since Python statements often contain spaces or other
170 characters that are special to the shell, it is best to quote
{\tt
171 command
} in its entirety with double quotes.
173 Note that there is a difference between ``
{\tt python file
}'' and
174 ``
{\tt python $<$file
}''. In the latter case, input requests from the
175 program, such as calls to
{\tt input()
} and
{\tt raw_input()
}, are
176 satisfied from
{\em file
}. Since this file has already been read
177 until the end by the parser before the program starts executing, the
178 program will encounter EOF immediately. In the former case (which is
179 usually what you want) they are satisfied from whatever file or device
180 is connected to standard input of the Python interpreter.
182 When a script file is used, it is sometimes useful to be able to run
183 the script and enter interactive mode afterwards. This can be done by
184 passing
{\tt -i
} before the script. (This does not work if the script
185 is read from standard input, for the same reason as explained in the
188 \subsection{Argument Passing
}
190 When known to the interpreter, the script name and additional
191 arguments thereafter are passed to the script in the variable
{\tt
192 sys.argv
}, which is a list of strings. Its length is at least one;
193 when no script and no arguments are given,
{\tt sys.argv
[0]} is an
194 empty string. When the script name is given as
{\tt '-'
} (meaning
195 standard input),
{\tt sys.argv
[0]} is set to
{\tt '-'
}. When
{\tt -c
196 command
} is used,
{\tt sys.argv
[0]} is set to
{\tt '-c'
}. Options
197 found after
{\tt -c command
} are not consumed by the Python
198 interpreter's option processing but left in
{\tt sys.argv
} for the
201 \subsection{Interactive Mode
}
203 When commands are read from a tty, the interpreter is said to be in
204 {\em interactive\ mode
}. In this mode it prompts for the next command
205 with the
{\em primary\ prompt
}, usually three greater-than signs (
{\tt
206 >>>
}); for continuation lines it prompts with the
{\em secondary\
207 prompt
}, by default three dots (
{\tt ...
}). Typing an EOF (Control-D)
208 at the primary prompt causes the interpreter to exit with a zero exit
211 The interpreter prints a welcome message stating its version number
212 and a copyright notice before printing the first prompt, e.g.:
214 \bcode\begin{verbatim
}
216 Python
1.1 (Oct
6 1994)
217 Copyright
1991-
1994 Stichting Mathematisch Centrum, Amsterdam
221 \section{The Interpreter and its Environment
}
223 \subsection{Error Handling
}
225 When an error occurs, the interpreter prints an error
226 message and a stack trace. In interactive mode, it then returns to
227 the primary prompt; when input came from a file, it exits with a
228 nonzero exit status after printing
229 the stack trace. (Exceptions handled by an
{\tt except
} clause in a
230 {\tt try
} statement are not errors in this context.) Some errors are
231 unconditionally fatal and cause an exit with a nonzero exit; this
232 applies to internal inconsistencies and some cases of running out of
233 memory. All error messages are written to the standard error stream;
234 normal output from the executed commands is written to standard
237 Typing the interrupt character (usually Control-C or DEL) to the
238 primary or secondary prompt cancels the input and returns to the
241 A problem with the GNU Readline package may prevent this.
243 Typing an interrupt while a command is executing raises the
{\tt
244 KeyboardInterrupt
} exception, which may be handled by a
{\tt try
}
247 \subsection{The Module Search Path
}
249 When a module named
{\tt spam
} is imported, the interpreter searches
250 for a file named
{\tt spam.py
} in the list of directories specified by
251 the environment variable
{\tt PYTHONPATH
}. It has the same syntax as
252 the
{\UNIX} shell variable
{\tt PATH
}, i.e., a list of colon-separated
253 directory names. When
{\tt PYTHONPATH
} is not set, or when the file
254 is not found there, the search continues in an installation-dependent
255 default path, usually
{\tt .:/usr/local/lib/python
}.
257 Actually, modules are searched in the list of directories given by the
258 variable
{\tt sys.path
} which is initialized from
{\tt PYTHONPATH
} and
259 the installation-dependent default. This allows Python programs that
260 know what they're doing to modify or replace the module search path.
261 See the section on Standard Modules later.
263 \subsection{``Compiled'' Python files
}
265 As an important speed-up of the start-up time for short programs that
266 use a lot of standard modules, if a file called
{\tt spam.pyc
} exists
267 in the directory where
{\tt spam.py
} is found, this is assumed to
268 contain an already-``compiled'' version of the module
{\tt spam
}. The
269 modification time of the version of
{\tt spam.py
} used to create
{\tt
270 spam.pyc
} is recorded in
{\tt spam.pyc
}, and the file is ignored if
273 Whenever
{\tt spam.py
} is successfully compiled, an attempt is made to
274 write the compiled version to
{\tt spam.pyc
}. It is not an error if
275 this attempt fails; if for any reason the file is not written
276 completely, the resulting
{\tt spam.pyc
} file will be recognized as
277 invalid and thus ignored later.
279 \subsection{Executable Python scripts
}
281 On BSD'ish
{\UNIX} systems, Python scripts can be made directly
282 executable, like shell scripts, by putting the line
284 \bcode\begin{verbatim
}
285 #! /usr/local/bin/python
288 (assuming that's the name of the interpreter) at the beginning of the
289 script and giving the file an executable mode. The
{\tt \#!
} must be
290 the first two characters of the file.
292 \subsection{The Interactive Startup File
}
294 When you use Python interactively, it is frequently handy to have some
295 standard commands executed every time the interpreter is started. You
296 can do this by setting an environment variable named
{\tt
297 PYTHONSTARTUP
} to the name of a file containing your start-up
298 commands. This is similar to the
{\tt .profile
} feature of the UNIX
301 This file is only read in interactive sessions, not when Python reads
302 commands from a script, and not when
{\tt /dev/tty
} is given as the
303 explicit source of commands (which otherwise behaves like an
304 interactive session). It is executed in the same name space where
305 interactive commands are executed, so that objects that it defines or
306 imports can be used without qualification in the interactive session.
307 You can also change the prompts
{\tt sys.ps1
} and
{\tt sys.ps2
} in
310 If you want to read an additional start-up file from the current
311 directory, you can program this in the global start-up file, e.g.
312 \verb\execfile('.pythonrc')\. If you want to use the startup file
313 in a script, you must write this explicitly in the script, e.g.
314 \verb\import os;\
\verb\execfile(os.environ
['PYTHONSTARTUP'
])\.
316 \section{Interactive Input Editing and History Substitution
}
318 Some versions of the Python interpreter support editing of the current
319 input line and history substitution, similar to facilities found in
320 the Korn shell and the GNU Bash shell. This is implemented using the
321 {\em GNU\ Readline
} library, which supports Emacs-style and vi-style
322 editing. This library has its own documentation which I won't
323 duplicate here; however, the basics are easily explained.
325 Perhaps the quickest check to see whether command line editing is
326 supported is typing Control-P to the first Python prompt you get. If
327 it beeps, you have command line editing. If nothing appears to
328 happen, or if
\verb/^P/ is echoed, you can skip the rest of this
331 \subsection{Line Editing
}
333 If supported, input line editing is active whenever the interpreter
334 prints a primary or secondary prompt. The current line can be edited
335 using the conventional Emacs control characters. The most important
336 of these are: C-A (Control-A) moves the cursor to the beginning of the
337 line, C-E to the end, C-B moves it one position to the left, C-F to
338 the right. Backspace erases the character to the left of the cursor,
339 C-D the character to its right. C-K kills (erases) the rest of the
340 line to the right of the cursor, C-Y yanks back the last killed
341 string. C-underscore undoes the last change you made; it can be
342 repeated for cumulative effect.
344 \subsection{History Substitution
}
346 History substitution works as follows. All non-empty input lines
347 issued are saved in a history buffer, and when a new prompt is given
348 you are positioned on a new line at the bottom of this buffer. C-P
349 moves one line up (back) in the history buffer, C-N moves one down.
350 Any line in the history buffer can be edited; an asterisk appears in
351 front of the prompt to mark a line as modified. Pressing the Return
352 key passes the current line to the interpreter. C-R starts an
353 incremental reverse search; C-S starts a forward search.
355 \subsection{Key Bindings
}
357 The key bindings and some other parameters of the Readline library can
358 be customized by placing commands in an initialization file called
359 {\tt \$HOME/.inputrc
}. Key bindings have the form
361 \bcode\begin{verbatim
}
362 key-name: function-name
367 \bcode\begin{verbatim
}
368 "string": function-name
371 and options can be set with
373 \bcode\begin{verbatim
}
374 set option-name value
379 \bcode\begin{verbatim
}
380 # I prefer vi-style editing:
382 # Edit using a single line:
383 set horizontal-scroll-mode On
385 Meta-h: backward-kill-word
386 "
\C-u": universal-argument
387 "
\C-x
\C-r": re-read-init-file
390 Note that the default binding for TAB in Python is to insert a TAB
391 instead of Readline's default filename completion function. If you
392 insist, you can override this by putting
394 \bcode\begin{verbatim
}
398 in your
{\tt \$HOME/.inputrc
}. (Of course, this makes it hard to type
399 indented continuation lines...)
401 \subsection{Commentary
}
403 This facility is an enormous step forward compared to previous
404 versions of the interpreter; however, some wishes are left: It would
405 be nice if the proper indentation were suggested on continuation lines
406 (the parser knows if an indent token is required next). The
407 completion mechanism might use the interpreter's symbol table. A
408 command to check (or even suggest) matching parentheses, quotes etc.
409 would also be useful.
412 \chapter{An Informal Introduction to Python
}
414 In the following examples, input and output are distinguished by the
415 presence or absence of prompts (
{\tt >>>
} and
{\tt ...
}): to repeat
416 the example, you must type everything after the prompt, when the
417 prompt appears; lines that do not begin with a prompt are output from
420 I'd prefer to use different fonts to distinguish input
421 from output, but the amount of LaTeX hacking that would require
422 is currently beyond my ability.
424 Note that a secondary prompt on a line by itself in an example means
425 you must type a blank line; this is used to end a multi-line command.
427 \section{Using Python as a Calculator
}
429 Let's try some simple Python commands. Start the interpreter and wait
430 for the primary prompt,
{\tt >>>
}. (It shouldn't take long.)
434 The interpreter acts as a simple calculator: you can type an
435 expression at it and it will write the value. Expression syntax is
436 straightforward: the operators
{\tt +
},
{\tt -
},
{\tt *
} and
{\tt /
}
437 work just like in most other languages (e.g., Pascal or C); parentheses
438 can be used for grouping. For example:
440 \bcode\begin{verbatim
}
443 >>> # This is a comment
446 >>>
2+
2 # and a comment on the same line as code
450 >>> # Integer division returns the floor:
458 Like in C, the equal sign (
{\tt =
}) is used to assign a value to a
459 variable. The value of an assignment is not written:
461 \bcode\begin{verbatim
}
469 A value can be assigned to several variables simultaneously:
471 \bcode\begin{verbatim
}
472 >>> x = y = z =
0 # Zero x, y and z
482 There is full support for floating point; operators with mixed type
483 operands convert the integer operand to floating point:
485 \bcode\begin{verbatim
}
495 Besides numbers, Python can also manipulate strings, enclosed in
496 single quotes or double quotes:
498 \bcode\begin{verbatim
}
505 >>> '"Yes," he said.'
507 >>> "\"Yes,\" he said."
509 >>> '"Isn\'t," she said.'
510 '"Isn\'t," she said.'
514 Strings are written the same way as they are typed for input: inside
515 quotes and with quotes and other funny characters escaped by backslashes,
516 to show the precise value. The string is enclosed in double quotes if
517 the string contains a single quote and no double quotes, else it's
518 enclosed in single quotes. (The
{\tt print
} statement, described later,
519 can be used to write strings without quotes or escapes.)
521 Strings can be concatenated (glued together) with the
{\tt +
}
522 operator, and repeated with
{\tt *
}:
524 \bcode\begin{verbatim
}
525 >>> word = 'Help' + 'A'
528 >>> '<' + word*
5 + '>'
529 '<HelpAHelpAHelpAHelpAHelpA>'
533 Strings can be subscripted (indexed); like in C, the first character of
534 a string has subscript (index)
0.
536 There is no separate character type; a character is simply a string of
537 size one. Like in Icon, substrings can be specified with the
{\em
538 slice
} notation: two indices separated by a colon.
540 \bcode\begin{verbatim
}
550 Slice indices have useful defaults; an omitted first index defaults to
551 zero, an omitted second index defaults to the size of the string being
554 \bcode\begin{verbatim
}
555 >>> word
[:
2] # The first two characters
557 >>> word
[2:
] # All but the first two characters
562 Here's a useful invariant of slice operations:
\verb\s[:i
] + s
[i:
]\
565 \bcode\begin{verbatim
}
566 >>> word
[:
2] + word
[2:
]
568 >>> word
[:
3] + word
[3:
]
573 Degenerate slice indices are handled gracefully: an index that is too
574 large is replaced by the string size, an upper bound smaller than the
575 lower bound returns an empty string.
577 \bcode\begin{verbatim
}
587 Indices may be negative numbers, to start counting from the right.
590 \bcode\begin{verbatim
}
591 >>> word
[-
1] # The last character
593 >>> word
[-
2] # The last-but-one character
595 >>> word
[-
2:
] # The last two characters
597 >>> word
[:-
2] # All but the last two characters
602 But note that -
0 is really the same as
0, so it does not count from
605 \bcode\begin{verbatim
}
606 >>> word
[-
0] # (since -
0 equals
0)
611 Out-of-range negative slice indices are truncated, but don't try this
612 for single-element (non-slice) indices:
614 \bcode\begin{verbatim
}
617 >>> word
[-
10] # error
618 Traceback (innermost last):
619 File "<stdin>", line
1
620 IndexError: string index out of range
624 The best way to remember how slices work is to think of the indices as
625 pointing
{\em between
} characters, with the left edge of the first
626 character numbered
0. Then the right edge of the last character of a
627 string of
{\tt n
} characters has index
{\tt n
}, for example:
629 \bcode\begin{verbatim
}
630 +---+---+---+---+---+
631 | H | e | l | p | A |
632 +---+---+---+---+---+
637 The first row of numbers gives the position of the indices
0..
.5 in
638 the string; the second row gives the corresponding negative indices.
639 The slice from
\verb\i\ to
\verb\j\ consists of all characters between
640 the edges labeled
\verb\i\ and
\verb\j\, respectively.
642 For nonnegative indices, the length of a slice is the difference of
643 the indices, if both are within bounds, e.g., the length of
644 \verb\word[1:
3]\ is
2.
646 The built-in function
{\tt len()
} returns the length of a string:
648 \bcode\begin{verbatim
}
649 >>> s = 'supercalifragilisticexpialidocious'
657 Python knows a number of
{\em compound
} data types, used to group
658 together other values. The most versatile is the
{\em list
}, which
659 can be written as a list of comma-separated values (items) between
660 square brackets. List items need not all have the same type.
662 \bcode\begin{verbatim
}
663 >>> a =
['spam', 'eggs',
100,
1234]
665 ['spam', 'eggs',
100,
1234]
669 Like string indices, list indices start at
0, and lists can be sliced,
670 concatenated and so on:
672 \bcode\begin{verbatim
}
681 >>> a
[:
2] +
['bacon',
2*
2]
682 ['spam', 'eggs', 'bacon',
4]
683 >>>
3*a
[:
3] +
['Boe!'
]
684 ['spam', 'eggs',
100, 'spam', 'eggs',
100, 'spam', 'eggs',
100, 'Boe!'
]
688 Unlike strings, which are
{\em immutable
}, it is possible to change
689 individual elements of a list:
691 \bcode\begin{verbatim
}
693 ['spam', 'eggs',
100,
1234]
696 ['spam', 'eggs',
123,
1234]
700 Assignment to slices is also possible, and this can even change the size
703 \bcode\begin{verbatim
}
704 >>> # Replace some items:
713 ... a
[1:
1] =
['bletch', 'xyzzy'
]
715 [123, 'bletch', 'xyzzy',
1234]
716 >>> a
[:
0] = a # Insert (a copy of) itself at the beginning
718 [123, 'bletch', 'xyzzy',
1234,
123, 'bletch', 'xyzzy',
1234]
722 The built-in function
{\tt len()
} also applies to lists:
724 \bcode\begin{verbatim
}
730 It is possible to nest lists (create lists containing other lists),
733 \bcode\begin{verbatim
}
742 >>> p
[1].append('xtra') # See section
5.1
744 [1,
[2,
3, 'xtra'
],
4]
750 Note that in the last example,
{\tt p
[1]} and
{\tt q
} really refer to
751 the same object! We'll come back to
{\em object semantics
} later.
753 \section{First Steps Towards Programming
}
755 Of course, we can use Python for more complicated tasks than adding
756 two and two together. For instance, we can write an initial
757 subsequence of the
{\em Fibonacci
} series as follows:
759 \bcode\begin{verbatim
}
760 >>> # Fibonacci series:
761 ... # the sum of two elements defines the next
776 This example introduces several new features.
781 The first line contains a
{\em multiple assignment
}: the variables
782 {\tt a
} and
{\tt b
} simultaneously get the new values
0 and
1. On the
783 last line this is used again, demonstrating that the expressions on
784 the right-hand side are all evaluated first before any of the
785 assignments take place.
788 The
{\tt while
} loop executes as long as the condition (here:
{\tt b <
789 10}) remains true. In Python, like in C, any non-zero integer value is
790 true; zero is false. The condition may also be a string or list value,
791 in fact any sequence; anything with a non-zero length is true, empty
792 sequences are false. The test used in the example is a simple
793 comparison. The standard comparison operators are written the same as
794 in C:
{\tt <
},
{\tt >
},
{\tt ==
},
{\tt <=
},
{\tt >=
} and
{\tt !=
}.
797 The
{\em body
} of the loop is
{\em indented
}: indentation is Python's
798 way of grouping statements. Python does not (yet!) provide an
799 intelligent input line editing facility, so you have to type a tab or
800 space(s) for each indented line. In practice you will prepare more
801 complicated input for Python with a text editor; most text editors have
802 an auto-indent facility. When a compound statement is entered
803 interactively, it must be followed by a blank line to indicate
804 completion (since the parser cannot guess when you have typed the last
808 The
{\tt print
} statement writes the value of the expression(s) it is
809 given. It differs from just writing the expression you want to write
810 (as we did earlier in the calculator examples) in the way it handles
811 multiple expressions and strings. Strings are printed without quotes,
812 and a space is inserted between items, so you can format things nicely,
815 \bcode\begin{verbatim
}
817 >>> print 'The value of i is', i
818 The value of i is
65536
822 A trailing comma avoids the newline after the output:
824 \bcode\begin{verbatim
}
830 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
834 Note that the interpreter inserts a newline before it prints the next
835 prompt if the last line was not completed.
840 \chapter{More Control Flow Tools
}
842 Besides the
{\tt while
} statement just introduced, Python knows the
843 usual control flow statements known from other languages, with some
846 \section{If Statements
}
848 Perhaps the most well-known statement type is the
{\tt if
} statement.
851 \bcode\begin{verbatim
}
854 ... print 'Negative changed to zero'
864 There can be zero or more
{\tt elif
} parts, and the
{\tt else
} part is
865 optional. The keyword `
{\tt elif
}' is short for `
{\tt else if
}', and is
866 useful to avoid excessive indentation. An
{\tt if...elif...elif...
}
867 sequence is a substitute for the
{\em switch
} or
{\em case
} statements
868 found in other languages.
870 \section{For Statements
}
872 The
{\tt for
} statement in Python differs a bit from what you may be
873 used to in C or Pascal. Rather than always iterating over an
874 arithmetic progression of numbers (like in Pascal), or leaving the user
875 completely free in the iteration test and step (as C), Python's
{\tt
876 for
} statement iterates over the items of any sequence (e.g., a list
877 or a string), in the order that they appear in the sequence. For
878 example (no pun intended):
880 \bcode\begin{verbatim
}
881 >>> # Measure some strings:
882 ... a =
['cat', 'window', 'defenestrate'
]
892 It is not safe to modify the sequence being iterated over in the loop
893 (this can only happen for mutable sequence types, i.e., lists). If
894 you need to modify the list you are iterating over, e.g., duplicate
895 selected items, you must iterate over a copy. The slice notation
896 makes this particularly convenient:
898 \bcode\begin{verbatim
}
899 >>> for x in a
[:
]: # make a slice copy of the entire list
900 ... if len(x) >
6: a.insert(
0, x)
903 ['defenestrate', 'cat', 'window', 'defenestrate'
]
907 \section{The
{\tt range()
} Function
}
909 If you do need to iterate over a sequence of numbers, the built-in
910 function
{\tt range()
} comes in handy. It generates lists containing
911 arithmetic progressions, e.g.:
913 \bcode\begin{verbatim
}
915 [0,
1,
2,
3,
4,
5,
6,
7,
8,
9]
919 The given end point is never part of the generated list;
{\tt range(
10)
}
920 generates a list of
10 values, exactly the legal indices for items of a
921 sequence of length
10. It is possible to let the range start at another
922 number, or to specify a different increment (even negative):
924 \bcode\begin{verbatim
}
929 >>> range(-
10, -
100, -
30)
934 To iterate over the indices of a sequence, combine
{\tt range()
} and
935 {\tt len()
} as follows:
937 \bcode\begin{verbatim
}
938 >>> a =
['Mary', 'had', 'a', 'little', 'lamb'
]
939 >>> for i in range(len(a)):
950 \section{Break and Continue Statements, and Else Clauses on Loops
}
952 The
{\tt break
} statement, like in C, breaks out of the smallest
953 enclosing
{\tt for
} or
{\tt while
} loop.
955 The
{\tt continue
} statement, also borrowed from C, continues with the
956 next iteration of the loop.
958 Loop statements may have an
{\tt else
} clause; it is executed when the
959 loop terminates through exhaustion of the list (with
{\tt for
}) or when
960 the condition becomes false (with
{\tt while
}), but not when the loop is
961 terminated by a
{\tt break
} statement. This is exemplified by the
962 following loop, which searches for prime numbers:
964 \bcode\begin{verbatim
}
965 >>> for n in range(
2,
10):
966 ... for x in range(
2, n):
968 ... print n, 'equals', x, '*', n/x
971 ... print n, 'is a prime number'
984 \section{Pass Statements
}
986 The
{\tt pass
} statement does nothing.
987 It can be used when a statement is required syntactically but the
988 program requires no action.
991 \bcode\begin{verbatim
}
993 ... pass # Busy-wait for keyboard interrupt
997 \section{Defining Functions
}
999 We can create a function that writes the Fibonacci series to an
1002 \bcode\begin{verbatim
}
1003 >>> def fib(n): # write Fibonacci series up to n
1009 >>> # Now call the function we just defined:
1011 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
1013 \end{verbatim
}\ecode
1015 The keyword
{\tt def
} introduces a function
{\em definition
}. It must
1016 be followed by the function name and the parenthesized list of formal
1017 parameters. The statements that form the body of the function starts at
1018 the next line, indented by a tab stop.
1020 The
{\em execution
} of a function introduces a new symbol table used
1021 for the local variables of the function. More precisely, all variable
1022 assignments in a function store the value in the local symbol table;
1024 variable references first look in the local symbol table, then
1025 in the global symbol table, and then in the table of built-in names.
1027 global variables cannot be directly assigned a value within a
1028 function (unless named in a
{\tt global
} statement), although
1029 they may be referenced.
1031 The actual parameters (arguments) to a function call are introduced in
1032 the local symbol table of the called function when it is called; thus,
1033 arguments are passed using
{\em call\ by\ value
}.
%
1035 Actually,
{\em call by object reference
} would be a better
1036 description, since if a mutable object is passed, the caller
1037 will see any changes the callee makes to it (e.g., items
1038 inserted into a list).
1040 When a function calls another function, a new local symbol table is
1041 created for that call.
1043 A function definition introduces the function name in the
1045 symbol table. The value
1046 of the function name
1047 has a type that is recognized by the interpreter as a user-defined
1048 function. This value can be assigned to another name which can then
1049 also be used as a function. This serves as a general renaming
1052 \bcode\begin{verbatim
}
1054 <function object at
10042ed0>
1057 1 1 2 3 5 8 13 21 34 55 89
1059 \end{verbatim
}\ecode
1061 You might object that
{\tt fib
} is not a function but a procedure. In
1062 Python, like in C, procedures are just functions that don't return a
1063 value. In fact, technically speaking, procedures do return a value,
1064 albeit a rather boring one. This value is called
{\tt None
} (it's a
1065 built-in name). Writing the value
{\tt None
} is normally suppressed by
1066 the interpreter if it would be the only value written. You can see it
1067 if you really want to:
1069 \bcode\begin{verbatim
}
1073 \end{verbatim
}\ecode
1075 It is simple to write a function that returns a list of the numbers of
1076 the Fibonacci series, instead of printing it:
1078 \bcode\begin{verbatim
}
1079 >>> def fib2(n): # return Fibonacci series up to n
1083 ... result.append(b) # see below
1087 >>> f100 = fib2(
100) # call it
1088 >>> f100 # write the result
1089 [1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89]
1091 \end{verbatim
}\ecode
1093 This example, as usual, demonstrates some new Python features:
1098 The
{\tt return
} statement returns with a value from a function.
{\tt
1099 return
} without an expression argument is used to return from the middle
1100 of a procedure (falling off the end also returns from a procedure), in
1101 which case the
{\tt None
} value is returned.
1104 The statement
{\tt result.append(b)
} calls a
{\em method
} of the list
1105 object
{\tt result
}. A method is a function that `belongs' to an
1106 object and is named
{\tt obj.methodname
}, where
{\tt obj
} is some
1107 object (this may be an expression), and
{\tt methodname
} is the name
1108 of a method that is defined by the object's type. Different types
1109 define different methods. Methods of different types may have the
1110 same name without causing ambiguity. (It is possible to define your
1111 own object types and methods, using
{\em classes
}, as discussed later
1113 The method
{\tt append
} shown in the example, is defined for
1114 list objects; it adds a new element at the end of the list. In this
1116 it is equivalent to
{\tt result = result +
[b
]}, but more efficient.
1121 \chapter{Odds and Ends
}
1123 This chapter describes some things you've learned about already in
1124 more detail, and adds some new things as well.
1126 \section{More on Lists
}
1128 The list data type has some more methods. Here are all of the methods
1133 \item[{\tt insert(i, x)
}]
1134 Insert an item at a given position. The first argument is the index of
1135 the element before which to insert, so
{\tt a.insert(
0, x)
} inserts at
1136 the front of the list, and
{\tt a.insert(len(a), x)
} is equivalent to
1139 \item[{\tt append(x)
}]
1140 Equivalent to
{\tt a.insert(len(a), x)
}.
1142 \item[{\tt index(x)
}]
1143 Return the index in the list of the first item whose value is
{\tt x
}.
1144 It is an error if there is no such item.
1146 \item[{\tt remove(x)
}]
1147 Remove the first item from the list whose value is
{\tt x
}.
1148 It is an error if there is no such item.
1151 Sort the items of the list, in place.
1153 \item[{\tt reverse()
}]
1154 Reverse the elements of the list, in place.
1156 \item[{\tt count(x)
}]
1157 Return the number of times
{\tt x
} appears in the list.
1161 An example that uses all list methods:
1163 \bcode\begin{verbatim
}
1164 >>> a =
[66.6,
333,
333,
1,
1234.5]
1165 >>> print a.count(
333), a.count(
66.6), a.count('x')
1170 [66.6,
333, -
1,
333,
1,
1234.5,
333]
1175 [66.6, -
1,
333,
1,
1234.5,
333]
1178 [333,
1234.5,
1,
333, -
1,
66.6]
1181 [-
1,
1,
66.6,
333,
333,
1234.5]
1183 \end{verbatim
}\ecode
1185 \section{The
{\tt del
} statement
}
1187 There is a way to remove an item from a list given its index instead
1188 of its value: the
{\tt del
} statement. This can also be used to
1189 remove slices from a list (which we did earlier by assignment of an
1190 empty list to the slice). For example:
1192 \bcode\begin{verbatim
}
1194 [-
1,
1,
66.6,
333,
333,
1234.5]
1197 [1,
66.6,
333,
333,
1234.5]
1202 \end{verbatim
}\ecode
1204 {\tt del
} can also be used to delete entire variables:
1206 \bcode\begin{verbatim
}
1209 \end{verbatim
}\ecode
1211 Referencing the name
{\tt a
} hereafter is an error (at least until
1212 another value is assigned to it). We'll find other uses for
{\tt del
}
1215 \section{Tuples and Sequences
}
1217 We saw that lists and strings have many common properties, e.g.,
1218 indexing and slicing operations. They are two examples of
{\em
1219 sequence
} data types. Since Python is an evolving language, other
1220 sequence data types may be added. There is also another standard
1221 sequence data type: the
{\em tuple
}.
1223 A tuple consists of a number of values separated by commas, for
1226 \bcode\begin{verbatim
}
1227 >>> t =
12345,
54321, 'hello!'
1231 (
12345,
54321, 'hello!')
1232 >>> # Tuples may be nested:
1233 ... u = t, (
1,
2,
3,
4,
5)
1235 ((
12345,
54321, 'hello!'), (
1,
2,
3,
4,
5))
1237 \end{verbatim
}\ecode
1239 As you see, on output tuples are alway enclosed in parentheses, so
1240 that nested tuples are interpreted correctly; they may be input with
1241 or without surrounding parentheses, although often parentheses are
1242 necessary anyway (if the tuple is part of a larger expression).
1244 Tuples have many uses, e.g., (x, y) coordinate pairs, employee records
1245 from a database, etc. Tuples, like strings, are immutable: it is not
1246 possible to assign to the individual items of a tuple (you can
1247 simulate much of the same effect with slicing and concatenation,
1250 A special problem is the construction of tuples containing
0 or
1
1251 items: the syntax has some extra quirks to accommodate these. Empty
1252 tuples are constructed by an empty pair of parentheses; a tuple with
1253 one item is constructed by following a value with a comma
1254 (it is not sufficient to enclose a single value in parentheses).
1255 Ugly, but effective. For example:
1257 \bcode\begin{verbatim
}
1259 >>> singleton = 'hello', # <-- note trailing comma
1267 \end{verbatim
}\ecode
1269 The statement
{\tt t =
12345,
54321, 'hello!'
} is an example of
{\em
1270 tuple packing
}: the values
{\tt 12345},
{\tt 54321} and
{\tt 'hello!'
}
1271 are packed together in a tuple. The reverse operation is also
1274 \bcode\begin{verbatim
}
1277 \end{verbatim
}\ecode
1279 This is called, appropriately enough,
{\em tuple unpacking
}. Tuple
1280 unpacking requires that the list of variables on the left has the same
1281 number of elements as the length of the tuple. Note that multiple
1282 assignment is really just a combination of tuple packing and tuple
1285 Occasionally, the corresponding operation on lists is useful:
{\em list
1286 unpacking
}. This is supported by enclosing the list of variables in
1289 \bcode\begin{verbatim
}
1290 >>> a =
['spam', 'eggs',
100,
1234]
1291 >>>
[a1, a2, a3, a4
] = a
1293 \end{verbatim
}\ecode
1295 \section{Dictionaries
}
1297 Another useful data type built into Python is the
{\em dictionary
}.
1298 Dictionaries are sometimes found in other languages as ``associative
1299 memories'' or ``associative arrays''. Unlike sequences, which are
1300 indexed by a range of numbers, dictionaries are indexed by
{\em keys
},
1301 which are strings (the use of non-string values as keys
1302 is supported, but beyond the scope of this tutorial).
1303 It is best to think of a dictionary as an unordered set of
1304 {\em key:value
} pairs, with the requirement that the keys are unique
1305 (within one dictionary).
1306 A pair of braces creates an empty dictionary:
\verb/
{}/.
1307 Placing a comma-separated list of key:value pairs within the
1308 braces adds initial key:value pairs to the dictionary; this is also the
1309 way dictionaries are written on output.
1311 The main operations on a dictionary are storing a value with some key
1312 and extracting the value given the key. It is also possible to delete
1315 If you store using a key that is already in use, the old value
1316 associated with that key is forgotten. It is an error to extract a
1317 value using a non-existent key.
1319 The
{\tt keys()
} method of a dictionary object returns a list of all the
1320 keys used in the dictionary, in random order (if you want it sorted,
1321 just apply the
{\tt sort()
} method to the list of keys). To check
1322 whether a single key is in the dictionary, use the
\verb/has_key()/
1323 method of the dictionary.
1325 Here is a small example using a dictionary:
1327 \bcode\begin{verbatim
}
1328 >>> tel =
{'jack':
4098, 'sape':
4139}
1329 >>> tel
['guido'
] =
4127
1331 {'sape':
4139, 'guido':
4127, 'jack':
4098}
1335 >>> tel
['irv'
] =
4127
1337 {'guido':
4127, 'irv':
4127, 'jack':
4098}
1339 ['guido', 'irv', 'jack'
]
1340 >>> tel.has_key('guido')
1343 \end{verbatim
}\ecode
1345 \section{More on Conditions
}
1347 The conditions used in
{\tt while
} and
{\tt if
} statements above can
1348 contain other operators besides comparisons.
1350 The comparison operators
{\tt in
} and
{\tt not in
} check whether a value
1351 occurs (does not occur) in a sequence. The operators
{\tt is
} and
{\tt
1352 is not
} compare whether two objects are really the same object; this
1353 only matters for mutable objects like lists. All comparison operators
1354 have the same priority, which is lower than that of all numerical
1357 Comparisons can be chained: e.g.,
{\tt a < b == c
} tests whether
{\tt a
}
1358 is less than
{\tt b
} and moreover
{\tt b
} equals
{\tt c
}.
1360 Comparisons may be combined by the Boolean operators
{\tt and
} and
{\tt
1361 or
}, and the outcome of a comparison (or of any other Boolean
1362 expression) may be negated with
{\tt not
}. These all have lower
1363 priorities than comparison operators again; between them,
{\tt not
} has
1364 the highest priority, and
{\tt or
} the lowest, so that
1365 {\tt A and not B or C
} is equivalent to
{\tt (A and (not B)) or C
}. Of
1366 course, parentheses can be used to express the desired composition.
1368 The Boolean operators
{\tt and
} and
{\tt or
} are so-called
{\em
1369 shortcut
} operators: their arguments are evaluated from left to right,
1370 and evaluation stops as soon as the outcome is determined. E.g., if
1371 {\tt A
} and
{\tt C
} are true but
{\tt B
} is false,
{\tt A and B and C
}
1372 does not evaluate the expression C. In general, the return value of a
1373 shortcut operator, when used as a general value and not as a Boolean, is
1374 the last evaluated argument.
1376 It is possible to assign the result of a comparison or other Boolean
1377 expression to a variable. For example,
1379 \bcode\begin{verbatim
}
1380 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
1381 >>> non_null = string1 or string2 or string3
1385 \end{verbatim
}\ecode
1387 Note that in Python, unlike C, assignment cannot occur inside expressions.
1389 \section{Comparing Sequences and Other Types
}
1391 Sequence objects may be compared to other objects with the same
1392 sequence type. The comparison uses
{\em lexicographical
} ordering:
1393 first the first two items are compared, and if they differ this
1394 determines the outcome of the comparison; if they are equal, the next
1395 two items are compared, and so on, until either sequence is exhausted.
1396 If two items to be compared are themselves sequences of the same type,
1397 the lexicographical comparison is carried out recursively. If all
1398 items of two sequences compare equal, the sequences are considered
1399 equal. If one sequence is an initial subsequence of the other, the
1400 shorted sequence is the smaller one. Lexicographical ordering for
1401 strings uses the ASCII ordering for individual characters. Some
1402 examples of comparisons between sequences with the same types:
1404 \bcode\begin{verbatim
}
1405 (
1,
2,
3) < (
1,
2,
4)
1406 [1,
2,
3] <
[1,
2,
4]
1407 'ABC' < 'C' < 'Pascal' < 'Python'
1408 (
1,
2,
3,
4) < (
1,
2,
4)
1410 (
1,
2,
3) = (
1.0,
2.0,
3.0)
1411 (
1,
2, ('aa', 'ab')) < (
1,
2, ('abc', 'a'),
4)
1412 \end{verbatim
}\ecode
1414 Note that comparing objects of different types is legal. The outcome
1415 is deterministic but arbitrary: the types are ordered by their name.
1416 Thus, a list is always smaller than a string, a string is always
1417 smaller than a tuple, etc. Mixed numeric types are compared according
1418 to their numeric value, so
0 equals
0.0, etc.
%
1420 The rules for comparing objects of different types should
1421 not be relied upon; they may change in a future version of
1428 If you quit from the Python interpreter and enter it again, the
1429 definitions you have made (functions and variables) are lost.
1430 Therefore, if you want to write a somewhat longer program, you are
1431 better off using a text editor to prepare the input for the interpreter
1432 and running it with that file as input instead. This is known as creating a
1433 {\em script
}. As your program gets longer, you may want to split it
1434 into several files for easier maintenance. You may also want to use a
1435 handy function that you've written in several programs without copying
1436 its definition into each program.
1438 To support this, Python has a way to put definitions in a file and use
1439 them in a script or in an interactive instance of the interpreter.
1440 Such a file is called a
{\em module
}; definitions from a module can be
1441 {\em imported
} into other modules or into the
{\em main
} module (the
1442 collection of variables that you have access to in a script
1443 executed at the top level
1444 and in calculator mode).
1446 A module is a file containing Python definitions and statements. The
1447 file name is the module name with the suffix
{\tt .py
} appended. Within
1448 a module, the module's name (as a string) is available as the value of
1449 the global variable
{\tt __name__
}. For instance, use your favorite text
1450 editor to create a file called
{\tt fibo.py
} in the current directory
1451 with the following contents:
1453 \bcode\begin{verbatim
}
1454 # Fibonacci numbers module
1456 def fib(n): # write Fibonacci series up to n
1462 def fib2(n): # return Fibonacci series up to n
1469 \end{verbatim
}\ecode
1471 Now enter the Python interpreter and import this module with the
1474 \bcode\begin{verbatim
}
1477 \end{verbatim
}\ecode
1479 This does not enter the names of the functions defined in
1481 directly in the current symbol table; it only enters the module name
1484 Using the module name you can access the functions:
1486 \bcode\begin{verbatim
}
1488 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
1490 [1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89]
1494 \end{verbatim
}\ecode
1496 If you intend to use a function often you can assign it to a local name:
1498 \bcode\begin{verbatim
}
1501 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1503 \end{verbatim
}\ecode
1505 \section{More on Modules
}
1507 A module can contain executable statements as well as function
1509 These statements are intended to initialize the module.
1510 They are executed only the
1512 time the module is imported somewhere.
%
1514 In fact function definitions are also `statements' that are
1515 `executed'; the execution enters the function name in the
1516 module's global symbol table.
1519 Each module has its own private symbol table, which is used as the
1520 global symbol table by all functions defined in the module.
1521 Thus, the author of a module can use global variables in the module
1522 without worrying about accidental clashes with a user's global
1524 On the other hand, if you know what you are doing you can touch a
1525 module's global variables with the same notation used to refer to its
1527 {\tt modname.itemname
}.
1529 Modules can import other modules.
1530 It is customary but not required to place all
1532 statements at the beginning of a module (or script, for that matter).
1533 The imported module names are placed in the importing module's global
1536 There is a variant of the
1538 statement that imports names from a module directly into the importing
1539 module's symbol table.
1542 \bcode\begin{verbatim
}
1543 >>> from fibo import fib, fib2
1545 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1547 \end{verbatim
}\ecode
1549 This does not introduce the module name from which the imports are taken
1550 in the local symbol table (so in the example,
{\tt fibo
} is not
1553 There is even a variant to import all names that a module defines:
1555 \bcode\begin{verbatim
}
1556 >>> from fibo import *
1558 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1560 \end{verbatim
}\ecode
1562 This imports all names except those beginning with an underscore
1565 \section{Standard Modules
}
1567 Python comes with a library of standard modules, described in a separate
1568 document (Python Library Reference). Some modules are built into the
1569 interpreter; these provide access to operations that are not part of the
1570 core of the language but are nevertheless built in, either for
1571 efficiency or to provide access to operating system primitives such as
1572 system calls. The set of such modules is a configuration option; e.g.,
1573 the
{\tt amoeba
} module is only provided on systems that somehow support
1574 Amoeba primitives. One particular module deserves some attention:
{\tt
1575 sys
}, which is built into every Python interpreter. The variables
{\tt
1576 sys.ps1
} and
{\tt sys.ps2
} define the strings used as primary and
1579 \bcode\begin{verbatim
}
1589 \end{verbatim
}\ecode
1591 These two variables are only defined if the interpreter is in
1596 is a list of strings that determine the interpreter's search path for
1598 It is initialized to a default path taken from the environment variable
1600 or from a built-in default if
1603 You can modify it using standard list operations, e.g.:
1605 \bcode\begin{verbatim
}
1607 >>> sys.path.append('/ufs/guido/lib/python')
1609 \end{verbatim
}\ecode
1611 \section{The
{\tt dir()
} function
}
1613 The built-in function
{\tt dir
} is used to find out which names a module
1614 defines. It returns a sorted list of strings:
1616 \bcode\begin{verbatim
}
1617 >>> import fibo, sys
1619 ['__name__', 'fib', 'fib2'
]
1621 ['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
1622 'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
1623 'stderr', 'stdin', 'stdout', 'version'
]
1625 \end{verbatim
}\ecode
1627 Without arguments,
{\tt dir()
} lists the names you have defined currently:
1629 \bcode\begin{verbatim
}
1630 >>> a =
[1,
2,
3,
4,
5]
1631 >>> import fibo, sys
1634 ['__name__', 'a', 'fib', 'fibo', 'sys'
]
1636 \end{verbatim
}\ecode
1638 Note that it lists all types of names: variables, modules, functions, etc.
1640 {\tt dir()
} does not list the names of built-in functions and variables.
1641 If you want a list of those, they are defined in the standard module
1644 \bcode\begin{verbatim
}
1645 >>> import __builtin__
1646 >>> dir(__builtin__)
1647 ['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
1648 'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
1649 'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
1650 'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
1651 'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
1652 'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
1653 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
1654 'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
1655 'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange'
]
1657 \end{verbatim
}\ecode
1660 \chapter{Output Formatting
}
1662 So far we've encountered two ways of writing values:
{\em expression
1663 statements
} and the
{\tt print
} statement. (A third way is using the
1664 {\tt write
} method of file objects; the standard output file can be
1665 referenced as
{\tt sys.stdout
}. See the Library Reference for more
1666 information on this.)
1668 Often you'll want more control over the formatting of your output than
1669 simply printing space-separated values. The key to nice formatting in
1670 Python is to do all the string handling yourself; using string slicing
1671 and concatenation operations you can create any lay-out you can imagine.
1672 The standard module
{\tt string
} contains some useful operations for
1673 padding strings to a given column width; these will be discussed shortly.
1674 Finally, the
\code{\%
} operator (modulo) with a string left argument
1675 interprets this string as a C sprintf format string to be applied to the
1676 right argument, and returns the string resulting from this formatting
1679 One question remains, of course: how do you convert values to strings?
1680 Luckily, Python has a way to convert any value to a string: just write
1681 the value between reverse quotes (
\verb/``/). Some examples:
1683 \bcode\begin{verbatim
}
1686 >>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
1688 The value of x is
31.4, and y is
40000...
1689 >>> # Reverse quotes work on other types besides numbers:
1694 >>> # Converting a string adds string quotes and backslashes:
1695 ... hello = 'hello, world
\n'
1696 >>> hellos = `hello`
1699 >>> # The argument of reverse quotes may be a tuple:
1700 ... `x, y, ('spam', 'eggs')`
1701 "(
31.4,
40000, ('spam', 'eggs'))"
1703 \end{verbatim
}\ecode
1705 Here are two ways to write a table of squares and cubes:
1707 \bcode\begin{verbatim
}
1709 >>> for x in range(
1,
11):
1710 ... print string.rjust(`x`,
2), string.rjust(`x*x`,
3),
1711 ... # Note trailing comma on previous line
1712 ... print string.rjust(`x*x*x`,
4)
1724 >>> for x in range(
1,
11):
1725 ... print '
%2d %3d %4d' % (x, x*x, x*x*x)
1738 \end{verbatim
}\ecode
1740 (Note that one space between each column was added by the way
{\tt print
}
1741 works: it always adds spaces between its arguments.)
1743 This example demonstrates the function
{\tt string.rjust()
}, which
1744 right-justifies a string in a field of a given width by padding it with
1745 spaces on the left. There are similar functions
{\tt string.ljust()
}
1746 and
{\tt string.center()
}. These functions do not write anything, they
1747 just return a new string. If the input string is too long, they don't
1748 truncate it, but return it unchanged; this will mess up your column
1749 lay-out but that's usually better than the alternative, which would be
1750 lying about a value. (If you really want truncation you can always add
1751 a slice operation, as in
{\tt string.ljust(x,~n)
[0:n
]}.)
1753 There is another function,
{\tt string.zfill
}, which pads a numeric
1754 string on the left with zeros. It understands about plus and minus
1757 \bcode\begin{verbatim
}
1758 >>> string.zfill('
12',
5)
1760 >>> string.zfill('-
3.14',
7)
1762 >>> string.zfill('
3.14159265359',
5)
1765 \end{verbatim
}\ecode
1768 \chapter{Errors and Exceptions
}
1770 Until now error messages haven't been more than mentioned, but if you
1771 have tried out the examples you have probably seen some. There are
1772 (at least) two distinguishable kinds of errors:
{\em syntax\ errors
}
1773 and
{\em exceptions
}.
1775 \section{Syntax Errors
}
1777 Syntax errors, also known as parsing errors, are perhaps the most common
1778 kind of complaint you get while you are still learning Python:
1780 \bcode\begin{verbatim
}
1781 >>> while
1 print 'Hello world'
1782 File "<stdin>", line
1
1783 while
1 print 'Hello world'
1785 SyntaxError: invalid syntax
1787 \end{verbatim
}\ecode
1789 The parser repeats the offending line and displays a little `arrow'
1790 pointing at the earliest point in the line where the error was detected.
1791 The error is caused by (or at least detected at) the token
1793 the arrow: in the example, the error is detected at the keyword
1794 {\tt print
}, since a colon (
{\tt :
}) is missing before it.
1795 File name and line number are printed so you know where to look in case
1796 the input came from a script.
1798 \section{Exceptions
}
1800 Even if a statement or expression is syntactically correct, it may
1801 cause an error when an attempt is made to execute it.
1802 Errors detected during execution are called
{\em exceptions
} and are
1803 not unconditionally fatal: you will soon learn how to handle them in
1804 Python programs. Most exceptions are not handled by programs,
1805 however, and result in error messages as shown here:
1807 \bcode\small\begin{verbatim
}
1809 Traceback (innermost last):
1810 File "<stdin>", line
1
1811 ZeroDivisionError: integer division or modulo
1813 Traceback (innermost last):
1814 File "<stdin>", line
1
1817 Traceback (innermost last):
1818 File "<stdin>", line
1
1819 TypeError: illegal argument type for built-in operation
1821 \end{verbatim
}\ecode
1823 The last line of the error message indicates what happened.
1824 Exceptions come in different types, and the type is printed as part of
1825 the message: the types in the example are
1826 {\tt ZeroDivisionError
},
1830 The string printed as the exception type is the name of the built-in
1831 name for the exception that occurred. This is true for all built-in
1832 exceptions, but need not be true for user-defined exceptions (although
1833 it is a useful convention).
1834 Standard exception names are built-in identifiers (not reserved
1837 The rest of the line is a detail whose interpretation depends on the
1838 exception type; its meaning is dependent on the exception type.
1840 The preceding part of the error message shows the context where the
1841 exception happened, in the form of a stack backtrace.
1842 In general it contains a stack backtrace listing source lines; however,
1843 it will not display lines read from standard input.
1845 The Python library reference manual lists the built-in exceptions and
1848 \section{Handling Exceptions
}
1850 It is possible to write programs that handle selected exceptions.
1851 Look at the following example, which prints a table of inverses of
1852 some floating point numbers:
1854 \bcode\begin{verbatim
}
1855 >>> numbers =
[0.3333,
2.5,
0,
10]
1856 >>> for x in numbers:
1860 ... except ZeroDivisionError:
1861 ... print '*** has no inverse ***'
1865 0 *** has no inverse ***
1868 \end{verbatim
}\ecode
1870 The
{\tt try
} statement works as follows.
1875 (the statement(s) between the
{\tt try
} and
{\tt except
} keywords) is
1878 If no exception occurs, the
1879 {\em except\ clause
}
1880 is skipped and execution of the
{\tt try
} statement is finished.
1882 If an exception occurs during execution of the try clause,
1883 the rest of the clause is skipped. Then if
1884 its type matches the exception named after the
{\tt except
} keyword,
1885 the rest of the try clause is skipped, the except clause is executed,
1886 and then execution continues after the
{\tt try
} statement.
1888 If an exception occurs which does not match the exception named in the
1889 except clause, it is passed on to outer try statements; if no handler is
1891 {\em unhandled\ exception
}
1892 and execution stops with a message as shown above.
1894 A
{\tt try
} statement may have more than one except clause, to specify
1895 handlers for different exceptions.
1896 At most one handler will be executed.
1897 Handlers only handle exceptions that occur in the corresponding try
1898 clause, not in other handlers of the same
{\tt try
} statement.
1899 An except clause may name multiple exceptions as a parenthesized list,
1902 \bcode\begin{verbatim
}
1903 ... except (RuntimeError, TypeError, NameError):
1905 \end{verbatim
}\ecode
1907 The last except clause may omit the exception name(s), to serve as a
1909 Use this with extreme caution, since it is easy to mask a real
1910 programming error in this way!
1912 When an exception occurs, it may have an associated value, also known as
1915 The presence and type of the argument depend on the exception type.
1916 For exception types which have an argument, the except clause may
1917 specify a variable after the exception name (or list) to receive the
1918 argument's value, as follows:
1920 \bcode\begin{verbatim
}
1923 ... except NameError, x:
1924 ... print 'name', x, 'undefined'
1928 \end{verbatim
}\ecode
1930 If an exception has an argument, it is printed as the last part
1931 (`detail') of the message for unhandled exceptions.
1933 Exception handlers don't just handle exceptions if they occur
1934 immediately in the try clause, but also if they occur inside functions
1935 that are called (even indirectly) in the try clause.
1938 \bcode\begin{verbatim
}
1939 >>> def this_fails():
1944 ... except ZeroDivisionError, detail:
1945 ... print 'Handling run-time error:', detail
1947 Handling run-time error: integer division or modulo
1949 \end{verbatim
}\ecode
1951 \section{Raising Exceptions
}
1953 The
{\tt raise
} statement allows the programmer to force a specified
1957 \bcode\begin{verbatim
}
1958 >>> raise NameError, 'HiThere'
1959 Traceback (innermost last):
1960 File "<stdin>", line
1
1963 \end{verbatim
}\ecode
1965 The first argument to
{\tt raise
} names the exception to be raised.
1966 The optional second argument specifies the exception's argument.
1968 \section{User-defined Exceptions
}
1970 Programs may name their own exceptions by assigning a string to a
1974 \bcode\begin{verbatim
}
1975 >>> my_exc = 'my_exc'
1977 ... raise my_exc,
2*
2
1978 ... except my_exc, val:
1979 ... print 'My exception occurred, value:', val
1981 My exception occurred, value:
4
1983 Traceback (innermost last):
1984 File "<stdin>", line
1
1987 \end{verbatim
}\ecode
1989 Many standard modules use this to
report errors that may occur in
1990 functions they define.
1992 \section{Defining Clean-up Actions
}
1994 The
{\tt try
} statement has another optional clause which is intended to
1995 define clean-up actions that must be executed under all circumstances.
1998 \bcode\begin{verbatim
}
2000 ... raise KeyboardInterrupt
2002 ... print 'Goodbye, world!'
2005 Traceback (innermost last):
2006 File "<stdin>", line
2
2009 \end{verbatim
}\ecode
2011 A
{\tt finally
} clause is executed whether or not an exception has
2012 occurred in the
{\tt try
} clause. When an exception has occurred, it
2013 is re-raised after the
{\tt finally
} clause is executed. The
2014 {\tt finally
} clause is also executed ``on the way out'' when the
2015 {\tt try
} statement is left via a
{\tt break
} or
{\tt return
}
2018 A
{\tt try
} statement must either have one or more
{\tt except
}
2019 clauses or one
{\tt finally
} clause, but not both.
2024 Python's class mechanism adds classes to the language with a minimum
2025 of new syntax and semantics. It is a mixture of the class mechanisms
2026 found in
\Cpp{} and Modula-
3. As is true for modules, classes in Python
2027 do not put an absolute barrier between definition and user, but rather
2028 rely on the politeness of the user not to ``break into the
2029 definition.'' The most important features of classes are retained
2030 with full power, however: the class inheritance mechanism allows
2031 multiple base classes, a derived class can override any methods of its
2032 base class(es), a method can call the method of a base class with the
2033 same name. Objects can contain an arbitrary amount of private data.
2035 In
\Cpp{} terminology, all class members (including the data members) are
2036 {\em public
}, and all member functions are
{\em virtual
}. There are
2037 no special constructors or destructors. As in Modula-
3, there are no
2038 shorthands for referencing the object's members from its methods: the
2039 method function is declared with an explicit first argument
2040 representing the object, which is provided implicitly by the call. As
2041 in Smalltalk, classes themselves are objects, albeit in the wider
2042 sense of the word: in Python, all data types are objects. This
2043 provides semantics for importing and renaming. But, just like in
\Cpp{}
2044 or Modula-
3, built-in types cannot be used as base classes for
2045 extension by the user. Also, like in
\Cpp{} but unlike in Modula-
3, most
2046 built-in operators with special syntax (arithmetic operators,
2047 subscripting etc.) can be redefined for class members.
2050 \section{A word about terminology
}
2052 Lacking universally accepted terminology to talk about classes, I'll
2053 make occasional use of Smalltalk and
\Cpp{} terms. (I'd use Modula-
3
2054 terms, since its object-oriented semantics are closer to those of
2055 Python than
\Cpp{}, but I expect that few readers have heard of it...)
2057 I also have to warn you that there's a terminological pitfall for
2058 object-oriented readers: the word ``object'' in Python does not
2059 necessarily mean a class instance. Like
\Cpp{} and Modula-
3, and unlike
2060 Smalltalk, not all types in Python are classes: the basic built-in
2061 types like integers and lists aren't, and even somewhat more exotic
2062 types like files aren't. However,
{\em all
} Python types share a little
2063 bit of common semantics that is best described by using the word
2066 Objects have individuality, and multiple names (in multiple scopes)
2067 can be bound to the same object. This is known as aliasing in other
2068 languages. This is usually not appreciated on a first glance at
2069 Python, and can be safely ignored when dealing with immutable basic
2070 types (numbers, strings, tuples). However, aliasing has an
2071 (intended!) effect on the semantics of Python code involving mutable
2072 objects such as lists, dictionaries, and most types representing
2073 entities outside the program (files, windows, etc.). This is usually
2074 used to the benefit of the program, since aliases behave like pointers
2075 in some respects. For example, passing an object is cheap since only
2076 a pointer is passed by the implementation; and if a function modifies
2077 an object passed as an argument, the caller will see the change --- this
2078 obviates the need for two different argument passing mechanisms as in
2082 \section{Python scopes and name spaces
}
2084 Before introducing classes, I first have to tell you something about
2085 Python's scope rules. Class definitions play some neat tricks with
2086 name spaces, and you need to know how scopes and name spaces work to
2087 fully understand what's going on. Incidentally, knowledge about this
2088 subject is useful for any advanced Python programmer.
2090 Let's begin with some definitions.
2092 A
{\em name space
} is a mapping from names to objects. Most name
2093 spaces are currently implemented as Python dictionaries, but that's
2094 normally not noticeable in any way (except for performance), and it
2095 may change in the future. Examples of name spaces are: the set of
2096 built-in names (functions such as
\verb\abs()\, and built-in exception
2097 names); the global names in a module; and the local names in a
2098 function invocation. In a sense the set of attributes of an object
2099 also form a name space. The important thing to know about name
2100 spaces is that there is absolutely no relation between names in
2101 different name spaces; for instance, two different modules may both
2102 define a function ``maximize'' without confusion --- users of the
2103 modules must prefix it with the module name.
2105 By the way, I use the word
{\em attribute
} for any name following a
2106 dot --- for example, in the expression
\verb\z.real\,
\verb\real\ is
2107 an attribute of the object
\verb\z\. Strictly speaking, references to
2108 names in modules are attribute references: in the expression
2109 \verb\modname.funcname\,
\verb\modname\ is a module object and
2110 \verb\funcname\ is an attribute of it. In this case there happens to
2111 be a straightforward mapping between the module's attributes and the
2112 global names defined in the module: they share the same name space!
%
2114 Except for one thing. Module objects have a secret read-only
2115 attribute called
{\tt __dict__
} which returns the dictionary
2116 used to implement the module's name space; the name
2117 {\tt __dict__
} is an attribute but not a global name.
2118 Obviously, using this violates the abstraction of name space
2119 implementation, and should be restricted to things like
2120 post-mortem debuggers...
2123 Attributes may be read-only or writable. In the latter case,
2124 assignment to attributes is possible. Module attributes are writable:
2125 you can write
\verb\modname.the_answer =
42\. Writable attributes may
2126 also be deleted with the del statement, e.g.
2127 \verb\del modname.the_answer\.
2129 Name spaces are created at different moments and have different
2130 lifetimes. The name space containing the built-in names is created
2131 when the Python interpreter starts up, and is never deleted. The
2132 global name space for a module is created when the module definition
2133 is read in; normally, module name spaces also last until the
2134 interpreter quits. The statements executed by the top-level
2135 invocation of the interpreter, either read from a script file or
2136 interactively, are considered part of a module called
\verb\__main__\,
2137 so they have their own global name space. (The built-in names
2138 actually also live in a module; this is called
\verb\__builtin__\.)
2140 The local name space for a function is created when the function is
2141 called, and deleted when the function returns or raises an exception
2142 that is not handled within the function. (Actually, forgetting would
2143 be a better way to describe what actually happens.) Of course,
2144 recursive invocations each have their own local name space.
2146 A
{\em scope
} is a textual region of a Python program where a name space
2147 is directly accessible. ``Directly accessible'' here means that an
2148 unqualified reference to a name attempts to find the name in the name
2151 Although scopes are determined statically, they are used dynamically.
2152 At any time during execution, exactly three nested scopes are in use
2153 (i.e., exactly three name spaces are directly accessible): the
2154 innermost scope, which is searched first, contains the local names,
2155 the middle scope, searched next, contains the current module's global
2156 names, and the outermost scope (searched last) is the name space
2157 containing built-in names.
2159 Usually, the local scope references the local names of the (textually)
2160 current function. Outside of functions, the the local scope references
2161 the same name space as the global scope: the module's name space.
2162 Class definitions place yet another name space in the local scope.
2164 It is important to realize that scopes are determined textually: the
2165 global scope of a function defined in a module is that module's name
2166 space, no matter from where or by what alias the function is called.
2167 On the other hand, the actual search for names is done dynamically, at
2168 run time --- however, the the language definition is evolving towards
2169 static name resolution, at ``compile'' time, so don't rely on dynamic
2170 name resolution! (In fact, local variables are already determined
2173 A special quirk of Python is that assignments always go into the
2174 innermost scope. Assignments do not copy data --- they just
2175 bind names to objects. The same is true for deletions: the statement
2176 \verb\del x\ removes the binding of x from the name space referenced by the
2177 local scope. In fact, all operations that introduce new names use the
2178 local scope: in particular, import statements and function definitions
2179 bind the module or function name in the local scope. (The
2180 \verb\global\ statement can be used to indicate that particular
2181 variables live in the global scope.)
2184 \section{A first look at classes
}
2186 Classes introduce a little bit of new syntax, three new object types,
2187 and some new semantics.
2190 \subsection{Class definition syntax
}
2192 The simplest form of class definition looks like this:
2203 Class definitions, like function definitions (
\verb\def\ statements)
2204 must be executed before they have any effect. (You could conceivably
2205 place a class definition in a branch of an
\verb\if\ statement, or
2208 In practice, the statements inside a class definition will usually be
2209 function definitions, but other statements are allowed, and sometimes
2210 useful --- we'll come back to this later. The function definitions
2211 inside a class normally have a peculiar form of argument list,
2212 dictated by the calling conventions for methods --- again, this is
2215 When a class definition is entered, a new name space is created, and
2216 used as the local scope --- thus, all assignments to local variables
2217 go into this new name space. In particular, function definitions bind
2218 the name of the new function here.
2220 When a class definition is left normally (via the end), a
{\em class
2221 object
} is created. This is basically a wrapper around the contents
2222 of the name space created by the class definition; we'll learn more
2223 about class objects in the next section. The original local scope
2224 (the one in effect just before the class definitions was entered) is
2225 reinstated, and the class object is bound here to class name given in
2226 the class definition header (ClassName in the example).
2229 \subsection{Class objects
}
2231 Class objects support two kinds of operations: attribute references
2234 {\em Attribute references
} use the standard syntax used for all
2235 attribute references in Python:
\verb\obj.name\. Valid attribute
2236 names are all the names that were in the class's name space when the
2237 class object was created. So, if the class definition looked like
2244 return 'hello world'
2247 then
\verb\MyClass.i\ and
\verb\MyClass.f\ are valid attribute
2248 references, returning an integer and a function object, respectively.
2249 Class attributes can also be assigned to, so you can change the
2250 value of
\verb\MyClass.i\ by assignment.
2252 Class
{\em instantiation
} uses function notation. Just pretend that
2253 the class object is a parameterless function that returns a new
2254 instance of the class. For example, (assuming the above class):
2260 creates a new
{\em instance
} of the class and assigns this object to
2261 the local variable
\verb\x\.
2264 \subsection{Instance objects
}
2266 Now what can we do with instance objects? The only operations
2267 understood by instance objects are attribute references. There are
2268 two kinds of valid attribute names.
2270 The first I'll call
{\em data attributes
}. These correspond to
2271 ``instance variables'' in Smalltalk, and to ``data members'' in
\Cpp{}.
2272 Data attributes need not be declared; like local variables, they
2273 spring into existence when they are first assigned to. For example,
2274 if
\verb\x\ in the instance of
\verb\MyClass\ created above, the
2275 following piece of code will print the value
16, without leaving a
2280 while x.counter <
10:
2281 x.counter = x.counter *
2
2286 The second kind of attribute references understood by instance objects
2287 are
{\em methods
}. A method is a function that ``belongs to'' an
2288 object. (In Python, the term method is not unique to class instances:
2289 other object types can have methods as well, e.g., list objects have
2290 methods called append, insert, remove, sort, and so on. However,
2291 below, we'll use the term method exclusively to mean methods of class
2292 instance objects, unless explicitly stated otherwise.)
2294 Valid method names of an instance object depend on its class. By
2295 definition, all attributes of a class that are (user-defined) function
2296 objects define corresponding methods of its instances. So in our
2297 example,
\verb\x.f\ is a valid method reference, since
2298 \verb\MyClass.f\ is a function, but
\verb\x.i\ is not, since
2299 \verb\MyClass.i\ is not. But
\verb\x.f\ is not the
2300 same thing as
\verb\MyClass.f\ --- it is a
{\em method object
}, not a
2304 \subsection{Method objects
}
2306 Usually, a method is called immediately, e.g.:
2312 In our example, this will return the string
\verb\'hello world'\.
2313 However, it is not necessary to call a method right away:
\verb\x.f\
2314 is a method object, and can be stored away and called at a later
2315 moment, for example:
2323 will continue to print
\verb\hello world\ until the end of time.
2325 What exactly happens when a method is called? You may have noticed
2326 that
\verb\x.f()\ was called without an argument above, even though
2327 the function definition for
\verb\f\ specified an argument. What
2328 happened to the argument? Surely Python raises an exception when a
2329 function that requires an argument is called without any --- even if
2330 the argument isn't actually used...
2332 Actually, you may have guessed the answer: the special thing about
2333 methods is that the object is passed as the first argument of the
2334 function. In our example, the call
\verb\x.f()\ is exactly equivalent
2335 to
\verb\MyClass.f(x)\. In general, calling a method with a list of
2336 {\em n
} arguments is equivalent to calling the corresponding function
2337 with an argument list that is created by inserting the method's object
2338 before the first argument.
2340 If you still don't understand how methods work, a look at the
2341 implementation can perhaps clarify matters. When an instance
2342 attribute is referenced that isn't a data attribute, its class is
2343 searched. If the name denotes a valid class attribute that is a
2344 function object, a method object is created by packing (pointers to)
2345 the instance object and the function object just found together in an
2346 abstract object: this is the method object. When the method object is
2347 called with an argument list, it is unpacked again, a new argument
2348 list is constructed from the instance object and the original argument
2349 list, and the function object is called with this new argument list.
2352 \section{Random remarks
}
2355 [These should perhaps be placed more carefully...
]
2358 Data attributes override method attributes with the same name; to
2359 avoid accidental name conflicts, which may cause hard-to-find bugs in
2360 large programs, it is wise to use some kind of convention that
2361 minimizes the chance of conflicts, e.g., capitalize method names,
2362 prefix data attribute names with a small unique string (perhaps just
2363 an underscore), or use verbs for methods and nouns for data attributes.
2366 Data attributes may be referenced by methods as well as by ordinary
2367 users (``clients'') of an object. In other words, classes are not
2368 usable to implement pure abstract data types. In fact, nothing in
2369 Python makes it possible to enforce data hiding --- it is all based
2370 upon convention. (On the other hand, the Python implementation,
2371 written in C, can completely hide implementation details and control
2372 access to an object if necessary; this can be used by extensions to
2373 Python written in C.)
2376 Clients should use data attributes with care --- clients may mess up
2377 invariants maintained by the methods by stamping on their data
2378 attributes. Note that clients may add data attributes of their own to
2379 an instance object without affecting the validity of the methods, as
2380 long as name conflicts are avoided --- again, a naming convention can
2381 save a lot of headaches here.
2384 There is no shorthand for referencing data attributes (or other
2385 methods!) from within methods. I find that this actually increases
2386 the readability of methods: there is no chance of confusing local
2387 variables and instance variables when glancing through a method.
2390 Conventionally, the first argument of methods is often called
2391 \verb\self\. This is nothing more than a convention: the name
2392 \verb\self\ has absolutely no special meaning to Python. (Note,
2393 however, that by not following the convention your code may be less
2394 readable by other Python programmers, and it is also conceivable that
2395 a
{\em class browser
} program be written which relies upon such a
2399 Any function object that is a class attribute defines a method for
2400 instances of that class. It is not necessary that the function
2401 definition is textually enclosed in the class definition: assigning a
2402 function object to a local variable in the class is also ok. For
2406 # Function defined outside the class
2413 return 'hello world'
2417 Now
\verb\f\,
\verb\g\ and
\verb\h\ are all attributes of class
2418 \verb\C\ that refer to function objects, and consequently they are all
2419 methods of instances of
\verb\C\ ---
\verb\h\ being exactly equivalent
2420 to
\verb\g\. Note that this practice usually only serves to confuse
2421 the reader of a program.
2424 Methods may call other methods by using method attributes of the
2425 \verb\self\ argument, e.g.:
2433 def addtwice(self, x):
2439 The instantiation operation (``calling'' a class object) creates an
2440 empty object. Many classes like to create objects in a known initial
2441 state. Therefore a class may define a special method named
2442 \verb\__init__\, like this:
2449 When a class defines an
\verb\__init__\ method, class instantiation
2450 automatically invokes
\verb\__init__\ for the newly-created class
2451 instance. So in the
\verb\Bag\ example, a new and initialized instance
2458 Of course, the
\verb\__init__\ method may have arguments for greater
2459 flexibility. In that case, arguments given to the class instantiation
2460 operator are passed on to
\verb\__init__\. For example,
2462 \bcode\begin{verbatim
}
2464 ... def __init__(self, realpart, imagpart):
2465 ... self.r = realpart
2466 ... self.i = imagpart
2468 >>> x = Complex(
3.0,-
4.5)
2472 \end{verbatim
}\ecode
2474 Methods may reference global names in the same way as ordinary
2475 functions. The global scope associated with a method is the module
2476 containing the class definition. (The class itself is never used as a
2477 global scope!) While one rarely encounters a good reason for using
2478 global data in a method, there are many legitimate uses of the global
2479 scope: for one thing, functions and modules imported into the global
2480 scope can be used by methods, as well as functions and classes defined
2481 in it. Usually, the class containing the method is itself defined in
2482 this global scope, and in the next section we'll find some good
2483 reasons why a method would want to reference its own class!
2486 \section{Inheritance
}
2488 Of course, a language feature would not be worthy of the name ``class''
2489 without supporting inheritance. The syntax for a derived class
2490 definition looks as follows:
2493 class DerivedClassName(BaseClassName):
2501 The name
\verb\BaseClassName\ must be defined in a scope containing
2502 the derived class definition. Instead of a base class name, an
2503 expression is also allowed. This is useful when the base class is
2504 defined in another module, e.g.,
2507 class DerivedClassName(modname.BaseClassName):
2510 Execution of a derived class definition proceeds the same as for a
2511 base class. When the class object is constructed, the base class is
2512 remembered. This is used for resolving attribute references: if a
2513 requested attribute is not found in the class, it is searched in the
2514 base class. This rule is applied recursively if the base class itself
2515 is derived from some other class.
2517 There's nothing special about instantiation of derived classes:
2518 \verb\DerivedClassName()\ creates a new instance of the class. Method
2519 references are resolved as follows: the corresponding class attribute
2520 is searched, descending down the chain of base classes if necessary,
2521 and the method reference is valid if this yields a function object.
2523 Derived classes may override methods of their base classes. Because
2524 methods have no special privileges when calling other methods of the
2525 same object, a method of a base class that calls another method
2526 defined in the same base class, may in fact end up calling a method of
2527 a derived class that overrides it. (For
\Cpp{} programmers: all methods
2528 in Python are ``virtual functions''.)
2530 An overriding method in a derived class may in fact want to extend
2531 rather than simply replace the base class method of the same name.
2532 There is a simple way to call the base class method directly: just
2533 call
\verb\BaseClassName.methodname(self, arguments)\. This is
2534 occasionally useful to clients as well. (Note that this only works if
2535 the base class is defined or imported directly in the global scope.)
2538 \subsection{Multiple inheritance
}
2540 Python supports a limited form of multiple inheritance as well. A
2541 class definition with multiple base classes looks as follows:
2544 class DerivedClassName(Base1, Base2, Base3):
2552 The only rule necessary to explain the semantics is the resolution
2553 rule used for class attribute references. This is depth-first,
2554 left-to-right. Thus, if an attribute is not found in
2555 \verb\DerivedClassName\, it is searched in
\verb\Base1\, then
2556 (recursively) in the base classes of
\verb\Base1\, and only if it is
2557 not found there, it is searched in
\verb\Base2\, and so on.
2559 (To some people breadth first---searching
\verb\Base2\ and
2560 \verb\Base3\ before the base classes of
\verb\Base1\---looks more
2561 natural. However, this would require you to know whether a particular
2562 attribute of
\verb\Base1\ is actually defined in
\verb\Base1\ or in
2563 one of its base classes before you can figure out the consequences of
2564 a name conflict with an attribute of
\verb\Base2\. The depth-first
2565 rule makes no differences between direct and inherited attributes of
2568 It is clear that indiscriminate use of multiple inheritance is a
2569 maintenance nightmare, given the reliance in Python on conventions to
2570 avoid accidental name conflicts. A well-known problem with multiple
2571 inheritance is a class derived from two classes that happen to have a
2572 common base class. While it is easy enough to figure out what happens
2573 in this case (the instance will have a single copy of ``instance
2574 variables'' or data attributes used by the common base class), it is
2575 not clear that these semantics are in any way useful.
2578 \section{Odds and ends
}
2580 Sometimes it is useful to have a data type similar to the Pascal
2581 ``record'' or C ``struct'', bundling together a couple of named data
2582 items. An empty class definition will do nicely, e.g.:
2588 john = Employee() # Create an empty employee record
2590 # Fill the fields of the record
2591 john.name = 'John Doe'
2592 john.dept = 'computer lab'
2597 A piece of Python code that expects a particular abstract data type
2598 can often be passed a class that emulates the methods of that data
2599 type instead. For instance, if you have a function that formats some
2600 data from a file object, you can define a class with methods
2601 \verb\read()\ and
\verb\readline()\ that gets the data from a string
2602 buffer instead, and pass it as an argument. (Unfortunately, this
2603 technique has its limitations: a class can't define operations that
2604 are accessed by special syntax such as sequence subscripting or
2605 arithmetic operators, and assigning such a ``pseudo-file'' to
2606 \verb\sys.stdin\ will not cause the interpreter to read further input
2610 Instance method objects have attributes, too:
\verb\m.im_self\ is the
2611 object of which the method is an instance, and
\verb\m.im_func\ is the
2612 function object corresponding to the method.
2615 \chapter{Recent Additions
}
2617 Python is an evolving language. Since this tutorial was last
2618 thoroughly revised, several new features have been added to the
2619 language. While ideally I should revise the tutorial to incorporate
2620 them in the mainline of the text, lack of time currently requires me
2621 to take a more modest approach. In this chapter I will briefly list the
2622 most important improvements to the language and how you can use them
2625 \section{The Last Printed Expression
}
2627 In interactive mode, the last printed expression is assigned to the
2628 variable
\code{_
}. This means that when you are using Python as a
2629 desk calculator, it is somewhat easier to continue calculations, for
2633 >>> tax =
17.5 /
100
2644 For reasons too embarrassing to explain, this variable is implemented
2645 as a built-in (living in the module
\code{__builtin__
}), so it should
2646 be treated as read-only by the user. I.e. don't explicitly assign a
2647 value to it --- you would create an independent local variable with
2648 the same name masking the built-in variable with its magic behavior.
2650 \section{String Literals
}
2652 \subsection{Double Quotes
}
2654 Python can now also use double quotes to surround string literals,
2655 e.g.
\verb\"this doesn't hurt a bit"\. There is no semantic
2656 difference between strings surrounded by single or double quotes.
2658 \subsection{Continuation Of String Literals
}
2660 String literals can span multiple lines by escaping newlines with
2664 hello = "This is a rather long string containing
\n\
2665 several lines of text just as you would do in C.
\n\
2666 Note that whitespace at the beginning of the line is\
2671 which would print the following:
2673 This is a rather long string containing
2674 several lines of text just as you would do in C.
2675 Note that whitespace at the beginning of the line is significant.
2678 \subsection{Triple-quoted strings
}
2680 In some cases, when you need to include really long strings (e.g.
2681 containing several paragraphs of informational text), it is annoying
2682 that you have to terminate each line with
\verb@
\n\@, especially if
2683 you would like to reformat the text occasionally with a powerful text
2684 editor like Emacs. For such situations, ``triple-quoted'' strings can
2690 This string is bounded by triple double quotes (
3 times ").
2691 Unescaped newlines in the string are retained, though \
2692 it is still possible
\nto use all normal escape sequences.
2694 Whitespace at the beginning of a line is
2695 significant. If you need to include three opening quotes
2696 you have to escape at least one of them, e.g. \""".
2698 This string ends in a newline.
2702 Triple-quoted strings can be surrounded by three single quotes as
2703 well, again without semantic difference.
2705 \subsection{String Literal Juxtaposition
}
2707 One final twist: you can juxtapose multiple string literals. Two or
2708 more adjacent string literals (but not arbitrary expressions!)
2709 separated only by whitespace will be concatenated (without intervening
2710 whitespace) into a single string object at compile time. This makes
2711 it possible to continue a long string on the next line without
2712 sacrificing indentation or performance, unlike the use of the string
2713 concatenation operator
\verb\+\ or the continuation of the literal
2714 itself on the next line (since leading whitespace is significant
2715 inside all types of string literals). Note that this feature, like
2716 all string features except triple-quoted strings, is borrowed from
2719 \section{The Formatting Operator
}
2721 \subsection{Basic Usage
}
2723 The chapter on output formatting is really out of date: there is now
2724 an almost complete interface to C-style printf formats. This is done
2725 by overloading the modulo operator (
\verb\%\) for a left operand
2726 which is a string, e.g.
2730 >>> print 'The value of PI is approximately
%5.3f.' % math.pi
2731 The value of PI is approximately
3.142.
2735 If there is more than one format in the string you pass a tuple as
2739 >>> table =
{'Sjoerd':
4127, 'Jack':
4098, 'Dcab':
8637678}
2740 >>> for name, phone in table.items():
2741 ... print '
%-10s ==> %10d' % (name, phone)
2749 Most formats work exactly as in C and require that you pass the proper
2750 type (however, if you don't you get an exception, not a core dump).
2751 The
\verb\%s\ format is more relaxed: if the corresponding argument is
2752 not a string object, it is converted to string using the
\verb\str()\
2753 built-in function. Using
\verb\*\ to pass the width or precision in
2754 as a separate (integer) argument is supported. The C formats
2755 \verb\%n\ and
\verb\%p\ are not supported.
2757 \subsection{Referencing Variables By Name
}
2759 If you have a really long format string that you don't want to split
2760 up, it would be nice if you could reference the variables to be
2761 formatted by name instead of by position. This can be done by using
2762 an extension of C formats using the form
\verb\%(name)format\, e.g.
2765 >>> table =
{'Sjoerd':
4127, 'Jack':
4098, 'Dcab':
8637678}
2766 >>> print 'Jack:
%(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
2767 Jack:
4098; Sjoerd:
4127; Dcab:
8637678
2771 This is particularly useful in combination with the new built-in
2772 \verb\vars()\ function, which returns a dictionary containing all
2775 \section{Optional Function Arguments
}
2777 It is now possible to define functions with a variable number of
2778 arguments. There are two forms, which can be combined.
2780 \subsection{Default Argument Values
}
2782 The most useful form is to specify a default value for one or more
2783 arguments. This creates a function that can be called with fewer
2784 arguments than it is defined, e.g.
2787 def ask_ok(prompt, retries =
4, complaint = 'Yes or no, please!'):
2789 ok = raw_input(prompt)
2790 if ok in ('y', 'ye', 'yes'): return
1
2791 if ok in ('n', 'no', 'nop', 'nope'): return
0
2792 retries = retries -
1
2793 if retries <
0: raise IOError, 'refusenik user'
2797 This function can be called either like this:
2798 \verb\ask_ok('Do you really want to quit?')\ or like this:
2799 \verb\ask_ok('OK to overwrite the file?',
2)\.
2801 The default values are evaluated at the point of function definition
2802 in the
{\em defining
} scope, so that e.g.
2806 def f(arg = i): print arg
2811 will print
\verb\5\.
2813 \subsection{Arbitrary Argument Lists
}
2815 It is also possible to specify that a function can be called with an
2816 arbitrary number of arguments. These arguments will be wrapped up in
2817 a tuple. Before the variable number of arguments, zero or more normal
2818 arguments may occur, e.g.
2821 def fprintf(file, format, *args):
2822 file.write(format
% args)
2825 This feature may be combined with the previous, e.g.
2828 def but_is_it_useful(required, optional = None, *remains):
2829 print "I don't know"
2832 \section{Lambda And Functional Programming Tools
}
2834 \subsection{Lambda Forms
}
2836 By popular demand, a few features commonly found in functional
2837 programming languages and Lisp have been added to Python. With the
2838 \verb\lambda\ keyword, small anonymous functions can be created.
2839 Here's a function that returns the sum of its two arguments:
2840 \verb\lambda a, b: a+b\. Lambda forms can be used wherever function
2841 objects are required. They are syntactically restricted to a single
2842 expression. Semantically, they are just syntactic sugar for a normal
2843 function definition. Like nested function definitions, lambda forms
2844 cannot reference variables from the containing scope, but this can be
2845 overcome through the judicious use of default argument values, e.g.
2848 def make_incrementor(n):
2849 return lambda x, incr=n: x+incr
2852 \subsection{Map, Reduce and Filter
}
2854 Three new built-in functions on sequences are good candidate to pass
2857 \subsubsection{Map.
}
2859 \verb\map(function, sequence)\ calls
\verb\function(item)\ for each of
2860 the sequence's items and returns a list of the return values. For
2861 example, to compute some cubes:
2864 >>> map(lambda x: x*x*x, range(
1,
11))
2865 [1,
8,
27,
64,
125,
216,
343,
512,
729,
1000]
2869 More than one sequence may be passed; the function must then have as
2870 many arguments as there are sequences and is called with the
2871 corresponding item from each sequence (or
\verb\None\ if some sequence
2872 is shorter than another). If
\verb\None\ is passed for the function,
2873 a function returning its argument(s) is substituted.
2875 Combining these two special cases, we see that
2876 \verb\map(None, list1, list2)\ is a convenient way of turning a pair
2877 of lists into a list of pairs. For example:
2881 >>> map(None, seq, map(lambda x: x*x, seq))
2882 [(
0,
0), (
1,
1), (
2,
4), (
3,
9), (
4,
16), (
5,
25), (
6,
36), (
7,
49)
]
2886 \subsubsection{Filter.
}
2888 \verb\filter(function, sequence)\ returns a sequence (of the same
2889 type, if possible) consisting of those items from the sequence for
2890 which
\verb\function(item)\ is true. For example, to compute some
2894 >>> filter(lambda x: x
%2 != 0 and x%3 != 0, range(2, 25))
2895 [5,
7,
11,
13,
17,
19,
23]
2899 \subsubsection{Reduce.
}
2901 \verb\reduce(function, sequence)\ returns a single value constructed
2902 by calling the (binary) function on the first two items of the
2903 sequence, then on the result and the next item, and so on. For
2904 example, to compute the sum of the numbers
1 through
10:
2907 >>> reduce(lambda x, y: x+y, range(
1,
11))
2912 If there's only one item in the sequence, its value is returned; if
2913 the sequence is empty, an exception is raised.
2915 A third argument can be passed to indicate the starting value. In this
2916 case the starting value is returned for an empty sequence, and the
2917 function is first applied to the starting value and the first sequence
2918 item, then to the result and the next item, and so on. For example,
2922 ... return reduce(lambda x, y: x+y, seq,
0)
2924 >>> sum(range(
1,
11))
2931 \section{Continuation Lines Without Backslashes
}
2933 While the general mechanism for continuation of a source line on the
2934 next physical line remains to place a backslash on the end of the
2935 line, expressions inside matched parentheses (or square brackets, or
2936 curly braces) can now also be continued without using a backslash.
2937 This is particularly useful for calls to functions with many
2938 arguments, and for initializations of large tables.
2943 month_names =
['Januari', 'Februari', 'Maart',
2944 'April', 'Mei', 'Juni',
2945 'Juli', 'Augustus', 'September',
2946 'Oktober', 'November', 'December'
]
2952 CopyInternalHyperLinks(self.context.hyperlinks,
2953 copy.context.hyperlinks,
2957 \section{Regular Expressions
}
2959 While C's printf-style output formats, transformed into Python, are
2960 adequate for most output formatting jobs, C's scanf-style input
2961 formats are not very powerful. Instead of scanf-style input, Python
2962 offers Emacs-style regular expressions as a powerful input and
2963 scanning mechanism. Read the corresponding section in the Library
2964 Reference for a full description.
2966 \section{Generalized Dictionaries
}
2968 The keys of dictionaries are no longer restricted to strings --- they
2969 can be any immutable basic type including strings, numbers, tuples, or
2970 (certain) class instances. (Lists and dictionaries are not acceptable
2971 as dictionary keys, in order to avoid problems when the object used as
2974 Dictionaries have two new methods:
\verb\d.values()\ returns a list of
2975 the dictionary's values, and
\verb\d.items()\ returns a list of the
2976 dictionary's (key, value) pairs. Like
\verb\d.keys()\, these
2977 operations are slow for large dictionaries. Examples:
2980 >>> d =
{100: 'honderd',
1000: 'duizend',
10: 'tien'
}
2984 ['honderd', 'tien', 'duizend'
]
2986 [(
100, 'honderd'), (
10, 'tien'), (
1000, 'duizend')
]
2990 \section{Miscellaneous New Built-in Functions
}
2992 The function
\verb\vars()\ returns a dictionary containing the current
2993 local variables. With a module argument, it returns that module's
2994 global variables. The old function
\verb\dir(x)\ returns
2995 \verb\vars(x).keys()\.
2997 The function
\verb\round(x)\ returns a floating point number rounded
2998 to the nearest integer (but still expressed as a floating point
2999 number). E.g.
\verb\round(
3.4) ==
3.0\ and
\verb\round(
3.5) ==
4.0\.
3000 With a second argument it rounds to the specified number of digits,
3001 e.g.
\verb\round(math.pi,
4) ==
3.1416\ or even
3002 \verb\round(
123.4, -
2) ==
100.0\.
3004 The function
\verb\hash(x)\ returns a hash value for an object.
3005 All object types acceptable as dictionary keys have a hash value (and
3006 it is this hash value that the dictionary implementation uses).
3008 The function
\verb\id(x)\ return a unique identifier for an object.
3009 For two objects x and y,
\verb\id(x) == id(y)\ if and only if
3010 \verb\x is y\. (In fact the object's address is used.)
3012 The function
\verb\hasattr(x, name)\ returns whether an object has an
3013 attribute with the given name (a string value). The function
3014 \verb\getattr(x, name)\ returns the object's attribute with the given
3015 name. The function
\verb\setattr(x, name, value)\ assigns a value to
3016 an object's attribute with the given name. These three functions are
3017 useful if the attribute names are not known beforehand. Note that
3018 \verb\getattr(x, 'spam')\ is equivalent to
\verb\x.spam\, and
3019 \verb\setattr(x, 'spam', y)\ is equivalent to
\verb\x.spam = y\. By
3020 definition,
\verb\hasattr(x, name)\ returns true if and only if
3021 \verb\getattr(x, name)\ returns without raising an exception.
3023 \section{Else Clause For Try Statement
}
3025 The
\verb\try...except\ statement now has an optional
\verb\else\
3026 clause, which must follow all
\verb\except\ clauses. It is useful to
3027 place code that must be executed if the
\verb\try\ clause does not
3028 raise an exception. For example:
3031 for arg in sys.argv:
3035 print 'cannot open', arg
3037 print arg, 'has', len(f.readlines()), 'lines'
3042 \section{New Class Features in Release
1.1}
3044 Some changes have been made to classes: the operator overloading
3045 mechanism is more flexible, providing more support for non-numeric use
3046 of operators (including calling an object as if it were a function),
3047 and it is possible to trap attribute accesses.
3049 \subsection{New Operator Overloading
}
3051 It is no longer necessary to coerce both sides of an operator to the
3052 same class or type. A class may still provide a
\code{__coerce__
}
3053 method, but this method may return objects of different types or
3054 classes if it feels like it. If no
\code{__coerce__
} is defined, any
3055 argument type or class is acceptable.
3057 In order to make it possible to implement binary operators where the
3058 right-hand side is a class instance but the left-hand side is not,
3059 without using coercions, right-hand versions of all binary operators
3060 may be defined. These have an `r' prepended to their name,
3061 e.g.
\code{__radd__
}.
3063 For example, here's a very simple class for representing times. Times
3064 are initialized from a number of seconds (like time.time()). Times
3065 are printed like this:
\code{Thu Oct
6 14:
20:
06 1994}. Subtracting
3066 two Times gives their difference in seconds. Adding or subtracting a
3067 Time and a number gives a new Time. You can't add two times, nor can
3068 you subtract a Time from a number.
3074 def __init__(self, seconds):
3075 self.seconds = seconds
3077 return time.ctime(self.seconds)
3078 def __add__(self, x):
3079 return Time(self.seconds + x)
3080 __radd__ = __add__ # support for x+t
3081 def __sub__(self, x):
3082 if hasattr(x, 'seconds'): # test if x could be a Time
3083 return self.seconds - x.seconds
3085 return self.seconds - x
3087 now = Time(time.time())
3088 tomorrow =
24*
3600 + now
3089 yesterday = now - today
3090 print tomorrow - yesterday # prints
172800
3093 \subsection{Trapping Attribute Access
}
3095 You can define three new ``magic'' methods in a class now:
3096 \code{__getattr__(self, name)
},
\code{__setattr__(self, name, value)
}
3097 and
\code{__delattr__(self, name)
}.
3099 The
\code{__getattr__
} method is called when an attribute access fails,
3100 i.e. when an attribute access would otherwise raise AttributeError ---
3101 this is
{\em after
} the instance's dictionary and its class hierarchy
3102 have been searched for the named attribute. Note that if this method
3103 attempts to access any undefined instance attribute it will be called
3106 The
\code{__setattr__
} and
\code{__delattr__
} methods are called when
3107 assignment to, respectively deletion of an attribute are attempted.
3108 They are called
{\em instead
} of the normal action (which is to insert
3109 or delete the attribute in the instance dictionary). If either of
3110 these methods most set or delete any attribute, they can only do so by
3111 using the instance dictionary directly ---
\code{self.__dict__
} --- else
3112 they would be called recursively.
3114 For example, here's a near-universal ``Wrapper'' class that passes all
3115 its attribute accesses to another object. Note how the
3116 \code{__init__
} method inserts the wrapped object in
3117 \code{self.__dict__
} in order to avoid endless recursion
3118 (
\code{__setattr__
} would call
\code{__getattr__
} which would call
3119 itself recursively).
3123 def __init__(self, wrapped):
3124 self.__dict__
['wrapped'
] = wrapped
3125 def __getattr__(self, name):
3126 return getattr(self.wrapped, name)
3127 def __setattr__(self, name, value):
3128 setattr(self.wrapped, name, value)
3129 def __delattr__(self, name):
3130 delattr(self.wrapped, name)
3133 f = Wrapper(sys.stdout)
3134 f.write('hello world
\n') # prints 'hello world'
3137 A simpler example of
\code{__getattr__
} is an attribute that is
3138 computed each time (or the first time) it it accessed. For instance:
3144 def __init__(self, radius):
3145 self.radius = radius
3146 def __getattr__(self, name):
3147 if name == 'circumference':
3148 return
2 * pi * self.radius
3149 if name == 'diameter':
3150 return
2 * self.radius
3152 return pi * pow(self.radius,
2)
3153 raise AttributeError, name
3156 \subsection{Calling a Class Instance
}
3158 If a class defines a method
\code{__call__
} it is possible to call its
3159 instances as if they were functions. For example:
3162 class PresetSomeArguments:
3163 def __init__(self, func, *args):
3164 self.func, self.args = func, args
3165 def __call__(self, *args):
3166 return apply(self.func, self.args + args)
3168 f = PresetSomeArguments(pow,
2) # f(i) computes powers of
2
3169 for i in range(
10): print f(i), # prints
1 2 4 8 16 32 64 128 256 512
3170 print # append newline
3174 \chapter{New in Release
1.2}
3177 This chapter describes even more recent additions to the Python
3178 language and library.
3181 \section{New Class Features
}
3183 The semantics of
\code{__coerce__
} have been changed to be more
3184 reasonable. As an example, the new standard module
\code{Complex
}
3185 implements fairly complete complex numbers using this. Additional
3186 examples of classes with and without
\code{__coerce__
} methods can be
3187 found in the
\code{Demo/classes
} subdirectory, modules
\code{Rat
} and
3190 If a class defines no
\code{__coerce__
} method, this is equivalent to
3191 the following definition:
3194 def __coerce__(self, other): return self, other
3197 If
\code{__coerce__
} coerces itself to an object of a different type,
3198 the operation is carried out using that type --- in release
1.1, this
3199 would cause an error.
3201 Comparisons involving class instances now invoke
\code{__coerce__
}
3202 exactly as if
\code{cmp(x, y)
} were a binary operator like
\code{+
}
3203 (except if
\code{x
} and
\code{y
} are the same object).
3205 \section{Unix Signal Handling
}
3207 On Unix, Python now supports signal handling. The module
3208 \code{signal
} exports functions
\code{signal
},
\code{pause
} and
3209 \code{alarm
}, which act similar to their Unix counterparts. The
3210 module also exports the conventional names for the various signal
3211 classes (also usable with
\code{os.kill()
}) and
\code{SIG_IGN
} and
3212 \code{SIG_DFL
}. See the section on
\code{signal
} in the Library
3213 Reference Manual for more information.
3215 \section{Exceptions Can Be Classes
}
3217 User-defined exceptions are no longer limited to being string objects
3218 --- they can be identified by classes as well. Using this mechanism it
3219 is possible to create extensible hierarchies of exceptions.
3221 There are two new valid (semantic) forms for the raise statement:
3224 raise Class, instance
3229 In the first form,
\code{instance
} must be an instance of
\code{Class
}
3230 or of a class derived from it. The second form is a shorthand for
3233 raise instance.__class__, instance
3236 An except clause may list classes as well as string objects. A class
3237 in an except clause is compatible with an exception if it is the same
3238 class or a base class thereof (but not the other way around --- an
3239 except clause listing a derived class is not compatible with a base
3240 class). For example, the following code will print B, C, D in that
3262 Note that if the except clauses were reversed (with ``
\code{except B
}''
3263 first), it would have printed B, B, B --- the first matching except
3264 clause is triggered.
3266 When an error message is printed for an unhandled exception which is a
3267 class, the class name is printed, then a colon and a space, and
3268 finally the instance converted to a string using the built-in function
3271 In this release, the built-in exceptions are still strings.
3274 \section{Object Persistency and Object Copying
}
3276 Two new modules,
\code{pickle
} and
\code{shelve
}, support storage and
3277 retrieval of (almost) arbitrary Python objects on disk, using the
3278 \code{dbm
} package. A third module,
\code{copy
}, provides flexible
3279 object copying operations. More information on these modules is
3280 provided in the Library Reference Manual.
3282 \subsection{Persistent Objects
}
3284 The module
\code{pickle
} provides a general framework for objects to
3285 disassemble themselves into a stream of bytes and to reassemble such a
3286 stream back into an object. It copes with reference sharing,
3287 recursive objects and instances of user-defined classes, but not
3288 (directly) with objects that have ``magical'' links into the operating
3289 system such as open files, sockets or windows.
3291 The
\code{pickle
} module defines a simple protocol whereby
3292 user-defined classes can control how they are disassembled and
3293 assembled. The method
\code{__getinitargs__()
}, if defined, returns
3294 the argument list for the constructor to be used at assembly time (by
3295 default the constructor is called without arguments). The methods
3296 \code{__getstate__()
} and
\code{__setstate__()
} are used to pass
3297 additional state from disassembly to assembly; by default the
3298 instance's
\code{__dict__
} is passed and restored.
3300 Note that
\code{pickle
} does not open or close any files --- it can be
3301 used equally well for moving objects around on a network or store them
3302 in a database. For ease of debugging, and the inevitable occasional
3303 manual patch-up, the constructed byte streams consist of printable
3304 ASCII characters only (though it's not designed to be pretty).
3306 The module
\code{shelve
} provides a simple model for storing objects
3307 on files. The operation
\code{shelve.open(filename)
} returns a
3308 ``shelf'', which is a simple persistent database with a
3309 dictionary-like interface. Database keys are strings, objects stored
3310 in the database can be anything that
\code{pickle
} will handle.
3312 \subsection{Copying Objects
}
3314 The module
\code{copy
} exports two functions:
\code{copy()
} and
3315 \code{deepcopy()
}. The
\code{copy()
} function returns a ``shallow''
3316 copy of an object;
\code{deepcopy()
} returns a ``deep'' copy. The
3317 difference between shallow and deep copying is only relevant for
3318 compound objects (objects that contain other objects, like lists or
3324 A shallow copy constructs a new compound object and then (to the
3325 extent possible) inserts
{\em the same objects
} into in that the
3329 A deep copy constructs a new compound object and then, recursively,
3330 inserts
{\em copies
} into it of the objects found in the original.
3334 Both functions have the same restrictions and use the same protocols
3335 as
\code{pickle
} --- user-defined classes can control how they are
3336 copied by providing methods named
\code{__getinitargs__()
},
3337 \code{__getstate__()
} and
\code{__setstate__()
}.
3340 \section{Documentation Strings
}
3342 A variety of objects now have a new attribute,
\code{__doc__
}, which
3343 is supposed to contain a documentation string (if no documentation is
3344 present, the attribute is
\code{None
}). New syntax, compatible with
3345 the old interpreter, allows for convenient initialization of the
3346 \code{__doc__
} attribute of modules, classes and functions by placing
3347 a string literal by itself as the first statement in the suite. It
3348 must be a literal --- an expression yielding a string object is not
3349 accepted as a documentation string, since future tools may need to
3350 derive documentation from source by parsing.
3352 Here is a hypothetical, amply documented module called
\code{Spam
}:
3357 This module exports two classes, a function and an exception:
3359 class Spam: full Spam functionality --- three can sizes
3360 class SpamLight: limited Spam functionality --- only one can size
3362 def open(filename): open a file and return a corresponding Spam or
3365 GoneOff: exception raised for errors; should never happen
3367 Note that it is always possible to convert a SpamLight object to a
3368 Spam object by a simple method call, but that the reverse operation is
3369 generally costly and may fail for a number of reasons.
3373 """Limited spam functionality.
3375 Supports a single can size, no flavor, and only hard disks.
3378 def __init__(self, size=
12):
3379 """Construct a new SpamLight instance.
3381 Argument is the can size.
3387 class Spam(SpamLight):
3388 """Full spam functionality.
3390 Supports three can sizes, two flavor varieties, and all floppy
3391 disk formats still supported by current hardware.
3394 def __init__(self, size1=
8, size2=
12, size3=
20):
3395 """Construct a new Spam instance.
3397 Arguments are up to three can sizes.
3403 def open(filename = "/dev/null"):
3404 """Open a can of Spam.
3406 Argument must be an existing file.
3411 """Class used for Spam exceptions.
3413 There shouldn't be any.
3418 After executing ``
\code{import Spam
}'', the following expressions
3419 return the various documentation strings from the module:
3423 Spam.SpamLight.__doc__
3424 Spam.SpamLight.__init__.__doc__
3426 Spam.Spam.__init__.__doc__
3428 Spam.GoneOff.__doc__
3431 There are emerging conventions about the content and formatting of
3432 documentation strings.
3434 The first line should always be a short, concise summary of the
3435 object's purpose. For brevity, it should not explicitly state the
3436 object's name or type, since these are available by other means
3437 (except if the name happens to be a verb describing a function's
3438 operation). This line should begin with a capital letter and end with
3441 If there are more lines in the documentation string, the second line
3442 should be blank, visually separating the summary from the rest of the
3443 description. The following lines should be one of more of paragraphs
3444 describing the objects calling conventions, its side effects, etc.
3446 Some people like to copy the Emacs convention of using UPPER CASE for
3447 function parameters --- this often saves a few words or lines.
3449 The Python parser does not strip indentation from multi-line string
3450 literals in Python, so tools that process documentation have to strip
3451 indentation. This is done using the following convention. The first
3452 non-blank line
{\em after
} the first line of the string determines the
3453 amount of indentation for the entire documentation string. (We can't
3454 use the first line since it is generally adjacent to the string's
3455 opening quotes so its indentation is not apparent in the string
3456 literal.) Whitespace ``equivalent'' to this indentation is then
3457 stripped from the start of all lines of the string. Lines that are
3458 indented less should not occur, but if they occur all their leading
3459 whitespace should be stripped. Equivalence of whitespace should be
3460 tested after expansion of tabs (to
8 spaces, normally).
3462 In this release, few of the built-in or standard functions and modules
3463 have documentation strings.
3466 \section{Customizing Import and Built-Ins
}
3468 In preparation for a ``restricted execution mode'' which will be
3469 usable to run code received from an untrusted source (such as a WWW
3470 server or client), the mechanism by which modules are imported has
3471 been redesigned. It is now possible to provide your own function
3472 \code{__import__
} which is called whenever an
\code{import
} statement
3473 is executed. There's a built-in function
\code{__import__
} which
3474 provides the default implementation, but more interesting, the various
3475 steps it takes are available separately from the new built-in module
3476 \code{imp
}. (See the section on
\code{imp
} in the Library Reference
3477 Manual for more information on this module.)
3479 When you do
\code{dir()
} in a fresh interactive interpreter you will
3480 see another ``secret'' object that's present in every module:
3481 \code{__builtins__
}. This is either a dictionary or a module
3482 containing the set of built-in objects used by functions defined in
3483 current module. Although normally all modules are initialized with a
3484 reference to the same dictionary, it is now possible to use a
3485 different set of built-ins on a per-module basis. Together with the
3486 fact that the
\code{import
} statement uses the
\code{__import__
}
3487 function it finds in the importing modules' dictionary of built-ins,
3488 this forms the basis for a future restricted execution mode.
3491 \section{Python and the World-Wide Web
}
3493 There is a growing number of modules available for writing WWW tools.
3494 The previous release already sported modules
\code{gopherlib
},
3495 \code{ftplib
},
\code{httplib
} and
\code{urllib
} (which unifies the
3496 other three) for accessing data through the commonest WWW protocols.
3497 This release also provides
\code{cgi
}, to ease the writing of
3498 server-side scripts that use the Common Gateway Interface protocol,
3499 supported by most WWW servers. The module
\code{urlparse
} provides
3500 precise parsing of a URL string into its components (address scheme,
3501 network location, path, parameters, query, and fragment identifier).
3503 A rudimentary, parser for HTML files is available in the module
3504 \code{htmllib
}. It currently supports a subset of HTML
1.0 (if you
3505 bring it up to date, I'd love to receive your fixes!). Unfortunately
3506 Python seems to be too slow for real-time parsing and formatting of
3507 HTML such as required by interactive WWW browsers --- but it's good
3508 enough to write a ``robot'' (an automated WWW browser that searches
3509 the web for information).
3512 \section{Miscellaneous
}
3517 The
\code{socket
} module now exports all the needed constants used for
3518 socket operations, such as
\code{SO_BROADCAST
}.
3521 The functions
\code{popen()
} and
\code{fdopen()
} in the
\code{os
}
3522 module now follow the pattern of the built-in function
\code{open()
}:
3523 the default mode argument is
\code{'r'
} and the optional third
3524 argument specifies the buffer size, where
\code{0} means unbuffered,
3525 \code{1} means line-buffered, and any larger number means the size of
3526 the buffer in bytes.