1 \documentstyle[twoside,
11pt,myformat
]{report}
3 \title{Python Tutorial
}
18 Python is a simple, yet powerful programming language that bridges the
19 gap between C and shell programming, and is thus ideally suited for
20 ``throw-away programming''
21 and rapid prototyping. Its syntax is put
22 together from constructs borrowed from a variety of other languages;
23 most prominent are influences from ABC, C, Modula-
3 and Icon.
25 The Python interpreter is easily extended with new functions and data
26 types implemented in C. Python is also suitable as an extension
27 language for highly customizable C applications such as editors or
30 Python is available for various operating systems, amongst which
31 several flavors of
{\UNIX}, Amoeba, the Apple Macintosh O.S.,
34 This tutorial introduces the reader informally to the basic concepts
35 and features of the Python language and system. It helps to have a
36 Python interpreter handy for hands-on experience, but as the examples
37 are self-contained, the tutorial can be read off-line as well.
39 For a description of standard objects and modules, see the
{\em Python
40 Library Reference
} document. The
{\em Python Reference Manual
} gives
41 a more formal definition of the language.
53 \pagenumbering{arabic
}
56 \chapter{Whetting Your Appetite
}
58 If you ever wrote a large shell script, you probably know this
59 feeling: you'd love to add yet another feature, but it's already so
60 slow, and so big, and so complicated; or the feature involves a system
61 call or other function that is only accessible from C
\ldots Usually
62 the problem at hand isn't serious enough to warrant rewriting the
63 script in C; perhaps because the problem requires variable-length
64 strings or other data types (like sorted lists of file names) that are
65 easy in the shell but lots of work to implement in C; or perhaps just
66 because you're not sufficiently familiar with C.
68 In such cases, Python may be just the language for you. Python is
69 simple to use, but it is a real programming language, offering much
70 more structure and support for large programs than the shell has. On
71 the other hand, it also offers much more error checking than C, and,
72 being a
{\em very-high-level language
}, it has high-level data types
73 built in, such as flexible arrays and dictionaries that would cost you
74 days to implement efficiently in C. Because of its more general data
75 types Python is applicable to a much larger problem domain than
{\em
76 Awk
} or even
{\em Perl
}, yet many things are at least as easy in
77 Python as in those languages.
79 Python allows you to split up your program in modules that can be
80 reused in other Python programs. It comes with a large collection of
81 standard modules that you can use as the basis of your programs --- or
82 as examples to start learning to program in Python. There are also
83 built-in modules that provide things like file I/O, system calls,
84 sockets, and even a generic interface to window systems (STDWIN).
86 Python is an interpreted language, which can save you considerable time
87 during program development because no compilation and linking is
88 necessary. The interpreter can be used interactively, which makes it
89 easy to experiment with features of the language, to write throw-away
90 programs, or to test functions during bottom-up program development.
91 It is also a handy desk calculator.
93 Python allows writing very compact and readable programs. Programs
94 written in Python are typically much shorter than equivalent C
95 programs, for several reasons:
98 the high-level data types allow you to express complex operations in a
101 statement grouping is done by indentation instead of begin/end
104 no variable or argument declarations are necessary.
107 Python is
{\em extensible
}: if you know how to program in C it is easy
108 to add a new built-in
110 module to the interpreter, either to
111 perform critical operations at maximum speed, or to link Python
112 programs to libraries that may only be available in binary form (such
113 as a vendor-specific graphics library). Once you are really hooked,
114 you can link the Python interpreter into an application written in C
115 and use it as an extension or command language for that application.
117 By the way, the language is named after the BBC show ``Monty
118 Python's Flying Circus'' and has nothing to do with nasty reptiles...
120 \section{Where From Here
}
122 Now that you are all excited about Python, you'll want to examine it
123 in some more detail. Since the best way to learn a language is
124 using it, you are invited here to do so.
126 In the next chapter, the mechanics of using the interpreter are
127 explained. This is rather mundane information, but essential for
128 trying out the examples shown later.
130 The rest of the tutorial introduces various features of the Python
131 language and system though examples, beginning with simple
132 expressions, statements and data types, through functions and modules,
133 and finally touching upon advanced concepts like exceptions
134 and user-defined classes.
136 When you're through with the tutorial (or just getting bored), you
137 should read the Library Reference, which gives complete (though terse)
138 reference material about built-in and standard types, functions and
139 modules that can save you a lot of time when writing Python programs.
142 \chapter{Using the Python Interpreter
}
144 \section{Invoking the Interpreter
}
146 The Python interpreter is usually installed as
{\tt /usr/local/bin/python
}
147 on those machines where it is available; putting
{\tt /usr/local/bin
} in
148 your
{\UNIX} shell's search path makes it possible to start it by
151 \bcode\begin{verbatim
}
155 to the shell. Since the choice of the directory where the interpreter
156 lives is an installation option, other places are possible; check with
157 your local Python guru or system administrator. (E.g.,
{\tt
158 /usr/local/python
} is a popular alternative location.)
160 The interpreter operates somewhat like the
{\UNIX} shell: when called
161 with standard input connected to a tty device, it reads and executes
162 commands interactively; when called with a file name argument or with
163 a file as standard input, it reads and executes a
{\em script
} from
166 A third way of starting the interpreter is
167 ``
{\tt python -c command
[arg
] ...
}'', which
168 executes the statement(s) in
{\tt command
}, analogous to the shell's
169 {\tt -c
} option. Since Python statements often contain spaces or other
170 characters that are special to the shell, it is best to quote
{\tt
171 command
} in its entirety with double quotes.
173 Note that there is a difference between ``
{\tt python file
}'' and
174 ``
{\tt python $<$file
}''. In the latter case, input requests from the
175 program, such as calls to
{\tt input()
} and
{\tt raw_input()
}, are
176 satisfied from
{\em file
}. Since this file has already been read
177 until the end by the parser before the program starts executing, the
178 program will encounter EOF immediately. In the former case (which is
179 usually what you want) they are satisfied from whatever file or device
180 is connected to standard input of the Python interpreter.
182 When a script file is used, it is sometimes useful to be able to run
183 the script and enter interactive mode afterwards. This can be done by
184 passing
{\tt -i
} before the script. (This does not work if the script
185 is read from standard input, for the same reason as explained in the
188 \subsection{Argument Passing
}
190 When known to the interpreter, the script name and additional
191 arguments thereafter are passed to the script in the variable
{\tt
192 sys.argv
}, which is a list of strings. Its length is at least one;
193 when no script and no arguments are given,
{\tt sys.argv
[0]} is an
194 empty string. When the script name is given as
{\tt '-'
} (meaning
195 standard input),
{\tt sys.argv
[0]} is set to
{\tt '-'
}. When
{\tt -c
196 command
} is used,
{\tt sys.argv
[0]} is set to
{\tt '-c'
}. Options
197 found after
{\tt -c command
} are not consumed by the Python
198 interpreter's option processing but left in
{\tt sys.argv
} for the
201 \subsection{Interactive Mode
}
203 When commands are read from a tty, the interpreter is said to be in
204 {\em interactive\ mode
}. In this mode it prompts for the next command
205 with the
{\em primary\ prompt
}, usually three greater-than signs (
{\tt
206 >>>
}); for continuation lines it prompts with the
207 {\em secondary\ prompt
},
208 by default three dots (
{\tt ...
}). Typing an EOF (Control-D)
209 at the primary prompt causes the interpreter to exit with a zero exit
212 The interpreter prints a welcome message stating its version number
213 and a copyright notice before printing the first prompt, e.g.:
215 \bcode\begin{verbatim
}
217 Python
1.3 (Oct
13 1995)
218 Copyright
1991-
1995 Stichting Mathematisch Centrum, Amsterdam
222 \section{The Interpreter and its Environment
}
224 \subsection{Error Handling
}
226 When an error occurs, the interpreter prints an error
227 message and a stack trace. In interactive mode, it then returns to
228 the primary prompt; when input came from a file, it exits with a
229 nonzero exit status after printing
230 the stack trace. (Exceptions handled by an
{\tt except
} clause in a
231 {\tt try
} statement are not errors in this context.) Some errors are
232 unconditionally fatal and cause an exit with a nonzero exit; this
233 applies to internal inconsistencies and some cases of running out of
234 memory. All error messages are written to the standard error stream;
235 normal output from the executed commands is written to standard
238 Typing the interrupt character (usually Control-C or DEL) to the
239 primary or secondary prompt cancels the input and returns to the
242 A problem with the GNU Readline package may prevent this.
244 Typing an interrupt while a command is executing raises the
{\tt
245 KeyboardInterrupt
} exception, which may be handled by a
{\tt try
}
248 \subsection{The Module Search Path
}
250 When a module named
{\tt spam
} is imported, the interpreter searches
251 for a file named
{\tt spam.py
} in the list of directories specified by
252 the environment variable
{\tt PYTHONPATH
}. It has the same syntax as
253 the
{\UNIX} shell variable
{\tt PATH
}, i.e., a list of colon-separated
254 directory names. When
{\tt PYTHONPATH
} is not set, or when the file
255 is not found there, the search continues in an installation-dependent
256 default path, usually
{\tt .:/usr/local/lib/python
}.
258 Actually, modules are searched in the list of directories given by the
259 variable
{\tt sys.path
} which is initialized from
{\tt PYTHONPATH
} and
260 the installation-dependent default. This allows Python programs that
261 know what they're doing to modify or replace the module search path.
262 See the section on Standard Modules later.
264 \subsection{``Compiled'' Python files
}
266 As an important speed-up of the start-up time for short programs that
267 use a lot of standard modules, if a file called
{\tt spam.pyc
} exists
268 in the directory where
{\tt spam.py
} is found, this is assumed to
269 contain an already-``compiled'' version of the module
{\tt spam
}. The
270 modification time of the version of
{\tt spam.py
} used to create
{\tt
271 spam.pyc
} is recorded in
{\tt spam.pyc
}, and the file is ignored if
274 Whenever
{\tt spam.py
} is successfully compiled, an attempt is made to
275 write the compiled version to
{\tt spam.pyc
}. It is not an error if
276 this attempt fails; if for any reason the file is not written
277 completely, the resulting
{\tt spam.pyc
} file will be recognized as
278 invalid and thus ignored later.
280 \subsection{Executable Python scripts
}
282 On BSD'ish
{\UNIX} systems, Python scripts can be made directly
283 executable, like shell scripts, by putting the line
285 \bcode\begin{verbatim
}
286 #! /usr/local/bin/python
289 (assuming that's the name of the interpreter) at the beginning of the
290 script and giving the file an executable mode. The
{\tt \#!
} must be
291 the first two characters of the file.
293 \subsection{The Interactive Startup File
}
295 When you use Python interactively, it is frequently handy to have some
296 standard commands executed every time the interpreter is started. You
297 can do this by setting an environment variable named
{\tt
298 PYTHONSTARTUP
} to the name of a file containing your start-up
299 commands. This is similar to the
{\tt .profile
} feature of the UNIX
302 This file is only read in interactive sessions, not when Python reads
303 commands from a script, and not when
{\tt /dev/tty
} is given as the
304 explicit source of commands (which otherwise behaves like an
305 interactive session). It is executed in the same name space where
306 interactive commands are executed, so that objects that it defines or
307 imports can be used without qualification in the interactive session.
308 You can also change the prompts
{\tt sys.ps1
} and
{\tt sys.ps2
} in
311 If you want to read an additional start-up file from the current
312 directory, you can program this in the global start-up file, e.g.
313 \verb\execfile('.pythonrc')\. If you want to use the startup file
314 in a script, you must write this explicitly in the script, e.g.
315 \verb\import os;\
\verb\execfile(os.environ
['PYTHONSTARTUP'
])\.
317 \section{Interactive Input Editing and History Substitution
}
319 Some versions of the Python interpreter support editing of the current
320 input line and history substitution, similar to facilities found in
321 the Korn shell and the GNU Bash shell. This is implemented using the
322 {\em GNU\ Readline
} library, which supports Emacs-style and vi-style
323 editing. This library has its own documentation which I won't
324 duplicate here; however, the basics are easily explained.
326 Perhaps the quickest check to see whether command line editing is
327 supported is typing Control-P to the first Python prompt you get. If
328 it beeps, you have command line editing. If nothing appears to
329 happen, or if
\verb/^P/ is echoed, you can skip the rest of this
332 \subsection{Line Editing
}
334 If supported, input line editing is active whenever the interpreter
335 prints a primary or secondary prompt. The current line can be edited
336 using the conventional Emacs control characters. The most important
337 of these are: C-A (Control-A) moves the cursor to the beginning of the
338 line, C-E to the end, C-B moves it one position to the left, C-F to
339 the right. Backspace erases the character to the left of the cursor,
340 C-D the character to its right. C-K kills (erases) the rest of the
341 line to the right of the cursor, C-Y yanks back the last killed
342 string. C-underscore undoes the last change you made; it can be
343 repeated for cumulative effect.
345 \subsection{History Substitution
}
347 History substitution works as follows. All non-empty input lines
348 issued are saved in a history buffer, and when a new prompt is given
349 you are positioned on a new line at the bottom of this buffer. C-P
350 moves one line up (back) in the history buffer, C-N moves one down.
351 Any line in the history buffer can be edited; an asterisk appears in
352 front of the prompt to mark a line as modified. Pressing the Return
353 key passes the current line to the interpreter. C-R starts an
354 incremental reverse search; C-S starts a forward search.
356 \subsection{Key Bindings
}
358 The key bindings and some other parameters of the Readline library can
359 be customized by placing commands in an initialization file called
360 {\tt \$HOME/.inputrc
}. Key bindings have the form
362 \bcode\begin{verbatim
}
363 key-name: function-name
368 \bcode\begin{verbatim
}
369 "string": function-name
372 and options can be set with
374 \bcode\begin{verbatim
}
375 set option-name value
380 \bcode\begin{verbatim
}
381 # I prefer vi-style editing:
383 # Edit using a single line:
384 set horizontal-scroll-mode On
386 Meta-h: backward-kill-word
387 "
\C-u": universal-argument
388 "
\C-x
\C-r": re-read-init-file
391 Note that the default binding for TAB in Python is to insert a TAB
392 instead of Readline's default filename completion function. If you
393 insist, you can override this by putting
395 \bcode\begin{verbatim
}
399 in your
{\tt \$HOME/.inputrc
}. (Of course, this makes it hard to type
400 indented continuation lines...)
402 \subsection{Commentary
}
404 This facility is an enormous step forward compared to previous
405 versions of the interpreter; however, some wishes are left: It would
406 be nice if the proper indentation were suggested on continuation lines
407 (the parser knows if an indent token is required next). The
408 completion mechanism might use the interpreter's symbol table. A
409 command to check (or even suggest) matching parentheses, quotes etc.
410 would also be useful.
413 \chapter{An Informal Introduction to Python
}
415 In the following examples, input and output are distinguished by the
416 presence or absence of prompts (
{\tt >>>
} and
{\tt ...
}): to repeat
417 the example, you must type everything after the prompt, when the
418 prompt appears; lines that do not begin with a prompt are output from
421 I'd prefer to use different fonts to distinguish input
422 from output, but the amount of LaTeX hacking that would require
423 is currently beyond my ability.
425 Note that a secondary prompt on a line by itself in an example means
426 you must type a blank line; this is used to end a multi-line command.
428 \section{Using Python as a Calculator
}
430 Let's try some simple Python commands. Start the interpreter and wait
431 for the primary prompt,
{\tt >>>
}. (It shouldn't take long.)
435 The interpreter acts as a simple calculator: you can type an
436 expression at it and it will write the value. Expression syntax is
437 straightforward: the operators
{\tt +
},
{\tt -
},
{\tt *
} and
{\tt /
}
438 work just like in most other languages (e.g., Pascal or C); parentheses
439 can be used for grouping. For example:
441 \bcode\begin{verbatim
}
444 >>> # This is a comment
447 >>>
2+
2 # and a comment on the same line as code
451 >>> # Integer division returns the floor:
459 Like in C, the equal sign (
{\tt =
}) is used to assign a value to a
460 variable. The value of an assignment is not written:
462 \bcode\begin{verbatim
}
470 A value can be assigned to several variables simultaneously:
472 \bcode\begin{verbatim
}
473 >>> x = y = z =
0 # Zero x, y and z
483 There is full support for floating point; operators with mixed type
484 operands convert the integer operand to floating point:
486 \bcode\begin{verbatim
}
496 Besides numbers, Python can also manipulate strings, enclosed in
497 single quotes or double quotes:
499 \bcode\begin{verbatim
}
506 >>> '"Yes," he said.'
508 >>> "\"Yes,\" he said."
510 >>> '"Isn\'t," she said.'
511 '"Isn\'t," she said.'
515 Strings are written the same way as they are typed for input: inside
516 quotes and with quotes and other funny characters escaped by backslashes,
517 to show the precise value. The string is enclosed in double quotes if
518 the string contains a single quote and no double quotes, else it's
519 enclosed in single quotes. (The
{\tt print
} statement, described later,
520 can be used to write strings without quotes or escapes.)
522 Strings can be concatenated (glued together) with the
{\tt +
}
523 operator, and repeated with
{\tt *
}:
525 \bcode\begin{verbatim
}
526 >>> word = 'Help' + 'A'
529 >>> '<' + word*
5 + '>'
530 '<HelpAHelpAHelpAHelpAHelpA>'
534 Strings can be subscripted (indexed); like in C, the first character of
535 a string has subscript (index)
0.
537 There is no separate character type; a character is simply a string of
538 size one. Like in Icon, substrings can be specified with the
{\em
539 slice
} notation: two indices separated by a colon.
541 \bcode\begin{verbatim
}
551 Slice indices have useful defaults; an omitted first index defaults to
552 zero, an omitted second index defaults to the size of the string being
555 \bcode\begin{verbatim
}
556 >>> word
[:
2] # The first two characters
558 >>> word
[2:
] # All but the first two characters
563 Here's a useful invariant of slice operations:
\verb\s[:i
] + s
[i:
]\
566 \bcode\begin{verbatim
}
567 >>> word
[:
2] + word
[2:
]
569 >>> word
[:
3] + word
[3:
]
574 Degenerate slice indices are handled gracefully: an index that is too
575 large is replaced by the string size, an upper bound smaller than the
576 lower bound returns an empty string.
578 \bcode\begin{verbatim
}
588 Indices may be negative numbers, to start counting from the right.
591 \bcode\begin{verbatim
}
592 >>> word
[-
1] # The last character
594 >>> word
[-
2] # The last-but-one character
596 >>> word
[-
2:
] # The last two characters
598 >>> word
[:-
2] # All but the last two characters
603 But note that -
0 is really the same as
0, so it does not count from
606 \bcode\begin{verbatim
}
607 >>> word
[-
0] # (since -
0 equals
0)
612 Out-of-range negative slice indices are truncated, but don't try this
613 for single-element (non-slice) indices:
615 \bcode\begin{verbatim
}
618 >>> word
[-
10] # error
619 Traceback (innermost last):
620 File "<stdin>", line
1
621 IndexError: string index out of range
625 The best way to remember how slices work is to think of the indices as
626 pointing
{\em between
} characters, with the left edge of the first
627 character numbered
0. Then the right edge of the last character of a
628 string of
{\tt n
} characters has index
{\tt n
}, for example:
630 \bcode\begin{verbatim
}
631 +---+---+---+---+---+
632 | H | e | l | p | A |
633 +---+---+---+---+---+
638 The first row of numbers gives the position of the indices
0..
.5 in
639 the string; the second row gives the corresponding negative indices.
640 The slice from
\verb\i\ to
\verb\j\ consists of all characters between
641 the edges labeled
\verb\i\ and
\verb\j\, respectively.
643 For nonnegative indices, the length of a slice is the difference of
644 the indices, if both are within bounds, e.g., the length of
645 \verb\word[1:
3]\ is
2.
647 The built-in function
{\tt len()
} returns the length of a string:
649 \bcode\begin{verbatim
}
650 >>> s = 'supercalifragilisticexpialidocious'
658 Python knows a number of
{\em compound
} data types, used to group
659 together other values. The most versatile is the
{\em list
}, which
660 can be written as a list of comma-separated values (items) between
661 square brackets. List items need not all have the same type.
663 \bcode\begin{verbatim
}
664 >>> a =
['spam', 'eggs',
100,
1234]
666 ['spam', 'eggs',
100,
1234]
670 Like string indices, list indices start at
0, and lists can be sliced,
671 concatenated and so on:
673 \bcode\begin{verbatim
}
682 >>> a
[:
2] +
['bacon',
2*
2]
683 ['spam', 'eggs', 'bacon',
4]
684 >>>
3*a
[:
3] +
['Boe!'
]
685 ['spam', 'eggs',
100, 'spam', 'eggs',
100, 'spam', 'eggs',
100, 'Boe!'
]
689 Unlike strings, which are
{\em immutable
}, it is possible to change
690 individual elements of a list:
692 \bcode\begin{verbatim
}
694 ['spam', 'eggs',
100,
1234]
697 ['spam', 'eggs',
123,
1234]
701 Assignment to slices is also possible, and this can even change the size
704 \bcode\begin{verbatim
}
705 >>> # Replace some items:
714 ... a
[1:
1] =
['bletch', 'xyzzy'
]
716 [123, 'bletch', 'xyzzy',
1234]
717 >>> a
[:
0] = a # Insert (a copy of) itself at the beginning
719 [123, 'bletch', 'xyzzy',
1234,
123, 'bletch', 'xyzzy',
1234]
723 The built-in function
{\tt len()
} also applies to lists:
725 \bcode\begin{verbatim
}
731 It is possible to nest lists (create lists containing other lists),
734 \bcode\begin{verbatim
}
743 >>> p
[1].append('xtra') # See section
5.1
745 [1,
[2,
3, 'xtra'
],
4]
751 Note that in the last example,
{\tt p
[1]} and
{\tt q
} really refer to
752 the same object! We'll come back to
{\em object semantics
} later.
754 \section{First Steps Towards Programming
}
756 Of course, we can use Python for more complicated tasks than adding
757 two and two together. For instance, we can write an initial
758 subsequence of the
{\em Fibonacci
} series as follows:
760 \bcode\begin{verbatim
}
761 >>> # Fibonacci series:
762 ... # the sum of two elements defines the next
777 This example introduces several new features.
782 The first line contains a
{\em multiple assignment
}: the variables
783 {\tt a
} and
{\tt b
} simultaneously get the new values
0 and
1. On the
784 last line this is used again, demonstrating that the expressions on
785 the right-hand side are all evaluated first before any of the
786 assignments take place.
789 The
{\tt while
} loop executes as long as the condition (here:
{\tt b <
790 10}) remains true. In Python, like in C, any non-zero integer value is
791 true; zero is false. The condition may also be a string or list value,
792 in fact any sequence; anything with a non-zero length is true, empty
793 sequences are false. The test used in the example is a simple
794 comparison. The standard comparison operators are written the same as
795 in C:
{\tt <
},
{\tt >
},
{\tt ==
},
{\tt <=
},
{\tt >=
} and
{\tt !=
}.
798 The
{\em body
} of the loop is
{\em indented
}: indentation is Python's
799 way of grouping statements. Python does not (yet!) provide an
800 intelligent input line editing facility, so you have to type a tab or
801 space(s) for each indented line. In practice you will prepare more
802 complicated input for Python with a text editor; most text editors have
803 an auto-indent facility. When a compound statement is entered
804 interactively, it must be followed by a blank line to indicate
805 completion (since the parser cannot guess when you have typed the last
809 The
{\tt print
} statement writes the value of the expression(s) it is
810 given. It differs from just writing the expression you want to write
811 (as we did earlier in the calculator examples) in the way it handles
812 multiple expressions and strings. Strings are printed without quotes,
813 and a space is inserted between items, so you can format things nicely,
816 \bcode\begin{verbatim
}
818 >>> print 'The value of i is', i
819 The value of i is
65536
823 A trailing comma avoids the newline after the output:
825 \bcode\begin{verbatim
}
831 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
835 Note that the interpreter inserts a newline before it prints the next
836 prompt if the last line was not completed.
841 \chapter{More Control Flow Tools
}
843 Besides the
{\tt while
} statement just introduced, Python knows the
844 usual control flow statements known from other languages, with some
847 \section{If Statements
}
849 Perhaps the most well-known statement type is the
{\tt if
} statement.
852 \bcode\begin{verbatim
}
855 ... print 'Negative changed to zero'
865 There can be zero or more
{\tt elif
} parts, and the
{\tt else
} part is
866 optional. The keyword `
{\tt elif
}' is short for `
{\tt else if
}', and is
867 useful to avoid excessive indentation. An
{\tt if...elif...elif...
}
868 sequence is a substitute for the
{\em switch
} or
{\em case
} statements
869 found in other languages.
871 \section{For Statements
}
873 The
{\tt for
} statement in Python differs a bit from what you may be
874 used to in C or Pascal. Rather than always iterating over an
875 arithmetic progression of numbers (like in Pascal), or leaving the user
876 completely free in the iteration test and step (as C), Python's
{\tt
877 for
} statement iterates over the items of any sequence (e.g., a list
878 or a string), in the order that they appear in the sequence. For
879 example (no pun intended):
881 \bcode\begin{verbatim
}
882 >>> # Measure some strings:
883 ... a =
['cat', 'window', 'defenestrate'
]
893 It is not safe to modify the sequence being iterated over in the loop
894 (this can only happen for mutable sequence types, i.e., lists). If
895 you need to modify the list you are iterating over, e.g., duplicate
896 selected items, you must iterate over a copy. The slice notation
897 makes this particularly convenient:
899 \bcode\begin{verbatim
}
900 >>> for x in a
[:
]: # make a slice copy of the entire list
901 ... if len(x) >
6: a.insert(
0, x)
904 ['defenestrate', 'cat', 'window', 'defenestrate'
]
908 \section{The
{\tt range()
} Function
}
910 If you do need to iterate over a sequence of numbers, the built-in
911 function
{\tt range()
} comes in handy. It generates lists containing
912 arithmetic progressions, e.g.:
914 \bcode\begin{verbatim
}
916 [0,
1,
2,
3,
4,
5,
6,
7,
8,
9]
920 The given end point is never part of the generated list;
{\tt range(
10)
}
921 generates a list of
10 values, exactly the legal indices for items of a
922 sequence of length
10. It is possible to let the range start at another
923 number, or to specify a different increment (even negative):
925 \bcode\begin{verbatim
}
930 >>> range(-
10, -
100, -
30)
935 To iterate over the indices of a sequence, combine
{\tt range()
} and
936 {\tt len()
} as follows:
938 \bcode\begin{verbatim
}
939 >>> a =
['Mary', 'had', 'a', 'little', 'lamb'
]
940 >>> for i in range(len(a)):
951 \section{Break and Continue Statements, and Else Clauses on Loops
}
953 The
{\tt break
} statement, like in C, breaks out of the smallest
954 enclosing
{\tt for
} or
{\tt while
} loop.
956 The
{\tt continue
} statement, also borrowed from C, continues with the
957 next iteration of the loop.
959 Loop statements may have an
{\tt else
} clause; it is executed when the
960 loop terminates through exhaustion of the list (with
{\tt for
}) or when
961 the condition becomes false (with
{\tt while
}), but not when the loop is
962 terminated by a
{\tt break
} statement. This is exemplified by the
963 following loop, which searches for prime numbers:
965 \bcode\begin{verbatim
}
966 >>> for n in range(
2,
10):
967 ... for x in range(
2, n):
969 ... print n, 'equals', x, '*', n/x
972 ... print n, 'is a prime number'
985 \section{Pass Statements
}
987 The
{\tt pass
} statement does nothing.
988 It can be used when a statement is required syntactically but the
989 program requires no action.
992 \bcode\begin{verbatim
}
994 ... pass # Busy-wait for keyboard interrupt
998 \section{Defining Functions
}
1000 We can create a function that writes the Fibonacci series to an
1003 \bcode\begin{verbatim
}
1004 >>> def fib(n): # write Fibonacci series up to n
1010 >>> # Now call the function we just defined:
1012 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
1014 \end{verbatim
}\ecode
1016 The keyword
{\tt def
} introduces a function
{\em definition
}. It must
1017 be followed by the function name and the parenthesized list of formal
1018 parameters. The statements that form the body of the function starts at
1019 the next line, indented by a tab stop.
1021 The
{\em execution
} of a function introduces a new symbol table used
1022 for the local variables of the function. More precisely, all variable
1023 assignments in a function store the value in the local symbol table;
1025 variable references first look in the local symbol table, then
1026 in the global symbol table, and then in the table of built-in names.
1028 global variables cannot be directly assigned a value within a
1029 function (unless named in a
{\tt global
} statement), although
1030 they may be referenced.
1032 The actual parameters (arguments) to a function call are introduced in
1033 the local symbol table of the called function when it is called; thus,
1034 arguments are passed using
{\em call\ by\ value
}.
%
1036 Actually,
{\em call by object reference
} would be a better
1037 description, since if a mutable object is passed, the caller
1038 will see any changes the callee makes to it (e.g., items
1039 inserted into a list).
1041 When a function calls another function, a new local symbol table is
1042 created for that call.
1044 A function definition introduces the function name in the
1046 symbol table. The value
1047 of the function name
1048 has a type that is recognized by the interpreter as a user-defined
1049 function. This value can be assigned to another name which can then
1050 also be used as a function. This serves as a general renaming
1053 \bcode\begin{verbatim
}
1055 <function object at
10042ed0>
1058 1 1 2 3 5 8 13 21 34 55 89
1060 \end{verbatim
}\ecode
1062 You might object that
{\tt fib
} is not a function but a procedure. In
1063 Python, like in C, procedures are just functions that don't return a
1064 value. In fact, technically speaking, procedures do return a value,
1065 albeit a rather boring one. This value is called
{\tt None
} (it's a
1066 built-in name). Writing the value
{\tt None
} is normally suppressed by
1067 the interpreter if it would be the only value written. You can see it
1068 if you really want to:
1070 \bcode\begin{verbatim
}
1074 \end{verbatim
}\ecode
1076 It is simple to write a function that returns a list of the numbers of
1077 the Fibonacci series, instead of printing it:
1079 \bcode\begin{verbatim
}
1080 >>> def fib2(n): # return Fibonacci series up to n
1084 ... result.append(b) # see below
1088 >>> f100 = fib2(
100) # call it
1089 >>> f100 # write the result
1090 [1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89]
1092 \end{verbatim
}\ecode
1094 This example, as usual, demonstrates some new Python features:
1099 The
{\tt return
} statement returns with a value from a function.
{\tt
1100 return
} without an expression argument is used to return from the middle
1101 of a procedure (falling off the end also returns from a procedure), in
1102 which case the
{\tt None
} value is returned.
1105 The statement
{\tt result.append(b)
} calls a
{\em method
} of the list
1106 object
{\tt result
}. A method is a function that `belongs' to an
1107 object and is named
{\tt obj.methodname
}, where
{\tt obj
} is some
1108 object (this may be an expression), and
{\tt methodname
} is the name
1109 of a method that is defined by the object's type. Different types
1110 define different methods. Methods of different types may have the
1111 same name without causing ambiguity. (It is possible to define your
1112 own object types and methods, using
{\em classes
}, as discussed later
1114 The method
{\tt append
} shown in the example, is defined for
1115 list objects; it adds a new element at the end of the list. In this
1117 it is equivalent to
{\tt result = result +
[b
]}, but more efficient.
1122 \chapter{Odds and Ends
}
1124 This chapter describes some things you've learned about already in
1125 more detail, and adds some new things as well.
1127 \section{More on Lists
}
1129 The list data type has some more methods. Here are all of the methods
1134 \item[{\tt insert(i, x)
}]
1135 Insert an item at a given position. The first argument is the index of
1136 the element before which to insert, so
{\tt a.insert(
0, x)
} inserts at
1137 the front of the list, and
{\tt a.insert(len(a), x)
} is equivalent to
1140 \item[{\tt append(x)
}]
1141 Equivalent to
{\tt a.insert(len(a), x)
}.
1143 \item[{\tt index(x)
}]
1144 Return the index in the list of the first item whose value is
{\tt x
}.
1145 It is an error if there is no such item.
1147 \item[{\tt remove(x)
}]
1148 Remove the first item from the list whose value is
{\tt x
}.
1149 It is an error if there is no such item.
1152 Sort the items of the list, in place.
1154 \item[{\tt reverse()
}]
1155 Reverse the elements of the list, in place.
1157 \item[{\tt count(x)
}]
1158 Return the number of times
{\tt x
} appears in the list.
1162 An example that uses all list methods:
1164 \bcode\begin{verbatim
}
1165 >>> a =
[66.6,
333,
333,
1,
1234.5]
1166 >>> print a.count(
333), a.count(
66.6), a.count('x')
1171 [66.6,
333, -
1,
333,
1,
1234.5,
333]
1176 [66.6, -
1,
333,
1,
1234.5,
333]
1179 [333,
1234.5,
1,
333, -
1,
66.6]
1182 [-
1,
1,
66.6,
333,
333,
1234.5]
1184 \end{verbatim
}\ecode
1186 \section{The
{\tt del
} statement
}
1188 There is a way to remove an item from a list given its index instead
1189 of its value: the
{\tt del
} statement. This can also be used to
1190 remove slices from a list (which we did earlier by assignment of an
1191 empty list to the slice). For example:
1193 \bcode\begin{verbatim
}
1195 [-
1,
1,
66.6,
333,
333,
1234.5]
1198 [1,
66.6,
333,
333,
1234.5]
1203 \end{verbatim
}\ecode
1205 {\tt del
} can also be used to delete entire variables:
1207 \bcode\begin{verbatim
}
1210 \end{verbatim
}\ecode
1212 Referencing the name
{\tt a
} hereafter is an error (at least until
1213 another value is assigned to it). We'll find other uses for
{\tt del
}
1216 \section{Tuples and Sequences
}
1218 We saw that lists and strings have many common properties, e.g.,
1219 indexing and slicing operations. They are two examples of
{\em
1220 sequence
} data types. Since Python is an evolving language, other
1221 sequence data types may be added. There is also another standard
1222 sequence data type: the
{\em tuple
}.
1224 A tuple consists of a number of values separated by commas, for
1227 \bcode\begin{verbatim
}
1228 >>> t =
12345,
54321, 'hello!'
1232 (
12345,
54321, 'hello!')
1233 >>> # Tuples may be nested:
1234 ... u = t, (
1,
2,
3,
4,
5)
1236 ((
12345,
54321, 'hello!'), (
1,
2,
3,
4,
5))
1238 \end{verbatim
}\ecode
1240 As you see, on output tuples are alway enclosed in parentheses, so
1241 that nested tuples are interpreted correctly; they may be input with
1242 or without surrounding parentheses, although often parentheses are
1243 necessary anyway (if the tuple is part of a larger expression).
1245 Tuples have many uses, e.g., (x, y) coordinate pairs, employee records
1246 from a database, etc. Tuples, like strings, are immutable: it is not
1247 possible to assign to the individual items of a tuple (you can
1248 simulate much of the same effect with slicing and concatenation,
1251 A special problem is the construction of tuples containing
0 or
1
1252 items: the syntax has some extra quirks to accommodate these. Empty
1253 tuples are constructed by an empty pair of parentheses; a tuple with
1254 one item is constructed by following a value with a comma
1255 (it is not sufficient to enclose a single value in parentheses).
1256 Ugly, but effective. For example:
1258 \bcode\begin{verbatim
}
1260 >>> singleton = 'hello', # <-- note trailing comma
1268 \end{verbatim
}\ecode
1270 The statement
{\tt t =
12345,
54321, 'hello!'
} is an example of
{\em
1271 tuple packing
}: the values
{\tt 12345},
{\tt 54321} and
{\tt 'hello!'
}
1272 are packed together in a tuple. The reverse operation is also
1275 \bcode\begin{verbatim
}
1278 \end{verbatim
}\ecode
1280 This is called, appropriately enough,
{\em tuple unpacking
}. Tuple
1281 unpacking requires that the list of variables on the left has the same
1282 number of elements as the length of the tuple. Note that multiple
1283 assignment is really just a combination of tuple packing and tuple
1286 Occasionally, the corresponding operation on lists is useful:
{\em list
1287 unpacking
}. This is supported by enclosing the list of variables in
1290 \bcode\begin{verbatim
}
1291 >>> a =
['spam', 'eggs',
100,
1234]
1292 >>>
[a1, a2, a3, a4
] = a
1294 \end{verbatim
}\ecode
1296 \section{Dictionaries
}
1298 Another useful data type built into Python is the
{\em dictionary
}.
1299 Dictionaries are sometimes found in other languages as ``associative
1300 memories'' or ``associative arrays''. Unlike sequences, which are
1301 indexed by a range of numbers, dictionaries are indexed by
{\em keys
},
1302 which are strings (the use of non-string values as keys
1303 is supported, but beyond the scope of this tutorial).
1304 It is best to think of a dictionary as an unordered set of
1305 {\em key:value
} pairs, with the requirement that the keys are unique
1306 (within one dictionary).
1307 A pair of braces creates an empty dictionary:
\verb/
{}/.
1308 Placing a comma-separated list of key:value pairs within the
1309 braces adds initial key:value pairs to the dictionary; this is also the
1310 way dictionaries are written on output.
1312 The main operations on a dictionary are storing a value with some key
1313 and extracting the value given the key. It is also possible to delete
1316 If you store using a key that is already in use, the old value
1317 associated with that key is forgotten. It is an error to extract a
1318 value using a non-existent key.
1320 The
{\tt keys()
} method of a dictionary object returns a list of all the
1321 keys used in the dictionary, in random order (if you want it sorted,
1322 just apply the
{\tt sort()
} method to the list of keys). To check
1323 whether a single key is in the dictionary, use the
\verb/has_key()/
1324 method of the dictionary.
1326 Here is a small example using a dictionary:
1328 \bcode\begin{verbatim
}
1329 >>> tel =
{'jack':
4098, 'sape':
4139}
1330 >>> tel
['guido'
] =
4127
1332 {'sape':
4139, 'guido':
4127, 'jack':
4098}
1336 >>> tel
['irv'
] =
4127
1338 {'guido':
4127, 'irv':
4127, 'jack':
4098}
1340 ['guido', 'irv', 'jack'
]
1341 >>> tel.has_key('guido')
1344 \end{verbatim
}\ecode
1346 \section{More on Conditions
}
1348 The conditions used in
{\tt while
} and
{\tt if
} statements above can
1349 contain other operators besides comparisons.
1351 The comparison operators
{\tt in
} and
{\tt not in
} check whether a value
1352 occurs (does not occur) in a sequence. The operators
{\tt is
} and
{\tt
1353 is not
} compare whether two objects are really the same object; this
1354 only matters for mutable objects like lists. All comparison operators
1355 have the same priority, which is lower than that of all numerical
1358 Comparisons can be chained: e.g.,
{\tt a < b == c
} tests whether
{\tt a
}
1359 is less than
{\tt b
} and moreover
{\tt b
} equals
{\tt c
}.
1361 Comparisons may be combined by the Boolean operators
{\tt and
} and
{\tt
1362 or
}, and the outcome of a comparison (or of any other Boolean
1363 expression) may be negated with
{\tt not
}. These all have lower
1364 priorities than comparison operators again; between them,
{\tt not
} has
1365 the highest priority, and
{\tt or
} the lowest, so that
1366 {\tt A and not B or C
} is equivalent to
{\tt (A and (not B)) or C
}. Of
1367 course, parentheses can be used to express the desired composition.
1369 The Boolean operators
{\tt and
} and
{\tt or
} are so-called
{\em
1370 shortcut
} operators: their arguments are evaluated from left to right,
1371 and evaluation stops as soon as the outcome is determined. E.g., if
1372 {\tt A
} and
{\tt C
} are true but
{\tt B
} is false,
{\tt A and B and C
}
1373 does not evaluate the expression C. In general, the return value of a
1374 shortcut operator, when used as a general value and not as a Boolean, is
1375 the last evaluated argument.
1377 It is possible to assign the result of a comparison or other Boolean
1378 expression to a variable. For example,
1380 \bcode\begin{verbatim
}
1381 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
1382 >>> non_null = string1 or string2 or string3
1386 \end{verbatim
}\ecode
1388 Note that in Python, unlike C, assignment cannot occur inside expressions.
1390 \section{Comparing Sequences and Other Types
}
1392 Sequence objects may be compared to other objects with the same
1393 sequence type. The comparison uses
{\em lexicographical
} ordering:
1394 first the first two items are compared, and if they differ this
1395 determines the outcome of the comparison; if they are equal, the next
1396 two items are compared, and so on, until either sequence is exhausted.
1397 If two items to be compared are themselves sequences of the same type,
1398 the lexicographical comparison is carried out recursively. If all
1399 items of two sequences compare equal, the sequences are considered
1400 equal. If one sequence is an initial subsequence of the other, the
1401 shorted sequence is the smaller one. Lexicographical ordering for
1402 strings uses the
\ASCII{} ordering for individual characters. Some
1403 examples of comparisons between sequences with the same types:
1405 \bcode\begin{verbatim
}
1406 (
1,
2,
3) < (
1,
2,
4)
1407 [1,
2,
3] <
[1,
2,
4]
1408 'ABC' < 'C' < 'Pascal' < 'Python'
1409 (
1,
2,
3,
4) < (
1,
2,
4)
1411 (
1,
2,
3) = (
1.0,
2.0,
3.0)
1412 (
1,
2, ('aa', 'ab')) < (
1,
2, ('abc', 'a'),
4)
1413 \end{verbatim
}\ecode
1415 Note that comparing objects of different types is legal. The outcome
1416 is deterministic but arbitrary: the types are ordered by their name.
1417 Thus, a list is always smaller than a string, a string is always
1418 smaller than a tuple, etc. Mixed numeric types are compared according
1419 to their numeric value, so
0 equals
0.0, etc.
%
1421 The rules for comparing objects of different types should
1422 not be relied upon; they may change in a future version of
1429 If you quit from the Python interpreter and enter it again, the
1430 definitions you have made (functions and variables) are lost.
1431 Therefore, if you want to write a somewhat longer program, you are
1432 better off using a text editor to prepare the input for the interpreter
1433 and running it with that file as input instead. This is known as creating a
1434 {\em script
}. As your program gets longer, you may want to split it
1435 into several files for easier maintenance. You may also want to use a
1436 handy function that you've written in several programs without copying
1437 its definition into each program.
1439 To support this, Python has a way to put definitions in a file and use
1440 them in a script or in an interactive instance of the interpreter.
1441 Such a file is called a
{\em module
}; definitions from a module can be
1442 {\em imported
} into other modules or into the
{\em main
} module (the
1443 collection of variables that you have access to in a script
1444 executed at the top level
1445 and in calculator mode).
1447 A module is a file containing Python definitions and statements. The
1448 file name is the module name with the suffix
{\tt .py
} appended. Within
1449 a module, the module's name (as a string) is available as the value of
1450 the global variable
{\tt __name__
}. For instance, use your favorite text
1451 editor to create a file called
{\tt fibo.py
} in the current directory
1452 with the following contents:
1454 \bcode\begin{verbatim
}
1455 # Fibonacci numbers module
1457 def fib(n): # write Fibonacci series up to n
1463 def fib2(n): # return Fibonacci series up to n
1470 \end{verbatim
}\ecode
1472 Now enter the Python interpreter and import this module with the
1475 \bcode\begin{verbatim
}
1478 \end{verbatim
}\ecode
1480 This does not enter the names of the functions defined in
1482 directly in the current symbol table; it only enters the module name
1485 Using the module name you can access the functions:
1487 \bcode\begin{verbatim
}
1489 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
1491 [1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89]
1495 \end{verbatim
}\ecode
1497 If you intend to use a function often you can assign it to a local name:
1499 \bcode\begin{verbatim
}
1502 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1504 \end{verbatim
}\ecode
1506 \section{More on Modules
}
1508 A module can contain executable statements as well as function
1510 These statements are intended to initialize the module.
1511 They are executed only the
1513 time the module is imported somewhere.
%
1515 In fact function definitions are also `statements' that are
1516 `executed'; the execution enters the function name in the
1517 module's global symbol table.
1520 Each module has its own private symbol table, which is used as the
1521 global symbol table by all functions defined in the module.
1522 Thus, the author of a module can use global variables in the module
1523 without worrying about accidental clashes with a user's global
1525 On the other hand, if you know what you are doing you can touch a
1526 module's global variables with the same notation used to refer to its
1528 {\tt modname.itemname
}.
1530 Modules can import other modules.
1531 It is customary but not required to place all
1533 statements at the beginning of a module (or script, for that matter).
1534 The imported module names are placed in the importing module's global
1537 There is a variant of the
1539 statement that imports names from a module directly into the importing
1540 module's symbol table.
1543 \bcode\begin{verbatim
}
1544 >>> from fibo import fib, fib2
1546 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1548 \end{verbatim
}\ecode
1550 This does not introduce the module name from which the imports are taken
1551 in the local symbol table (so in the example,
{\tt fibo
} is not
1554 There is even a variant to import all names that a module defines:
1556 \bcode\begin{verbatim
}
1557 >>> from fibo import *
1559 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1561 \end{verbatim
}\ecode
1563 This imports all names except those beginning with an underscore
1566 \section{Standard Modules
}
1568 Python comes with a library of standard modules, described in a separate
1569 document (Python Library Reference). Some modules are built into the
1570 interpreter; these provide access to operations that are not part of the
1571 core of the language but are nevertheless built in, either for
1572 efficiency or to provide access to operating system primitives such as
1573 system calls. The set of such modules is a configuration option; e.g.,
1574 the
{\tt amoeba
} module is only provided on systems that somehow support
1575 Amoeba primitives. One particular module deserves some attention:
{\tt
1576 sys
}, which is built into every Python interpreter. The variables
{\tt
1577 sys.ps1
} and
{\tt sys.ps2
} define the strings used as primary and
1580 \bcode\begin{verbatim
}
1590 \end{verbatim
}\ecode
1592 These two variables are only defined if the interpreter is in
1597 is a list of strings that determine the interpreter's search path for
1599 It is initialized to a default path taken from the environment variable
1601 or from a built-in default if
1604 You can modify it using standard list operations, e.g.:
1606 \bcode\begin{verbatim
}
1608 >>> sys.path.append('/ufs/guido/lib/python')
1610 \end{verbatim
}\ecode
1612 \section{The
{\tt dir()
} function
}
1614 The built-in function
{\tt dir
} is used to find out which names a module
1615 defines. It returns a sorted list of strings:
1617 \bcode\begin{verbatim
}
1618 >>> import fibo, sys
1620 ['__name__', 'fib', 'fib2'
]
1622 ['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
1623 'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
1624 'stderr', 'stdin', 'stdout', 'version'
]
1626 \end{verbatim
}\ecode
1628 Without arguments,
{\tt dir()
} lists the names you have defined currently:
1630 \bcode\begin{verbatim
}
1631 >>> a =
[1,
2,
3,
4,
5]
1632 >>> import fibo, sys
1635 ['__name__', 'a', 'fib', 'fibo', 'sys'
]
1637 \end{verbatim
}\ecode
1639 Note that it lists all types of names: variables, modules, functions, etc.
1641 {\tt dir()
} does not list the names of built-in functions and variables.
1642 If you want a list of those, they are defined in the standard module
1645 \bcode\begin{verbatim
}
1646 >>> import __builtin__
1647 >>> dir(__builtin__)
1648 ['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
1649 'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
1650 'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
1651 'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
1652 'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
1653 'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
1654 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
1655 'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
1656 'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange'
]
1658 \end{verbatim
}\ecode
1661 \chapter{Output Formatting
}
1663 So far we've encountered two ways of writing values:
{\em expression
1664 statements
} and the
{\tt print
} statement. (A third way is using the
1665 {\tt write
} method of file objects; the standard output file can be
1666 referenced as
{\tt sys.stdout
}. See the Library Reference for more
1667 information on this.)
1669 Often you'll want more control over the formatting of your output than
1670 simply printing space-separated values. The key to nice formatting in
1671 Python is to do all the string handling yourself; using string slicing
1672 and concatenation operations you can create any lay-out you can imagine.
1673 The standard module
{\tt string
} contains some useful operations for
1674 padding strings to a given column width; these will be discussed shortly.
1675 Finally, the
\code{\%
} operator (modulo) with a string left argument
1676 interprets this string as a C sprintf format string to be applied to the
1677 right argument, and returns the string resulting from this formatting
1680 One question remains, of course: how do you convert values to strings?
1681 Luckily, Python has a way to convert any value to a string: just write
1682 the value between reverse quotes (
\verb/``/). Some examples:
1684 \bcode\begin{verbatim
}
1687 >>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
1689 The value of x is
31.4, and y is
40000...
1690 >>> # Reverse quotes work on other types besides numbers:
1695 >>> # Converting a string adds string quotes and backslashes:
1696 ... hello = 'hello, world
\n'
1697 >>> hellos = `hello`
1700 >>> # The argument of reverse quotes may be a tuple:
1701 ... `x, y, ('spam', 'eggs')`
1702 "(
31.4,
40000, ('spam', 'eggs'))"
1704 \end{verbatim
}\ecode
1706 Here are two ways to write a table of squares and cubes:
1708 \bcode\begin{verbatim
}
1710 >>> for x in range(
1,
11):
1711 ... print string.rjust(`x`,
2), string.rjust(`x*x`,
3),
1712 ... # Note trailing comma on previous line
1713 ... print string.rjust(`x*x*x`,
4)
1725 >>> for x in range(
1,
11):
1726 ... print '
%2d %3d %4d' % (x, x*x, x*x*x)
1739 \end{verbatim
}\ecode
1741 (Note that one space between each column was added by the way
{\tt print
}
1742 works: it always adds spaces between its arguments.)
1744 This example demonstrates the function
{\tt string.rjust()
}, which
1745 right-justifies a string in a field of a given width by padding it with
1746 spaces on the left. There are similar functions
{\tt string.ljust()
}
1747 and
{\tt string.center()
}. These functions do not write anything, they
1748 just return a new string. If the input string is too long, they don't
1749 truncate it, but return it unchanged; this will mess up your column
1750 lay-out but that's usually better than the alternative, which would be
1751 lying about a value. (If you really want truncation you can always add
1752 a slice operation, as in
{\tt string.ljust(x,~n)
[0:n
]}.)
1754 There is another function,
{\tt string.zfill
}, which pads a numeric
1755 string on the left with zeros. It understands about plus and minus
1758 \bcode\begin{verbatim
}
1759 >>> string.zfill('
12',
5)
1761 >>> string.zfill('-
3.14',
7)
1763 >>> string.zfill('
3.14159265359',
5)
1766 \end{verbatim
}\ecode
1769 \chapter{Errors and Exceptions
}
1771 Until now error messages haven't been more than mentioned, but if you
1772 have tried out the examples you have probably seen some. There are
1773 (at least) two distinguishable kinds of errors:
{\em syntax\ errors
}
1774 and
{\em exceptions
}.
1776 \section{Syntax Errors
}
1778 Syntax errors, also known as parsing errors, are perhaps the most common
1779 kind of complaint you get while you are still learning Python:
1781 \bcode\begin{verbatim
}
1782 >>> while
1 print 'Hello world'
1783 File "<stdin>", line
1
1784 while
1 print 'Hello world'
1786 SyntaxError: invalid syntax
1788 \end{verbatim
}\ecode
1790 The parser repeats the offending line and displays a little `arrow'
1791 pointing at the earliest point in the line where the error was detected.
1792 The error is caused by (or at least detected at) the token
1794 the arrow: in the example, the error is detected at the keyword
1795 {\tt print
}, since a colon (
{\tt :
}) is missing before it.
1796 File name and line number are printed so you know where to look in case
1797 the input came from a script.
1799 \section{Exceptions
}
1801 Even if a statement or expression is syntactically correct, it may
1802 cause an error when an attempt is made to execute it.
1803 Errors detected during execution are called
{\em exceptions
} and are
1804 not unconditionally fatal: you will soon learn how to handle them in
1805 Python programs. Most exceptions are not handled by programs,
1806 however, and result in error messages as shown here:
1808 \bcode\small\begin{verbatim
}
1810 Traceback (innermost last):
1811 File "<stdin>", line
1
1812 ZeroDivisionError: integer division or modulo
1814 Traceback (innermost last):
1815 File "<stdin>", line
1
1818 Traceback (innermost last):
1819 File "<stdin>", line
1
1820 TypeError: illegal argument type for built-in operation
1822 \end{verbatim
}\ecode
1824 The last line of the error message indicates what happened.
1825 Exceptions come in different types, and the type is printed as part of
1826 the message: the types in the example are
1827 {\tt ZeroDivisionError
},
1831 The string printed as the exception type is the name of the built-in
1832 name for the exception that occurred. This is true for all built-in
1833 exceptions, but need not be true for user-defined exceptions (although
1834 it is a useful convention).
1835 Standard exception names are built-in identifiers (not reserved
1838 The rest of the line is a detail whose interpretation depends on the
1839 exception type; its meaning is dependent on the exception type.
1841 The preceding part of the error message shows the context where the
1842 exception happened, in the form of a stack backtrace.
1843 In general it contains a stack backtrace listing source lines; however,
1844 it will not display lines read from standard input.
1846 The Python library reference manual lists the built-in exceptions and
1849 \section{Handling Exceptions
}
1851 It is possible to write programs that handle selected exceptions.
1852 Look at the following example, which prints a table of inverses of
1853 some floating point numbers:
1855 \bcode\begin{verbatim
}
1856 >>> numbers =
[0.3333,
2.5,
0,
10]
1857 >>> for x in numbers:
1861 ... except ZeroDivisionError:
1862 ... print '*** has no inverse ***'
1866 0 *** has no inverse ***
1869 \end{verbatim
}\ecode
1871 The
{\tt try
} statement works as follows.
1876 (the statement(s) between the
{\tt try
} and
{\tt except
} keywords) is
1879 If no exception occurs, the
1880 {\em except\ clause
}
1881 is skipped and execution of the
{\tt try
} statement is finished.
1883 If an exception occurs during execution of the try clause,
1884 the rest of the clause is skipped. Then if
1885 its type matches the exception named after the
{\tt except
} keyword,
1886 the rest of the try clause is skipped, the except clause is executed,
1887 and then execution continues after the
{\tt try
} statement.
1889 If an exception occurs which does not match the exception named in the
1890 except clause, it is passed on to outer try statements; if no handler is
1892 {\em unhandled\ exception
}
1893 and execution stops with a message as shown above.
1895 A
{\tt try
} statement may have more than one except clause, to specify
1896 handlers for different exceptions.
1897 At most one handler will be executed.
1898 Handlers only handle exceptions that occur in the corresponding try
1899 clause, not in other handlers of the same
{\tt try
} statement.
1900 An except clause may name multiple exceptions as a parenthesized list,
1903 \bcode\begin{verbatim
}
1904 ... except (RuntimeError, TypeError, NameError):
1906 \end{verbatim
}\ecode
1908 The last except clause may omit the exception name(s), to serve as a
1910 Use this with extreme caution, since it is easy to mask a real
1911 programming error in this way!
1913 When an exception occurs, it may have an associated value, also known as
1916 The presence and type of the argument depend on the exception type.
1917 For exception types which have an argument, the except clause may
1918 specify a variable after the exception name (or list) to receive the
1919 argument's value, as follows:
1921 \bcode\begin{verbatim
}
1924 ... except NameError, x:
1925 ... print 'name', x, 'undefined'
1929 \end{verbatim
}\ecode
1931 If an exception has an argument, it is printed as the last part
1932 (`detail') of the message for unhandled exceptions.
1934 Exception handlers don't just handle exceptions if they occur
1935 immediately in the try clause, but also if they occur inside functions
1936 that are called (even indirectly) in the try clause.
1939 \bcode\begin{verbatim
}
1940 >>> def this_fails():
1945 ... except ZeroDivisionError, detail:
1946 ... print 'Handling run-time error:', detail
1948 Handling run-time error: integer division or modulo
1950 \end{verbatim
}\ecode
1952 \section{Raising Exceptions
}
1954 The
{\tt raise
} statement allows the programmer to force a specified
1958 \bcode\begin{verbatim
}
1959 >>> raise NameError, 'HiThere'
1960 Traceback (innermost last):
1961 File "<stdin>", line
1
1964 \end{verbatim
}\ecode
1966 The first argument to
{\tt raise
} names the exception to be raised.
1967 The optional second argument specifies the exception's argument.
1969 \section{User-defined Exceptions
}
1971 Programs may name their own exceptions by assigning a string to a
1975 \bcode\begin{verbatim
}
1976 >>> my_exc = 'my_exc'
1978 ... raise my_exc,
2*
2
1979 ... except my_exc, val:
1980 ... print 'My exception occurred, value:', val
1982 My exception occurred, value:
4
1984 Traceback (innermost last):
1985 File "<stdin>", line
1
1988 \end{verbatim
}\ecode
1990 Many standard modules use this to
report errors that may occur in
1991 functions they define.
1993 \section{Defining Clean-up Actions
}
1995 The
{\tt try
} statement has another optional clause which is intended to
1996 define clean-up actions that must be executed under all circumstances.
1999 \bcode\begin{verbatim
}
2001 ... raise KeyboardInterrupt
2003 ... print 'Goodbye, world!'
2006 Traceback (innermost last):
2007 File "<stdin>", line
2
2010 \end{verbatim
}\ecode
2012 A
{\tt finally
} clause is executed whether or not an exception has
2013 occurred in the
{\tt try
} clause. When an exception has occurred, it
2014 is re-raised after the
{\tt finally
} clause is executed. The
2015 {\tt finally
} clause is also executed ``on the way out'' when the
2016 {\tt try
} statement is left via a
{\tt break
} or
{\tt return
}
2019 A
{\tt try
} statement must either have one or more
{\tt except
}
2020 clauses or one
{\tt finally
} clause, but not both.
2025 Python's class mechanism adds classes to the language with a minimum
2026 of new syntax and semantics. It is a mixture of the class mechanisms
2027 found in
\Cpp{} and Modula-
3. As is true for modules, classes in Python
2028 do not put an absolute barrier between definition and user, but rather
2029 rely on the politeness of the user not to ``break into the
2030 definition.'' The most important features of classes are retained
2031 with full power, however: the class inheritance mechanism allows
2032 multiple base classes, a derived class can override any methods of its
2033 base class(es), a method can call the method of a base class with the
2034 same name. Objects can contain an arbitrary amount of private data.
2036 In
\Cpp{} terminology, all class members (including the data members) are
2037 {\em public
}, and all member functions are
{\em virtual
}. There are
2038 no special constructors or destructors. As in Modula-
3, there are no
2039 shorthands for referencing the object's members from its methods: the
2040 method function is declared with an explicit first argument
2041 representing the object, which is provided implicitly by the call. As
2042 in Smalltalk, classes themselves are objects, albeit in the wider
2043 sense of the word: in Python, all data types are objects. This
2044 provides semantics for importing and renaming. But, just like in
\Cpp{}
2045 or Modula-
3, built-in types cannot be used as base classes for
2046 extension by the user. Also, like in
\Cpp{} but unlike in Modula-
3, most
2047 built-in operators with special syntax (arithmetic operators,
2048 subscripting etc.) can be redefined for class members.
2051 \section{A word about terminology
}
2053 Lacking universally accepted terminology to talk about classes, I'll
2054 make occasional use of Smalltalk and
\Cpp{} terms. (I'd use Modula-
3
2055 terms, since its object-oriented semantics are closer to those of
2056 Python than
\Cpp{}, but I expect that few readers have heard of it...)
2058 I also have to warn you that there's a terminological pitfall for
2059 object-oriented readers: the word ``object'' in Python does not
2060 necessarily mean a class instance. Like
\Cpp{} and Modula-
3, and unlike
2061 Smalltalk, not all types in Python are classes: the basic built-in
2062 types like integers and lists aren't, and even somewhat more exotic
2063 types like files aren't. However,
{\em all
} Python types share a little
2064 bit of common semantics that is best described by using the word
2067 Objects have individuality, and multiple names (in multiple scopes)
2068 can be bound to the same object. This is known as aliasing in other
2069 languages. This is usually not appreciated on a first glance at
2070 Python, and can be safely ignored when dealing with immutable basic
2071 types (numbers, strings, tuples). However, aliasing has an
2072 (intended!) effect on the semantics of Python code involving mutable
2073 objects such as lists, dictionaries, and most types representing
2074 entities outside the program (files, windows, etc.). This is usually
2075 used to the benefit of the program, since aliases behave like pointers
2076 in some respects. For example, passing an object is cheap since only
2077 a pointer is passed by the implementation; and if a function modifies
2078 an object passed as an argument, the caller will see the change --- this
2079 obviates the need for two different argument passing mechanisms as in
2083 \section{Python scopes and name spaces
}
2085 Before introducing classes, I first have to tell you something about
2086 Python's scope rules. Class definitions play some neat tricks with
2087 name spaces, and you need to know how scopes and name spaces work to
2088 fully understand what's going on. Incidentally, knowledge about this
2089 subject is useful for any advanced Python programmer.
2091 Let's begin with some definitions.
2093 A
{\em name space
} is a mapping from names to objects. Most name
2094 spaces are currently implemented as Python dictionaries, but that's
2095 normally not noticeable in any way (except for performance), and it
2096 may change in the future. Examples of name spaces are: the set of
2097 built-in names (functions such as
\verb\abs()\, and built-in exception
2098 names); the global names in a module; and the local names in a
2099 function invocation. In a sense the set of attributes of an object
2100 also form a name space. The important thing to know about name
2101 spaces is that there is absolutely no relation between names in
2102 different name spaces; for instance, two different modules may both
2103 define a function ``maximize'' without confusion --- users of the
2104 modules must prefix it with the module name.
2106 By the way, I use the word
{\em attribute
} for any name following a
2107 dot --- for example, in the expression
\verb\z.real\,
\verb\real\ is
2108 an attribute of the object
\verb\z\. Strictly speaking, references to
2109 names in modules are attribute references: in the expression
2110 \verb\modname.funcname\,
\verb\modname\ is a module object and
2111 \verb\funcname\ is an attribute of it. In this case there happens to
2112 be a straightforward mapping between the module's attributes and the
2113 global names defined in the module: they share the same name space!
%
2115 Except for one thing. Module objects have a secret read-only
2116 attribute called
{\tt __dict__
} which returns the dictionary
2117 used to implement the module's name space; the name
2118 {\tt __dict__
} is an attribute but not a global name.
2119 Obviously, using this violates the abstraction of name space
2120 implementation, and should be restricted to things like
2121 post-mortem debuggers...
2124 Attributes may be read-only or writable. In the latter case,
2125 assignment to attributes is possible. Module attributes are writable:
2126 you can write
\verb\modname.the_answer =
42\. Writable attributes may
2127 also be deleted with the del statement, e.g.
2128 \verb\del modname.the_answer\.
2130 Name spaces are created at different moments and have different
2131 lifetimes. The name space containing the built-in names is created
2132 when the Python interpreter starts up, and is never deleted. The
2133 global name space for a module is created when the module definition
2134 is read in; normally, module name spaces also last until the
2135 interpreter quits. The statements executed by the top-level
2136 invocation of the interpreter, either read from a script file or
2137 interactively, are considered part of a module called
\verb\__main__\,
2138 so they have their own global name space. (The built-in names
2139 actually also live in a module; this is called
\verb\__builtin__\.)
2141 The local name space for a function is created when the function is
2142 called, and deleted when the function returns or raises an exception
2143 that is not handled within the function. (Actually, forgetting would
2144 be a better way to describe what actually happens.) Of course,
2145 recursive invocations each have their own local name space.
2147 A
{\em scope
} is a textual region of a Python program where a name space
2148 is directly accessible. ``Directly accessible'' here means that an
2149 unqualified reference to a name attempts to find the name in the name
2152 Although scopes are determined statically, they are used dynamically.
2153 At any time during execution, exactly three nested scopes are in use
2154 (i.e., exactly three name spaces are directly accessible): the
2155 innermost scope, which is searched first, contains the local names,
2156 the middle scope, searched next, contains the current module's global
2157 names, and the outermost scope (searched last) is the name space
2158 containing built-in names.
2160 Usually, the local scope references the local names of the (textually)
2161 current function. Outside of functions, the local scope references
2162 the same name space as the global scope: the module's name space.
2163 Class definitions place yet another name space in the local scope.
2165 It is important to realize that scopes are determined textually: the
2166 global scope of a function defined in a module is that module's name
2167 space, no matter from where or by what alias the function is called.
2168 On the other hand, the actual search for names is done dynamically, at
2169 run time --- however, the language definition is evolving towards
2170 static name resolution, at ``compile'' time, so don't rely on dynamic
2171 name resolution! (In fact, local variables are already determined
2174 A special quirk of Python is that assignments always go into the
2175 innermost scope. Assignments do not copy data --- they just
2176 bind names to objects. The same is true for deletions: the statement
2177 \verb\del x\ removes the binding of x from the name space referenced by the
2178 local scope. In fact, all operations that introduce new names use the
2179 local scope: in particular, import statements and function definitions
2180 bind the module or function name in the local scope. (The
2181 \verb\global\ statement can be used to indicate that particular
2182 variables live in the global scope.)
2185 \section{A first look at classes
}
2187 Classes introduce a little bit of new syntax, three new object types,
2188 and some new semantics.
2191 \subsection{Class definition syntax
}
2193 The simplest form of class definition looks like this:
2204 Class definitions, like function definitions (
\verb\def\ statements)
2205 must be executed before they have any effect. (You could conceivably
2206 place a class definition in a branch of an
\verb\if\ statement, or
2209 In practice, the statements inside a class definition will usually be
2210 function definitions, but other statements are allowed, and sometimes
2211 useful --- we'll come back to this later. The function definitions
2212 inside a class normally have a peculiar form of argument list,
2213 dictated by the calling conventions for methods --- again, this is
2216 When a class definition is entered, a new name space is created, and
2217 used as the local scope --- thus, all assignments to local variables
2218 go into this new name space. In particular, function definitions bind
2219 the name of the new function here.
2221 When a class definition is left normally (via the end), a
{\em class
2222 object
} is created. This is basically a wrapper around the contents
2223 of the name space created by the class definition; we'll learn more
2224 about class objects in the next section. The original local scope
2225 (the one in effect just before the class definitions was entered) is
2226 reinstated, and the class object is bound here to class name given in
2227 the class definition header (ClassName in the example).
2230 \subsection{Class objects
}
2232 Class objects support two kinds of operations: attribute references
2235 {\em Attribute references
} use the standard syntax used for all
2236 attribute references in Python:
\verb\obj.name\. Valid attribute
2237 names are all the names that were in the class's name space when the
2238 class object was created. So, if the class definition looked like
2245 return 'hello world'
2248 then
\verb\MyClass.i\ and
\verb\MyClass.f\ are valid attribute
2249 references, returning an integer and a function object, respectively.
2250 Class attributes can also be assigned to, so you can change the
2251 value of
\verb\MyClass.i\ by assignment.
2253 Class
{\em instantiation
} uses function notation. Just pretend that
2254 the class object is a parameterless function that returns a new
2255 instance of the class. For example, (assuming the above class):
2261 creates a new
{\em instance
} of the class and assigns this object to
2262 the local variable
\verb\x\.
2265 \subsection{Instance objects
}
2267 Now what can we do with instance objects? The only operations
2268 understood by instance objects are attribute references. There are
2269 two kinds of valid attribute names.
2271 The first I'll call
{\em data attributes
}. These correspond to
2272 ``instance variables'' in Smalltalk, and to ``data members'' in
\Cpp{}.
2273 Data attributes need not be declared; like local variables, they
2274 spring into existence when they are first assigned to. For example,
2275 if
\verb\x\ in the instance of
\verb\MyClass\ created above, the
2276 following piece of code will print the value
16, without leaving a
2281 while x.counter <
10:
2282 x.counter = x.counter *
2
2287 The second kind of attribute references understood by instance objects
2288 are
{\em methods
}. A method is a function that ``belongs to'' an
2289 object. (In Python, the term method is not unique to class instances:
2290 other object types can have methods as well, e.g., list objects have
2291 methods called append, insert, remove, sort, and so on. However,
2292 below, we'll use the term method exclusively to mean methods of class
2293 instance objects, unless explicitly stated otherwise.)
2295 Valid method names of an instance object depend on its class. By
2296 definition, all attributes of a class that are (user-defined) function
2297 objects define corresponding methods of its instances. So in our
2298 example,
\verb\x.f\ is a valid method reference, since
2299 \verb\MyClass.f\ is a function, but
\verb\x.i\ is not, since
2300 \verb\MyClass.i\ is not. But
\verb\x.f\ is not the
2301 same thing as
\verb\MyClass.f\ --- it is a
{\em method object
}, not a
2305 \subsection{Method objects
}
2307 Usually, a method is called immediately, e.g.:
2313 In our example, this will return the string
\verb\'hello world'\.
2314 However, it is not necessary to call a method right away:
\verb\x.f\
2315 is a method object, and can be stored away and called at a later
2316 moment, for example:
2324 will continue to print
\verb\hello world\ until the end of time.
2326 What exactly happens when a method is called? You may have noticed
2327 that
\verb\x.f()\ was called without an argument above, even though
2328 the function definition for
\verb\f\ specified an argument. What
2329 happened to the argument? Surely Python raises an exception when a
2330 function that requires an argument is called without any --- even if
2331 the argument isn't actually used...
2333 Actually, you may have guessed the answer: the special thing about
2334 methods is that the object is passed as the first argument of the
2335 function. In our example, the call
\verb\x.f()\ is exactly equivalent
2336 to
\verb\MyClass.f(x)\. In general, calling a method with a list of
2337 {\em n
} arguments is equivalent to calling the corresponding function
2338 with an argument list that is created by inserting the method's object
2339 before the first argument.
2341 If you still don't understand how methods work, a look at the
2342 implementation can perhaps clarify matters. When an instance
2343 attribute is referenced that isn't a data attribute, its class is
2344 searched. If the name denotes a valid class attribute that is a
2345 function object, a method object is created by packing (pointers to)
2346 the instance object and the function object just found together in an
2347 abstract object: this is the method object. When the method object is
2348 called with an argument list, it is unpacked again, a new argument
2349 list is constructed from the instance object and the original argument
2350 list, and the function object is called with this new argument list.
2353 \section{Random remarks
}
2356 [These should perhaps be placed more carefully...
]
2359 Data attributes override method attributes with the same name; to
2360 avoid accidental name conflicts, which may cause hard-to-find bugs in
2361 large programs, it is wise to use some kind of convention that
2362 minimizes the chance of conflicts, e.g., capitalize method names,
2363 prefix data attribute names with a small unique string (perhaps just
2364 an underscore), or use verbs for methods and nouns for data attributes.
2367 Data attributes may be referenced by methods as well as by ordinary
2368 users (``clients'') of an object. In other words, classes are not
2369 usable to implement pure abstract data types. In fact, nothing in
2370 Python makes it possible to enforce data hiding --- it is all based
2371 upon convention. (On the other hand, the Python implementation,
2372 written in C, can completely hide implementation details and control
2373 access to an object if necessary; this can be used by extensions to
2374 Python written in C.)
2377 Clients should use data attributes with care --- clients may mess up
2378 invariants maintained by the methods by stamping on their data
2379 attributes. Note that clients may add data attributes of their own to
2380 an instance object without affecting the validity of the methods, as
2381 long as name conflicts are avoided --- again, a naming convention can
2382 save a lot of headaches here.
2385 There is no shorthand for referencing data attributes (or other
2386 methods!) from within methods. I find that this actually increases
2387 the readability of methods: there is no chance of confusing local
2388 variables and instance variables when glancing through a method.
2391 Conventionally, the first argument of methods is often called
2392 \verb\self\. This is nothing more than a convention: the name
2393 \verb\self\ has absolutely no special meaning to Python. (Note,
2394 however, that by not following the convention your code may be less
2395 readable by other Python programmers, and it is also conceivable that
2396 a
{\em class browser
} program be written which relies upon such a
2400 Any function object that is a class attribute defines a method for
2401 instances of that class. It is not necessary that the function
2402 definition is textually enclosed in the class definition: assigning a
2403 function object to a local variable in the class is also ok. For
2407 # Function defined outside the class
2414 return 'hello world'
2418 Now
\verb\f\,
\verb\g\ and
\verb\h\ are all attributes of class
2419 \verb\C\ that refer to function objects, and consequently they are all
2420 methods of instances of
\verb\C\ ---
\verb\h\ being exactly equivalent
2421 to
\verb\g\. Note that this practice usually only serves to confuse
2422 the reader of a program.
2425 Methods may call other methods by using method attributes of the
2426 \verb\self\ argument, e.g.:
2434 def addtwice(self, x):
2440 The instantiation operation (``calling'' a class object) creates an
2441 empty object. Many classes like to create objects in a known initial
2442 state. Therefore a class may define a special method named
2443 \verb\__init__\, like this:
2450 When a class defines an
\verb\__init__\ method, class instantiation
2451 automatically invokes
\verb\__init__\ for the newly-created class
2452 instance. So in the
\verb\Bag\ example, a new and initialized instance
2459 Of course, the
\verb\__init__\ method may have arguments for greater
2460 flexibility. In that case, arguments given to the class instantiation
2461 operator are passed on to
\verb\__init__\. For example,
2463 \bcode\begin{verbatim
}
2465 ... def __init__(self, realpart, imagpart):
2466 ... self.r = realpart
2467 ... self.i = imagpart
2469 >>> x = Complex(
3.0,-
4.5)
2473 \end{verbatim
}\ecode
2475 Methods may reference global names in the same way as ordinary
2476 functions. The global scope associated with a method is the module
2477 containing the class definition. (The class itself is never used as a
2478 global scope!) While one rarely encounters a good reason for using
2479 global data in a method, there are many legitimate uses of the global
2480 scope: for one thing, functions and modules imported into the global
2481 scope can be used by methods, as well as functions and classes defined
2482 in it. Usually, the class containing the method is itself defined in
2483 this global scope, and in the next section we'll find some good
2484 reasons why a method would want to reference its own class!
2487 \section{Inheritance
}
2489 Of course, a language feature would not be worthy of the name ``class''
2490 without supporting inheritance. The syntax for a derived class
2491 definition looks as follows:
2494 class DerivedClassName(BaseClassName):
2502 The name
\verb\BaseClassName\ must be defined in a scope containing
2503 the derived class definition. Instead of a base class name, an
2504 expression is also allowed. This is useful when the base class is
2505 defined in another module, e.g.,
2508 class DerivedClassName(modname.BaseClassName):
2511 Execution of a derived class definition proceeds the same as for a
2512 base class. When the class object is constructed, the base class is
2513 remembered. This is used for resolving attribute references: if a
2514 requested attribute is not found in the class, it is searched in the
2515 base class. This rule is applied recursively if the base class itself
2516 is derived from some other class.
2518 There's nothing special about instantiation of derived classes:
2519 \verb\DerivedClassName()\ creates a new instance of the class. Method
2520 references are resolved as follows: the corresponding class attribute
2521 is searched, descending down the chain of base classes if necessary,
2522 and the method reference is valid if this yields a function object.
2524 Derived classes may override methods of their base classes. Because
2525 methods have no special privileges when calling other methods of the
2526 same object, a method of a base class that calls another method
2527 defined in the same base class, may in fact end up calling a method of
2528 a derived class that overrides it. (For
\Cpp{} programmers: all methods
2529 in Python are ``virtual functions''.)
2531 An overriding method in a derived class may in fact want to extend
2532 rather than simply replace the base class method of the same name.
2533 There is a simple way to call the base class method directly: just
2534 call
\verb\BaseClassName.methodname(self, arguments)\. This is
2535 occasionally useful to clients as well. (Note that this only works if
2536 the base class is defined or imported directly in the global scope.)
2539 \subsection{Multiple inheritance
}
2541 Python supports a limited form of multiple inheritance as well. A
2542 class definition with multiple base classes looks as follows:
2545 class DerivedClassName(Base1, Base2, Base3):
2553 The only rule necessary to explain the semantics is the resolution
2554 rule used for class attribute references. This is depth-first,
2555 left-to-right. Thus, if an attribute is not found in
2556 \verb\DerivedClassName\, it is searched in
\verb\Base1\, then
2557 (recursively) in the base classes of
\verb\Base1\, and only if it is
2558 not found there, it is searched in
\verb\Base2\, and so on.
2560 (To some people breadth first---searching
\verb\Base2\ and
2561 \verb\Base3\ before the base classes of
\verb\Base1\---looks more
2562 natural. However, this would require you to know whether a particular
2563 attribute of
\verb\Base1\ is actually defined in
\verb\Base1\ or in
2564 one of its base classes before you can figure out the consequences of
2565 a name conflict with an attribute of
\verb\Base2\. The depth-first
2566 rule makes no differences between direct and inherited attributes of
2569 It is clear that indiscriminate use of multiple inheritance is a
2570 maintenance nightmare, given the reliance in Python on conventions to
2571 avoid accidental name conflicts. A well-known problem with multiple
2572 inheritance is a class derived from two classes that happen to have a
2573 common base class. While it is easy enough to figure out what happens
2574 in this case (the instance will have a single copy of ``instance
2575 variables'' or data attributes used by the common base class), it is
2576 not clear that these semantics are in any way useful.
2579 \section{Odds and ends
}
2581 Sometimes it is useful to have a data type similar to the Pascal
2582 ``record'' or C ``struct'', bundling together a couple of named data
2583 items. An empty class definition will do nicely, e.g.:
2589 john = Employee() # Create an empty employee record
2591 # Fill the fields of the record
2592 john.name = 'John Doe'
2593 john.dept = 'computer lab'
2598 A piece of Python code that expects a particular abstract data type
2599 can often be passed a class that emulates the methods of that data
2600 type instead. For instance, if you have a function that formats some
2601 data from a file object, you can define a class with methods
2602 \verb\read()\ and
\verb\readline()\ that gets the data from a string
2603 buffer instead, and pass it as an argument. (Unfortunately, this
2604 technique has its limitations: a class can't define operations that
2605 are accessed by special syntax such as sequence subscripting or
2606 arithmetic operators, and assigning such a ``pseudo-file'' to
2607 \verb\sys.stdin\ will not cause the interpreter to read further input
2611 Instance method objects have attributes, too:
\verb\m.im_self\ is the
2612 object of which the method is an instance, and
\verb\m.im_func\ is the
2613 function object corresponding to the method.
2616 \chapter{Recent Additions
}
2618 Python is an evolving language. Since this tutorial was last
2619 thoroughly revised, several new features have been added to the
2620 language. While ideally I should revise the tutorial to incorporate
2621 them in the mainline of the text, lack of time currently requires me
2622 to take a more modest approach. In this chapter I will briefly list the
2623 most important improvements to the language and how you can use them
2626 \section{The Last Printed Expression
}
2628 In interactive mode, the last printed expression is assigned to the
2629 variable
\code{_
}. This means that when you are using Python as a
2630 desk calculator, it is somewhat easier to continue calculations, for
2634 >>> tax =
17.5 /
100
2645 For reasons too embarrassing to explain, this variable is implemented
2646 as a built-in (living in the module
\code{__builtin__
}), so it should
2647 be treated as read-only by the user. I.e. don't explicitly assign a
2648 value to it --- you would create an independent local variable with
2649 the same name masking the built-in variable with its magic behavior.
2651 \section{String Literals
}
2653 \subsection{Double Quotes
}
2655 Python can now also use double quotes to surround string literals,
2656 e.g.
\verb\"this doesn't hurt a bit"\. There is no semantic
2657 difference between strings surrounded by single or double quotes.
2659 \subsection{Continuation Of String Literals
}
2661 String literals can span multiple lines by escaping newlines with
2665 hello = "This is a rather long string containing
\n\
2666 several lines of text just as you would do in C.
\n\
2667 Note that whitespace at the beginning of the line is\
2672 which would print the following:
2674 This is a rather long string containing
2675 several lines of text just as you would do in C.
2676 Note that whitespace at the beginning of the line is significant.
2679 \subsection{Triple-quoted strings
}
2681 In some cases, when you need to include really long strings (e.g.
2682 containing several paragraphs of informational text), it is annoying
2683 that you have to terminate each line with
\verb@
\n\@, especially if
2684 you would like to reformat the text occasionally with a powerful text
2685 editor like Emacs. For such situations, ``triple-quoted'' strings can
2691 This string is bounded by triple double quotes (
3 times ").
2692 Unescaped newlines in the string are retained, though \
2693 it is still possible
\nto use all normal escape sequences.
2695 Whitespace at the beginning of a line is
2696 significant. If you need to include three opening quotes
2697 you have to escape at least one of them, e.g. \""".
2699 This string ends in a newline.
2703 Triple-quoted strings can be surrounded by three single quotes as
2704 well, again without semantic difference.
2706 \subsection{String Literal Juxtaposition
}
2708 One final twist: you can juxtapose multiple string literals. Two or
2709 more adjacent string literals (but not arbitrary expressions!)
2710 separated only by whitespace will be concatenated (without intervening
2711 whitespace) into a single string object at compile time. This makes
2712 it possible to continue a long string on the next line without
2713 sacrificing indentation or performance, unlike the use of the string
2714 concatenation operator
\verb\+\ or the continuation of the literal
2715 itself on the next line (since leading whitespace is significant
2716 inside all types of string literals). Note that this feature, like
2717 all string features except triple-quoted strings, is borrowed from
2720 \section{The Formatting Operator
}
2722 \subsection{Basic Usage
}
2724 The chapter on output formatting is really out of date: there is now
2725 an almost complete interface to C-style printf formats. This is done
2726 by overloading the modulo operator (
\verb\%\) for a left operand
2727 which is a string, e.g.
2731 >>> print 'The value of PI is approximately
%5.3f.' % math.pi
2732 The value of PI is approximately
3.142.
2736 If there is more than one format in the string you pass a tuple as
2740 >>> table =
{'Sjoerd':
4127, 'Jack':
4098, 'Dcab':
8637678}
2741 >>> for name, phone in table.items():
2742 ... print '
%-10s ==> %10d' % (name, phone)
2750 Most formats work exactly as in C and require that you pass the proper
2751 type (however, if you don't you get an exception, not a core dump).
2752 The
\verb\%s\ format is more relaxed: if the corresponding argument is
2753 not a string object, it is converted to string using the
\verb\str()\
2754 built-in function. Using
\verb\*\ to pass the width or precision in
2755 as a separate (integer) argument is supported. The C formats
2756 \verb\%n\ and
\verb\%p\ are not supported.
2758 \subsection{Referencing Variables By Name
}
2760 If you have a really long format string that you don't want to split
2761 up, it would be nice if you could reference the variables to be
2762 formatted by name instead of by position. This can be done by using
2763 an extension of C formats using the form
\verb\%(name)format\, e.g.
2766 >>> table =
{'Sjoerd':
4127, 'Jack':
4098, 'Dcab':
8637678}
2767 >>> print 'Jack:
%(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
2768 Jack:
4098; Sjoerd:
4127; Dcab:
8637678
2772 This is particularly useful in combination with the new built-in
2773 \verb\vars()\ function, which returns a dictionary containing all
2776 \section{Optional Function Arguments
}
2778 It is now possible to define functions with a variable number of
2779 arguments. There are two forms, which can be combined.
2781 \subsection{Default Argument Values
}
2783 The most useful form is to specify a default value for one or more
2784 arguments. This creates a function that can be called with fewer
2785 arguments than it is defined, e.g.
2788 def ask_ok(prompt, retries =
4, complaint = 'Yes or no, please!'):
2790 ok = raw_input(prompt)
2791 if ok in ('y', 'ye', 'yes'): return
1
2792 if ok in ('n', 'no', 'nop', 'nope'): return
0
2793 retries = retries -
1
2794 if retries <
0: raise IOError, 'refusenik user'
2798 This function can be called either like this:
2799 \verb\ask_ok('Do you really want to quit?')\ or like this:
2800 \verb\ask_ok('OK to overwrite the file?',
2)\.
2802 The default values are evaluated at the point of function definition
2803 in the
{\em defining
} scope, so that e.g.
2807 def f(arg = i): print arg
2812 will print
\verb\5\.
2814 \subsection{Arbitrary Argument Lists
}
2816 It is also possible to specify that a function can be called with an
2817 arbitrary number of arguments. These arguments will be wrapped up in
2818 a tuple. Before the variable number of arguments, zero or more normal
2819 arguments may occur, e.g.
2822 def fprintf(file, format, *args):
2823 file.write(format
% args)
2826 This feature may be combined with the previous, e.g.
2829 def but_is_it_useful(required, optional = None, *remains):
2830 print "I don't know"
2833 \section{Lambda And Functional Programming Tools
}
2835 \subsection{Lambda Forms
}
2837 By popular demand, a few features commonly found in functional
2838 programming languages and Lisp have been added to Python. With the
2839 \verb\lambda\ keyword, small anonymous functions can be created.
2840 Here's a function that returns the sum of its two arguments:
2841 \verb\lambda a, b: a+b\. Lambda forms can be used wherever function
2842 objects are required. They are syntactically restricted to a single
2843 expression. Semantically, they are just syntactic sugar for a normal
2844 function definition. Like nested function definitions, lambda forms
2845 cannot reference variables from the containing scope, but this can be
2846 overcome through the judicious use of default argument values, e.g.
2849 def make_incrementor(n):
2850 return lambda x, incr=n: x+incr
2853 \subsection{Map, Reduce and Filter
}
2855 Three new built-in functions on sequences are good candidate to pass
2858 \subsubsection{Map.
}
2860 \verb\map(function, sequence)\ calls
\verb\function(item)\ for each of
2861 the sequence's items and returns a list of the return values. For
2862 example, to compute some cubes:
2865 >>> map(lambda x: x*x*x, range(
1,
11))
2866 [1,
8,
27,
64,
125,
216,
343,
512,
729,
1000]
2870 More than one sequence may be passed; the function must then have as
2871 many arguments as there are sequences and is called with the
2872 corresponding item from each sequence (or
\verb\None\ if some sequence
2873 is shorter than another). If
\verb\None\ is passed for the function,
2874 a function returning its argument(s) is substituted.
2876 Combining these two special cases, we see that
2877 \verb\map(None, list1, list2)\ is a convenient way of turning a pair
2878 of lists into a list of pairs. For example:
2882 >>> map(None, seq, map(lambda x: x*x, seq))
2883 [(
0,
0), (
1,
1), (
2,
4), (
3,
9), (
4,
16), (
5,
25), (
6,
36), (
7,
49)
]
2887 \subsubsection{Filter.
}
2889 \verb\filter(function, sequence)\ returns a sequence (of the same
2890 type, if possible) consisting of those items from the sequence for
2891 which
\verb\function(item)\ is true. For example, to compute some
2895 >>> filter(lambda x: x
%2 != 0 and x%3 != 0, range(2, 25))
2896 [5,
7,
11,
13,
17,
19,
23]
2900 \subsubsection{Reduce.
}
2902 \verb\reduce(function, sequence)\ returns a single value constructed
2903 by calling the (binary) function on the first two items of the
2904 sequence, then on the result and the next item, and so on. For
2905 example, to compute the sum of the numbers
1 through
10:
2908 >>> reduce(lambda x, y: x+y, range(
1,
11))
2913 If there's only one item in the sequence, its value is returned; if
2914 the sequence is empty, an exception is raised.
2916 A third argument can be passed to indicate the starting value. In this
2917 case the starting value is returned for an empty sequence, and the
2918 function is first applied to the starting value and the first sequence
2919 item, then to the result and the next item, and so on. For example,
2923 ... return reduce(lambda x, y: x+y, seq,
0)
2925 >>> sum(range(
1,
11))
2932 \section{Continuation Lines Without Backslashes
}
2934 While the general mechanism for continuation of a source line on the
2935 next physical line remains to place a backslash on the end of the
2936 line, expressions inside matched parentheses (or square brackets, or
2937 curly braces) can now also be continued without using a backslash.
2938 This is particularly useful for calls to functions with many
2939 arguments, and for initializations of large tables.
2944 month_names =
['Januari', 'Februari', 'Maart',
2945 'April', 'Mei', 'Juni',
2946 'Juli', 'Augustus', 'September',
2947 'Oktober', 'November', 'December'
]
2953 CopyInternalHyperLinks(self.context.hyperlinks,
2954 copy.context.hyperlinks,
2958 \section{Regular Expressions
}
2960 While C's printf-style output formats, transformed into Python, are
2961 adequate for most output formatting jobs, C's scanf-style input
2962 formats are not very powerful. Instead of scanf-style input, Python
2963 offers Emacs-style regular expressions as a powerful input and
2964 scanning mechanism. Read the corresponding section in the Library
2965 Reference for a full description.
2967 \section{Generalized Dictionaries
}
2969 The keys of dictionaries are no longer restricted to strings --- they
2970 can be any immutable basic type including strings, numbers, tuples, or
2971 (certain) class instances. (Lists and dictionaries are not acceptable
2972 as dictionary keys, in order to avoid problems when the object used as
2975 Dictionaries have two new methods:
\verb\d.values()\ returns a list of
2976 the dictionary's values, and
\verb\d.items()\ returns a list of the
2977 dictionary's (key, value) pairs. Like
\verb\d.keys()\, these
2978 operations are slow for large dictionaries. Examples:
2981 >>> d =
{100: 'honderd',
1000: 'duizend',
10: 'tien'
}
2985 ['honderd', 'tien', 'duizend'
]
2987 [(
100, 'honderd'), (
10, 'tien'), (
1000, 'duizend')
]
2991 \section{Miscellaneous New Built-in Functions
}
2993 The function
\verb\vars()\ returns a dictionary containing the current
2994 local variables. With a module argument, it returns that module's
2995 global variables. The old function
\verb\dir(x)\ returns
2996 \verb\vars(x).keys()\.
2998 The function
\verb\round(x)\ returns a floating point number rounded
2999 to the nearest integer (but still expressed as a floating point
3000 number). E.g.
\verb\round(
3.4) ==
3.0\ and
\verb\round(
3.5) ==
4.0\.
3001 With a second argument it rounds to the specified number of digits,
3002 e.g.
\verb\round(math.pi,
4) ==
3.1416\ or even
3003 \verb\round(
123.4, -
2) ==
100.0\.
3005 The function
\verb\hash(x)\ returns a hash value for an object.
3006 All object types acceptable as dictionary keys have a hash value (and
3007 it is this hash value that the dictionary implementation uses).
3009 The function
\verb\id(x)\ return a unique identifier for an object.
3010 For two objects x and y,
\verb\id(x) == id(y)\ if and only if
3011 \verb\x is y\. (In fact the object's address is used.)
3013 The function
\verb\hasattr(x, name)\ returns whether an object has an
3014 attribute with the given name (a string value). The function
3015 \verb\getattr(x, name)\ returns the object's attribute with the given
3016 name. The function
\verb\setattr(x, name, value)\ assigns a value to
3017 an object's attribute with the given name. These three functions are
3018 useful if the attribute names are not known beforehand. Note that
3019 \verb\getattr(x, 'spam')\ is equivalent to
\verb\x.spam\, and
3020 \verb\setattr(x, 'spam', y)\ is equivalent to
\verb\x.spam = y\. By
3021 definition,
\verb\hasattr(x, name)\ returns true if and only if
3022 \verb\getattr(x, name)\ returns without raising an exception.
3024 \section{Else Clause For Try Statement
}
3026 The
\verb\try...except\ statement now has an optional
\verb\else\
3027 clause, which must follow all
\verb\except\ clauses. It is useful to
3028 place code that must be executed if the
\verb\try\ clause does not
3029 raise an exception. For example:
3032 for arg in sys.argv:
3036 print 'cannot open', arg
3038 print arg, 'has', len(f.readlines()), 'lines'
3043 \section{New Class Features in Release
1.1}
3045 Some changes have been made to classes: the operator overloading
3046 mechanism is more flexible, providing more support for non-numeric use
3047 of operators (including calling an object as if it were a function),
3048 and it is possible to trap attribute accesses.
3050 \subsection{New Operator Overloading
}
3052 It is no longer necessary to coerce both sides of an operator to the
3053 same class or type. A class may still provide a
\code{__coerce__
}
3054 method, but this method may return objects of different types or
3055 classes if it feels like it. If no
\code{__coerce__
} is defined, any
3056 argument type or class is acceptable.
3058 In order to make it possible to implement binary operators where the
3059 right-hand side is a class instance but the left-hand side is not,
3060 without using coercions, right-hand versions of all binary operators
3061 may be defined. These have an `r' prepended to their name,
3062 e.g.
\code{__radd__
}.
3064 For example, here's a very simple class for representing times. Times
3065 are initialized from a number of seconds (like time.time()). Times
3066 are printed like this:
\code{Wed Mar
15 12:
28:
48 1995}. Subtracting
3067 two Times gives their difference in seconds. Adding or subtracting a
3068 Time and a number gives a new Time. You can't add two times, nor can
3069 you subtract a Time from a number.
3075 def __init__(self, seconds):
3076 self.seconds = seconds
3078 return time.ctime(self.seconds)
3079 def __add__(self, x):
3080 return Time(self.seconds + x)
3081 __radd__ = __add__ # support for x+t
3082 def __sub__(self, x):
3083 if hasattr(x, 'seconds'): # test if x could be a Time
3084 return self.seconds - x.seconds
3086 return self.seconds - x
3088 now = Time(time.time())
3089 tomorrow =
24*
3600 + now
3090 yesterday = now - today
3091 print tomorrow - yesterday # prints
172800
3094 \subsection{Trapping Attribute Access
}
3096 You can define three new ``magic'' methods in a class now:
3097 \code{__getattr__(self, name)
},
\code{__setattr__(self, name, value)
}
3098 and
\code{__delattr__(self, name)
}.
3100 The
\code{__getattr__
} method is called when an attribute access fails,
3101 i.e. when an attribute access would otherwise raise AttributeError ---
3102 this is
{\em after
} the instance's dictionary and its class hierarchy
3103 have been searched for the named attribute. Note that if this method
3104 attempts to access any undefined instance attribute it will be called
3107 The
\code{__setattr__
} and
\code{__delattr__
} methods are called when
3108 assignment to, respectively deletion of an attribute are attempted.
3109 They are called
{\em instead
} of the normal action (which is to insert
3110 or delete the attribute in the instance dictionary). If either of
3111 these methods most set or delete any attribute, they can only do so by
3112 using the instance dictionary directly ---
\code{self.__dict__
} --- else
3113 they would be called recursively.
3115 For example, here's a near-universal ``Wrapper'' class that passes all
3116 its attribute accesses to another object. Note how the
3117 \code{__init__
} method inserts the wrapped object in
3118 \code{self.__dict__
} in order to avoid endless recursion
3119 (
\code{__setattr__
} would call
\code{__getattr__
} which would call
3120 itself recursively).
3124 def __init__(self, wrapped):
3125 self.__dict__
['wrapped'
] = wrapped
3126 def __getattr__(self, name):
3127 return getattr(self.wrapped, name)
3128 def __setattr__(self, name, value):
3129 setattr(self.wrapped, name, value)
3130 def __delattr__(self, name):
3131 delattr(self.wrapped, name)
3134 f = Wrapper(sys.stdout)
3135 f.write('hello world
\n') # prints 'hello world'
3138 A simpler example of
\code{__getattr__
} is an attribute that is
3139 computed each time (or the first time) it it accessed. For instance:
3145 def __init__(self, radius):
3146 self.radius = radius
3147 def __getattr__(self, name):
3148 if name == 'circumference':
3149 return
2 * pi * self.radius
3150 if name == 'diameter':
3151 return
2 * self.radius
3153 return pi * pow(self.radius,
2)
3154 raise AttributeError, name
3157 \subsection{Calling a Class Instance
}
3159 If a class defines a method
\code{__call__
} it is possible to call its
3160 instances as if they were functions. For example:
3163 class PresetSomeArguments:
3164 def __init__(self, func, *args):
3165 self.func, self.args = func, args
3166 def __call__(self, *args):
3167 return apply(self.func, self.args + args)
3169 f = PresetSomeArguments(pow,
2) # f(i) computes powers of
2
3170 for i in range(
10): print f(i), # prints
1 2 4 8 16 32 64 128 256 512
3171 print # append newline
3175 \chapter{New in Release
1.2}
3178 This chapter describes even more recent additions to the Python
3179 language and library.
3182 \section{New Class Features
}
3184 The semantics of
\code{__coerce__
} have been changed to be more
3185 reasonable. As an example, the new standard module
\code{Complex
}
3186 implements fairly complete complex numbers using this. Additional
3187 examples of classes with and without
\code{__coerce__
} methods can be
3188 found in the
\code{Demo/classes
} subdirectory, modules
\code{Rat
} and
3191 If a class defines no
\code{__coerce__
} method, this is equivalent to
3192 the following definition:
3195 def __coerce__(self, other): return self, other
3198 If
\code{__coerce__
} coerces itself to an object of a different type,
3199 the operation is carried out using that type --- in release
1.1, this
3200 would cause an error.
3202 Comparisons involving class instances now invoke
\code{__coerce__
}
3203 exactly as if
\code{cmp(x, y)
} were a binary operator like
\code{+
}
3204 (except if
\code{x
} and
\code{y
} are the same object).
3206 \section{Unix Signal Handling
}
3208 On Unix, Python now supports signal handling. The module
3209 \code{signal
} exports functions
\code{signal
},
\code{pause
} and
3210 \code{alarm
}, which act similar to their Unix counterparts. The
3211 module also exports the conventional names for the various signal
3212 classes (also usable with
\code{os.kill()
}) and
\code{SIG_IGN
} and
3213 \code{SIG_DFL
}. See the section on
\code{signal
} in the Library
3214 Reference Manual for more information.
3216 \section{Exceptions Can Be Classes
}
3218 User-defined exceptions are no longer limited to being string objects
3219 --- they can be identified by classes as well. Using this mechanism it
3220 is possible to create extensible hierarchies of exceptions.
3222 There are two new valid (semantic) forms for the raise statement:
3225 raise Class, instance
3230 In the first form,
\code{instance
} must be an instance of
\code{Class
}
3231 or of a class derived from it. The second form is a shorthand for
3234 raise instance.__class__, instance
3237 An except clause may list classes as well as string objects. A class
3238 in an except clause is compatible with an exception if it is the same
3239 class or a base class thereof (but not the other way around --- an
3240 except clause listing a derived class is not compatible with a base
3241 class). For example, the following code will print B, C, D in that
3263 Note that if the except clauses were reversed (with ``
\code{except B
}''
3264 first), it would have printed B, B, B --- the first matching except
3265 clause is triggered.
3267 When an error message is printed for an unhandled exception which is a
3268 class, the class name is printed, then a colon and a space, and
3269 finally the instance converted to a string using the built-in function
3272 In this release, the built-in exceptions are still strings.
3275 \section{Object Persistency and Object Copying
}
3277 Two new modules,
\code{pickle
} and
\code{shelve
}, support storage and
3278 retrieval of (almost) arbitrary Python objects on disk, using the
3279 \code{dbm
} package. A third module,
\code{copy
}, provides flexible
3280 object copying operations. More information on these modules is
3281 provided in the Library Reference Manual.
3283 \subsection{Persistent Objects
}
3285 The module
\code{pickle
} provides a general framework for objects to
3286 disassemble themselves into a stream of bytes and to reassemble such a
3287 stream back into an object. It copes with reference sharing,
3288 recursive objects and instances of user-defined classes, but not
3289 (directly) with objects that have ``magical'' links into the operating
3290 system such as open files, sockets or windows.
3292 The
\code{pickle
} module defines a simple protocol whereby
3293 user-defined classes can control how they are disassembled and
3294 assembled. The method
\code{__getinitargs__()
}, if defined, returns
3295 the argument list for the constructor to be used at assembly time (by
3296 default the constructor is called without arguments). The methods
3297 \code{__getstate__()
} and
\code{__setstate__()
} are used to pass
3298 additional state from disassembly to assembly; by default the
3299 instance's
\code{__dict__
} is passed and restored.
3301 Note that
\code{pickle
} does not open or close any files --- it can be
3302 used equally well for moving objects around on a network or store them
3303 in a database. For ease of debugging, and the inevitable occasional
3304 manual patch-up, the constructed byte streams consist of printable
3305 \ASCII{} characters only (though it's not designed to be pretty).
3307 The module
\code{shelve
} provides a simple model for storing objects
3308 on files. The operation
\code{shelve.open(filename)
} returns a
3309 ``shelf'', which is a simple persistent database with a
3310 dictionary-like interface. Database keys are strings, objects stored
3311 in the database can be anything that
\code{pickle
} will handle.
3313 \subsection{Copying Objects
}
3315 The module
\code{copy
} exports two functions:
\code{copy()
} and
3316 \code{deepcopy()
}. The
\code{copy()
} function returns a ``shallow''
3317 copy of an object;
\code{deepcopy()
} returns a ``deep'' copy. The
3318 difference between shallow and deep copying is only relevant for
3319 compound objects (objects that contain other objects, like lists or
3325 A shallow copy constructs a new compound object and then (to the
3326 extent possible) inserts
{\em the same objects
} into in that the
3330 A deep copy constructs a new compound object and then, recursively,
3331 inserts
{\em copies
} into it of the objects found in the original.
3335 Both functions have the same restrictions and use the same protocols
3336 as
\code{pickle
} --- user-defined classes can control how they are
3337 copied by providing methods named
\code{__getinitargs__()
},
3338 \code{__getstate__()
} and
\code{__setstate__()
}.
3341 \section{Documentation Strings
}
3343 A variety of objects now have a new attribute,
\code{__doc__
}, which
3344 is supposed to contain a documentation string (if no documentation is
3345 present, the attribute is
\code{None
}). New syntax, compatible with
3346 the old interpreter, allows for convenient initialization of the
3347 \code{__doc__
} attribute of modules, classes and functions by placing
3348 a string literal by itself as the first statement in the suite. It
3349 must be a literal --- an expression yielding a string object is not
3350 accepted as a documentation string, since future tools may need to
3351 derive documentation from source by parsing.
3353 Here is a hypothetical, amply documented module called
\code{Spam
}:
3358 This module exports two classes, a function and an exception:
3360 class Spam: full Spam functionality --- three can sizes
3361 class SpamLight: limited Spam functionality --- only one can size
3363 def open(filename): open a file and return a corresponding Spam or
3366 GoneOff: exception raised for errors; should never happen
3368 Note that it is always possible to convert a SpamLight object to a
3369 Spam object by a simple method call, but that the reverse operation is
3370 generally costly and may fail for a number of reasons.
3374 """Limited spam functionality.
3376 Supports a single can size, no flavor, and only hard disks.
3379 def __init__(self, size=
12):
3380 """Construct a new SpamLight instance.
3382 Argument is the can size.
3388 class Spam(SpamLight):
3389 """Full spam functionality.
3391 Supports three can sizes, two flavor varieties, and all floppy
3392 disk formats still supported by current hardware.
3395 def __init__(self, size1=
8, size2=
12, size3=
20):
3396 """Construct a new Spam instance.
3398 Arguments are up to three can sizes.
3404 def open(filename = "/dev/null"):
3405 """Open a can of Spam.
3407 Argument must be an existing file.
3412 """Class used for Spam exceptions.
3414 There shouldn't be any.
3419 After executing ``
\code{import Spam
}'', the following expressions
3420 return the various documentation strings from the module:
3424 Spam.SpamLight.__doc__
3425 Spam.SpamLight.__init__.__doc__
3427 Spam.Spam.__init__.__doc__
3429 Spam.GoneOff.__doc__
3432 There are emerging conventions about the content and formatting of
3433 documentation strings.
3435 The first line should always be a short, concise summary of the
3436 object's purpose. For brevity, it should not explicitly state the
3437 object's name or type, since these are available by other means
3438 (except if the name happens to be a verb describing a function's
3439 operation). This line should begin with a capital letter and end with
3442 If there are more lines in the documentation string, the second line
3443 should be blank, visually separating the summary from the rest of the
3444 description. The following lines should be one of more of paragraphs
3445 describing the objects calling conventions, its side effects, etc.
3447 Some people like to copy the Emacs convention of using UPPER CASE for
3448 function parameters --- this often saves a few words or lines.
3450 The Python parser does not strip indentation from multi-line string
3451 literals in Python, so tools that process documentation have to strip
3452 indentation. This is done using the following convention. The first
3453 non-blank line
{\em after
} the first line of the string determines the
3454 amount of indentation for the entire documentation string. (We can't
3455 use the first line since it is generally adjacent to the string's
3456 opening quotes so its indentation is not apparent in the string
3457 literal.) Whitespace ``equivalent'' to this indentation is then
3458 stripped from the start of all lines of the string. Lines that are
3459 indented less should not occur, but if they occur all their leading
3460 whitespace should be stripped. Equivalence of whitespace should be
3461 tested after expansion of tabs (to
8 spaces, normally).
3463 In this release, few of the built-in or standard functions and modules
3464 have documentation strings.
3467 \section{Customizing Import and Built-Ins
}
3469 In preparation for a ``restricted execution mode'' which will be
3470 usable to run code received from an untrusted source (such as a WWW
3471 server or client), the mechanism by which modules are imported has
3472 been redesigned. It is now possible to provide your own function
3473 \code{__import__
} which is called whenever an
\code{import
} statement
3474 is executed. There's a built-in function
\code{__import__
} which
3475 provides the default implementation, but more interesting, the various
3476 steps it takes are available separately from the new built-in module
3477 \code{imp
}. (See the section on
\code{imp
} in the Library Reference
3478 Manual for more information on this module -- it also contains a
3479 complete example of how to write your own
\code{__import__
} function.)
3481 When you do
\code{dir()
} in a fresh interactive interpreter you will
3482 see another ``secret'' object that's present in every module:
3483 \code{__builtins__
}. This is either a dictionary or a module
3484 containing the set of built-in objects used by functions defined in
3485 current module. Although normally all modules are initialized with a
3486 reference to the same dictionary, it is now possible to use a
3487 different set of built-ins on a per-module basis. Together with the
3488 fact that the
\code{import
} statement uses the
\code{__import__
}
3489 function it finds in the importing modules' dictionary of built-ins,
3490 this forms the basis for a future restricted execution mode.
3493 \section{Python and the World-Wide Web
}
3495 There is a growing number of modules available for writing WWW tools.
3496 The previous release already sported modules
\code{gopherlib
},
3497 \code{ftplib
},
\code{httplib
} and
\code{urllib
} (which unifies the
3498 other three) for accessing data through the commonest WWW protocols.
3499 This release also provides
\code{cgi
}, to ease the writing of
3500 server-side scripts that use the Common Gateway Interface protocol,
3501 supported by most WWW servers. The module
\code{urlparse
} provides
3502 precise parsing of a URL string into its components (address scheme,
3503 network location, path, parameters, query, and fragment identifier).
3505 A rudimentary, parser for HTML files is available in the module
3506 \code{htmllib
}. It currently supports a subset of HTML
1.0 (if you
3507 bring it up to date, I'd love to receive your fixes!). Unfortunately
3508 Python seems to be too slow for real-time parsing and formatting of
3509 HTML such as required by interactive WWW browsers --- but it's good
3510 enough to write a ``robot'' (an automated WWW browser that searches
3511 the web for information).
3514 \section{Miscellaneous
}
3519 The
\code{socket
} module now exports all the needed constants used for
3520 socket operations, such as
\code{SO_BROADCAST
}.
3523 The functions
\code{popen()
} and
\code{fdopen()
} in the
\code{os
}
3524 module now follow the pattern of the built-in function
\code{open()
}:
3525 the default mode argument is
\code{'r'
} and the optional third
3526 argument specifies the buffer size, where
\code{0} means unbuffered,
3527 \code{1} means line-buffered, and any larger number means the size of
3528 the buffer in bytes.
3533 \chapter{New in Release
1.3}
3536 This chapter describes yet more recent additions to the Python
3537 language and library.
3540 \section{Keyword Arguments
}
3542 Functions and methods written in Python can now be called using
3543 keyword arguments of the form
\code{\var{keyword
} =
\var{value
}}. For
3544 instance, the following function:
3547 def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):
3548 print "-- This parrot wouldn't", action,
3549 print "if you put", voltage, "Volts through it."
3550 print "-- Lovely plumage, the", type
3551 print "-- It's", state, "!"
3554 could be called in any of the following ways:
3558 parrot(action = 'VOOOOOM', voltage =
1000000)
3559 parrot('a thousand', state = 'pushing up the daisies')
3560 parrot('a million', 'bereft of life', 'jump')
3563 but the following calls would all be invalid:
3566 parrot() # required argument missing
3567 parrot(voltage=
5.0, 'dead') # non-keyword argument following keyword
3568 parrot(
110, voltage=
220) # duplicate value for argument
3569 parrot(actor='John Cleese') # unknown keyword
3572 In general, an argument list must have the form: zero or more
3573 positional arguments followed by zero or more keyword arguments, where
3574 the keywords must be chosen from the formal parameter names. It's not
3575 important whether a formal parameter has a default value or not. No
3576 argument must receive a value more than once -- formal parameter names
3577 corresponding to positional arguments cannot be used as keywords in
3580 Note that no special syntax is required to allow a function to be
3581 called with keyword arguments. The additional costs incurred by
3582 keyword arguments are only present when a call uses them.
3584 (As far as I know, these rules are exactly the same as used by
3585 Modula-
3, even if they are enforced by totally different means. This
3588 When a final formal parameter of the form
\code{**
\var{name
}} is
3589 present, it receives a dictionary containing all keyword arguments
3590 whose keyword doesn't correspond to a formal parameter. This may be
3591 combined with a formal parameter of the form
\code{*
\var{name
}} which
3592 receives a tuple containing the positional arguments beyond the formal
3593 parameter list. (
\code{*
\var{name
}} must occur before
3594 \code{**
\var{name
}}.) For example, if we define a function like this:
3597 def cheeseshop(kind, *arguments, **keywords):
3598 print "-- Do you have any", kind, '?'
3599 print "-- I'm sorry, we're all out of", kind
3600 for arg in arguments: print arg
3602 for kw in keywords.keys(): print kw, ':', keywords
[kw
]
3605 It could be called like this:
3608 cheeseshop('Limburger', "It's very runny, sir.",
3609 "It's really very, VERY runny, sir.",
3610 client='John Cleese',
3611 shopkeeper='Michael Palin',
3612 sketch='Cheese Shop Sketch')
3615 and of course it would print:
3618 -- Do you have any Limburger ?
3619 -- I'm sorry, we're all out of Limburger
3620 It's very runny, sir.
3621 It's really very, VERY runny, sir.
3622 ----------------------------------------
3623 client : John Cleese
3624 shopkeeper : Michael Palin
3625 sketch : Cheese Shop Sketch
3628 Consequences of this change include:
3633 The built-in function
\code{apply()
} now has an optional third
3634 argument, which is a dictionary specifying any keyword arguments to be
3635 passed. For example,
3637 apply(parrot, (),
{'voltage':
20, 'action': 'voomm'
})
3641 parrot(voltage=
20, action='voomm')
3645 There is also a mechanism for functions and methods defined in an
3646 extension module (i.e., implemented in C or C++) to receive a
3647 dictionary of their keyword arguments. By default, such functions do
3648 not accept keyword arguments, since the argument names are not
3649 available to the interpreter.
3652 In the effort of implementing keyword arguments, function and
3653 especially method calls have been sped up significantly -- for a
3654 method with ten formal parameters, the call overhead has been cut in
3655 half; for a function with one formal parameters, the overhead has been
3659 The format of
\code{.pyc
} files has changed (again).
3662 The
\code{access
} statement has been disabled. The syntax is still
3663 recognized but no code is generated for it. (There were some
3664 unpleasant interactions with changes for keyword arguments, and my
3665 plan is to get rid of
\code{access
} altogether in favor of a different
3670 \section{Changes to the WWW and Internet tools
}
3675 The
\code{htmllib
} module has been rewritten in an incompatible
3676 fashion. The new version is considerably more complete (HTML
2.0
3677 except forms, but including all ISO-
8859-
1 entity definitions), and
3678 easy to use. Small changes to
\code{sgmllib
} have also been made, to
3679 better match the tokenization of HTML as recognized by other web
3683 A new module
\code{formatter
} has been added, for use with the new
3684 \code{htmllib
} module.
3687 The
\code{urllib
}and
\code{httplib
} modules have been changed somewhat
3688 to allow overriding unknown URL types and to support authentication.
3689 They now use
\code{mimetools.Message
} instead of
\code{rfc822.Message
}
3690 to parse headers. The
\code{endrequest()
} method has been removed
3691 from the HTTP class since it breaks the interaction with some servers.
3694 The
\code{rfc822.Message
} class has been changed to allow a flag to be
3695 passed in that says that the file is unseekable.
3698 The
\code{ftplib
} module has been fixed to be (hopefully) more robust
3702 Several new operations that are optionally supported by servers have
3703 been added to
\code{nntplib
}:
\code{xover
},
\code{xgtitle
},
3704 \code{xpath
} and
\code{date
}.
% thanks to Kevan Heydon
3708 \section{Other Language Changes
}
3713 The
\code{raise
} statement now takes an optional argument which
3714 specifies the traceback to be used when printing the exception's stack
3715 trace. This must be a traceback object, such as found in
3716 \code{sys.exc_traceback
}. When omitted or given as
\code{None
}, the
3717 old behavior (to generate a stack trace entry for the current stack
3721 The tokenizer is now more tolerant of alien whitespace. Control-L in
3722 the leading whitespace of a line resets the column number to zero,
3723 while Control-R just before the end of the line is ignored.
3727 \section{Changes to Built-in Operations
}
3732 For file objects,
\code{\var{f
}.read(
0)
} and
3733 \code{\var{f
}.readline(
0)
} now return an empty string rather than
3734 reading an unlimited number of bytes. For the latter, omit the
3735 argument altogether or pass a negative value.
3738 A new system variable,
\code{sys.platform
}, has been added. It
3739 specifies the current platform, e.g.
\code{sunos5
} or
\code{linux1
}.
3742 The built-in functions
\code{input()
} and
\code{raw_input()
} now use
3743 the GNU readline library when it has been configured (formerly, only
3744 interactive input to the interpreter itself was read using GNU
3745 readline). The GNU readline library provides elaborate line editing
3746 and history. The Python debugger (
\code{pdb
}) is the first
3747 beneficiary of this change.
3750 Two new built-in functions,
\code{globals()
} and
\code{locals()
},
3751 provide access to dictionaries containming current global and local
3752 variables, respectively. (These augment rather than replace
3753 \code{vars()
}, which returns the current local variables when called
3754 without an argument, and a module's global variables when called with
3755 an argument of type module.)
3758 The built-in function
\code{compile()
} now takes a third possible
3759 value for the kind of code to be compiled: specifying
\code{'single'
}
3760 generates code for a single interactive statement, which prints the
3761 output of expression statements that evaluate to something else than
3766 \section{Library Changes
}
3771 There are new module
\code{ni
} and
\code{ihooks
} that support
3772 importing modules with hierarchical names such as
\code{A.B.C
}. This
3773 is enabled by writing
\code{import ni; ni.ni()
} at the very top of the
3774 main program. These modules are amply documented in the Python
3778 The module
\code{rexec
} has been rewritten (incompatibly) to define a
3779 class and to use
\code{ihooks
}.
3782 The
\code{string.split()
} and
\code{string.splitfields()
} functions
3783 are now the same function (the presence or absence of the second
3784 argument determines which operation is invoked); similar for
3785 \code{string.join()
} and
\code{string.joinfields()
}.
3788 The
\code{Tkinter
} module and its helper
\code{Dialog
} have been
3789 revamped to use keyword arguments. Tk
4.0 is now the standard. A new
3790 module
\code{FileDialog
} has been added which implements standard file
3794 The optional built-in modules
\code{dbm
} and
\code{gdbm
} are more
3795 coordinated --- their
\code{open()
} functions now take the same values
3796 for their
\var{flag
} argument, and the
\var{flag
} and
\var{mode
}
3797 argument have default values (to open the database for reading only,
3798 and to create the database with mode
\code{0666} minuse the umask,
3799 respectively). The memory leaks have finally been fixed.
3802 A new dbm-like module,
\code{bsddb
}, has been added, which uses the
3803 BSD DB package's hash method.
% thanks to David Ely
3806 A portable (though slow) dbm-clone, implemented in Python, has been
3807 added for systems where none of the above is provided. It is aptly
3808 dubbed
\code{dumbdbm
}.
3811 The module
\code{anydbm
} provides a unified interface to
\code{bsddb
},
3812 \code{gdbm
},
\code{dbm
}, and
\code{dumbdbm
}, choosing the first one
3816 A new extension module,
\code{binascii
}, provides a variety of
3817 operations for conversion of text-encoded binary data.
3820 There are three new or rewritten companion modules implemented in
3821 Python that can encode and decode the most common such formats:
3822 \code{uu
} (uuencode),
\code{base64
} and
\code{binhex
}.
3825 A module to handle the MIME encoding quoted-printable has also been
3826 added:
\code{quopri
}.
3829 The parser module (which provides an interface to the Python parser's
3830 abstract syntax trees) has been rewritten (incompatibly) by Fred
3831 Drake. It now lets you change the parse tree and compile the result!
3834 The
\code{syslog
} module has been upgraded and documented.
3835 % thanks to Steve Clift
3839 \section{Other Changes
}
3844 The dynamic module loader recognizes the fact that different filenames
3845 point to the same shared library and loads the library only once, so
3846 you can have a single shared library that defines multiple modules.
3847 (SunOS / SVR4 style shared libraries only.)
3850 Jim Fulton's ``abstract object interface'' has been incorporated into
3851 the run-time API. For more detailes, read the files
3852 \code{Include/abstract.h
} and
\code{Objects/abstract.c
}.
3855 The Macintosh version is much more robust now.
3858 Numerous things I have forgotten or that are so obscure no-one will
3859 notice them anyway :-)