4 % Add a section on file I/O
5 % Write a chapter entitled ``Some Useful Modules''
7 % Should really move the Python startup file info to an appendix
9 \title{Python Tutorial
}
18 \chapter*
{Front Matter
\label{front
}}
26 Python is an easy to learn, powerful programming language. It has
27 efficient high-level data structures and a simple but effective
28 approach to object-oriented programming. Python's elegant syntax and
29 dynamic typing, together with its interpreted nature, make it an ideal
30 language for scripting and rapid application development in many areas
33 The Python interpreter and the extensive standard library are freely
34 available in source or binary form for all major platforms from the
35 Python web site,
\url{http://www.python.org
}, and can be freely
36 distributed. The same site also contains distributions of and
37 pointers to many free third party Python modules, programs and tools,
38 and additional documentation.
40 The Python interpreter is easily extended with new functions and data
41 types implemented in C or
\Cpp{} (or other languages callable from C).
42 Python is also suitable as an extension language for customizable
45 This tutorial introduces the reader informally to the basic concepts
46 and features of the Python language and system. It helps to have a
47 Python interpreter handy for hands-on experience, but all examples are
48 self-contained, so the tutorial can be read off-line as well.
50 For a description of standard objects and modules, see the
51 \emph{Python Library Reference
} document. The
\emph{Python Reference
52 Manual
} gives a more formal definition of the language. To write
53 extensions in C or
\Cpp{}, read the
\emph{Extending and Embedding
} and
54 \emph{Python/C API
} manuals. There are also several books covering
57 This tutorial does not attempt to be comprehensive and cover every
58 single feature, or even every commonly used feature. Instead, it
59 introduces many of Python's most noteworthy features, and will give
60 you a good idea of the language's flavor and style. After reading it,
61 you will be able to read and write Python modules and programs, and
62 you will be ready to learn more about the various Python library
63 modules described in the
\emph{Python Library Reference
}.
70 \chapter{Whetting Your Appetite
\label{intro
}}
72 If you ever wrote a large shell script, you probably know this
73 feeling: you'd love to add yet another feature, but it's already so
74 slow, and so big, and so complicated; or the feature involves a system
75 call or other function that is only accessible from C
\ldots Usually
76 the problem at hand isn't serious enough to warrant rewriting the
77 script in C; perhaps the problem requires variable-length strings or
78 other data types (like sorted lists of file names) that are easy in
79 the shell but lots of work to implement in C, or perhaps you're not
80 sufficiently familiar with C.
82 Another situation: perhaps you have to work with several C libraries,
83 and the usual C write/compile/test/re-compile cycle is too slow. You
84 need to develop software more quickly. Possibly perhaps you've
85 written a program that could use an extension language, and you don't
86 want to design a language, write and debug an interpreter for it, then
87 tie it into your application.
89 In such cases, Python may be just the language for you. Python is
90 simple to use, but it is a real programming language, offering much
91 more structure and support for large programs than the shell has. On
92 the other hand, it also offers much more error checking than C, and,
93 being a
\emph{very-high-level language
}, it has high-level data types
94 built in, such as flexible arrays and dictionaries that would cost you
95 days to implement efficiently in C. Because of its more general data
96 types Python is applicable to a much larger problem domain than
97 \emph{Awk
} or even
\emph{Perl
}, yet many things are at least as easy
98 in Python as in those languages.
100 Python allows you to split up your program in modules that can be
101 reused in other Python programs. It comes with a large collection of
102 standard modules that you can use as the basis of your programs --- or
103 as examples to start learning to program in Python. There are also
104 built-in modules that provide things like file I/O, system calls,
105 sockets, and even interfaces to GUI toolkits like Tk.
107 Python is an interpreted language, which can save you considerable time
108 during program development because no compilation and linking is
109 necessary. The interpreter can be used interactively, which makes it
110 easy to experiment with features of the language, to write throw-away
111 programs, or to test functions during bottom-up program development.
112 It is also a handy desk calculator.
114 Python allows writing very compact and readable programs. Programs
115 written in Python are typically much shorter than equivalent C
116 programs, for several reasons:
119 the high-level data types allow you to express complex operations in a
122 statement grouping is done by indentation instead of begin/end
125 no variable or argument declarations are necessary.
128 Python is
\emph{extensible
}: if you know how to program in C it is easy
129 to add a new built-in function or module to the interpreter, either to
130 perform critical operations at maximum speed, or to link Python
131 programs to libraries that may only be available in binary form (such
132 as a vendor-specific graphics library). Once you are really hooked,
133 you can link the Python interpreter into an application written in C
134 and use it as an extension or command language for that application.
136 By the way, the language is named after the BBC show ``Monty Python's
137 Flying Circus'' and has nothing to do with nasty reptiles. Making
138 references to Monty Python skits in documentation is not only allowed,
141 \section{Where From Here
\label{where
}}
143 Now that you are all excited about Python, you'll want to examine it
144 in some more detail. Since the best way to learn a language is
145 using it, you are invited here to do so.
147 In the next chapter, the mechanics of using the interpreter are
148 explained. This is rather mundane information, but essential for
149 trying out the examples shown later.
151 The rest of the tutorial introduces various features of the Python
152 language and system though examples, beginning with simple
153 expressions, statements and data types, through functions and modules,
154 and finally touching upon advanced concepts like exceptions
155 and user-defined classes.
157 \chapter{Using the Python Interpreter
\label{using
}}
159 \section{Invoking the Interpreter
\label{invoking
}}
161 The Python interpreter is usually installed as
\file{/usr/local/bin/python
}
162 on those machines where it is available; putting
\file{/usr/local/bin
} in
163 your
\UNIX{} shell's search path makes it possible to start it by
170 to the shell. Since the choice of the directory where the interpreter
171 lives is an installation option, other places are possible; check with
172 your local Python guru or system administrator. (E.g.,
173 \file{/usr/local/python
} is a popular alternative location.)
175 Typing an EOF character (Control-D on
\UNIX{}, Control-Z on DOS
176 or Windows) at the primary prompt causes the interpreter to exit with
177 a zero exit status. If that doesn't work, you can exit the
178 interpreter by typing the following commands:
\samp{import sys;
181 The interpreter's line-editing features usually aren't very
182 sophisticated. On
\UNIX{}, whoever installed the interpreter may have
183 enabled support for the GNU readline library, which adds more
184 elaborate interactive editing and history features. Perhaps the
185 quickest check to see whether command line editing is supported is
186 typing Control-P to the first Python prompt you get. If it beeps, you
187 have command line editing; see Appendix A for an introduction to the
188 keys. If nothing appears to happen, or if
\code{\^P
} is echoed,
189 command line editing isn't available; you'll only be able to use
190 backspace to remove characters from the current line.
192 The interpreter operates somewhat like the
\UNIX{} shell: when called
193 with standard input connected to a tty device, it reads and executes
194 commands interactively; when called with a file name argument or with
195 a file as standard input, it reads and executes a
\emph{script
} from
198 A third way of starting the interpreter is
199 \samp{python -c command
[arg
] ...
}, which
200 executes the statement(s) in
\code{command
}, analogous to the shell's
201 \code{-c
} option. Since Python statements often contain spaces or other
202 characters that are special to the shell, it is best to quote
203 \code{command
} in its entirety with double quotes.
205 Note that there is a difference between
\samp{python file
} and
206 \samp{python <file
}. In the latter case, input requests from the
207 program, such as calls to
\code{input()
} and
\code{raw_input()
}, are
208 satisfied from
\emph{file
}. Since this file has already been read
209 until the end by the parser before the program starts executing, the
210 program will encounter EOF immediately. In the former case (which is
211 usually what you want) they are satisfied from whatever file or device
212 is connected to standard input of the Python interpreter.
214 When a script file is used, it is sometimes useful to be able to run
215 the script and enter interactive mode afterwards. This can be done by
216 passing
\code{-i
} before the script. (This does not work if the script
217 is read from standard input, for the same reason as explained in the
220 \subsection{Argument Passing
\label{argPassing
}}
222 When known to the interpreter, the script name and additional
223 arguments thereafter are passed to the script in the variable
224 \code{sys.argv
}, which is a list of strings. Its length is at least
225 one; when no script and no arguments are given,
\code{sys.argv
[0]} is
226 an empty string. When the script name is given as
\code{'-'
} (meaning
227 standard input),
\code{sys.argv
[0]} is set to
\code{'-'
}. When
\code{-c
228 command
} is used,
\code{sys.argv
[0]} is set to
\code{'-c'
}. Options
229 found after
\code{-c command
} are not consumed by the Python
230 interpreter's option processing but left in
\code{sys.argv
} for the
233 \subsection{Interactive Mode
\label{interactive
}}
235 When commands are read from a tty, the interpreter is said to be in
236 \emph{interactive mode
}. In this mode it prompts for the next command
237 with the
\emph{primary prompt
}, usually three greater-than signs
238 (
\samp{>>>
}); for continuation lines it prompts with the
239 \emph{secondary prompt
},
240 by default three dots (
\samp{...
}).
242 The interpreter prints a welcome message stating its version number
243 and a copyright notice before printing the first prompt, e.g.:
247 Python
1.5.2b2 (
#1, Feb
28 1999,
00:
02:
06)
[GCC
2.8.1] on sunos5
248 Copyright
1991-
1995 Stichting Mathematisch Centrum, Amsterdam
252 \section{The Interpreter and Its Environment
\label{interp
}}
254 \subsection{Error Handling
\label{error
}}
256 When an error occurs, the interpreter prints an error
257 message and a stack trace. In interactive mode, it then returns to
258 the primary prompt; when input came from a file, it exits with a
259 nonzero exit status after printing
260 the stack trace. (Exceptions handled by an
\code{except
} clause in a
261 \code{try
} statement are not errors in this context.) Some errors are
262 unconditionally fatal and cause an exit with a nonzero exit; this
263 applies to internal inconsistencies and some cases of running out of
264 memory. All error messages are written to the standard error stream;
265 normal output from the executed commands is written to standard
268 Typing the interrupt character (usually Control-C or DEL) to the
269 primary or secondary prompt cancels the input and returns to the
270 primary prompt.
\footnote{
271 A problem with the GNU Readline package may prevent this.
273 Typing an interrupt while a command is executing raises the
274 \code{KeyboardInterrupt
} exception, which may be handled by a
275 \code{try
} statement.
277 \subsection{Executable Python Scripts
\label{scripts
}}
279 On BSD'ish
\UNIX{} systems, Python scripts can be made directly
280 executable, like shell scripts, by putting the line
283 #! /usr/bin/env python
286 (assuming that the interpreter is on the user's
\envvar{PATH
}) at the
287 beginning of the script and giving the file an executable mode. The
288 \samp{\#!
} must be the first two characters of the file.
290 \subsection{The Interactive Startup File
\label{startup
}}
292 % XXX This should probably be dumped in an appendix, since most people
293 % don't use Python interactively in non-trivial ways.
295 When you use Python interactively, it is frequently handy to have some
296 standard commands executed every time the interpreter is started. You
297 can do this by setting an environment variable named
298 \envvar{PYTHONSTARTUP
} to the name of a file containing your start-up
299 commands. This is similar to the
\file{.profile
} feature of the
\UNIX{}
302 This file is only read in interactive sessions, not when Python reads
303 commands from a script, and not when
\file{/dev/tty
} is given as the
304 explicit source of commands (which otherwise behaves like an
305 interactive session). It is executed in the same name space where
306 interactive commands are executed, so that objects that it defines or
307 imports can be used without qualification in the interactive session.
308 You can also change the prompts
\code{sys.ps1
} and
\code{sys.ps2
} in
311 If you want to read an additional start-up file from the current
312 directory, you can program this in the global start-up file,
313 e.g.\
\samp{execfile('.pythonrc.py')
}\indexii{.pythonrc.py
}{file
}. If
314 you want to use the startup file in a script, you must do this
315 explicitly in the script:
319 if os.environ.get('PYTHONSTARTUP') \
320 and os.path.isfile(os.environ
['PYTHONSTARTUP'
]):
321 execfile(os.environ
['PYTHONSTARTUP'
])
325 \chapter{An Informal Introduction to Python
\label{informal
}}
327 In the following examples, input and output are distinguished by the
328 presence or absence of prompts (
\samp{>>>
} and
\samp{...
}): to repeat
329 the example, you must type everything after the prompt, when the
330 prompt appears; lines that do not begin with a prompt are output from
333 % I'd prefer to use different fonts to distinguish input
334 % from output, but the amount of LaTeX hacking that would require
335 % is currently beyond my ability.
337 Note that a secondary prompt on a line by itself in an example means
338 you must type a blank line; this is used to end a multi-line command.
340 \section{Using Python as a Calculator
\label{calculator
}}
342 Let's try some simple Python commands. Start the interpreter and wait
343 for the primary prompt,
\samp{>>>
}. (It shouldn't take long.)
345 \subsection{Numbers
\label{numbers
}}
347 The interpreter acts as a simple calculator: you can type an
348 expression at it and it will write the value. Expression syntax is
349 straightforward: the operators
\code{+
},
\code{-
},
\code{*
} and
\code{/
}
350 work just like in most other languages (e.g., Pascal or C); parentheses
351 can be used for grouping. For example:
356 >>> # This is a comment
359 >>>
2+
2 # and a comment on the same line as code
363 >>> # Integer division returns the floor:
370 Like in C, the equal sign (
\character{=
}) is used to assign a value to a
371 variable. The value of an assignment is not written:
380 A value can be assigned to several variables simultaneously:
383 >>> x = y = z =
0 # Zero x, y and z
392 There is full support for floating point; operators with mixed type
393 operands convert the integer operand to floating point:
402 Complex numbers are also supported; imaginary numbers are written with
403 a suffix of
\samp{j
} or
\samp{J
}. Complex numbers with a nonzero
404 real component are written as
\samp{(
\var{real
}+
\var{imag
}j)
}, or can
405 be created with the
\samp{complex(
\var{real
},
\var{imag
})
} function.
410 >>>
1j * complex(
0,
1)
420 Complex numbers are always represented as two floating point numbers,
421 the real and imaginary part. To extract these parts from a complex
422 number
\var{z
}, use
\code{\var{z
}.real
} and
\code{\var{z
}.imag
}.
432 The conversion functions to floating point and integer
433 (
\function{float()
},
\function{int()
} and
\function{long()
}) don't
434 work for complex numbers --- there is no one correct way to convert a
435 complex number to a real number. Use
\code{abs(
\var{z
})
} to get its
436 magnitude (as a float) or
\code{z.real
} to get its real part.
441 Traceback (innermost last):
442 File "<stdin>", line
1, in ?
443 TypeError: can't convert complex to float; use e.g. abs(z)
450 In interactive mode, the last printed expression is assigned to the
451 variable
\code{_
}. This means that when you are using Python as a
452 desk calculator, it is somewhat easier to continue calculations, for
466 This variable should be treated as read-only by the user. Don't
467 explicitly assign a value to it --- you would create an independent
468 local variable with the same name masking the built-in variable with
471 \subsection{Strings
\label{strings
}}
473 Besides numbers, Python can also manipulate strings, which can be
474 expressed in several ways. They can be enclosed in single quotes or
484 >>> '"Yes," he said.'
486 >>> "\"Yes,\" he said."
488 >>> '"Isn\'t," she said.'
489 '"Isn\'t," she said.'
492 String literals can span multiple lines in several ways. Newlines can
493 be escaped with backslashes, e.g.:
496 hello = "This is a rather long string containing
\n\
497 several lines of text just as you would do in C.
\n\
498 Note that whitespace at the beginning of the line is\
503 which would print the following:
506 This is a rather long string containing
507 several lines of text just as you would do in C.
508 Note that whitespace at the beginning of the line is significant.
511 Or, strings can be surrounded in a pair of matching triple-quotes:
512 \code{"""
} or
\code {'''
}. End of lines do not need to be escaped
513 when using triple-quotes, but they will be included in the string.
517 Usage: thingy
[OPTIONS
]
518 -h Display this usage message
519 -H hostname Hostname to connect to
523 produces the following output:
526 Usage: thingy
[OPTIONS
]
527 -h Display this usage message
528 -H hostname Hostname to connect to
531 The interpreter prints the result of string operations in the same way
532 as they are typed for input: inside quotes, and with quotes and other
533 funny characters escaped by backslashes, to show the precise
534 value. The string is enclosed in double quotes if the string contains
535 a single quote and no double quotes, else it's enclosed in single
536 quotes. (The
\keyword{print
} statement, described later, can be used
537 to write strings without quotes or escapes.)
539 Strings can be concatenated (glued together) with the
\code{+
}
540 operator, and repeated with
\code{*
}:
543 >>> word = 'Help' + 'A'
546 >>> '<' + word*
5 + '>'
547 '<HelpAHelpAHelpAHelpAHelpA>'
550 Two string literals next to each other are automatically concatenated;
551 the first line above could also have been written
\samp{word = 'Help'
552 'A'
}; this only works with two literals, not with arbitrary string
556 >>> 'str' 'ing' # <- This is ok
558 >>> string.strip('str') + 'ing' # <- This is ok
560 >>> string.strip('str') 'ing' # <- This is invalid
561 File "<stdin>", line
1
562 string.strip('str') 'ing'
564 SyntaxError: invalid syntax
567 Strings can be subscripted (indexed); like in C, the first character
568 of a string has subscript (index)
0. There is no separate character
569 type; a character is simply a string of size one. Like in Icon,
570 substrings can be specified with the
\emph{slice notation
}: two indices
571 separated by a colon.
582 Slice indices have useful defaults; an omitted first index defaults to
583 zero, an omitted second index defaults to the size of the string being
587 >>> word
[:
2] # The first two characters
589 >>> word
[2:
] # All but the first two characters
593 Here's a useful invariant of slice operations:
\code{s
[:i
] + s
[i:
]}
597 >>> word
[:
2] + word
[2:
]
599 >>> word
[:
3] + word
[3:
]
603 Degenerate slice indices are handled gracefully: an index that is too
604 large is replaced by the string size, an upper bound smaller than the
605 lower bound returns an empty string.
616 Indices may be negative numbers, to start counting from the right.
620 >>> word
[-
1] # The last character
622 >>> word
[-
2] # The last-but-one character
624 >>> word
[-
2:
] # The last two characters
626 >>> word
[:-
2] # All but the last two characters
630 But note that -
0 is really the same as
0, so it does not count from
634 >>> word
[-
0] # (since -
0 equals
0)
638 Out-of-range negative slice indices are truncated, but don't try this
639 for single-element (non-slice) indices:
644 >>> word
[-
10] # error
645 Traceback (innermost last):
646 File "<stdin>", line
1
647 IndexError: string index out of range
650 The best way to remember how slices work is to think of the indices as
651 pointing
\emph{between
} characters, with the left edge of the first
652 character numbered
0. Then the right edge of the last character of a
653 string of
\var{n
} characters has index
\var{n
}, for example:
656 +---+---+---+---+---+
657 | H | e | l | p | A |
658 +---+---+---+---+---+
663 The first row of numbers gives the position of the indices
0..
.5 in
664 the string; the second row gives the corresponding negative indices.
665 The slice from
\var{i
} to
\var{j
} consists of all characters between
666 the edges labeled
\var{i
} and
\var{j
}, respectively.
668 For nonnegative indices, the length of a slice is the difference of
669 the indices, if both are within bounds, e.g., the length of
670 \code{word
[1:
3]} is
2.
672 The built-in function
\function{len()
} returns the length of a string:
675 >>> s = 'supercalifragilisticexpialidocious'
680 \subsection{Lists
\label{lists
}}
682 Python knows a number of
\emph{compound
} data types, used to group
683 together other values. The most versatile is the
\emph{list
}, which
684 can be written as a list of comma-separated values (items) between
685 square brackets. List items need not all have the same type.
688 >>> a =
['spam', 'eggs',
100,
1234]
690 ['spam', 'eggs',
100,
1234]
693 Like string indices, list indices start at
0, and lists can be sliced,
694 concatenated and so on:
705 >>> a
[:
2] +
['bacon',
2*
2]
706 ['spam', 'eggs', 'bacon',
4]
707 >>>
3*a
[:
3] +
['Boe!'
]
708 ['spam', 'eggs',
100, 'spam', 'eggs',
100, 'spam', 'eggs',
100, 'Boe!'
]
711 Unlike strings, which are
\emph{immutable
}, it is possible to change
712 individual elements of a list:
716 ['spam', 'eggs',
100,
1234]
719 ['spam', 'eggs',
123,
1234]
722 Assignment to slices is also possible, and this can even change the size
726 >>> # Replace some items:
735 ... a
[1:
1] =
['bletch', 'xyzzy'
]
737 [123, 'bletch', 'xyzzy',
1234]
738 >>> a
[:
0] = a # Insert (a copy of) itself at the beginning
740 [123, 'bletch', 'xyzzy',
1234,
123, 'bletch', 'xyzzy',
1234]
743 The built-in function
\function{len()
} also applies to lists:
750 It is possible to nest lists (create lists containing other lists),
762 >>> p
[1].append('xtra') # See section
5.1
764 [1,
[2,
3, 'xtra'
],
4]
769 Note that in the last example,
\code{p
[1]} and
\code{q
} really refer to
770 the same object! We'll come back to
\emph{object semantics
} later.
772 \section{First Steps Towards Programming
\label{firstSteps
}}
774 Of course, we can use Python for more complicated tasks than adding
775 two and two together. For instance, we can write an initial
776 subsequence of the
\emph{Fibonacci
} series as follows:
779 >>> # Fibonacci series:
780 ... # the sum of two elements defines the next
794 This example introduces several new features.
799 The first line contains a
\emph{multiple assignment
}: the variables
800 \code{a
} and
\code{b
} simultaneously get the new values
0 and
1. On the
801 last line this is used again, demonstrating that the expressions on
802 the right-hand side are all evaluated first before any of the
803 assignments take place.
806 The
\keyword{while
} loop executes as long as the condition (here:
807 \code{b <
10}) remains true. In Python, like in C, any non-zero
808 integer value is true; zero is false. The condition may also be a
809 string or list value, in fact any sequence; anything with a non-zero
810 length is true, empty sequences are false. The test used in the
811 example is a simple comparison. The standard comparison operators are
812 written the same as in C:
\code{<
},
\code{>
},
\code{==
},
\code{<=
},
813 \code{>=
} and
\code{!=
}.
816 The
\emph{body
} of the loop is
\emph{indented
}: indentation is Python's
817 way of grouping statements. Python does not (yet!) provide an
818 intelligent input line editing facility, so you have to type a tab or
819 space(s) for each indented line. In practice you will prepare more
820 complicated input for Python with a text editor; most text editors have
821 an auto-indent facility. When a compound statement is entered
822 interactively, it must be followed by a blank line to indicate
823 completion (since the parser cannot guess when you have typed the last
827 The
\keyword{print
} statement writes the value of the expression(s) it is
828 given. It differs from just writing the expression you want to write
829 (as we did earlier in the calculator examples) in the way it handles
830 multiple expressions and strings. Strings are printed without quotes,
831 and a space is inserted between items, so you can format things nicely,
836 >>> print 'The value of i is', i
837 The value of i is
65536
840 A trailing comma avoids the newline after the output:
848 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
851 Note that the interpreter inserts a newline before it prints the next
852 prompt if the last line was not completed.
857 \chapter{More Control Flow Tools
\label{moreControl
}}
859 Besides the
\keyword{while
} statement just introduced, Python knows
860 the usual control flow statements known from other languages, with
863 \section{\keyword{if
} Statements
\label{if
}}
865 Perhaps the most well-known statement type is the
\keyword{if
}
866 statement. For example:
869 >>> #
[Code which sets 'x' to a value...
]
872 ... print 'Negative changed to zero'
882 There can be zero or more
\keyword{elif
} parts, and the
\keyword{else
}
883 part is optional. The keyword `
\keyword{elif
}' is short for `else
884 if', and is useful to avoid excessive indentation. An
885 \keyword{if
} \ldots\
\keyword{elif
} \ldots\
\keyword{elif
}
886 \ldots\ sequence is a substitute for the
\emph{switch
} or
888 % Weird spacings happen here if the wrapping of the source text
889 % gets changed in the wrong way.
890 \emph{case
} statements found in other languages.
893 \section{\keyword{for
} Statements
\label{for
}}
895 The
\keyword{for
}\stindex{for
} statement in Python differs a bit from
896 what you may be used to in C or Pascal. Rather than always
897 iterating over an arithmetic progression of numbers (like in Pascal),
898 or giving the user the ability to define both the iteration step and
899 halting condition (as C), Python's
\keyword{for
}\stindex{for
}
900 statement iterates over the items of any sequence (e.g., a list or a
901 string), in the order that they appear in the sequence. For example
903 % One suggestion was to give a real C example here, but that may only
904 % serve to confuse non-C programmers.
907 >>> # Measure some strings:
908 ... a =
['cat', 'window', 'defenestrate'
]
917 It is not safe to modify the sequence being iterated over in the loop
918 (this can only happen for mutable sequence types, i.e., lists). If
919 you need to modify the list you are iterating over, e.g., duplicate
920 selected items, you must iterate over a copy. The slice notation
921 makes this particularly convenient:
924 >>> for x in a
[:
]: # make a slice copy of the entire list
925 ... if len(x) >
6: a.insert(
0, x)
928 ['defenestrate', 'cat', 'window', 'defenestrate'
]
932 \section{The
\function{range()
} Function
\label{range
}}
934 If you do need to iterate over a sequence of numbers, the built-in
935 function
\function{range()
} comes in handy. It generates lists
936 containing arithmetic progressions, e.g.:
940 [0,
1,
2,
3,
4,
5,
6,
7,
8,
9]
943 The given end point is never part of the generated list;
944 \code{range(
10)
} generates a list of
10 values, exactly the legal
945 indices for items of a sequence of length
10. It is possible to let
946 the range start at another number, or to specify a different increment
954 >>> range(-
10, -
100, -
30)
958 To iterate over the indices of a sequence, combine
\function{range()
}
959 and
\function{len()
} as follows:
962 >>> a =
['Mary', 'had', 'a', 'little', 'lamb'
]
963 >>> for i in range(len(a)):
973 \section{\keyword{break
} and
\keyword{continue
} Statements, and
974 \keyword{else
} Clauses on Loops
977 The
\keyword{break
} statement, like in C, breaks out of the smallest
978 enclosing
\keyword{for
} or
\keyword{while
} loop.
980 The
\keyword{continue
} statement, also borrowed from C, continues
981 with the next iteration of the loop.
983 Loop statements may have an
\code{else
} clause; it is executed when
984 the loop terminates through exhaustion of the list (with
985 \keyword{for
}) or when the condition becomes false (with
986 \keyword{while
}), but not when the loop is terminated by a
987 \keyword{break
} statement. This is exemplified by the following loop,
988 which searches for prime numbers:
991 >>> for n in range(
2,
10):
992 ... for x in range(
2, n):
994 ... print n, 'equals', x, '*', n/x
997 ... print n, 'is a prime number'
1009 \section{\keyword{pass
} Statements
\label{pass
}}
1011 The
\keyword{pass
} statement does nothing.
1012 It can be used when a statement is required syntactically but the
1013 program requires no action.
1018 ... pass # Busy-wait for keyboard interrupt
1022 \section{Defining Functions
\label{functions
}}
1024 We can create a function that writes the Fibonacci series to an
1028 >>> def fib(n): # write Fibonacci series up to n
1029 ... "Print a Fibonacci series up to n"
1035 >>> # Now call the function we just defined:
1037 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
1040 The keyword
\keyword{def
} introduces a function
\emph{definition
}. It
1041 must be followed by the function name and the parenthesized list of
1042 formal parameters. The statements that form the body of the function
1043 start at the next line, indented by a tab stop. The first statement
1044 of the function body can optionally be a string literal; this string
1045 literal is the function's documentation string, or
\dfn{docstring
}.
1046 There are tools which use docstrings to automatically produce printed
1047 documentation, or to let the user interactively browse through code;
1048 it's good practice to include docstrings in code that you write, so
1049 try to make a habit of it.
1051 The
\emph{execution
} of a function introduces a new symbol table used
1052 for the local variables of the function. More precisely, all variable
1053 assignments in a function store the value in the local symbol table;
1054 whereas variable references first look in the local symbol table, then
1055 in the global symbol table, and then in the table of built-in names.
1056 Thus, global variables cannot be directly assigned a value within a
1057 function (unless named in a
\keyword{global
} statement), although
1058 they may be referenced.
1060 The actual parameters (arguments) to a function call are introduced in
1061 the local symbol table of the called function when it is called; thus,
1062 arguments are passed using
\emph{call by value
}.
\footnote{
1063 Actually,
\emph{call by object reference
} would be a better
1064 description, since if a mutable object is passed, the caller
1065 will see any changes the callee makes to it (e.g., items
1066 inserted into a list).
1068 When a function calls another function, a new local symbol table is
1069 created for that call.
1071 A function definition introduces the function name in the current
1072 symbol table. The value of the function name
1073 has a type that is recognized by the interpreter as a user-defined
1074 function. This value can be assigned to another name which can then
1075 also be used as a function. This serves as a general renaming
1080 <function object at
10042ed0>
1083 1 1 2 3 5 8 13 21 34 55 89
1086 You might object that
\code{fib
} is not a function but a procedure. In
1087 Python, like in C, procedures are just functions that don't return a
1088 value. In fact, technically speaking, procedures do return a value,
1089 albeit a rather boring one. This value is called
\code{None
} (it's a
1090 built-in name). Writing the value
\code{None
} is normally suppressed by
1091 the interpreter if it would be the only value written. You can see it
1092 if you really want to:
1099 It is simple to write a function that returns a list of the numbers of
1100 the Fibonacci series, instead of printing it:
1103 >>> def fib2(n): # return Fibonacci series up to n
1104 ... "Return a list containing the Fibonacci series up to n"
1108 ... result.append(b) # see below
1112 >>> f100 = fib2(
100) # call it
1113 >>> f100 # write the result
1114 [1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89]
1117 This example, as usual, demonstrates some new Python features:
1122 The
\keyword{return
} statement returns with a value from a function.
1123 \keyword{return
} without an expression argument is used to return from
1124 the middle of a procedure (falling off the end also returns from a
1125 procedure), in which case the
\code{None
} value is returned.
1128 The statement
\code{result.append(b)
} calls a
\emph{method
} of the list
1129 object
\code{result
}. A method is a function that `belongs' to an
1130 object and is named
\code{obj.methodname
}, where
\code{obj
} is some
1131 object (this may be an expression), and
\code{methodname
} is the name
1132 of a method that is defined by the object's type. Different types
1133 define different methods. Methods of different types may have the
1134 same name without causing ambiguity. (It is possible to define your
1135 own object types and methods, using
\emph{classes
}, as discussed later
1137 The method
\method{append()
} shown in the example, is defined for
1138 list objects; it adds a new element at the end of the list. In this
1139 example it is equivalent to
\samp{result = result +
[b
]}, but more
1144 \section{More on Defining Functions
\label{defining
}}
1146 It is also possible to define functions with a variable number of
1147 arguments. There are three forms, which can be combined.
1149 \subsection{Default Argument Values
\label{defaultArgs
}}
1151 The most useful form is to specify a default value for one or more
1152 arguments. This creates a function that can be called with fewer
1153 arguments than it is defined, e.g.
1156 def ask_ok(prompt, retries=
4, complaint='Yes or no, please!'):
1158 ok = raw_input(prompt)
1159 if ok in ('y', 'ye', 'yes'): return
1
1160 if ok in ('n', 'no', 'nop', 'nope'): return
0
1161 retries = retries -
1
1162 if retries <
0: raise IOError, 'refusenik user'
1166 This function can be called either like this:
1167 \code{ask_ok('Do you really want to quit?')
} or like this:
1168 \code{ask_ok('OK to overwrite the file?',
2)
}.
1170 The default values are evaluated at the point of function definition
1171 in the
\emph{defining
} scope, so that e.g.
1175 def f(arg = i): print arg
1180 will print
\code{5}.
1182 \strong{Important warning:
} The default value is evaluated only once.
1183 This makes a difference when the default is a mutable object such as a
1184 list or dictionary. For example, the following function accumulates
1185 the arguments passed to it on subsequent calls:
1204 If you don't want the default to be shared between subsequent calls,
1205 you can write the function like this instead:
1215 \subsection{Keyword Arguments
\label{keywordArgs
}}
1217 Functions can also be called using
1218 keyword arguments of the form
\samp{\var{keyword
} =
\var{value
}}. For
1219 instance, the following function:
1222 def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):
1223 print "-- This parrot wouldn't", action,
1224 print "if you put", voltage, "Volts through it."
1225 print "-- Lovely plumage, the", type
1226 print "-- It's", state, "!"
1229 could be called in any of the following ways:
1233 parrot(action = 'VOOOOOM', voltage =
1000000)
1234 parrot('a thousand', state = 'pushing up the daisies')
1235 parrot('a million', 'bereft of life', 'jump')
1238 but the following calls would all be invalid:
1241 parrot() # required argument missing
1242 parrot(voltage=
5.0, 'dead') # non-keyword argument following keyword
1243 parrot(
110, voltage=
220) # duplicate value for argument
1244 parrot(actor='John Cleese') # unknown keyword
1247 In general, an argument list must have any positional arguments
1248 followed by any keyword arguments, where the keywords must be chosen
1249 from the formal parameter names. It's not important whether a formal
1250 parameter has a default value or not. No argument must receive a
1251 value more than once --- formal parameter names corresponding to
1252 positional arguments cannot be used as keywords in the same calls.
1254 When a final formal parameter of the form
\code{**
\var{name
}} is
1255 present, it receives a dictionary containing all keyword arguments
1256 whose keyword doesn't correspond to a formal parameter. This may be
1257 combined with a formal parameter of the form
\code{*
\var{name
}}
1258 (described in the next subsection) which receives a tuple containing
1259 the positional arguments beyond the formal parameter list.
1260 (
\code{*
\var{name
}} must occur before
\code{**
\var{name
}}.) For
1261 example, if we define a function like this:
1264 def cheeseshop(kind, *arguments, **keywords):
1265 print "-- Do you have any", kind, '?'
1266 print "-- I'm sorry, we're all out of", kind
1267 for arg in arguments: print arg
1269 for kw in keywords.keys(): print kw, ':', keywords
[kw
]
1272 It could be called like this:
1275 cheeseshop('Limburger', "It's very runny, sir.",
1276 "It's really very, VERY runny, sir.",
1277 client='John Cleese',
1278 shopkeeper='Michael Palin',
1279 sketch='Cheese Shop Sketch')
1282 and of course it would print:
1285 -- Do you have any Limburger ?
1286 -- I'm sorry, we're all out of Limburger
1287 It's very runny, sir.
1288 It's really very, VERY runny, sir.
1289 ----------------------------------------
1290 client : John Cleese
1291 shopkeeper : Michael Palin
1292 sketch : Cheese Shop Sketch
1295 \subsection{Arbitrary Argument Lists
\label{arbitraryArgs
}}
1297 Finally, the least frequently used option is to specify that a
1298 function can be called with an arbitrary number of arguments. These
1299 arguments will be wrapped up in a tuple. Before the variable number
1300 of arguments, zero or more normal arguments may occur.
1303 def fprintf(file, format, *args):
1304 file.write(format
% args)
1308 \subsection{Lambda Forms
\label{lambda
}}
1310 By popular demand, a few features commonly found in functional
1311 programming languages and Lisp have been added to Python. With the
1312 \keyword{lambda
} keyword, small anonymous functions can be created.
1313 Here's a function that returns the sum of its two arguments:
1314 \samp{lambda a, b: a+b
}. Lambda forms can be used wherever function
1315 objects are required. They are syntactically restricted to a single
1316 expression. Semantically, they are just syntactic sugar for a normal
1317 function definition. Like nested function definitions, lambda forms
1318 cannot reference variables from the containing scope, but this can be
1319 overcome through the judicious use of default argument values, e.g.
1322 def make_incrementor(n):
1323 return lambda x, incr=n: x+incr
1326 \subsection{Documentation Strings
\label{docstrings
}}
1328 There are emerging conventions about the content and formatting of
1329 documentation strings.
1331 The first line should always be a short, concise summary of the
1332 object's purpose. For brevity, it should not explicitly state the
1333 object's name or type, since these are available by other means
1334 (except if the name happens to be a verb describing a function's
1335 operation). This line should begin with a capital letter and end with
1338 If there are more lines in the documentation string, the second line
1339 should be blank, visually separating the summary from the rest of the
1340 description. The following lines should be one or more paragraphs
1341 describing the object's calling conventions, its side effects, etc.
1343 The Python parser does not strip indentation from multi-line string
1344 literals in Python, so tools that process documentation have to strip
1345 indentation. This is done using the following convention. The first
1346 non-blank line
\emph{after
} the first line of the string determines the
1347 amount of indentation for the entire documentation string. (We can't
1348 use the first line since it is generally adjacent to the string's
1349 opening quotes so its indentation is not apparent in the string
1350 literal.) Whitespace ``equivalent'' to this indentation is then
1351 stripped from the start of all lines of the string. Lines that are
1352 indented less should not occur, but if they occur all their leading
1353 whitespace should be stripped. Equivalence of whitespace should be
1354 tested after expansion of tabs (to
8 spaces, normally).
1358 \chapter{Data Structures
\label{structures
}}
1360 This chapter describes some things you've learned about already in
1361 more detail, and adds some new things as well.
1363 \section{More on Lists
\label{moreLists
}}
1365 The list data type has some more methods. Here are all of the methods
1370 \item[\code{insert(i, x)
}]
1371 Insert an item at a given position. The first argument is the index of
1372 the element before which to insert, so
\code{a.insert(
0, x)
} inserts at
1373 the front of the list, and
\code{a.insert(len(a), x)
} is equivalent to
1376 \item[\code{append(x)
}]
1377 Equivalent to
\code{a.insert(len(a), x)
}.
1379 \item[\code{index(x)
}]
1380 Return the index in the list of the first item whose value is
\code{x
}.
1381 It is an error if there is no such item.
1383 \item[\code{remove(x)
}]
1384 Remove the first item from the list whose value is
\code{x
}.
1385 It is an error if there is no such item.
1387 \item[\code{sort()
}]
1388 Sort the items of the list, in place.
1390 \item[\code{reverse()
}]
1391 Reverse the elements of the list, in place.
1393 \item[\code{count(x)
}]
1394 Return the number of times
\code{x
} appears in the list.
1398 An example that uses all list methods:
1401 >>> a =
[66.6,
333,
333,
1,
1234.5]
1402 >>> print a.count(
333), a.count(
66.6), a.count('x')
1407 [66.6,
333, -
1,
333,
1,
1234.5,
333]
1412 [66.6, -
1,
333,
1,
1234.5,
333]
1415 [333,
1234.5,
1,
333, -
1,
66.6]
1418 [-
1,
1,
66.6,
333,
333,
1234.5]
1421 \subsection{Functional Programming Tools
\label{functional
}}
1423 There are three built-in functions that are very useful when used with
1424 lists:
\function{filter()
},
\function{map()
}, and
\function{reduce()
}.
1426 \samp{filter(
\var{function
},
\var{sequence
})
} returns a sequence (of
1427 the same type, if possible) consisting of those items from the
1428 sequence for which
\code{\var{function
}(
\var{item
})
} is true. For
1429 example, to compute some primes:
1432 >>> def f(x): return x
% 2 != 0 and x % 3 != 0
1434 >>> filter(f, range(
2,
25))
1435 [5,
7,
11,
13,
17,
19,
23]
1438 \samp{map(
\var{function
},
\var{sequence
})
} calls
1439 \code{\var{function
}(
\var{item
})
} for each of the sequence's items and
1440 returns a list of the return values. For example, to compute some
1444 >>> def cube(x): return x*x*x
1446 >>> map(cube, range(
1,
11))
1447 [1,
8,
27,
64,
125,
216,
343,
512,
729,
1000]
1450 More than one sequence may be passed; the function must then have as
1451 many arguments as there are sequences and is called with the
1452 corresponding item from each sequence (or
\code{None
} if some sequence
1453 is shorter than another). If
\code{None
} is passed for the function,
1454 a function returning its argument(s) is substituted.
1456 Combining these two special cases, we see that
1457 \samp{map(None,
\var{list1
},
\var{list2
})
} is a convenient way of
1458 turning a pair of lists into a list of pairs. For example:
1462 >>> def square(x): return x*x
1464 >>> map(None, seq, map(square, seq))
1465 [(
0,
0), (
1,
1), (
2,
4), (
3,
9), (
4,
16), (
5,
25), (
6,
36), (
7,
49)
]
1468 \samp{reduce(
\var{func
},
\var{sequence
})
} returns a single value
1469 constructed by calling the binary function
\var{func
} on the first two
1470 items of the sequence, then on the result and the next item, and so
1471 on. For example, to compute the sum of the numbers
1 through
10:
1474 >>> def add(x,y): return x+y
1476 >>> reduce(add, range(
1,
11))
1480 If there's only one item in the sequence, its value is returned; if
1481 the sequence is empty, an exception is raised.
1483 A third argument can be passed to indicate the starting value. In this
1484 case the starting value is returned for an empty sequence, and the
1485 function is first applied to the starting value and the first sequence
1486 item, then to the result and the next item, and so on. For example,
1490 ... def add(x,y): return x+y
1491 ... return reduce(add, seq,
0)
1493 >>> sum(range(
1,
11))
1499 \section{The
\keyword{del
} statement
\label{del
}}
1501 There is a way to remove an item from a list given its index instead
1502 of its value: the
\code{del
} statement. This can also be used to
1503 remove slices from a list (which we did earlier by assignment of an
1504 empty list to the slice). For example:
1508 [-
1,
1,
66.6,
333,
333,
1234.5]
1511 [1,
66.6,
333,
333,
1234.5]
1517 \keyword{del
} can also be used to delete entire variables:
1523 Referencing the name
\code{a
} hereafter is an error (at least until
1524 another value is assigned to it). We'll find other uses for
1525 \keyword{del
} later.
1527 \section{Tuples and Sequences
\label{tuples
}}
1529 We saw that lists and strings have many common properties, e.g.,
1530 indexing and slicing operations. They are two examples of
1531 \emph{sequence
} data types. Since Python is an evolving language,
1532 other sequence data types may be added. There is also another
1533 standard sequence data type: the
\emph{tuple
}.
1535 A tuple consists of a number of values separated by commas, for
1539 >>> t =
12345,
54321, 'hello!'
1543 (
12345,
54321, 'hello!')
1544 >>> # Tuples may be nested:
1545 ... u = t, (
1,
2,
3,
4,
5)
1547 ((
12345,
54321, 'hello!'), (
1,
2,
3,
4,
5))
1550 As you see, on output tuples are alway enclosed in parentheses, so
1551 that nested tuples are interpreted correctly; they may be input with
1552 or without surrounding parentheses, although often parentheses are
1553 necessary anyway (if the tuple is part of a larger expression).
1555 Tuples have many uses, e.g., (x, y) coordinate pairs, employee records
1556 from a database, etc. Tuples, like strings, are immutable: it is not
1557 possible to assign to the individual items of a tuple (you can
1558 simulate much of the same effect with slicing and concatenation,
1561 A special problem is the construction of tuples containing
0 or
1
1562 items: the syntax has some extra quirks to accommodate these. Empty
1563 tuples are constructed by an empty pair of parentheses; a tuple with
1564 one item is constructed by following a value with a comma
1565 (it is not sufficient to enclose a single value in parentheses).
1566 Ugly, but effective. For example:
1570 >>> singleton = 'hello', # <-- note trailing comma
1579 The statement
\code{t =
12345,
54321, 'hello!'
} is an example of
1580 \emph{tuple packing
}: the values
\code{12345},
\code{54321} and
1581 \code{'hello!'
} are packed together in a tuple. The reverse operation
1582 is also possible, e.g.:
1588 This is called, appropriately enough,
\emph{tuple unpacking
}. Tuple
1589 unpacking requires that the list of variables on the left has the same
1590 number of elements as the length of the tuple. Note that multiple
1591 assignment is really just a combination of tuple packing and tuple
1594 % XXX This is no longer necessary!
1595 Occasionally, the corresponding operation on lists is useful:
\emph{list
1596 unpacking
}. This is supported by enclosing the list of variables in
1600 >>> a =
['spam', 'eggs',
100,
1234]
1601 >>>
[a1, a2, a3, a4
] = a
1604 % XXX Add a bit on the difference between tuples and lists.
1605 % XXX Also explain that a tuple can *contain* a mutable object!
1607 \section{Dictionaries
\label{dictionaries
}}
1609 Another useful data type built into Python is the
\emph{dictionary
}.
1610 Dictionaries are sometimes found in other languages as ``associative
1611 memories'' or ``associative arrays''. Unlike sequences, which are
1612 indexed by a range of numbers, dictionaries are indexed by
\emph{keys
},
1613 which can be any non-mutable type; strings and numbers can always be
1614 keys. Tuples can be used as keys if they contain only strings,
1615 numbers, or tuples. You can't use lists as keys, since lists can be
1616 modified in place using their
\code{append()
} method.
1618 It is best to think of a dictionary as an unordered set of
1619 \emph{key:value
} pairs, with the requirement that the keys are unique
1620 (within one dictionary).
1621 A pair of braces creates an empty dictionary:
\code{\
{\
}}.
1622 Placing a comma-separated list of key:value pairs within the
1623 braces adds initial key:value pairs to the dictionary; this is also the
1624 way dictionaries are written on output.
1626 The main operations on a dictionary are storing a value with some key
1627 and extracting the value given the key. It is also possible to delete
1630 If you store using a key that is already in use, the old value
1631 associated with that key is forgotten. It is an error to extract a
1632 value using a non-existent key.
1634 The
\code{keys()
} method of a dictionary object returns a list of all the
1635 keys used in the dictionary, in random order (if you want it sorted,
1636 just apply the
\code{sort()
} method to the list of keys). To check
1637 whether a single key is in the dictionary, use the
\code{has_key()
}
1638 method of the dictionary.
1640 Here is a small example using a dictionary:
1643 >>> tel =
{'jack':
4098, 'sape':
4139}
1644 >>> tel
['guido'
] =
4127
1646 {'sape':
4139, 'guido':
4127, 'jack':
4098}
1650 >>> tel
['irv'
] =
4127
1652 {'guido':
4127, 'irv':
4127, 'jack':
4098}
1654 ['guido', 'irv', 'jack'
]
1655 >>> tel.has_key('guido')
1659 \section{More on Conditions
\label{conditions
}}
1661 The conditions used in
\code{while
} and
\code{if
} statements above can
1662 contain other operators besides comparisons.
1664 The comparison operators
\code{in
} and
\code{not in
} check whether a value
1665 occurs (does not occur) in a sequence. The operators
\code{is
} and
1666 \code{is not
} compare whether two objects are really the same object; this
1667 only matters for mutable objects like lists. All comparison operators
1668 have the same priority, which is lower than that of all numerical
1671 Comparisons can be chained: e.g.,
\code{a < b == c
} tests whether
\code{a
}
1672 is less than
\code{b
} and moreover
\code{b
} equals
\code{c
}.
1674 Comparisons may be combined by the Boolean operators
\code{and
} and
1675 \code{or
}, and the outcome of a comparison (or of any other Boolean
1676 expression) may be negated with
\code{not
}. These all have lower
1677 priorities than comparison operators again; between them,
\code{not
} has
1678 the highest priority, and
\code{or
} the lowest, so that
1679 \code{A and not B or C
} is equivalent to
\code{(A and (not B)) or C
}. Of
1680 course, parentheses can be used to express the desired composition.
1682 The Boolean operators
\code{and
} and
\code{or
} are so-called
1683 \emph{shortcut
} operators: their arguments are evaluated from left to
1684 right, and evaluation stops as soon as the outcome is determined.
1685 E.g., if
\code{A
} and
\code{C
} are true but
\code{B
} is false,
\code{A
1686 and B and C
} does not evaluate the expression C. In general, the
1687 return value of a shortcut operator, when used as a general value and
1688 not as a Boolean, is the last evaluated argument.
1690 It is possible to assign the result of a comparison or other Boolean
1691 expression to a variable. For example,
1694 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
1695 >>> non_null = string1 or string2 or string3
1700 Note that in Python, unlike C, assignment cannot occur inside expressions.
1702 \section{Comparing Sequences and Other Types
\label{comparing
}}
1704 Sequence objects may be compared to other objects with the same
1705 sequence type. The comparison uses
\emph{lexicographical
} ordering:
1706 first the first two items are compared, and if they differ this
1707 determines the outcome of the comparison; if they are equal, the next
1708 two items are compared, and so on, until either sequence is exhausted.
1709 If two items to be compared are themselves sequences of the same type,
1710 the lexicographical comparison is carried out recursively. If all
1711 items of two sequences compare equal, the sequences are considered
1712 equal. If one sequence is an initial subsequence of the other, the
1713 shorted sequence is the smaller one. Lexicographical ordering for
1714 strings uses the
\ASCII{} ordering for individual characters. Some
1715 examples of comparisons between sequences with the same types:
1718 (
1,
2,
3) < (
1,
2,
4)
1719 [1,
2,
3] <
[1,
2,
4]
1720 'ABC' < 'C' < 'Pascal' < 'Python'
1721 (
1,
2,
3,
4) < (
1,
2,
4)
1723 (
1,
2,
3) = (
1.0,
2.0,
3.0)
1724 (
1,
2, ('aa', 'ab')) < (
1,
2, ('abc', 'a'),
4)
1727 Note that comparing objects of different types is legal. The outcome
1728 is deterministic but arbitrary: the types are ordered by their name.
1729 Thus, a list is always smaller than a string, a string is always
1730 smaller than a tuple, etc. Mixed numeric types are compared according
1731 to their numeric value, so
0 equals
0.0, etc.
\footnote{
1732 The rules for comparing objects of different types should
1733 not be relied upon; they may change in a future version of
1738 \chapter{Modules
\label{modules
}}
1740 If you quit from the Python interpreter and enter it again, the
1741 definitions you have made (functions and variables) are lost.
1742 Therefore, if you want to write a somewhat longer program, you are
1743 better off using a text editor to prepare the input for the interpreter
1744 and running it with that file as input instead. This is known as creating a
1745 \emph{script
}. As your program gets longer, you may want to split it
1746 into several files for easier maintenance. You may also want to use a
1747 handy function that you've written in several programs without copying
1748 its definition into each program.
1750 To support this, Python has a way to put definitions in a file and use
1751 them in a script or in an interactive instance of the interpreter.
1752 Such a file is called a
\emph{module
}; definitions from a module can be
1753 \emph{imported
} into other modules or into the
\emph{main
} module (the
1754 collection of variables that you have access to in a script
1755 executed at the top level
1756 and in calculator mode).
1758 A module is a file containing Python definitions and statements. The
1759 file name is the module name with the suffix
\file{.py
} appended. Within
1760 a module, the module's name (as a string) is available as the value of
1761 the global variable
\code{__name__
}. For instance, use your favorite text
1762 editor to create a file called
\file{fibo.py
} in the current directory
1763 with the following contents:
1766 # Fibonacci numbers module
1768 def fib(n): # write Fibonacci series up to n
1774 def fib2(n): # return Fibonacci series up to n
1783 Now enter the Python interpreter and import this module with the
1790 This does not enter the names of the functions defined in
1792 directly in the current symbol table; it only enters the module name
1795 Using the module name you can access the functions:
1799 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
1801 [1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89]
1806 If you intend to use a function often you can assign it to a local name:
1811 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1815 \section{More on Modules
\label{moreModules
}}
1817 A module can contain executable statements as well as function
1819 These statements are intended to initialize the module.
1820 They are executed only the
1822 time the module is imported somewhere.
\footnote{
1823 In fact function definitions are also `statements' that are
1824 `executed'; the execution enters the function name in the
1825 module's global symbol table.
1828 Each module has its own private symbol table, which is used as the
1829 global symbol table by all functions defined in the module.
1830 Thus, the author of a module can use global variables in the module
1831 without worrying about accidental clashes with a user's global
1833 On the other hand, if you know what you are doing you can touch a
1834 module's global variables with the same notation used to refer to its
1836 \code{modname.itemname
}.
1838 Modules can import other modules.
1839 It is customary but not required to place all
1841 statements at the beginning of a module (or script, for that matter).
1842 The imported module names are placed in the importing module's global
1845 There is a variant of the
1847 statement that imports names from a module directly into the importing
1848 module's symbol table.
1852 >>> from fibo import fib, fib2
1854 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1857 This does not introduce the module name from which the imports are taken
1858 in the local symbol table (so in the example,
\code{fibo
} is not
1861 There is even a variant to import all names that a module defines:
1864 >>> from fibo import *
1866 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1869 This imports all names except those beginning with an underscore
1872 \subsection{The Module Search Path
\label{searchPath
}}
1874 % XXX Need to document that a lone .pyc/.pyo is acceptable too!
1876 \indexiii{module
}{search
}{path
}
1877 When a module named
\module{spam
} is imported, the interpreter searches
1878 for a file named
\file{spam.py
} in the current directory,
1879 and then in the list of directories specified by
1880 the environment variable
\envvar{PYTHONPATH
}. This has the same syntax as
1881 the shell variable
\envvar{PATH
}, i.e., a list of
1882 directory names. When
\envvar{PYTHONPATH
} is not set, or when the file
1883 is not found there, the search continues in an installation-dependent
1884 default path; on
\UNIX{}, this is usually
\file{.:/usr/local/lib/python
}.
1886 Actually, modules are searched in the list of directories given by the
1887 variable
\code{sys.path
} which is initialized from the directory
1888 containing the input script (or the current directory),
1889 \envvar{PYTHONPATH
} and the installation-dependent default. This allows
1890 Python programs that know what they're doing to modify or replace the
1891 module search path. See the section on Standard Modules later.
1893 \subsection{``Compiled'' Python files
}
1895 As an important speed-up of the start-up time for short programs that
1896 use a lot of standard modules, if a file called
\file{spam.pyc
} exists
1897 in the directory where
\file{spam.py
} is found, this is assumed to
1898 contain an already-``byte-compiled'' version of the module
\module{spam
}.
1899 The modification time of the version of
\file{spam.py
} used to create
1900 \file{spam.pyc
} is recorded in
\file{spam.pyc
}, and the file is
1901 ignored if these don't match.
1903 Normally, you don't need to do anything to create the
\file{spam.pyc
} file.
1904 Whenever
\file{spam.py
} is successfully compiled, an attempt is made to
1905 write the compiled version to
\file{spam.pyc
}. It is not an error if
1906 this attempt fails; if for any reason the file is not written
1907 completely, the resulting
\file{spam.pyc
} file will be recognized as
1908 invalid and thus ignored later. The contents of the
\file{spam.pyc
}
1909 file is platform independent, so a Python module directory can be
1910 shared by machines of different architectures.
1912 Some tips for experts:
1917 When the Python interpreter is invoked with the
\code{-O
} flag,
1918 optimized code is generated and stored in
\file{.pyo
} files.
1919 The optimizer currently doesn't help much; it only removes
1920 \keyword{assert
} statements and
\code{SET_LINENO
} instructions.
1921 When
\code{-O
} is used,
\emph{all
} bytecode is optimized;
\code{.pyc
}
1922 files are ignored and
\code{.py
} files are compiled to optimized
1926 Passing two
\code{-O
} flags to the Python interpreter (
\code{-OO
})
1927 will cause the bytecode compiler to perform optimizations that could
1928 in some rare cases result in malfunctioning programs. Currently only
1929 \code{__doc__
} strings are removed from the bytecode, resulting in more
1930 compact
\file{.pyo
} files. Since some programs may rely on having
1931 these available, you should only use this option if you know what
1935 A program doesn't run any faster when it is read from a
1936 \file{.pyc
} or
\file{.pyo
} file than when it is read from a
\file{.py
}
1937 file; the only thing that's faster about
\file{.pyc
} or
\file{.pyo
}
1938 files is the speed with which they are loaded.
1941 When a script is run by giving its name on the command line, the
1942 bytecode for the script is never written to a
\file{.pyc
} or
1943 \file{.pyo
} file. Thus, the startup time of a script may be reduced
1944 by moving most of its code to a module and having a small bootstrap
1945 script that imports that module.
1948 It is possible to have a file called
\file{spam.pyc
} (or
1949 \file{spam.pyo
} when
\code{-O
} is used) without a module
1950 \file{spam.py
} in the same module. This can be used to distribute
1951 a library of Python code in a form that is moderately hard to reverse
1955 The module
\module{compileall
}\refstmodindex{compileall
} can create
1956 \file{.pyc
} files (or
\file{.pyo
} files when
\code{-O
} is used) for
1957 all modules in a directory.
1962 \section{Standard Modules
\label{standardModules
}}
1964 Python comes with a library of standard modules, described in a separate
1965 document, the
\emph{Python Library Reference
} (``Library Reference''
1966 hereafter). Some modules are built into the interpreter; these
1967 provide access to operations that are not part of the core of the
1968 language but are nevertheless built in, either for efficiency or to
1969 provide access to operating system primitives such as system calls.
1970 The set of such modules is a configuration option; e.g., the
1971 \module{amoeba
} module is only provided on systems that somehow
1972 support Amoeba primitives. One particular module deserves some
1973 attention:
\module{sys
}\refstmodindex{sys
}, which is built into every
1974 Python interpreter. The variables
\code{sys.ps1
} and
1975 \code{sys.ps2
} define the strings used as primary and secondary
1990 These two variables are only defined if the interpreter is in
1993 The variable
\code{sys.path
} is a list of strings that determine the
1994 interpreter's search path for modules. It is initialized to a default
1995 path taken from the environment variable
\envvar{PYTHONPATH
}, or from
1996 a built-in default if
\envvar{PYTHONPATH
} is not set. You can modify
1997 it using standard list operations, e.g.:
2001 >>> sys.path.append('/ufs/guido/lib/python')
2004 \section{The
\function{dir()
} Function
\label{dir
}}
2006 The built-in function
\function{dir()
} is used to find out which names
2007 a module defines. It returns a sorted list of strings:
2010 >>> import fibo, sys
2012 ['__name__', 'fib', 'fib2'
]
2014 ['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
2015 'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
2016 'stderr', 'stdin', 'stdout', 'version'
]
2019 Without arguments,
\function{dir()
} lists the names you have defined
2023 >>> a =
[1,
2,
3,
4,
5]
2024 >>> import fibo, sys
2027 ['__name__', 'a', 'fib', 'fibo', 'sys'
]
2030 Note that it lists all types of names: variables, modules, functions, etc.
2032 \function{dir()
} does not list the names of built-in functions and
2033 variables. If you want a list of those, they are defined in the
2034 standard module
\module{__builtin__
}\refbimodindex{__builtin__
}:
2037 >>> import __builtin__
2038 >>> dir(__builtin__)
2039 ['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
2040 'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
2041 'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
2042 'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
2043 'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
2044 'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
2045 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
2046 'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
2047 'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange'
]
2050 \section{Packages
\label{packages
}}
2052 Packages are a way of structuring Python's module namespace
2053 by using ``dotted module names''. For example, the module name
2054 \module{A.B
} designates a submodule named
\samp{B
} in a package named
2055 \samp{A
}. Just like the use of modules saves the authors of different
2056 modules from having to worry about each other's global variable names,
2057 the use of dotted module names saves the authors of multi-module
2058 packages like NumPy or PIL from having to worry about each other's
2061 Suppose you want to design a collection of modules (a ``package'') for
2062 the uniform handling of sound files and sound data. There are many
2063 different sound file formats (usually recognized by their extension,
2064 e.g.
\file{.wav
},
\file{.aiff
},
\file{.au
}), so you may need to create
2065 and maintain a growing collection of modules for the conversion
2066 between the various file formats. There are also many different
2067 operations you might want to perform on sound data (e.g. mixing,
2068 adding echo, applying an equalizer function, creating an artificial
2069 stereo effect), so in addition you will be writing a never-ending
2070 stream of modules to perform these operations. Here's a possible
2071 structure for your package (expressed in terms of a hierarchical
2075 Sound/ Top-level package
2076 __init__.py Initialize the sound package
2077 Formats/ Subpackage for file format conversions
2086 Effects/ Subpackage for sound effects
2092 Filters/ Subpackage for filters
2099 The
\file{__init__.py
} files are required to make Python treat the
2100 directories as containing packages; this is done to prevent
2101 directories with a common name, such as
\samp{string
}, from
2102 unintentionally hiding valid modules that occur later on the module
2103 search path. In the simplest case,
\file{__init__.py
} can just be an
2104 empty file, but it can also execute initialization code for the
2105 package or set the
\code{__all__
} variable, described later.
2107 Users of the package can import individual modules from the
2108 package, for example:
2111 import Sound.Effects.echo
2113 This loads the submodule
\module{Sound.Effects.echo
}. It must be referenced
2114 with its full name, e.g.
2117 Sound.Effects.echo.echofilter(input, output, delay=
0.7, atten=
4)
2119 An alternative way of importing the submodule is:
2122 from Sound.Effects import echo
2124 This also loads the submodule
\module{echo
}, and makes it available without
2125 its package prefix, so it can be used as follows:
2128 echo.echofilter(input, output, delay=
0.7, atten=
4)
2131 Yet another variation is to import the desired function or variable directly:
2134 from Sound.Effects.echo import echofilter
2137 Again, this loads the submodule
\module{echo
}, but this makes its function
2138 echofilter directly available:
2141 echofilter(input, output, delay=
0.7, atten=
4)
2144 Note that when using
\code{from
\var{package
} import
\var{item
}}, the
2145 item can be either a submodule (or subpackage) of the package, or some
2146 other name defined in the package, like a function, class or
2147 variable. The
\code{import
} statement first tests whether the item is
2148 defined in the package; if not, it assumes it is a module and attempts
2149 to load it. If it fails to find it,
\exception{ImportError
} is raised.
2151 Contrarily, when using syntax like
\code{import
2152 \var{item.subitem.subsubitem
}}, each item except for the last must be
2153 a package; the last item can be a module or a package but can't be a
2154 class or function or variable defined in the previous item.
2156 \subsection{Importing * From a Package
\label{pkg-import-star
}}
2157 %The \code{__all__} Attribute
2159 Now what happens when the user writes
\code{from Sound.Effects import
2160 *
}? Ideally, one would hope that this somehow goes out to the
2161 filesystem, finds which submodules are present in the package, and
2162 imports them all. Unfortunately, this operation does not work very
2163 well on Mac and Windows platforms, where the filesystem does not
2164 always have accurate information about the case of a filename! On
2165 these platforms, there is no guaranteed way to know whether a file
2166 \file{ECHO.PY
} should be imported as a module
\module{echo
},
2167 \module{Echo
} or
\module{ECHO
}. (For example, Windows
95 has the
2168 annoying practice of showing all file names with a capitalized first
2169 letter.) The DOS
8+
3 filename restriction adds another interesting
2170 problem for long module names.
2172 The only solution is for the package author to provide an explicit
2173 index of the package. The import statement uses the following
2174 convention: if a package's
\file{__init__.py
} code defines a list named
2175 \code{__all__
}, it is taken to be the list of module names that should be imported
2176 when
\code{from
\var{package
} import *
} is
2177 encountered. It is up to the package author to keep this list
2178 up-to-date when a new version of the package is released. Package
2179 authors may also decide not to support it, if they don't see a use for
2180 importing * from their package. For example, the file
2181 \code{Sounds/Effects/__init__.py
} could contain the following code:
2184 __all__ =
["echo", "surround", "reverse"
]
2187 This would mean that
\code{from Sound.Effects import *
} would
2188 import the three named submodules of the
\module{Sound
} package.
2190 If
\code{__all__
} is not defined, the statement
\code{from Sound.Effects
2191 import *
} does
\emph{not
} import all submodules from the package
2192 \module{Sound.Effects
} into the current namespace; it only ensures that the
2193 package
\module{Sound.Effects
} has been imported (possibly running its
2194 initialization code,
\file{__init__.py
}) and then imports whatever names are
2195 defined in the package. This includes any names defined (and
2196 submodules explicitly loaded) by
\file{__init__.py
}. It also includes any
2197 submodules of the package that were explicitly loaded by previous
2198 import statements, e.g.
2201 import Sound.Effects.echo
2202 import Sound.Effects.surround
2203 from Sound.Effects import *
2207 In this example, the echo and surround modules are imported in the
2208 current namespace because they are defined in the
\module{Sound.Effects
}
2209 package when the
\code{from...import
} statement is executed. (This also
2210 works when
\code{__all__
} is defined.)
2212 Note that in general the practicing of importing * from a module or
2213 package is frowned upon, since it often causes poorly readable code.
2214 However, it is okay to use it to save typing in interactive sessions,
2215 and certain modules are designed to export only names that follow
2218 Remember, there is nothing wrong with using
\code{from Package
2219 import specific_submodule
}! In fact, this is the
2220 recommended notation unless the importing module needs to use
2221 submodules with the same name from different packages.
2224 \subsection{Intra-package References
}
2226 The submodules often need to refer to each other. For example, the
2227 \module{surround
} module might use the
\module{echo
} module. In fact, such references
2228 are so common that the
\code{import
} statement first looks in the
2229 containing package before looking in the standard module search path.
2230 Thus, the surround module can simply use
\code{import echo
} or
2231 \code{from echo import echofilter
}. If the imported module is not
2232 found in the current package (the package of which the current module
2233 is a submodule), the
\code{import
} statement looks for a top-level module
2234 with the given name.
2236 When packages are structured into subpackages (as with the
\module{Sound
}
2237 package in the example), there's no shortcut to refer to submodules of
2238 sibling packages - the full name of the subpackage must be used. For
2239 example, if the module
\module{Sound.Filters.vocoder
} needs to use the
\module{echo
}
2240 module in the
\module{Sound.Effects
} package, it can use
\code{from
2241 Sound.Effects import echo
}.
2243 %(One could design a notation to refer to parent packages, similar to
2244 %the use of ".." to refer to the parent directory in Unix and Windows
2245 %filesystems. In fact, the \module{ni} module, which was the
2246 %ancestor of this package system, supported this using \code{__} for
2247 %the package containing the current module,
2248 %\code{__.__} for the parent package, and so on. This feature was dropped
2249 %because of its awkwardness; since most packages will have a relative
2250 %shallow substructure, this is no big loss.)
2254 \chapter{Input and Output
\label{io
}}
2256 There are several ways to present the output of a program; data can be
2257 printed in a human-readable form, or written to a file for future use.
2258 This chapter will discuss some of the possibilities.
2261 \section{Fancier Output Formatting
\label{formatting
}}
2263 So far we've encountered two ways of writing values:
\emph{expression
2264 statements
} and the
\keyword{print
} statement. (A third way is using
2265 the
\method{write()
} method of file objects; the standard output file
2266 can be referenced as
\code{sys.stdout
}. See the Library Reference for
2267 more information on this.)
2269 Often you'll want more control over the formatting of your output than
2270 simply printing space-separated values. There are two ways to format
2271 your output; the first way is to do all the string handling yourself;
2272 using string slicing and concatenation operations you can create any
2273 lay-out you can imagine. The standard module
2274 \module{string
}\refstmodindex{string
} contains some useful operations
2275 for padding strings to a given column width;
2276 these will be discussed shortly. The second way is to use the
2277 \code{\%
} operator with a string as the left argument.
\code{\%
}
2278 interprets the left argument as a C
\cfunction{sprintf()
}-style
2279 format string to be applied to the right argument, and returns the
2280 string resulting from this formatting operation.
2282 One question remains, of course: how do you convert values to strings?
2283 Luckily, Python has a way to convert any value to a string: pass it to
2284 the
\function{repr()
} function, or just write the value between
2285 reverse quotes (
\code{``
}). Some examples:
2290 >>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
2292 The value of x is
31.4, and y is
40000...
2293 >>> # Reverse quotes work on other types besides numbers:
2298 >>> # Converting a string adds string quotes and backslashes:
2299 ... hello = 'hello, world
\n'
2300 >>> hellos = `hello`
2303 >>> # The argument of reverse quotes may be a tuple:
2304 ... `x, y, ('spam', 'eggs')`
2305 "(
31.4,
40000, ('spam', 'eggs'))"
2308 Here are two ways to write a table of squares and cubes:
2312 >>> for x in range(
1,
11):
2313 ... print string.rjust(`x`,
2), string.rjust(`x*x`,
3),
2314 ... # Note trailing comma on previous line
2315 ... print string.rjust(`x*x*x`,
4)
2327 >>> for x in range(
1,
11):
2328 ... print '
%2d %3d %4d' % (x, x*x, x*x*x)
2342 (Note that one space between each column was added by the way
2343 \keyword{print
} works: it always adds spaces between its arguments.)
2345 This example demonstrates the function
\function{string.rjust()
},
2346 which right-justifies a string in a field of a given width by padding
2347 it with spaces on the left. There are similar functions
2348 \function{string.ljust()
} and
\function{string.center()
}. These
2349 functions do not write anything, they just return a new string. If
2350 the input string is too long, they don't truncate it, but return it
2351 unchanged; this will mess up your column lay-out but that's usually
2352 better than the alternative, which would be lying about a value. (If
2353 you really want truncation you can always add a slice operation, as in
2354 \samp{string.ljust(x,~n)
[0:n
]}.)
2356 There is another function,
\function{string.zfill()
}, which pads a
2357 numeric string on the left with zeros. It understands about plus and
2361 >>> string.zfill('
12',
5)
2363 >>> string.zfill('-
3.14',
7)
2365 >>> string.zfill('
3.14159265359',
5)
2369 Using the
\code{\%
} operator looks like this:
2373 >>> print 'The value of PI is approximately
%5.3f.' % math.pi
2374 The value of PI is approximately
3.142.
2377 If there is more than one format in the string you pass a tuple as
2381 >>> table =
{'Sjoerd':
4127, 'Jack':
4098, 'Dcab':
8637678}
2382 >>> for name, phone in table.items():
2383 ... print '
%-10s ==> %10d' % (name, phone)
2390 Most formats work exactly as in C and require that you pass the proper
2391 type; however, if you don't you get an exception, not a core dump.
2392 The
\code{\%s
} format is more relaxed: if the corresponding argument is
2393 not a string object, it is converted to string using the
2394 \function{str()
} built-in function. Using
\code{*
} to pass the width
2395 or precision in as a separate (integer) argument is supported. The
2396 C formats
\code{\%n
} and
\code{\%p
} are not supported.
2398 If you have a really long format string that you don't want to split
2399 up, it would be nice if you could reference the variables to be
2400 formatted by name instead of by position. This can be done by using
2401 an extension of C formats using the form
\code{\%(name)format
}, e.g.
2404 >>> table =
{'Sjoerd':
4127, 'Jack':
4098, 'Dcab':
8637678}
2405 >>> print 'Jack:
%(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
2406 Jack:
4098; Sjoerd:
4127; Dcab:
8637678
2409 This is particularly useful in combination with the new built-in
2410 \function{vars()
} function, which returns a dictionary containing all
2413 \section{Reading and Writing Files
\label{files
}}
2416 \function{open()
}\bifuncindex{open
} returns a file
2417 object
\obindex{file
}, and is most commonly used with two arguments:
2418 \samp{open(
\var{filename
},
\var{mode
})
}.
2421 >>> f=open('/tmp/workfile', 'w')
2423 <open file '/tmp/workfile', mode 'w' at
80a0960>
2426 The first argument is a string containing the filename. The second
2427 argument is another string containing a few characters describing the
2428 way in which the file will be used.
\var{mode
} can be
\code{'r'
} when
2429 the file will only be read,
\code{'w'
} for only writing (an existing
2430 file with the same name will be erased), and
\code{'a'
} opens the file
2431 for appending; any data written to the file is automatically added to
2432 the end.
\code{'r+'
} opens the file for both reading and writing.
2433 The
\var{mode
} argument is optional;
\code{'r'
} will be assumed if
2436 On Windows and the Macintosh,
\code{'b'
} appended to the
2437 mode opens the file in binary mode, so there are also modes like
2438 \code{'rb'
},
\code{'wb'
}, and
\code{'r+b'
}. Windows makes a
2439 distinction between text and binary files; the end-of-line characters
2440 in text files are automatically altered slightly when data is read or
2441 written. This behind-the-scenes modification to file data is fine for
2442 \ASCII{} text files, but it'll corrupt binary data like that in JPEGs or
2443 \file{.EXE
} files. Be very careful to use binary mode when reading and
2444 writing such files. (Note that the precise semantics of text mode on
2445 the Macintosh depends on the underlying C library being used.)
2447 \subsection{Methods of File Objects
\label{fileMethods
}}
2449 The rest of the examples in this section will assume that a file
2450 object called
\code{f
} has already been created.
2452 To read a file's contents, call
\code{f.read(
\var{size
})
}, which reads
2453 some quantity of data and returns it as a string.
\var{size
} is an
2454 optional numeric argument. When
\var{size
} is omitted or negative,
2455 the entire contents of the file will be read and returned; it's your
2456 problem if the file is twice as large as your machine's memory.
2457 Otherwise, at most
\var{size
} bytes are read and returned. If the end
2458 of the file has been reached,
\code{f.read()
} will return an empty
2459 string (
\code {""
}).
2462 'This is the entire file.
\012'
2467 \code{f.readline()
} reads a single line from the file; a newline
2468 character (
\code{\e n
}) is left at the end of the string, and is only
2469 omitted on the last line of the file if the file doesn't end in a
2470 newline. This makes the return value unambiguous; if
2471 \code{f.readline()
} returns an empty string, the end of the file has
2472 been reached, while a blank line is represented by
\code{'
\e n'
}, a
2473 string containing only a single newline.
2477 'This is the first line of the file.
\012'
2479 'Second line of the file
\012'
2484 \code{f.readlines()
} uses
\code{f.readline()
} repeatedly, and returns
2485 a list containing all the lines of data in the file.
2489 ['This is the first line of the file.
\012', 'Second line of the file
\012'
]
2492 \code{f.write(
\var{string
})
} writes the contents of
\var{string
} to
2493 the file, returning
\code{None
}.
2496 >>> f.write('This is a test
\n')
2499 \code{f.tell()
} returns an integer giving the file object's current
2500 position in the file, measured in bytes from the beginning of the
2501 file. To change the file object's position, use
2502 \samp{f.seek(
\var{offset
},
\var{from_what
})
}. The position is
2503 computed from adding
\var{offset
} to a reference point; the reference
2504 point is selected by the
\var{from_what
} argument. A
\var{from_what
}
2505 value of
0 measures from the beginning of the file,
1 uses the current
2506 file position, and
2 uses the end of the file as the reference point.
2507 \var{from_what
} can be omitted and defaults to
0, using the beginning
2508 of the file as the reference point.
2511 >>> f=open('/tmp/workfile', 'r+')
2512 >>> f.write('
0123456789abcdef')
2513 >>> f.seek(
5) # Go to the
5th byte in the file
2516 >>> f.seek(-
3,
2) # Go to the
3rd byte before the end
2521 When you're done with a file, call
\code{f.close()
} to close it and
2522 free up any system resources taken up by the open file. After calling
2523 \code{f.close()
}, attempts to use the file object will automatically fail.
2528 Traceback (innermost last):
2529 File "<stdin>", line
1, in ?
2530 ValueError: I/O operation on closed file
2533 File objects have some additional methods, such as
\method{isatty()
}
2534 and
\method{truncate()
} which are less frequently used; consult the
2535 Library Reference for a complete guide to file objects.
2537 \subsection{The
\module{pickle
} Module
\label{pickle
}}
2538 \refstmodindex{pickle
}
2540 Strings can easily be written to and read from a file. Numbers take a
2541 bit more effort, since the
\method{read()
} method only returns
2542 strings, which will have to be passed to a function like
2543 \function{string.atoi()
}, which takes a string like
\code{'
123'
} and
2544 returns its numeric value
123. However, when you want to save more
2545 complex data types like lists, dictionaries, or class instances,
2546 things get a lot more complicated.
2548 Rather than have users be constantly writing and debugging code to
2549 save complicated data types, Python provides a standard module called
2550 \module{pickle
}. This is an amazing module that can take almost
2551 any Python object (even some forms of Python code!), and convert it to
2552 a string representation; this process is called
\dfn{pickling
}.
2553 Reconstructing the object from the string representation is called
2554 \dfn{unpickling
}. Between pickling and unpickling, the string
2555 representing the object may have been stored in a file or data, or
2556 sent over a network connection to some distant machine.
2558 If you have an object
\code{x
}, and a file object
\code{f
} that's been
2559 opened for writing, the simplest way to pickle the object takes only
2566 To unpickle the object again, if
\code{f
} is a file object which has
2567 been opened for reading:
2573 (There are other variants of this, used when pickling many objects or
2574 when you don't want to write the pickled data to a file; consult the
2575 complete documentation for
\module{pickle
} in the Library Reference.)
2577 \module{pickle
} is the standard way to make Python objects which can be
2578 stored and reused by other programs or by a future invocation of the
2579 same program; the technical term for this is a
\dfn{persistent
}
2580 object. Because
\module{pickle
} is so widely used, many authors who
2581 write Python extensions take care to ensure that new data types such
2582 as matrices can be properly pickled and unpickled.
2586 \chapter{Errors and Exceptions
\label{errors
}}
2588 Until now error messages haven't been more than mentioned, but if you
2589 have tried out the examples you have probably seen some. There are
2590 (at least) two distinguishable kinds of errors:
\emph{syntax errors
}
2591 and
\emph{exceptions
}.
2593 \section{Syntax Errors
\label{syntaxErrors
}}
2595 Syntax errors, also known as parsing errors, are perhaps the most common
2596 kind of complaint you get while you are still learning Python:
2599 >>> while
1 print 'Hello world'
2600 File "<stdin>", line
1
2601 while
1 print 'Hello world'
2603 SyntaxError: invalid syntax
2606 The parser repeats the offending line and displays a little `arrow'
2607 pointing at the earliest point in the line where the error was detected.
2608 The error is caused by (or at least detected at) the token
2610 the arrow: in the example, the error is detected at the keyword
2611 \keyword{print
}, since a colon (
\character{:
}) is missing before it.
2612 File name and line number are printed so you know where to look in case
2613 the input came from a script.
2615 \section{Exceptions
\label{exceptions
}}
2617 Even if a statement or expression is syntactically correct, it may
2618 cause an error when an attempt is made to execute it.
2619 Errors detected during execution are called
\emph{exceptions
} and are
2620 not unconditionally fatal: you will soon learn how to handle them in
2621 Python programs. Most exceptions are not handled by programs,
2622 however, and result in error messages as shown here:
2626 Traceback (innermost last):
2627 File "<stdin>", line
1
2628 ZeroDivisionError: integer division or modulo
2630 Traceback (innermost last):
2631 File "<stdin>", line
1
2634 Traceback (innermost last):
2635 File "<stdin>", line
1
2636 TypeError: illegal argument type for built-in operation
2639 The last line of the error message indicates what happened.
2640 Exceptions come in different types, and the type is printed as part of
2641 the message: the types in the example are
2642 \exception{ZeroDivisionError
},
2643 \exception{NameError
}
2645 \exception{TypeError
}.
2646 The string printed as the exception type is the name of the built-in
2647 name for the exception that occurred. This is true for all built-in
2648 exceptions, but need not be true for user-defined exceptions (although
2649 it is a useful convention).
2650 Standard exception names are built-in identifiers (not reserved
2653 The rest of the line is a detail whose interpretation depends on the
2654 exception type; its meaning is dependent on the exception type.
2656 The preceding part of the error message shows the context where the
2657 exception happened, in the form of a stack backtrace.
2658 In general it contains a stack backtrace listing source lines; however,
2659 it will not display lines read from standard input.
2661 The Library Reference lists the built-in exceptions and their
2664 \section{Handling Exceptions
\label{handling
}}
2666 It is possible to write programs that handle selected exceptions.
2667 Look at the following example, which prints a table of inverses of
2668 some floating point numbers:
2671 >>> numbers =
[0.3333,
2.5,
0,
10]
2672 >>> for x in numbers:
2676 ... except ZeroDivisionError:
2677 ... print '*** has no inverse ***'
2681 0 *** has no inverse ***
2685 The
\keyword{try
} statement works as follows.
2688 First, the
\emph{try clause
}
2689 (the statement(s) between the
\keyword{try
} and
\keyword{except
}
2690 keywords) is executed.
2692 If no exception occurs, the
2693 \emph{except\ clause
}
2694 is skipped and execution of the
\keyword{try
} statement is finished.
2696 If an exception occurs during execution of the try clause,
2697 the rest of the clause is skipped. Then if its type matches the
2698 exception named after the
\keyword{except
} keyword, the rest of the
2699 try clause is skipped, the except clause is executed, and then
2700 execution continues after the
\keyword{try
} statement.
2702 If an exception occurs which does not match the exception named in the
2703 except clause, it is passed on to outer
\keyword{try
} statements; if
2704 no handler is found, it is an
\emph{unhandled exception
}
2705 and execution stops with a message as shown above.
2707 A
\keyword{try
} statement may have more than one except clause, to
2708 specify handlers for different exceptions.
2709 At most one handler will be executed.
2710 Handlers only handle exceptions that occur in the corresponding try
2711 clause, not in other handlers of the same
\keyword{try
} statement.
2712 An except clause may name multiple exceptions as a parenthesized list,
2716 ... except (RuntimeError, TypeError, NameError):
2720 The last except clause may omit the exception name(s), to serve as a
2722 Use this with extreme caution, since it is easy to mask a real
2723 programming error in this way!
2725 The
\keyword{try
} \ldots\
\keyword{except
} statement has an optional
2726 \emph{else clause
}, which must follow all except clauses. It is
2727 useful to place code that must be executed if the try clause does not
2728 raise an exception. For example:
2731 for arg in sys.argv
[1:
]:
2735 print 'cannot open', arg
2737 print arg, 'has', len(f.readlines()), 'lines'
2742 When an exception occurs, it may have an associated value, also known as
2743 the exceptions's
\emph{argument
}.
2744 The presence and type of the argument depend on the exception type.
2745 For exception types which have an argument, the except clause may
2746 specify a variable after the exception name (or list) to receive the
2747 argument's value, as follows:
2752 ... except NameError, x:
2753 ... print 'name', x, 'undefined'
2758 If an exception has an argument, it is printed as the last part
2759 (`detail') of the message for unhandled exceptions.
2761 Exception handlers don't just handle exceptions if they occur
2762 immediately in the try clause, but also if they occur inside functions
2763 that are called (even indirectly) in the try clause.
2767 >>> def this_fails():
2772 ... except ZeroDivisionError, detail:
2773 ... print 'Handling run-time error:', detail
2775 Handling run-time error: integer division or modulo
2779 \section{Raising Exceptions
\label{raising
}}
2781 The
\keyword{raise
} statement allows the programmer to force a
2782 specified exception to occur.
2786 >>> raise NameError, 'HiThere'
2787 Traceback (innermost last):
2788 File "<stdin>", line
1
2792 The first argument to
\keyword{raise
} names the exception to be
2793 raised. The optional second argument specifies the exception's
2797 \section{User-defined Exceptions
\label{userExceptions
}}
2799 Programs may name their own exceptions by assigning a string to a
2804 >>> my_exc = 'my_exc'
2806 ... raise my_exc,
2*
2
2807 ... except my_exc, val:
2808 ... print 'My exception occurred, value:', val
2810 My exception occurred, value:
4
2812 Traceback (innermost last):
2813 File "<stdin>", line
1
2817 Many standard modules use this to
report errors that may occur in
2818 functions they define.
2821 \section{Defining Clean-up Actions
\label{cleanup
}}
2823 The
\keyword{try
} statement has another optional clause which is
2824 intended to define clean-up actions that must be executed under all
2825 circumstances. For example:
2829 ... raise KeyboardInterrupt
2831 ... print 'Goodbye, world!'
2834 Traceback (innermost last):
2835 File "<stdin>", line
2
2839 A
\emph{finally clause
} is executed whether or not an exception has
2840 occurred in the try clause. When an exception has occurred, it is
2841 re-raised after the finally clause is executed. The finally clause is
2842 also executed ``on the way out'' when the
\keyword{try
} statement is
2843 left via a
\keyword{break
} or
\keyword{return
} statement.
2845 A
\keyword{try
} statement must either have one or more except clauses
2846 or one finally clause, but not both.
2848 \chapter{Classes
\label{classes
}}
2850 Python's class mechanism adds classes to the language with a minimum
2851 of new syntax and semantics. It is a mixture of the class mechanisms
2852 found in
\Cpp{} and Modula-
3. As is true for modules, classes in Python
2853 do not put an absolute barrier between definition and user, but rather
2854 rely on the politeness of the user not to ``break into the
2855 definition.'' The most important features of classes are retained
2856 with full power, however: the class inheritance mechanism allows
2857 multiple base classes, a derived class can override any methods of its
2858 base class or classes, a method can call the method of a base class with the
2859 same name. Objects can contain an arbitrary amount of private data.
2861 In
\Cpp{} terminology, all class members (including the data members) are
2862 \emph{public
}, and all member functions are
\emph{virtual
}. There are
2863 no special constructors or destructors. As in Modula-
3, there are no
2864 shorthands for referencing the object's members from its methods: the
2865 method function is declared with an explicit first argument
2866 representing the object, which is provided implicitly by the call. As
2867 in Smalltalk, classes themselves are objects, albeit in the wider
2868 sense of the word: in Python, all data types are objects. This
2869 provides semantics for importing and renaming. But, just like in
\Cpp{}
2870 or Modula-
3, built-in types cannot be used as base classes for
2871 extension by the user. Also, like in
\Cpp{} but unlike in Modula-
3, most
2872 built-in operators with special syntax (arithmetic operators,
2873 subscripting etc.) can be redefined for class instances.
2875 \section{A Word About Terminology
\label{terminology
}}
2877 Lacking universally accepted terminology to talk about classes, I will
2878 make occasional use of Smalltalk and
\Cpp{} terms. (I would use Modula-
3
2879 terms, since its object-oriented semantics are closer to those of
2880 Python than
\Cpp{}, but I expect that few readers have heard of it.)
2882 I also have to warn you that there's a terminological pitfall for
2883 object-oriented readers: the word ``object'' in Python does not
2884 necessarily mean a class instance. Like
\Cpp{} and Modula-
3, and
2885 unlike Smalltalk, not all types in Python are classes: the basic
2886 built-in types like integers and lists are not, and even somewhat more
2887 exotic types like files aren't. However,
\emph{all
} Python types
2888 share a little bit of common semantics that is best described by using
2891 Objects have individuality, and multiple names (in multiple scopes)
2892 can be bound to the same object. This is known as aliasing in other
2893 languages. This is usually not appreciated on a first glance at
2894 Python, and can be safely ignored when dealing with immutable basic
2895 types (numbers, strings, tuples). However, aliasing has an
2896 (intended!) effect on the semantics of Python code involving mutable
2897 objects such as lists, dictionaries, and most types representing
2898 entities outside the program (files, windows, etc.). This is usually
2899 used to the benefit of the program, since aliases behave like pointers
2900 in some respects. For example, passing an object is cheap since only
2901 a pointer is passed by the implementation; and if a function modifies
2902 an object passed as an argument, the caller will see the change --- this
2903 obviates the need for two different argument passing mechanisms as in
2907 \section{Python Scopes and Name Spaces
\label{scopes
}}
2909 Before introducing classes, I first have to tell you something about
2910 Python's scope rules. Class definitions play some neat tricks with
2911 name spaces, and you need to know how scopes and name spaces work to
2912 fully understand what's going on. Incidentally, knowledge about this
2913 subject is useful for any advanced Python programmer.
2915 Let's begin with some definitions.
2917 A
\emph{name space
} is a mapping from names to objects. Most name
2918 spaces are currently implemented as Python dictionaries, but that's
2919 normally not noticeable in any way (except for performance), and it
2920 may change in the future. Examples of name spaces are: the set of
2921 built-in names (functions such as
\function{abs()
}, and built-in exception
2922 names); the global names in a module; and the local names in a
2923 function invocation. In a sense the set of attributes of an object
2924 also form a name space. The important thing to know about name
2925 spaces is that there is absolutely no relation between names in
2926 different name spaces; for instance, two different modules may both
2927 define a function ``maximize'' without confusion --- users of the
2928 modules must prefix it with the module name.
2930 By the way, I use the word
\emph{attribute
} for any name following a
2931 dot --- for example, in the expression
\code{z.real
},
\code{real
} is
2932 an attribute of the object
\code{z
}. Strictly speaking, references to
2933 names in modules are attribute references: in the expression
2934 \code{modname.funcname
},
\code{modname
} is a module object and
2935 \code{funcname
} is an attribute of it. In this case there happens to
2936 be a straightforward mapping between the module's attributes and the
2937 global names defined in the module: they share the same name
2939 Except for one thing. Module objects have a secret read-only
2940 attribute called
\code{__dict__
} which returns the dictionary
2941 used to implement the module's name space; the name
2942 \code{__dict__
} is an attribute but not a global name.
2943 Obviously, using this violates the abstraction of name space
2944 implementation, and should be restricted to things like
2945 post-mortem debuggers.
2948 Attributes may be read-only or writable. In the latter case,
2949 assignment to attributes is possible. Module attributes are writable:
2950 you can write
\samp{modname.the_answer =
42}. Writable attributes may
2951 also be deleted with the
\keyword{del
} statement, e.g.
2952 \samp{del modname.the_answer
}.
2954 Name spaces are created at different moments and have different
2955 lifetimes. The name space containing the built-in names is created
2956 when the Python interpreter starts up, and is never deleted. The
2957 global name space for a module is created when the module definition
2958 is read in; normally, module name spaces also last until the
2959 interpreter quits. The statements executed by the top-level
2960 invocation of the interpreter, either read from a script file or
2961 interactively, are considered part of a module called
2962 \module{__main__
}, so they have their own global name space. (The
2963 built-in names actually also live in a module; this is called
2964 \module{__builtin__
}.)
2966 The local name space for a function is created when the function is
2967 called, and deleted when the function returns or raises an exception
2968 that is not handled within the function. (Actually, forgetting would
2969 be a better way to describe what actually happens.) Of course,
2970 recursive invocations each have their own local name space.
2972 A
\emph{scope
} is a textual region of a Python program where a name space
2973 is directly accessible. ``Directly accessible'' here means that an
2974 unqualified reference to a name attempts to find the name in the name
2977 Although scopes are determined statically, they are used dynamically.
2978 At any time during execution, exactly three nested scopes are in use
2979 (i.e., exactly three name spaces are directly accessible): the
2980 innermost scope, which is searched first, contains the local names,
2981 the middle scope, searched next, contains the current module's global
2982 names, and the outermost scope (searched last) is the name space
2983 containing built-in names.
2985 Usually, the local scope references the local names of the (textually)
2986 current function. Outside of functions, the local scope references
2987 the same name space as the global scope: the module's name space.
2988 Class definitions place yet another name space in the local scope.
2990 It is important to realize that scopes are determined textually: the
2991 global scope of a function defined in a module is that module's name
2992 space, no matter from where or by what alias the function is called.
2993 On the other hand, the actual search for names is done dynamically, at
2994 run time --- however, the language definition is evolving towards
2995 static name resolution, at ``compile'' time, so don't rely on dynamic
2996 name resolution! (In fact, local variables are already determined
2999 A special quirk of Python is that assignments always go into the
3000 innermost scope. Assignments do not copy data --- they just
3001 bind names to objects. The same is true for deletions: the statement
3002 \samp{del x
} removes the binding of
\code{x
} from the name space
3003 referenced by the local scope. In fact, all operations that introduce
3004 new names use the local scope: in particular, import statements and
3005 function definitions bind the module or function name in the local
3006 scope. (The
\keyword{global
} statement can be used to indicate that
3007 particular variables live in the global scope.)
3010 \section{A First Look at Classes
\label{firstClasses
}}
3012 Classes introduce a little bit of new syntax, three new object types,
3013 and some new semantics.
3016 \subsection{Class Definition Syntax
\label{classDefinition
}}
3018 The simplest form of class definition looks like this:
3029 Class definitions, like function definitions (
\keyword{def
}
3030 statements) must be executed before they have any effect. (You could
3031 conceivably place a class definition in a branch of an
\keyword{if
}
3032 statement, or inside a function.)
3034 In practice, the statements inside a class definition will usually be
3035 function definitions, but other statements are allowed, and sometimes
3036 useful --- we'll come back to this later. The function definitions
3037 inside a class normally have a peculiar form of argument list,
3038 dictated by the calling conventions for methods --- again, this is
3041 When a class definition is entered, a new name space is created, and
3042 used as the local scope --- thus, all assignments to local variables
3043 go into this new name space. In particular, function definitions bind
3044 the name of the new function here.
3046 When a class definition is left normally (via the end), a
\emph{class
3047 object
} is created. This is basically a wrapper around the contents
3048 of the name space created by the class definition; we'll learn more
3049 about class objects in the next section. The original local scope
3050 (the one in effect just before the class definitions was entered) is
3051 reinstated, and the class object is bound here to the class name given
3052 in the class definition header (
\class{ClassName
} in the example).
3055 \subsection{Class Objects
\label{classObjects
}}
3057 Class objects support two kinds of operations: attribute references
3060 \emph{Attribute references
} use the standard syntax used for all
3061 attribute references in Python:
\code{obj.name
}. Valid attribute
3062 names are all the names that were in the class's name space when the
3063 class object was created. So, if the class definition looked like
3068 "A simple example class"
3071 return 'hello world'
3074 then
\code{MyClass.i
} and
\code{MyClass.f
} are valid attribute
3075 references, returning an integer and a function object, respectively.
3076 Class attributes can also be assigned to, so you can change the value
3077 of
\code{MyClass.i
} by assignment.
\code{__doc__
} is also a valid
3078 attribute that's read-only, returning the docstring belonging to
3079 the class:
\code{"A simple example class"
}).
3081 Class
\emph{instantiation
} uses function notation. Just pretend that
3082 the class object is a parameterless function that returns a new
3083 instance of the class. For example, (assuming the above class):
3089 creates a new
\emph{instance
} of the class and assigns this object to
3090 the local variable
\code{x
}.
3093 \subsection{Instance Objects
\label{instanceObjects
}}
3095 Now what can we do with instance objects? The only operations
3096 understood by instance objects are attribute references. There are
3097 two kinds of valid attribute names.
3099 The first I'll call
\emph{data attributes
}. These correspond to
3100 ``instance variables'' in Smalltalk, and to ``data members'' in
3101 \Cpp{}. Data attributes need not be declared; like local variables,
3102 they spring into existence when they are first assigned to. For
3103 example, if
\code{x
} is the instance of
\class{MyClass
} created above,
3104 the following piece of code will print the value
\code{16}, without
3109 while x.counter <
10:
3110 x.counter = x.counter *
2
3115 The second kind of attribute references understood by instance objects
3116 are
\emph{methods
}. A method is a function that ``belongs to'' an
3117 object. (In Python, the term method is not unique to class instances:
3118 other object types can have methods as well, e.g., list objects have
3119 methods called append, insert, remove, sort, and so on. However,
3120 below, we'll use the term method exclusively to mean methods of class
3121 instance objects, unless explicitly stated otherwise.)
3123 Valid method names of an instance object depend on its class. By
3124 definition, all attributes of a class that are (user-defined) function
3125 objects define corresponding methods of its instances. So in our
3126 example,
\code{x.f
} is a valid method reference, since
3127 \code{MyClass.f
} is a function, but
\code{x.i
} is not, since
3128 \code{MyClass.i
} is not. But
\code{x.f
} is not the same thing as
3129 \code{MyClass.f
} --- it is a
\emph{method object
}, not a function
3134 \subsection{Method Objects
\label{methodObjects
}}
3136 Usually, a method is called immediately, e.g.:
3142 In our example, this will return the string
\code{'hello world'
}.
3143 However, it is not necessary to call a method right away:
3144 \code{x.f
} is a method object, and can be stored away and called at a
3145 later time. For example:
3153 will continue to print
\samp{hello world
} until the end of time.
3155 What exactly happens when a method is called? You may have noticed
3156 that
\code{x.f()
} was called without an argument above, even though
3157 the function definition for
\method{f
} specified an argument. What
3158 happened to the argument? Surely Python raises an exception when a
3159 function that requires an argument is called without any --- even if
3160 the argument isn't actually used...
3162 Actually, you may have guessed the answer: the special thing about
3163 methods is that the object is passed as the first argument of the
3164 function. In our example, the call
\code{x.f()
} is exactly equivalent
3165 to
\code{MyClass.f(x)
}. In general, calling a method with a list of
3166 \var{n
} arguments is equivalent to calling the corresponding function
3167 with an argument list that is created by inserting the method's object
3168 before the first argument.
3170 If you still don't understand how methods work, a look at the
3171 implementation can perhaps clarify matters. When an instance
3172 attribute is referenced that isn't a data attribute, its class is
3173 searched. If the name denotes a valid class attribute that is a
3174 function object, a method object is created by packing (pointers to)
3175 the instance object and the function object just found together in an
3176 abstract object: this is the method object. When the method object is
3177 called with an argument list, it is unpacked again, a new argument
3178 list is constructed from the instance object and the original argument
3179 list, and the function object is called with this new argument list.
3182 \section{Random Remarks
\label{remarks
}}
3184 [These should perhaps be placed more carefully...
]
3187 Data attributes override method attributes with the same name; to
3188 avoid accidental name conflicts, which may cause hard-to-find bugs in
3189 large programs, it is wise to use some kind of convention that
3190 minimizes the chance of conflicts, e.g., capitalize method names,
3191 prefix data attribute names with a small unique string (perhaps just
3192 an underscore), or use verbs for methods and nouns for data attributes.
3195 Data attributes may be referenced by methods as well as by ordinary
3196 users (``clients'') of an object. In other words, classes are not
3197 usable to implement pure abstract data types. In fact, nothing in
3198 Python makes it possible to enforce data hiding --- it is all based
3199 upon convention. (On the other hand, the Python implementation,
3200 written in C, can completely hide implementation details and control
3201 access to an object if necessary; this can be used by extensions to
3202 Python written in C.)
3205 Clients should use data attributes with care --- clients may mess up
3206 invariants maintained by the methods by stamping on their data
3207 attributes. Note that clients may add data attributes of their own to
3208 an instance object without affecting the validity of the methods, as
3209 long as name conflicts are avoided --- again, a naming convention can
3210 save a lot of headaches here.
3213 There is no shorthand for referencing data attributes (or other
3214 methods!) from within methods. I find that this actually increases
3215 the readability of methods: there is no chance of confusing local
3216 variables and instance variables when glancing through a method.
3219 Conventionally, the first argument of methods is often called
3220 \code{self
}. This is nothing more than a convention: the name
3221 \code{self
} has absolutely no special meaning to Python. (Note,
3222 however, that by not following the convention your code may be less
3223 readable by other Python programmers, and it is also conceivable that
3224 a
\emph{class browser
} program be written which relies upon such a
3228 Any function object that is a class attribute defines a method for
3229 instances of that class. It is not necessary that the function
3230 definition is textually enclosed in the class definition: assigning a
3231 function object to a local variable in the class is also ok. For
3235 # Function defined outside the class
3242 return 'hello world'
3246 Now
\code{f
},
\code{g
} and
\code{h
} are all attributes of class
3247 \class{C
} that refer to function objects, and consequently they are all
3248 methods of instances of
\class{C
} ---
\code{h
} being exactly equivalent
3249 to
\code{g
}. Note that this practice usually only serves to confuse
3250 the reader of a program.
3253 Methods may call other methods by using method attributes of the
3254 \code{self
} argument, e.g.:
3262 def addtwice(self, x):
3268 The instantiation operation (``calling'' a class object) creates an
3269 empty object. Many classes like to create objects in a known initial
3270 state. Therefore a class may define a special method named
3271 \method{__init__()
}, like this:
3278 When a class defines an
\method{__init__()
} method, class
3279 instantiation automatically invokes
\method{__init__()
} for the
3280 newly-created class instance. So in the
\class{Bag
} example, a new
3281 and initialized instance can be obtained by:
3287 Of course, the
\method{__init__()
} method may have arguments for
3288 greater flexibility. In that case, arguments given to the class
3289 instantiation operator are passed on to
\method{__init__()
}. For
3294 ... def __init__(self, realpart, imagpart):
3295 ... self.r = realpart
3296 ... self.i = imagpart
3298 >>> x = Complex(
3.0,-
4.5)
3303 Methods may reference global names in the same way as ordinary
3304 functions. The global scope associated with a method is the module
3305 containing the class definition. (The class itself is never used as a
3306 global scope!) While one rarely encounters a good reason for using
3307 global data in a method, there are many legitimate uses of the global
3308 scope: for one thing, functions and modules imported into the global
3309 scope can be used by methods, as well as functions and classes defined
3310 in it. Usually, the class containing the method is itself defined in
3311 this global scope, and in the next section we'll find some good
3312 reasons why a method would want to reference its own class!
3315 \section{Inheritance
\label{inheritance
}}
3317 Of course, a language feature would not be worthy of the name ``class''
3318 without supporting inheritance. The syntax for a derived class
3319 definition looks as follows:
3322 class DerivedClassName(BaseClassName):
3330 The name
\class{BaseClassName
} must be defined in a scope containing
3331 the derived class definition. Instead of a base class name, an
3332 expression is also allowed. This is useful when the base class is
3333 defined in another module, e.g.,
3336 class DerivedClassName(modname.BaseClassName):
3339 Execution of a derived class definition proceeds the same as for a
3340 base class. When the class object is constructed, the base class is
3341 remembered. This is used for resolving attribute references: if a
3342 requested attribute is not found in the class, it is searched in the
3343 base class. This rule is applied recursively if the base class itself
3344 is derived from some other class.
3346 There's nothing special about instantiation of derived classes:
3347 \code{DerivedClassName()
} creates a new instance of the class. Method
3348 references are resolved as follows: the corresponding class attribute
3349 is searched, descending down the chain of base classes if necessary,
3350 and the method reference is valid if this yields a function object.
3352 Derived classes may override methods of their base classes. Because
3353 methods have no special privileges when calling other methods of the
3354 same object, a method of a base class that calls another method
3355 defined in the same base class, may in fact end up calling a method of
3356 a derived class that overrides it. (For
\Cpp{} programmers: all methods
3357 in Python are ``virtual functions''.)
3359 An overriding method in a derived class may in fact want to extend
3360 rather than simply replace the base class method of the same name.
3361 There is a simple way to call the base class method directly: just
3362 call
\samp{BaseClassName.methodname(self, arguments)
}. This is
3363 occasionally useful to clients as well. (Note that this only works if
3364 the base class is defined or imported directly in the global scope.)
3367 \subsection{Multiple Inheritance
\label{multiple
}}
3369 Python supports a limited form of multiple inheritance as well. A
3370 class definition with multiple base classes looks as follows:
3373 class DerivedClassName(Base1, Base2, Base3):
3381 The only rule necessary to explain the semantics is the resolution
3382 rule used for class attribute references. This is depth-first,
3383 left-to-right. Thus, if an attribute is not found in
3384 \class{DerivedClassName
}, it is searched in
\class{Base1
}, then
3385 (recursively) in the base classes of
\class{Base1
}, and only if it is
3386 not found there, it is searched in
\class{Base2
}, and so on.
3388 (To some people breadth first --- searching
\class{Base2
} and
3389 \class{Base3
} before the base classes of
\class{Base1
} --- looks more
3390 natural. However, this would require you to know whether a particular
3391 attribute of
\class{Base1
} is actually defined in
\class{Base1
} or in
3392 one of its base classes before you can figure out the consequences of
3393 a name conflict with an attribute of
\class{Base2
}. The depth-first
3394 rule makes no differences between direct and inherited attributes of
3397 It is clear that indiscriminate use of multiple inheritance is a
3398 maintenance nightmare, given the reliance in Python on conventions to
3399 avoid accidental name conflicts. A well-known problem with multiple
3400 inheritance is a class derived from two classes that happen to have a
3401 common base class. While it is easy enough to figure out what happens
3402 in this case (the instance will have a single copy of ``instance
3403 variables'' or data attributes used by the common base class), it is
3404 not clear that these semantics are in any way useful.
3407 \section{Private Variables
\label{private
}}
3409 There is limited support for class-private
3410 identifiers. Any identifier of the form
\code{__spam
} (at least two
3411 leading underscores, at most one trailing underscore) is now textually
3412 replaced with
\code{_classname__spam
}, where
\code{classname
} is the
3413 current class name with leading underscore(s) stripped. This mangling
3414 is done without regard of the syntactic position of the identifier, so
3415 it can be used to define class-private instance and class variables,
3416 methods, as well as globals, and even to store instance variables
3417 private to this class on instances of
\emph{other
} classes. Truncation
3418 may occur when the mangled name would be longer than
255 characters.
3419 Outside classes, or when the class name consists of only underscores,
3422 Name mangling is intended to give classes an easy way to define
3423 ``private'' instance variables and methods, without having to worry
3424 about instance variables defined by derived classes, or mucking with
3425 instance variables by code outside the class. Note that the mangling
3426 rules are designed mostly to avoid accidents; it still is possible for
3427 a determined soul to access or modify a variable that is considered
3428 private. This can even be useful, e.g. for the debugger, and that's
3429 one reason why this loophole is not closed. (Buglet: derivation of a
3430 class with the same name as the base class makes use of private
3431 variables of the base class possible.)
3433 Notice that code passed to
\code{exec
},
\code{eval()
} or
3434 \code{evalfile()
} does not consider the classname of the invoking
3435 class to be the current class; this is similar to the effect of the
3436 \code{global
} statement, the effect of which is likewise restricted to
3437 code that is byte-compiled together. The same restriction applies to
3438 \code{getattr()
},
\code{setattr()
} and
\code{delattr()
}, as well as
3439 when referencing
\code{__dict__
} directly.
3441 Here's an example of a class that implements its own
3442 \code{__getattr__
} and
\code{__setattr__
} methods and stores all
3443 attributes in a private variable, in a way that works in Python
1.4 as
3444 well as in previous versions:
3447 class VirtualAttributes:
3449 __vdict_name = locals().keys()
[0]
3452 self.__dict__
[self.__vdict_name
] =
{}
3454 def __getattr__(self, name):
3455 return self.__vdict
[name
]
3457 def __setattr__(self, name, value):
3458 self.__vdict
[name
] = value
3461 %\emph{Warning: this is an experimental feature.} To avoid all
3462 %potential problems, refrain from using identifiers starting with
3463 %double underscore except for predefined uses like \code{__init__}. To
3464 %use private names while maintaining future compatibility: refrain from
3465 %using the same private name in classes related via subclassing; avoid
3466 %explicit (manual) mangling/unmangling; and assume that at some point
3467 %in the future, leading double underscore will revert to being just a
3468 %naming convention. Discussion on extensive compile-time declarations
3469 %are currently underway, and it is impossible to predict what solution
3470 %will eventually be chosen for private names. Double leading
3471 %underscore is still a candidate, of course --- just not the only one.
3472 %It is placed in the distribution in the belief that it is useful, and
3473 %so that widespread experience with its use can be gained. It will not
3474 %be removed without providing a better solution and a migration path.
3476 \section{Odds and Ends
\label{odds
}}
3478 Sometimes it is useful to have a data type similar to the Pascal
3479 ``record'' or C ``struct'', bundling together a couple of named data
3480 items. An empty class definition will do nicely, e.g.:
3486 john = Employee() # Create an empty employee record
3488 # Fill the fields of the record
3489 john.name = 'John Doe'
3490 john.dept = 'computer lab'
3495 A piece of Python code that expects a particular abstract data type
3496 can often be passed a class that emulates the methods of that data
3497 type instead. For instance, if you have a function that formats some
3498 data from a file object, you can define a class with methods
3499 \method{read()
} and
\method{readline()
} that gets the data from a string
3500 buffer instead, and pass it as an argument.
% (Unfortunately, this
3501 %technique has its limitations: a class can't define operations that
3502 %are accessed by special syntax such as sequence subscripting or
3503 %arithmetic operators, and assigning such a ``pseudo-file'' to
3504 %\code{sys.stdin} will not cause the interpreter to read further input
3508 Instance method objects have attributes, too:
\code{m.im_self
} is the
3509 object of which the method is an instance, and
\code{m.im_func
} is the
3510 function object corresponding to the method.
3512 \subsection{Exceptions Can Be Classes
\label{exceptionClasses
}}
3514 User-defined exceptions are no longer limited to being string objects
3515 --- they can be identified by classes as well. Using this mechanism it
3516 is possible to create extensible hierarchies of exceptions.
3518 There are two new valid (semantic) forms for the raise statement:
3521 raise Class, instance
3526 In the first form,
\code{instance
} must be an instance of
\class{Class
}
3527 or of a class derived from it. The second form is a shorthand for
3530 raise instance.__class__, instance
3533 An except clause may list classes as well as string objects. A class
3534 in an except clause is compatible with an exception if it is the same
3535 class or a base class thereof (but not the other way around --- an
3536 except clause listing a derived class is not compatible with a base
3537 class). For example, the following code will print B, C, D in that
3559 Note that if the except clauses were reversed (with
3560 \samp{except B
} first), it would have printed B, B, B --- the first
3561 matching except clause is triggered.
3563 When an error message is printed for an unhandled exception which is a
3564 class, the class name is printed, then a colon and a space, and
3565 finally the instance converted to a string using the built-in function
3569 \chapter{What Now?
\label{whatNow
}}
3571 Hopefully reading this tutorial has reinforced your interest in using
3572 Python. Now what should you do?
3574 You should read, or at least page through, the Library Reference,
3575 which gives complete (though terse) reference material about types,
3576 functions, and modules that can save you a lot of time when writing
3577 Python programs. The standard Python distribution includes a
3578 \emph{lot
} of code in both C and Python; there are modules to read
3579 \UNIX{} mailboxes, retrieve documents via HTTP, generate random
3580 numbers, parse command-line options, write CGI programs, compress
3581 data, and a lot more; skimming through the Library Reference will give
3582 you an idea of what's available.
3584 The major Python Web site is
\url{http://www.python.org
}; it contains
3585 code, documentation, and pointers to Python-related pages around the
3586 Web. This web site is mirrored in various places around the
3587 world, such as Europe, Japan, and Australia; a mirror may be faster
3588 than the main site, depending on your geographical location. A more
3589 informal site is
\url{http://starship.skyport.net
}, which contains a
3590 bunch of Python-related personal home pages; many people have
3591 downloadable software here.
3593 For Python-related questions and problem reports, you can post to the
3594 newsgroup
\newsgroup{comp.lang.python
}, or send them to the mailing
3595 list at
\email{python-list@cwi.nl
}. The newsgroup and mailing list
3596 are gatewayed, so messages posted to one will automatically be
3597 forwarded to the other. There are around
35--
45 postings a day,
3598 % Postings figure based on average of last six months activity as
3599 % reported by www.findmail.com; Oct. '97 - Mar. '98: 7480 msgs / 182
3600 % days = 41.1 msgs / day.
3601 asking (and answering) questions, suggesting new features, and
3602 announcing new modules. Before posting, be sure to check the list of
3603 Frequently Asked Questions (also called the FAQ), at
3604 \url{http://www.python.org/doc/FAQ.html
}, or look for it in the
3605 \file{Misc/
} directory of the Python source distribution. The FAQ
3606 answers many of the questions that come up again and again, and may
3607 already contain the solution for your problem.
3609 You can support the Python community by joining the Python Software
3610 Activity, which runs the python.org web, ftp and email servers, and
3611 organizes Python workshops. See
\url{http://www.python.org/psa/
} for
3612 information on how to join.
3617 \chapter{Interactive Input Editing and History Substitution
3618 \label{interacting
}}
3620 Some versions of the Python interpreter support editing of the current
3621 input line and history substitution, similar to facilities found in
3622 the Korn shell and the GNU Bash shell. This is implemented using the
3623 \emph{GNU Readline
} library, which supports Emacs-style and vi-style
3624 editing. This library has its own documentation which I won't
3625 duplicate here; however, the basics are easily explained. The
3626 interactive editing and history described here are optionally
3627 available in the
\UNIX{} and CygWin versions of the interpreter.
3629 This chapter does
\emph{not
} document the editing facilities of Mark
3630 Hammond's PythonWin package or the Tk-based environment, IDLE,
3631 distributed with Python. The command line history recall which
3632 operates within DOS boxes on NT and some other DOS and Windows flavors
3633 is yet another beast.
3635 \section{Line Editing
\label{lineEditing
}}
3637 If supported, input line editing is active whenever the interpreter
3638 prints a primary or secondary prompt. The current line can be edited
3639 using the conventional Emacs control characters. The most important
3640 of these are: C-A (Control-A) moves the cursor to the beginning of the
3641 line, C-E to the end, C-B moves it one position to the left, C-F to
3642 the right. Backspace erases the character to the left of the cursor,
3643 C-D the character to its right. C-K kills (erases) the rest of the
3644 line to the right of the cursor, C-Y yanks back the last killed
3645 string. C-underscore undoes the last change you made; it can be
3646 repeated for cumulative effect.
3648 \section{History Substitution
\label{history
}}
3650 History substitution works as follows. All non-empty input lines
3651 issued are saved in a history buffer, and when a new prompt is given
3652 you are positioned on a new line at the bottom of this buffer. C-P
3653 moves one line up (back) in the history buffer, C-N moves one down.
3654 Any line in the history buffer can be edited; an asterisk appears in
3655 front of the prompt to mark a line as modified. Pressing the Return
3656 key passes the current line to the interpreter. C-R starts an
3657 incremental reverse search; C-S starts a forward search.
3659 \section{Key Bindings
\label{keyBindings
}}
3661 The key bindings and some other parameters of the Readline library can
3662 be customized by placing commands in an initialization file called
3663 \file{\$HOME/.inputrc
}. Key bindings have the form
3666 key-name: function-name
3672 "string": function-name
3675 and options can be set with
3678 set option-name value
3684 # I prefer vi-style editing:
3686 # Edit using a single line:
3687 set horizontal-scroll-mode On
3689 Meta-h: backward-kill-word
3690 "
\C-u": universal-argument
3691 "
\C-x
\C-r": re-read-init-file
3694 Note that the default binding for TAB in Python is to insert a TAB
3695 instead of Readline's default filename completion function. If you
3696 insist, you can override this by putting
3702 in your
\file{\$HOME/.inputrc
}. (Of course, this makes it hard to type
3703 indented continuation lines...)
3705 Automatic completion of variable and module names is optionally
3706 available. To enable it in the interpreter's interactive mode, add
3707 the following to your
\file{\$HOME/.pythonrc.py
} file:
% $ <- bow to font-lock
3708 \indexii{.pythonrc.py
}{file
}
3709 \refstmodindex{rlcompleter
}
3710 \refbimodindex{readline
}
3713 import rlcompleter, readline
3714 readline.parse_and_bind('tab: complete')
3717 This binds the TAB key to the completion function, so hitting the TAB
3718 key twice suggests completions; it looks at Python statement names,
3719 the current local variables, and the available module names. For
3720 dotted expressions such as
\code{string.a
}, it will evaluate the the
3721 expression up to the final
\character{.
} and then suggest completions
3722 from the attributes of the resulting object. Note that this may
3723 execute application-defined code if an object with a
3724 \method{__getattr__()
} method is part of the expression.
3727 \section{Commentary
\label{commentary
}}
3729 This facility is an enormous step forward compared to previous
3730 versions of the interpreter; however, some wishes are left: It would
3731 be nice if the proper indentation were suggested on continuation lines
3732 (the parser knows if an indent token is required next). The
3733 completion mechanism might use the interpreter's symbol table. A
3734 command to check (or even suggest) matching parentheses, quotes etc.
3735 would also be useful.
3737 % XXX Lele Gaifax's readline module, which adds name completion...