Doc/tut.tex

   1 \documentstyle[twoside,11pt,myformat]{report}
   2
   3 \title{Python Tutorial}
   4
   5 \input{boilerplate}
   6
   7 \begin{document}
   8
   9 \pagenumbering{roman}
  10
  11 \maketitle
  12
  13 \input{copyright}
  14
  15 \begin{abstract}
  16
  17 \noindent
  18 Python is a simple, yet powerful programming language that bridges the
  19 gap between C and shell programming, and is thus ideally suited for
  20 ``throw-away programming''
  21 and rapid prototyping.  Its syntax is put
  22 together from constructs borrowed from a variety of other languages;
  23 most prominent are influences from ABC, C, Modula-3 and Icon.
  24
  25 The Python interpreter is easily extended with new functions and data
  26 types implemented in C.  Python is also suitable as an extension
  27 language for highly customizable C applications such as editors or
  28 window managers.
  29
  30 Python is available for various operating systems, amongst which
  31 several flavors of {\UNIX}, Amoeba, the Apple Macintosh O.S.,
  32 and MS-DOS.
  33
  34 This tutorial introduces the reader informally to the basic concepts
  35 and features of the Python language and system.  It helps to have a
  36 Python interpreter handy for hands-on experience, but as the examples
  37 are self-contained, the tutorial can be read off-line as well.
  38
  39 For a description of standard objects and modules, see the {\em Python
  40 Library Reference} document.  The {\em Python Reference Manual} gives
  41 a more formal definition of the language.
  42
  43 \end{abstract}
  44
  45 \pagebreak
  46 {
  47 \parskip = 0mm
  48 \tableofcontents
  49 }
  50
  51 \pagebreak
  52
  53 \pagenumbering{arabic}
  54
  55
  56 \chapter{Whetting Your Appetite}
  57
  58 If you ever wrote a large shell script, you probably know this
  59 feeling: you'd love to add yet another feature, but it's already so
  60 slow, and so big, and so complicated; or the feature involves a system
  61 call or other function that is only accessible from C \ldots  Usually
  62 the problem at hand isn't serious enough to warrant rewriting the
  63 script in C; perhaps because the problem requires variable-length
  64 strings or other data types (like sorted lists of file names) that are
  65 easy in the shell but lots of work to implement in C; or perhaps just
  66 because you're not sufficiently familiar with C.
  67
  68 In such cases, Python may be just the language for you.  Python is
  69 simple to use, but it is a real programming language, offering much
  70 more structure and support for large programs than the shell has.  On
  71 the other hand, it also offers much more error checking than C, and,
  72 being a {\em very-high-level language}, it has high-level data types
  73 built in, such as flexible arrays and dictionaries that would cost you
  74 days to implement efficiently in C.  Because of its more general data
  75 types Python is applicable to a much larger problem domain than {\em
  76 Awk} or even {\em Perl}, yet many things are at least as easy in
  77 Python as in those languages.
  78
  79 Python allows you to split up your program in modules that can be
  80 reused in other Python programs.  It comes with a large collection of
  81 standard modules that you can use as the basis of your programs --- or
  82 as examples to start learning to program in Python.  There are also
  83 built-in modules that provide things like file I/O, system calls,
  84 sockets, and even a generic interface to window systems (STDWIN).
  85
  86 Python is an interpreted language, which can save you considerable time
  87 during program development because no compilation and linking is
  88 necessary.  The interpreter can be used interactively, which makes it
  89 easy to experiment with features of the language, to write throw-away
  90 programs, or to test functions during bottom-up program development.
  91 It is also a handy desk calculator.
  92
  93 Python allows writing very compact and readable programs.  Programs
  94 written in Python are typically much shorter than equivalent C
  95 programs, for several reasons:
  96 \begin{itemize}
  97 \item
  98 the high-level data types allow you to express complex operations in a
  99 single statement;
 100 \item
 101 statement grouping is done by indentation instead of begin/end
 102 brackets;
 103 \item
 104 no variable or argument declarations are necessary.
 105 \end{itemize}
 106
 107 Python is {\em extensible}: if you know how to program in C it is easy
 108 to add a new built-in
 109 function or
 110 module to the interpreter, either to
 111 perform critical operations at maximum speed, or to link Python
 112 programs to libraries that may only be available in binary form (such
 113 as a vendor-specific graphics library).  Once you are really hooked,
 114 you can link the Python interpreter into an application written in C
 115 and use it as an extension or command language for that application.
 116
 117 By the way, the language is named after the BBC show ``Monty
 118 Python's Flying Circus'' and has nothing to do with nasty reptiles...
 119
 120 \section{Where From Here}
 121
 122 Now that you are all excited about Python, you'll want to examine it
 123 in some more detail.  Since the best way to learn a language is
 124 using it, you are invited here to do so.
 125
 126 In the next chapter, the mechanics of using the interpreter are
 127 explained.  This is rather mundane information, but essential for
 128 trying out the examples shown later.
 129
 130 The rest of the tutorial introduces various features of the Python
 131 language and system though examples, beginning with simple
 132 expressions, statements and data types, through functions and modules,
 133 and finally touching upon advanced concepts like exceptions
 134 and user-defined classes.
 135
 136 When you're through with the tutorial (or just getting bored), you
 137 should read the Library Reference, which gives complete (though terse)
 138 reference material about built-in and standard types, functions and
 139 modules that can save you a lot of time when writing Python programs.
 140
 141
 142 \chapter{Using the Python Interpreter}
 143
 144 \section{Invoking the Interpreter}
 145
 146 The Python interpreter is usually installed as {\tt /usr/local/bin/python}
 147 on those machines where it is available; putting {\tt /usr/local/bin} in
 148 your {\UNIX} shell's search path makes it possible to start it by
 149 typing the command
 150
 151 \bcode\begin{verbatim}
 152 python
 153 \end{verbatim}\ecode
 154 %
 155 to the shell.  Since the choice of the directory where the interpreter
 156 lives is an installation option, other places are possible; check with
 157 your local Python guru or system administrator.  (E.g., {\tt
 158 /usr/local/python} is a popular alternative location.)
 159
 160 The interpreter operates somewhat like the {\UNIX} shell: when called
 161 with standard input connected to a tty device, it reads and executes
 162 commands interactively; when called with a file name argument or with
 163 a file as standard input, it reads and executes a {\em script} from
 164 that file.
 165
 166 A third way of starting the interpreter is
 167 ``{\tt python -c command [arg] ...}'', which
 168 executes the statement(s) in {\tt command}, analogous to the shell's
 169 {\tt -c} option.  Since Python statements often contain spaces or other
 170 characters that are special to the shell, it is best to quote {\tt
 171 command} in its entirety with double quotes.
 172
 173 Note that there is a difference between ``{\tt python file}'' and
 174 ``{\tt python $<$file}''.  In the latter case, input requests from the
 175 program, such as calls to {\tt input()} and {\tt raw_input()}, are
 176 satisfied from {\em file}.  Since this file has already been read
 177 until the end by the parser before the program starts executing, the
 178 program will encounter EOF immediately.  In the former case (which is
 179 usually what you want) they are satisfied from whatever file or device
 180 is connected to standard input of the Python interpreter.
 181
 182 When a script file is used, it is sometimes useful to be able to run
 183 the script and enter interactive mode afterwards.  This can be done by
 184 passing {\tt -i} before the script.  (This does not work if the script
 185 is read from standard input, for the same reason as explained in the
 186 previous paragraph.)
 187
 188 \subsection{Argument Passing}
 189
 190 When known to the interpreter, the script name and additional
 191 arguments thereafter are passed to the script in the variable {\tt
 192 sys.argv}, which is a list of strings.  Its length is at least one;
 193 when no script and no arguments are given, {\tt sys.argv[0]} is an
 194 empty string.  When the script name is given as {\tt '-'} (meaning
 195 standard input), {\tt sys.argv[0]} is set to {\tt '-'}.  When {\tt -c
 196 command} is used, {\tt sys.argv[0]} is set to {\tt '-c'}.  Options
 197 found after {\tt -c command} are not consumed by the Python
 198 interpreter's option processing but left in {\tt sys.argv} for the
 199 command to handle.
 200
 201 \subsection{Interactive Mode}
 202
 203 When commands are read from a tty, the interpreter is said to be in
 204 {\em interactive\ mode}.  In this mode it prompts for the next command
 205 with the {\em primary\ prompt}, usually three greater-than signs ({\tt
 206 >>>}); for continuation lines it prompts with the {\em secondary\
 207 prompt}, by default three dots ({\tt ...}).  Typing an EOF (Control-D)
 208 at the primary prompt causes the interpreter to exit with a zero exit
 209 status.
 210
 211 The interpreter prints a welcome message stating its version number
 212 and a copyright notice before printing the first prompt, e.g.:
 213
 214 \bcode\begin{verbatim}
 215 python
 216 Python 1.1 (Oct  6 1994)
 217 Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam
 218 >>>
 219 \end{verbatim}\ecode
 220
 221 \section{The Interpreter and its Environment}
 222
 223 \subsection{Error Handling}
 224
 225 When an error occurs, the interpreter prints an error
 226 message and a stack trace.  In interactive mode, it then returns to
 227 the primary prompt; when input came from a file, it exits with a
 228 nonzero exit status after printing
 229 the stack trace.  (Exceptions handled by an {\tt except} clause in a
 230 {\tt try} statement are not errors in this context.)  Some errors are
 231 unconditionally fatal and cause an exit with a nonzero exit; this
 232 applies to internal inconsistencies and some cases of running out of
 233 memory.  All error messages are written to the standard error stream;
 234 normal output from the executed commands is written to standard
 235 output.
 236
 237 Typing the interrupt character (usually Control-C or DEL) to the
 238 primary or secondary prompt cancels the input and returns to the
 239 primary prompt.%
 240 \footnote{
 241         A problem with the GNU Readline package may prevent this.
 242 }
 243 Typing an interrupt while a command is executing raises the {\tt
 244 KeyboardInterrupt} exception, which may be handled by a {\tt try}
 245 statement.
 246
 247 \subsection{The Module Search Path}
 248
 249 When a module named {\tt spam} is imported, the interpreter searches
 250 for a file named {\tt spam.py} in the list of directories specified by
 251 the environment variable {\tt PYTHONPATH}.  It has the same syntax as
 252 the {\UNIX} shell variable {\tt PATH}, i.e., a list of colon-separated
 253 directory names.  When {\tt PYTHONPATH} is not set, or when the file
 254 is not found there, the search continues in an installation-dependent
 255 default path, usually {\tt .:/usr/local/lib/python}.
 256
 257 Actually, modules are searched in the list of directories given by the
 258 variable {\tt sys.path} which is initialized from {\tt PYTHONPATH} and
 259 the installation-dependent default.  This allows Python programs that
 260 know what they're doing to modify or replace the module search path.
 261 See the section on Standard Modules later.
 262
 263 \subsection{``Compiled'' Python files}
 264
 265 As an important speed-up of the start-up time for short programs that
 266 use a lot of standard modules, if a file called {\tt spam.pyc} exists
 267 in the directory where {\tt spam.py} is found, this is assumed to
 268 contain an already-``compiled'' version of the module {\tt spam}.  The
 269 modification time of the version of {\tt spam.py} used to create {\tt
 270 spam.pyc} is recorded in {\tt spam.pyc}, and the file is ignored if
 271 these don't match.
 272
 273 Whenever {\tt spam.py} is successfully compiled, an attempt is made to
 274 write the compiled version to {\tt spam.pyc}.  It is not an error if
 275 this attempt fails; if for any reason the file is not written
 276 completely, the resulting {\tt spam.pyc} file will be recognized as
 277 invalid and thus ignored later.
 278
 279 \subsection{Executable Python scripts}
 280
 281 On BSD'ish {\UNIX} systems, Python scripts can be made directly
 282 executable, like shell scripts, by putting the line
 283
 284 \bcode\begin{verbatim}
 285 #! /usr/local/bin/python
 286 \end{verbatim}\ecode
 287 %
 288 (assuming that's the name of the interpreter) at the beginning of the
 289 script and giving the file an executable mode.  The {\tt \#!} must be
 290 the first two characters of the file.
 291
 292 \subsection{The Interactive Startup File}
 293
 294 When you use Python interactively, it is frequently handy to have some
 295 standard commands executed every time the interpreter is started.  You
 296 can do this by setting an environment variable named {\tt
 297 PYTHONSTARTUP} to the name of a file containing your start-up
 298 commands.  This is similar to the {\tt .profile} feature of the UNIX
 299 shells.
 300
 301 This file is only read in interactive sessions, not when Python reads
 302 commands from a script, and not when {\tt /dev/tty} is given as the
 303 explicit source of commands (which otherwise behaves like an
 304 interactive session).  It is executed in the same name space where
 305 interactive commands are executed, so that objects that it defines or
 306 imports can be used without qualification in the interactive session.
 307 You can also change the prompts {\tt sys.ps1} and {\tt sys.ps2} in
 308 this file.
 309
 310 If you want to read an additional start-up file from the current
 311 directory, you can program this in the global start-up file, e.g.
 312 \verb\execfile('.pythonrc')\.  If you want to use the startup file
 313 in a script, you must write this explicitly in the script, e.g.
 314 \verb\import os;\ \verb\execfile(os.environ['PYTHONSTARTUP'])\.
 315
 316 \section{Interactive Input Editing and History Substitution}
 317
 318 Some versions of the Python interpreter support editing of the current
 319 input line and history substitution, similar to facilities found in
 320 the Korn shell and the GNU Bash shell.  This is implemented using the
 321 {\em GNU\ Readline} library, which supports Emacs-style and vi-style
 322 editing.  This library has its own documentation which I won't
 323 duplicate here; however, the basics are easily explained.
 324
 325 Perhaps the quickest check to see whether command line editing is
 326 supported is typing Control-P to the first Python prompt you get.  If
 327 it beeps, you have command line editing.  If nothing appears to
 328 happen, or if \verb/^P/ is echoed, you can skip the rest of this
 329 section.
 330
 331 \subsection{Line Editing}
 332
 333 If supported, input line editing is active whenever the interpreter
 334 prints a primary or secondary prompt.  The current line can be edited
 335 using the conventional Emacs control characters.  The most important
 336 of these are: C-A (Control-A) moves the cursor to the beginning of the
 337 line, C-E to the end, C-B moves it one position to the left, C-F to
 338 the right.  Backspace erases the character to the left of the cursor,
 339 C-D the character to its right.  C-K kills (erases) the rest of the
 340 line to the right of the cursor, C-Y yanks back the last killed
 341 string.  C-underscore undoes the last change you made; it can be
 342 repeated for cumulative effect.
 343
 344 \subsection{History Substitution}
 345
 346 History substitution works as follows.  All non-empty input lines
 347 issued are saved in a history buffer, and when a new prompt is given
 348 you are positioned on a new line at the bottom of this buffer.  C-P
 349 moves one line up (back) in the history buffer, C-N moves one down.
 350 Any line in the history buffer can be edited; an asterisk appears in
 351 front of the prompt to mark a line as modified.  Pressing the Return
 352 key passes the current line to the interpreter.  C-R starts an
 353 incremental reverse search; C-S starts a forward search.
 354
 355 \subsection{Key Bindings}
 356
 357 The key bindings and some other parameters of the Readline library can
 358 be customized by placing commands in an initialization file called
 359 {\tt \$HOME/.inputrc}.  Key bindings have the form
 360
 361 \bcode\begin{verbatim}
 362 key-name: function-name
 363 \end{verbatim}\ecode
 364 %
 365 or
 366
 367 \bcode\begin{verbatim}
 368 "string": function-name
 369 \end{verbatim}\ecode
 370 %
 371 and options can be set with
 372
 373 \bcode\begin{verbatim}
 374 set option-name value
 375 \end{verbatim}\ecode
 376 %
 377 For example:
 378
 379 \bcode\begin{verbatim}
 380 # I prefer vi-style editing:
 381 set editing-mode vi
 382 # Edit using a single line:
 383 set horizontal-scroll-mode On
 384 # Rebind some keys:
 385 Meta-h: backward-kill-word
 386 "\C-u": universal-argument
 387 "\C-x\C-r": re-read-init-file
 388 \end{verbatim}\ecode
 389 %
 390 Note that the default binding for TAB in Python is to insert a TAB
 391 instead of Readline's default filename completion function.  If you
 392 insist, you can override this by putting
 393
 394 \bcode\begin{verbatim}
 395 TAB: complete
 396 \end{verbatim}\ecode
 397 %
 398 in your {\tt \$HOME/.inputrc}.  (Of course, this makes it hard to type
 399 indented continuation lines...)
 400
 401 \subsection{Commentary}
 402
 403 This facility is an enormous step forward compared to previous
 404 versions of the interpreter; however, some wishes are left: It would
 405 be nice if the proper indentation were suggested on continuation lines
 406 (the parser knows if an indent token is required next).  The
 407 completion mechanism might use the interpreter's symbol table.  A
 408 command to check (or even suggest) matching parentheses, quotes etc.
 409 would also be useful.
 410
 411
 412 \chapter{An Informal Introduction to Python}
 413
 414 In the following examples, input and output are distinguished by the
 415 presence or absence of prompts ({\tt >>>} and {\tt ...}): to repeat
 416 the example, you must type everything after the prompt, when the
 417 prompt appears; lines that do not begin with a prompt are output from
 418 the interpreter.%
 419 \footnote{
 420         I'd prefer to use different fonts to distinguish input
 421         from output, but the amount of LaTeX hacking that would require
 422         is currently beyond my ability.
 423 }
 424 Note that a secondary prompt on a line by itself in an example means
 425 you must type a blank line; this is used to end a multi-line command.
 426
 427 \section{Using Python as a Calculator}
 428
 429 Let's try some simple Python commands.  Start the interpreter and wait
 430 for the primary prompt, {\tt >>>}.  (It shouldn't take long.)
 431
 432 \subsection{Numbers}
 433
 434 The interpreter acts as a simple calculator: you can type an
 435 expression at it and it will write the value.  Expression syntax is
 436 straightforward: the operators {\tt +}, {\tt -}, {\tt *} and {\tt /}
 437 work just like in most other languages (e.g., Pascal or C); parentheses
 438 can be used for grouping.  For example:
 439
 440 \bcode\begin{verbatim}
 441 >>> 2+2
 442 4
 443 >>> # This is a comment
 444 ... 2+2
 445 4
 446 >>> 2+2  # and a comment on the same line as code
 447 4
 448 >>> (50-5*6)/4
 449 5
 450 >>> # Integer division returns the floor:
 451 ... 7/3
 452 2
 453 >>> 7/-3
 454 -3
 455 >>>
 456 \end{verbatim}\ecode
 457 %
 458 Like in C, the equal sign ({\tt =}) is used to assign a value to a
 459 variable.  The value of an assignment is not written:
 460
 461 \bcode\begin{verbatim}
 462 >>> width = 20
 463 >>> height = 5*9
 464 >>> width * height
 465 900
 466 >>>
 467 \end{verbatim}\ecode
 468 %
 469 A value can be assigned to several variables simultaneously:
 470
 471 \bcode\begin{verbatim}
 472 >>> x = y = z = 0  # Zero x, y and z
 473 >>> x
 474 0
 475 >>> y
 476 0
 477 >>> z
 478 0
 479 >>>
 480 \end{verbatim}\ecode
 481 %
 482 There is full support for floating point; operators with mixed type
 483 operands convert the integer operand to floating point:
 484
 485 \bcode\begin{verbatim}
 486 >>> 4 * 2.5 / 3.3
 487 3.0303030303
 488 >>> 7.0 / 2
 489 3.5
 490 >>>
 491 \end{verbatim}\ecode
 492
 493 \subsection{Strings}
 494
 495 Besides numbers, Python can also manipulate strings, enclosed in
 496 single quotes or double quotes:
 497
 498 \bcode\begin{verbatim}
 499 >>> 'spam eggs'
 500 'spam eggs'
 501 >>> 'doesn\'t'
 502 "doesn't"
 503 >>> "doesn't"
 504 "doesn't"
 505 >>> '"Yes," he said.'
 506 '"Yes," he said.'
 507 >>> "\"Yes,\" he said."
 508 '"Yes," he said.'
 509 >>> '"Isn\'t," she said.'
 510 '"Isn\'t," she said.'
 511 >>>
 512 \end{verbatim}\ecode
 513 %
 514 Strings are written the same way as they are typed for input: inside
 515 quotes and with quotes and other funny characters escaped by backslashes,
 516 to show the precise value.  The string is enclosed in double quotes if
 517 the string contains a single quote and no double quotes, else it's
 518 enclosed in single quotes.  (The {\tt print} statement, described later,
 519 can be used to write strings without quotes or escapes.)
 520
 521 Strings can be concatenated (glued together) with the {\tt +}
 522 operator, and repeated with {\tt *}:
 523
 524 \bcode\begin{verbatim}
 525 >>> word = 'Help' + 'A'
 526 >>> word
 527 'HelpA'
 528 >>> '<' + word*5 + '>'
 529 '<HelpAHelpAHelpAHelpAHelpA>'
 530 >>>
 531 \end{verbatim}\ecode
 532 %
 533 Strings can be subscripted (indexed); like in C, the first character of
 534 a string has subscript (index) 0.
 535
 536 There is no separate character type; a character is simply a string of
 537 size one.  Like in Icon, substrings can be specified with the {\em
 538 slice} notation: two indices separated by a colon.
 539
 540 \bcode\begin{verbatim}
 541 >>> word[4]
 542 'A'
 543 >>> word[0:2]
 544 'He'
 545 >>> word[2:4]
 546 'lp'
 547 >>>
 548 \end{verbatim}\ecode
 549 %
 550 Slice indices have useful defaults; an omitted first index defaults to
 551 zero, an omitted second index defaults to the size of the string being
 552 sliced.
 553
 554 \bcode\begin{verbatim}
 555 >>> word[:2]    # The first two characters
 556 'He'
 557 >>> word[2:]    # All but the first two characters
 558 'lpA'
 559 >>>
 560 \end{verbatim}\ecode
 561 %
 562 Here's a useful invariant of slice operations: \verb\s[:i] + s[i:]\
 563 equals \verb\s\.
 564
 565 \bcode\begin{verbatim}
 566 >>> word[:2] + word[2:]
 567 'HelpA'
 568 >>> word[:3] + word[3:]
 569 'HelpA'
 570 >>>
 571 \end{verbatim}\ecode
 572 %
 573 Degenerate slice indices are handled gracefully: an index that is too
 574 large is replaced by the string size, an upper bound smaller than the
 575 lower bound returns an empty string.
 576
 577 \bcode\begin{verbatim}
 578 >>> word[1:100]
 579 'elpA'
 580 >>> word[10:]
 581 ''
 582 >>> word[2:1]
 583 ''
 584 >>>
 585 \end{verbatim}\ecode
 586 %
 587 Indices may be negative numbers, to start counting from the right.
 588 For example:
 589
 590 \bcode\begin{verbatim}
 591 >>> word[-1]     # The last character
 592 'A'
 593 >>> word[-2]     # The last-but-one character
 594 'p'
 595 >>> word[-2:]    # The last two characters
 596 'pA'
 597 >>> word[:-2]    # All but the last two characters
 598 'Hel'
 599 >>>
 600 \end{verbatim}\ecode
 601 %
 602 But note that -0 is really the same as 0, so it does not count from
 603 the right!
 604
 605 \bcode\begin{verbatim}
 606 >>> word[-0]     # (since -0 equals 0)
 607 'H'
 608 >>>
 609 \end{verbatim}\ecode
 610 %
 611 Out-of-range negative slice indices are truncated, but don't try this
 612 for single-element (non-slice) indices:
 613
 614 \bcode\begin{verbatim}
 615 >>> word[-100:]
 616 'HelpA'
 617 >>> word[-10]    # error
 618 Traceback (innermost last):
 619   File "<stdin>", line 1
 620 IndexError: string index out of range
 621 >>>
 622 \end{verbatim}\ecode
 623 %
 624 The best way to remember how slices work is to think of the indices as
 625 pointing {\em between} characters, with the left edge of the first
 626 character numbered 0.  Then the right edge of the last character of a
 627 string of {\tt n} characters has index {\tt n}, for example:
 628
 629 \bcode\begin{verbatim}
 630  +---+---+---+---+---+
 631  | H | e | l | p | A |
 632  +---+---+---+---+---+
 633  0   1   2   3   4   5
 634 -5  -4  -3  -2  -1
 635 \end{verbatim}\ecode
 636 %
 637 The first row of numbers gives the position of the indices 0...5 in
 638 the string; the second row gives the corresponding negative indices.
 639 The slice from \verb\i\ to \verb\j\ consists of all characters between
 640 the edges labeled \verb\i\ and \verb\j\, respectively.
 641
 642 For nonnegative indices, the length of a slice is the difference of
 643 the indices, if both are within bounds, e.g., the length of
 644 \verb\word[1:3]\ is 2.
 645
 646 The built-in function {\tt len()} returns the length of a string:
 647
 648 \bcode\begin{verbatim}
 649 >>> s = 'supercalifragilisticexpialidocious'
 650 >>> len(s)
 651 34
 652 >>>
 653 \end{verbatim}\ecode
 654
 655 \subsection{Lists}
 656
 657 Python knows a number of {\em compound} data types, used to group
 658 together other values.  The most versatile is the {\em list}, which
 659 can be written as a list of comma-separated values (items) between
 660 square brackets.  List items need not all have the same type.
 661
 662 \bcode\begin{verbatim}
 663 >>> a = ['spam', 'eggs', 100, 1234]
 664 >>> a
 665 ['spam', 'eggs', 100, 1234]
 666 >>>
 667 \end{verbatim}\ecode
 668 %
 669 Like string indices, list indices start at 0, and lists can be sliced,
 670 concatenated and so on:
 671
 672 \bcode\begin{verbatim}
 673 >>> a[0]
 674 'spam'
 675 >>> a[3]
 676 1234
 677 >>> a[-2]
 678 100
 679 >>> a[1:-1]
 680 ['eggs', 100]
 681 >>> a[:2] + ['bacon', 2*2]
 682 ['spam', 'eggs', 'bacon', 4]
 683 >>> 3*a[:3] + ['Boe!']
 684 ['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boe!']
 685 >>>
 686 \end{verbatim}\ecode
 687 %
 688 Unlike strings, which are {\em immutable}, it is possible to change
 689 individual elements of a list:
 690
 691 \bcode\begin{verbatim}
 692 >>> a
 693 ['spam', 'eggs', 100, 1234]
 694 >>> a[2] = a[2] + 23
 695 >>> a
 696 ['spam', 'eggs', 123, 1234]
 697 >>>
 698 \end{verbatim}\ecode
 699 %
 700 Assignment to slices is also possible, and this can even change the size
 701 of the list:
 702
 703 \bcode\begin{verbatim}
 704 >>> # Replace some items:
 705 ... a[0:2] = [1, 12]
 706 >>> a
 707 [1, 12, 123, 1234]
 708 >>> # Remove some:
 709 ... a[0:2] = []
 710 >>> a
 711 [123, 1234]
 712 >>> # Insert some:
 713 ... a[1:1] = ['bletch', 'xyzzy']
 714 >>> a
 715 [123, 'bletch', 'xyzzy', 1234]
 716 >>> a[:0] = a     # Insert (a copy of) itself at the beginning
 717 >>> a
 718 [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
 719 >>>
 720 \end{verbatim}\ecode
 721 %
 722 The built-in function {\tt len()} also applies to lists:
 723
 724 \bcode\begin{verbatim}
 725 >>> len(a)
 726 8
 727 >>>
 728 \end{verbatim}\ecode
 729 %
 730 It is possible to nest lists (create lists containing other lists),
 731 for example:
 732
 733 \bcode\begin{verbatim}
 734 >>> q = [2, 3]
 735 >>> p = [1, q, 4]
 736 >>> len(p)
 737 3
 738 >>> p[1]
 739 [2, 3]
 740 >>> p[1][0]
 741 2
 742 >>> p[1].append('xtra')     # See section 5.1
 743 >>> p
 744 [1, [2, 3, 'xtra'], 4]
 745 >>> q
 746 [2, 3, 'xtra']
 747 >>>
 748 \end{verbatim}\ecode
 749 %
 750 Note that in the last example, {\tt p[1]} and {\tt q} really refer to
 751 the same object!  We'll come back to {\em object semantics} later.
 752
 753 \section{First Steps Towards Programming}
 754
 755 Of course, we can use Python for more complicated tasks than adding
 756 two and two together.  For instance, we can write an initial
 757 subsequence of the {\em Fibonacci} series as follows:
 758
 759 \bcode\begin{verbatim}
 760 >>> # Fibonacci series:
 761 ... # the sum of two elements defines the next
 762 ... a, b = 0, 1
 763 >>> while b < 10:
 764 ...       print b
 765 ...       a, b = b, a+b
 766 ...
 767 1
 768 1
 769 2
 770 3
 771 5
 772 8
 773 >>>
 774 \end{verbatim}\ecode
 775 %
 776 This example introduces several new features.
 777
 778 \begin{itemize}
 779
 780 \item
 781 The first line contains a {\em multiple assignment}: the variables
 782 {\tt a} and {\tt b} simultaneously get the new values 0 and 1.  On the
 783 last line this is used again, demonstrating that the expressions on
 784 the right-hand side are all evaluated first before any of the
 785 assignments take place.
 786
 787 \item
 788 The {\tt while} loop executes as long as the condition (here: {\tt b <
 789 10}) remains true.  In Python, like in C, any non-zero integer value is
 790 true; zero is false.  The condition may also be a string or list value,
 791 in fact any sequence; anything with a non-zero length is true, empty
 792 sequences are false.  The test used in the example is a simple
 793 comparison.  The standard comparison operators are written the same as
 794 in C: {\tt <}, {\tt >}, {\tt ==}, {\tt <=}, {\tt >=} and {\tt !=}.
 795
 796 \item
 797 The {\em body} of the loop is {\em indented}: indentation is Python's
 798 way of grouping statements.  Python does not (yet!) provide an
 799 intelligent input line editing facility, so you have to type a tab or
 800 space(s) for each indented line.  In practice you will prepare more
 801 complicated input for Python with a text editor; most text editors have
 802 an auto-indent facility.  When a compound statement is entered
 803 interactively, it must be followed by a blank line to indicate
 804 completion (since the parser cannot guess when you have typed the last
 805 line).
 806
 807 \item
 808 The {\tt print} statement writes the value of the expression(s) it is
 809 given.  It differs from just writing the expression you want to write
 810 (as we did earlier in the calculator examples) in the way it handles
 811 multiple expressions and strings.  Strings are printed without quotes,
 812 and a space is inserted between items, so you can format things nicely,
 813 like this:
 814
 815 \bcode\begin{verbatim}
 816 >>> i = 256*256
 817 >>> print 'The value of i is', i
 818 The value of i is 65536
 819 >>>
 820 \end{verbatim}\ecode
 821 %
 822 A trailing comma avoids the newline after the output:
 823
 824 \bcode\begin{verbatim}
 825 >>> a, b = 0, 1
 826 >>> while b < 1000:
 827 ...     print b,
 828 ...     a, b = b, a+b
 829 ...
 830 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
 831 >>>
 832 \end{verbatim}\ecode
 833 %
 834 Note that the interpreter inserts a newline before it prints the next
 835 prompt if the last line was not completed.
 836
 837 \end{itemize}
 838
 839
 840 \chapter{More Control Flow Tools}
 841
 842 Besides the {\tt while} statement just introduced, Python knows the
 843 usual control flow statements known from other languages, with some
 844 twists.
 845
 846 \section{If Statements}
 847
 848 Perhaps the most well-known statement type is the {\tt if} statement.
 849 For example:
 850
 851 \bcode\begin{verbatim}
 852 >>> if x < 0:
 853 ...      x = 0
 854 ...      print 'Negative changed to zero'
 855 ... elif x == 0:
 856 ...      print 'Zero'
 857 ... elif x == 1:
 858 ...      print 'Single'
 859 ... else:
 860 ...      print 'More'
 861 ...
 862 \end{verbatim}\ecode
 863 %
 864 There can be zero or more {\tt elif} parts, and the {\tt else} part is
 865 optional.  The keyword `{\tt elif}' is short for `{\tt else if}', and is
 866 useful to avoid excessive indentation.  An {\tt if...elif...elif...}
 867 sequence is a substitute for the {\em switch} or {\em case} statements
 868 found in other languages.
 869
 870 \section{For Statements}
 871
 872 The {\tt for} statement in Python differs a bit from what you may be
 873 used to in C or Pascal.  Rather than always iterating over an
 874 arithmetic progression of numbers (like in Pascal), or leaving the user
 875 completely free in the iteration test and step (as C), Python's {\tt
 876 for} statement iterates over the items of any sequence (e.g., a list
 877 or a string), in the order that they appear in the sequence.  For
 878 example (no pun intended):
 879
 880 \bcode\begin{verbatim}
 881 >>> # Measure some strings:
 882 ... a = ['cat', 'window', 'defenestrate']
 883 >>> for x in a:
 884 ...     print x, len(x)
 885 ...
 886 cat 3
 887 window 6
 888 defenestrate 12
 889 >>>
 890 \end{verbatim}\ecode
 891 %
 892 It is not safe to modify the sequence being iterated over in the loop
 893 (this can only happen for mutable sequence types, i.e., lists).  If
 894 you need to modify the list you are iterating over, e.g., duplicate
 895 selected items, you must iterate over a copy.  The slice notation
 896 makes this particularly convenient:
 897
 898 \bcode\begin{verbatim}
 899 >>> for x in a[:]: # make a slice copy of the entire list
 900 ...    if len(x) > 6: a.insert(0, x)
 901 ...
 902 >>> a
 903 ['defenestrate', 'cat', 'window', 'defenestrate']
 904 >>>
 905 \end{verbatim}\ecode
 906
 907 \section{The {\tt range()} Function}
 908
 909 If you do need to iterate over a sequence of numbers, the built-in
 910 function {\tt range()} comes in handy.  It generates lists containing
 911 arithmetic progressions, e.g.:
 912
 913 \bcode\begin{verbatim}
 914 >>> range(10)
 915 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 916 >>>
 917 \end{verbatim}\ecode
 918 %
 919 The given end point is never part of the generated list; {\tt range(10)}
 920 generates a list of 10 values, exactly the legal indices for items of a
 921 sequence of length 10.  It is possible to let the range start at another
 922 number, or to specify a different increment (even negative):
 923
 924 \bcode\begin{verbatim}
 925 >>> range(5, 10)
 926 [5, 6, 7, 8, 9]
 927 >>> range(0, 10, 3)
 928 [0, 3, 6, 9]
 929 >>> range(-10, -100, -30)
 930 [-10, -40, -70]
 931 >>>
 932 \end{verbatim}\ecode
 933 %
 934 To iterate over the indices of a sequence, combine {\tt range()} and
 935 {\tt len()} as follows:
 936
 937 \bcode\begin{verbatim}
 938 >>> a = ['Mary', 'had', 'a', 'little', 'lamb']
 939 >>> for i in range(len(a)):
 940 ...     print i, a[i]
 941 ...
 942 0 Mary
 943 1 had
 944 2 a
 945 3 little
 946 4 lamb
 947 >>>
 948 \end{verbatim}\ecode
 949
 950 \section{Break and Continue Statements, and Else Clauses on Loops}
 951
 952 The {\tt break} statement, like in C, breaks out of the smallest
 953 enclosing {\tt for} or {\tt while} loop.
 954
 955 The {\tt continue} statement, also borrowed from C, continues with the
 956 next iteration of the loop.
 957
 958 Loop statements may have an {\tt else} clause; it is executed when the
 959 loop terminates through exhaustion of the list (with {\tt for}) or when
 960 the condition becomes false (with {\tt while}), but not when the loop is
 961 terminated by a {\tt break} statement.  This is exemplified by the
 962 following loop, which searches for prime numbers:
 963
 964 \bcode\begin{verbatim}
 965 >>> for n in range(2, 10):
 966 ...     for x in range(2, n):
 967 ...         if n % x == 0:
 968 ...            print n, 'equals', x, '*', n/x
 969 ...            break
 970 ...     else:
 971 ...          print n, 'is a prime number'
 972 ...
 973 2 is a prime number
 974 3 is a prime number
 975 4 equals 2 * 2
 976 5 is a prime number
 977 6 equals 2 * 3
 978 7 is a prime number
 979 8 equals 2 * 4
 980 9 equals 3 * 3
 981 >>>
 982 \end{verbatim}\ecode
 983
 984 \section{Pass Statements}
 985
 986 The {\tt pass} statement does nothing.
 987 It can be used when a statement is required syntactically but the
 988 program requires no action.
 989 For example:
 990
 991 \bcode\begin{verbatim}
 992 >>> while 1:
 993 ...       pass # Busy-wait for keyboard interrupt
 994 ...
 995 \end{verbatim}\ecode
 996
 997 \section{Defining Functions}
 998
 999 We can create a function that writes the Fibonacci series to an
1000 arbitrary boundary:
1001
1002 \bcode\begin{verbatim}
1003 >>> def fib(n):    # write Fibonacci series up to n
1004 ...     a, b = 0, 1
1005 ...     while b < n:
1006 ...           print b,
1007 ...           a, b = b, a+b
1008 ...
1009 >>> # Now call the function we just defined:
1010 ... fib(2000)
1011 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
1012 >>>
1013 \end{verbatim}\ecode
1014 %
1015 The keyword {\tt def} introduces a function {\em definition}.  It must
1016 be followed by the function name and the parenthesized list of formal
1017 parameters.  The statements that form the body of the function starts at
1018 the next line, indented by a tab stop.
1019
1020 The {\em execution} of a function introduces a new symbol table used
1021 for the local variables of the function.  More precisely, all variable
1022 assignments in a function store the value in the local symbol table;
1023 whereas
1024 variable references first look in the local symbol table, then
1025 in the global symbol table, and then in the table of built-in names.
1026 Thus,
1027 global variables cannot be directly assigned a value within a
1028 function (unless named in a {\tt global} statement), although
1029 they may be referenced.
1030
1031 The actual parameters (arguments) to a function call are introduced in
1032 the local symbol table of the called function when it is called; thus,
1033 arguments are passed using {\em call\ by\ value}.%
1034 \footnote{
1035          Actually, {\em call  by  object reference} would be a better
1036          description, since if a mutable object is passed, the caller
1037          will see any changes the callee makes to it (e.g., items
1038          inserted into a list).
1039 }
1040 When a function calls another function, a new local symbol table is
1041 created for that call.
1042
1043 A function definition introduces the function name in the
1044 current
1045 symbol table.  The value
1046 of the function name
1047 has a type that is recognized by the interpreter as a user-defined
1048 function.  This value can be assigned to another name which can then
1049 also be used as a function.  This serves as a general renaming
1050 mechanism:
1051
1052 \bcode\begin{verbatim}
1053 >>> fib
1054 <function object at 10042ed0>
1055 >>> f = fib
1056 >>> f(100)
1057 1 1 2 3 5 8 13 21 34 55 89
1058 >>>
1059 \end{verbatim}\ecode
1060 %
1061 You might object that {\tt fib} is not a function but a procedure.  In
1062 Python, like in C, procedures are just functions that don't return a
1063 value.  In fact, technically speaking, procedures do return a value,
1064 albeit a rather boring one.  This value is called {\tt None} (it's a
1065 built-in name).  Writing the value {\tt None} is normally suppressed by
1066 the interpreter if it would be the only value written.  You can see it
1067 if you really want to:
1068
1069 \bcode\begin{verbatim}
1070 >>> print fib(0)
1071 None
1072 >>>
1073 \end{verbatim}\ecode
1074 %
1075 It is simple to write a function that returns a list of the numbers of
1076 the Fibonacci series, instead of printing it:
1077
1078 \bcode\begin{verbatim}
1079 >>> def fib2(n): # return Fibonacci series up to n
1080 ...     result = []
1081 ...     a, b = 0, 1
1082 ...     while b < n:
1083 ...           result.append(b)    # see below
1084 ...           a, b = b, a+b
1085 ...     return result
1086 ...
1087 >>> f100 = fib2(100)    # call it
1088 >>> f100                # write the result
1089 [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
1090 >>>
1091 \end{verbatim}\ecode
1092 %
1093 This example, as usual, demonstrates some new Python features:
1094
1095 \begin{itemize}
1096
1097 \item
1098 The {\tt return} statement returns with a value from a function.  {\tt
1099 return} without an expression argument is used to return from the middle
1100 of a procedure (falling off the end also returns from a procedure), in
1101 which case the {\tt None} value is returned.
1102
1103 \item
1104 The statement {\tt result.append(b)} calls a {\em method} of the list
1105 object {\tt result}.  A method is a function that `belongs' to an
1106 object and is named {\tt obj.methodname}, where {\tt obj} is some
1107 object (this may be an expression), and {\tt methodname} is the name
1108 of a method that is defined by the object's type.  Different types
1109 define different methods.  Methods of different types may have the
1110 same name without causing ambiguity.  (It is possible to define your
1111 own object types and methods, using {\em classes}, as discussed later
1112 in this tutorial.)
1113 The method {\tt append} shown in the example, is defined for
1114 list objects; it adds a new element at the end of the list.  In this
1115 example
1116 it is equivalent to {\tt result = result + [b]}, but more efficient.
1117
1118 \end{itemize}
1119
1120
1121 \chapter{Odds and Ends}
1122
1123 This chapter describes some things you've learned about already in
1124 more detail, and adds some new things as well.
1125
1126 \section{More on Lists}
1127
1128 The list data type has some more methods.  Here are all of the methods
1129 of lists objects:
1130
1131 \begin{description}
1132
1133 \item[{\tt insert(i, x)}]
1134 Insert an item at a given position.  The first argument is the index of
1135 the element before which to insert, so {\tt a.insert(0, x)} inserts at
1136 the front of the list, and {\tt a.insert(len(a), x)} is equivalent to
1137 {\tt a.append(x)}.
1138
1139 \item[{\tt append(x)}]
1140 Equivalent to {\tt a.insert(len(a), x)}.
1141
1142 \item[{\tt index(x)}]
1143 Return the index in the list of the first item whose value is {\tt x}.
1144 It is an error if there is no such item.
1145
1146 \item[{\tt remove(x)}]
1147 Remove the first item from the list whose value is {\tt x}.
1148 It is an error if there is no such item.
1149
1150 \item[{\tt sort()}]
1151 Sort the items of the list, in place.
1152
1153 \item[{\tt reverse()}]
1154 Reverse the elements of the list, in place.
1155
1156 \item[{\tt count(x)}]
1157 Return the number of times {\tt x} appears in the list.
1158
1159 \end{description}
1160
1161 An example that uses all list methods:
1162
1163 \bcode\begin{verbatim}
1164 >>> a = [66.6, 333, 333, 1, 1234.5]
1165 >>> print a.count(333), a.count(66.6), a.count('x')
1166 2 1 0
1167 >>> a.insert(2, -1)
1168 >>> a.append(333)
1169 >>> a
1170 [66.6, 333, -1, 333, 1, 1234.5, 333]
1171 >>> a.index(333)
1172 1
1173 >>> a.remove(333)
1174 >>> a
1175 [66.6, -1, 333, 1, 1234.5, 333]
1176 >>> a.reverse()
1177 >>> a
1178 [333, 1234.5, 1, 333, -1, 66.6]
1179 >>> a.sort()
1180 >>> a
1181 [-1, 1, 66.6, 333, 333, 1234.5]
1182 >>>
1183 \end{verbatim}\ecode
1184
1185 \section{The {\tt del} statement}
1186
1187 There is a way to remove an item from a list given its index instead
1188 of its value: the {\tt del} statement.  This can also be used to
1189 remove slices from a list (which we did earlier by assignment of an
1190 empty list to the slice).  For example:
1191
1192 \bcode\begin{verbatim}
1193 >>> a
1194 [-1, 1, 66.6, 333, 333, 1234.5]
1195 >>> del a[0]
1196 >>> a
1197 [1, 66.6, 333, 333, 1234.5]
1198 >>> del a[2:4]
1199 >>> a
1200 [1, 66.6, 1234.5]
1201 >>>
1202 \end{verbatim}\ecode
1203 %
1204 {\tt del} can also be used to delete entire variables:
1205
1206 \bcode\begin{verbatim}
1207 >>> del a
1208 >>>
1209 \end{verbatim}\ecode
1210 %
1211 Referencing the name {\tt a} hereafter is an error (at least until
1212 another value is assigned to it).  We'll find other uses for {\tt del}
1213 later.
1214
1215 \section{Tuples and Sequences}
1216
1217 We saw that lists and strings have many common properties, e.g.,
1218 indexing and slicing operations.  They are two examples of {\em
1219 sequence} data types.  Since Python is an evolving language, other
1220 sequence data types may be added.  There is also another standard
1221 sequence data type: the {\em tuple}.
1222
1223 A tuple consists of a number of values separated by commas, for
1224 instance:
1225
1226 \bcode\begin{verbatim}
1227 >>> t = 12345, 54321, 'hello!'
1228 >>> t[0]
1229 12345
1230 >>> t
1231 (12345, 54321, 'hello!')
1232 >>> # Tuples may be nested:
1233 ... u = t, (1, 2, 3, 4, 5)
1234 >>> u
1235 ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
1236 >>>
1237 \end{verbatim}\ecode
1238 %
1239 As you see, on output tuples are alway enclosed in parentheses, so
1240 that nested tuples are interpreted correctly; they may be input with
1241 or without surrounding parentheses, although often parentheses are
1242 necessary anyway (if the tuple is part of a larger expression).
1243
1244 Tuples have many uses, e.g., (x, y) coordinate pairs, employee records
1245 from a database, etc.  Tuples, like strings, are immutable: it is not
1246 possible to assign to the individual items of a tuple (you can
1247 simulate much of the same effect with slicing and concatenation,
1248 though).
1249
1250 A special problem is the construction of tuples containing 0 or 1
1251 items: the syntax has some extra quirks to accommodate these.  Empty
1252 tuples are constructed by an empty pair of parentheses; a tuple with
1253 one item is constructed by following a value with a comma
1254 (it is not sufficient to enclose a single value in parentheses).
1255 Ugly, but effective.  For example:
1256
1257 \bcode\begin{verbatim}
1258 >>> empty = ()
1259 >>> singleton = 'hello',    # <-- note trailing comma
1260 >>> len(empty)
1261 0
1262 >>> len(singleton)
1263 1
1264 >>> singleton
1265 ('hello',)
1266 >>>
1267 \end{verbatim}\ecode
1268 %
1269 The statement {\tt t = 12345, 54321, 'hello!'} is an example of {\em
1270 tuple packing}: the values {\tt 12345}, {\tt 54321} and {\tt 'hello!'}
1271 are packed together in a tuple.  The reverse operation is also
1272 possible, e.g.:
1273
1274 \bcode\begin{verbatim}
1275 >>> x, y, z = t
1276 >>>
1277 \end{verbatim}\ecode
1278 %
1279 This is called, appropriately enough, {\em tuple unpacking}.  Tuple
1280 unpacking requires that the list of variables on the left has the same
1281 number of elements as the length of the tuple.  Note that multiple
1282 assignment is really just a combination of tuple packing and tuple
1283 unpacking!
1284
1285 Occasionally, the corresponding operation on lists is useful: {\em list
1286 unpacking}.  This is supported by enclosing the list of variables in
1287 square brackets:
1288
1289 \bcode\begin{verbatim}
1290 >>> a = ['spam', 'eggs', 100, 1234]
1291 >>> [a1, a2, a3, a4] = a
1292 >>>
1293 \end{verbatim}\ecode
1294
1295 \section{Dictionaries}
1296
1297 Another useful data type built into Python is the {\em dictionary}.
1298 Dictionaries are sometimes found in other languages as ``associative
1299 memories'' or ``associative arrays''.  Unlike sequences, which are
1300 indexed by a range of numbers, dictionaries are indexed by {\em keys},
1301 which are strings (the use of non-string values as keys
1302 is supported, but beyond the scope of this tutorial).
1303 It is best to think of a dictionary as an unordered set of
1304 {\em key:value} pairs, with the requirement that the keys are unique
1305 (within one dictionary).
1306 A pair of braces creates an empty dictionary: \verb/{}/.
1307 Placing a comma-separated list of key:value pairs within the
1308 braces adds initial key:value pairs to the dictionary; this is also the
1309 way dictionaries are written on output.
1310
1311 The main operations on a dictionary are storing a value with some key
1312 and extracting the value given the key.  It is also possible to delete
1313 a key:value pair
1314 with {\tt del}.
1315 If you store using a key that is already in use, the old value
1316 associated with that key is forgotten.  It is an error to extract a
1317 value using a non-existent key.
1318
1319 The {\tt keys()} method of a dictionary object returns a list of all the
1320 keys used in the dictionary, in random order (if you want it sorted,
1321 just apply the {\tt sort()} method to the list of keys).  To check
1322 whether a single key is in the dictionary, use the \verb/has_key()/
1323 method of the dictionary.
1324
1325 Here is a small example using a dictionary:
1326
1327 \bcode\begin{verbatim}
1328 >>> tel = {'jack': 4098, 'sape': 4139}
1329 >>> tel['guido'] = 4127
1330 >>> tel
1331 {'sape': 4139, 'guido': 4127, 'jack': 4098}
1332 >>> tel['jack']
1333 4098
1334 >>> del tel['sape']
1335 >>> tel['irv'] = 4127
1336 >>> tel
1337 {'guido': 4127, 'irv': 4127, 'jack': 4098}
1338 >>> tel.keys()
1339 ['guido', 'irv', 'jack']
1340 >>> tel.has_key('guido')
1341 1
1342 >>>
1343 \end{verbatim}\ecode
1344
1345 \section{More on Conditions}
1346
1347 The conditions used in {\tt while} and {\tt if} statements above can
1348 contain other operators besides comparisons.
1349
1350 The comparison operators {\tt in} and {\tt not in} check whether a value
1351 occurs (does not occur) in a sequence.  The operators {\tt is} and {\tt
1352 is not} compare whether two objects are really the same object; this
1353 only matters for mutable objects like lists.  All comparison operators
1354 have the same priority, which is lower than that of all numerical
1355 operators.
1356
1357 Comparisons can be chained: e.g., {\tt a < b == c} tests whether {\tt a}
1358 is less than {\tt b} and moreover {\tt b} equals {\tt c}.
1359
1360 Comparisons may be combined by the Boolean operators {\tt and} and {\tt
1361 or}, and the outcome of a comparison (or of any other Boolean
1362 expression) may be negated with {\tt not}.  These all have lower
1363 priorities than comparison operators again; between them, {\tt not} has
1364 the highest priority, and {\tt or} the lowest, so that
1365 {\tt A and not B or C} is equivalent to {\tt (A and (not B)) or C}.  Of
1366 course, parentheses can be used to express the desired composition.
1367
1368 The Boolean operators {\tt and} and {\tt or} are so-called {\em
1369 shortcut} operators: their arguments are evaluated from left to right,
1370 and evaluation stops as soon as the outcome is determined.  E.g., if
1371 {\tt A} and {\tt C} are true but {\tt B} is false, {\tt A and B and C}
1372 does not evaluate the expression C.  In general, the return value of a
1373 shortcut operator, when used as a general value and not as a Boolean, is
1374 the last evaluated argument.
1375
1376 It is possible to assign the result of a comparison or other Boolean
1377 expression to a variable.  For example,
1378
1379 \bcode\begin{verbatim}
1380 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
1381 >>> non_null = string1 or string2 or string3
1382 >>> non_null
1383 'Trondheim'
1384 >>>
1385 \end{verbatim}\ecode
1386 %
1387 Note that in Python, unlike C, assignment cannot occur inside expressions.
1388
1389 \section{Comparing Sequences and Other Types}
1390
1391 Sequence objects may be compared to other objects with the same
1392 sequence type.  The comparison uses {\em lexicographical} ordering:
1393 first the first two items are compared, and if they differ this
1394 determines the outcome of the comparison; if they are equal, the next
1395 two items are compared, and so on, until either sequence is exhausted.
1396 If two items to be compared are themselves sequences of the same type,
1397 the lexicographical comparison is carried out recursively.  If all
1398 items of two sequences compare equal, the sequences are considered
1399 equal.  If one sequence is an initial subsequence of the other, the
1400 shorted sequence is the smaller one.  Lexicographical ordering for
1401 strings uses the ASCII ordering for individual characters.  Some
1402 examples of comparisons between sequences with the same types:
1403
1404 \bcode\begin{verbatim}
1405 (1, 2, 3)              < (1, 2, 4)
1406 [1, 2, 3]              < [1, 2, 4]
1407 'ABC' < 'C' < 'Pascal' < 'Python'
1408 (1, 2, 3, 4)           < (1, 2, 4)
1409 (1, 2)                 < (1, 2, -1)
1410 (1, 2, 3)              = (1.0, 2.0, 3.0)
1411 (1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)
1412 \end{verbatim}\ecode
1413 %
1414 Note that comparing objects of different types is legal.  The outcome
1415 is deterministic but arbitrary: the types are ordered by their name.
1416 Thus, a list is always smaller than a string, a string is always
1417 smaller than a tuple, etc.  Mixed numeric types are compared according
1418 to their numeric value, so 0 equals 0.0, etc.%
1419 \footnote{
1420         The rules for comparing objects of different types should
1421         not be relied upon; they may change in a future version of
1422         the language.
1423 }
1424
1425
1426 \chapter{Modules}
1427
1428 If you quit from the Python interpreter and enter it again, the
1429 definitions you have made (functions and variables) are lost.
1430 Therefore, if you want to write a somewhat longer program, you are
1431 better off using a text editor to prepare the input for the interpreter
1432 and running it with that file as input instead.  This is known as creating a
1433 {\em script}.  As your program gets longer, you may want to split it
1434 into several files for easier maintenance.  You may also want to use a
1435 handy function that you've written in several programs without copying
1436 its definition into each program.
1437
1438 To support this, Python has a way to put definitions in a file and use
1439 them in a script or in an interactive instance of the interpreter.
1440 Such a file is called a {\em module}; definitions from a module can be
1441 {\em imported} into other modules or into the {\em main} module (the
1442 collection of variables that you have access to in a script
1443 executed at the top level
1444 and in calculator mode).
1445
1446 A module is a file containing Python definitions and statements.  The
1447 file name is the module name with the suffix {\tt .py} appended.  Within
1448 a module, the module's name (as a string) is available as the value of
1449 the global variable {\tt __name__}.  For instance, use your favorite text
1450 editor to create a file called {\tt fibo.py} in the current directory
1451 with the following contents:
1452
1453 \bcode\begin{verbatim}
1454 # Fibonacci numbers module
1455
1456 def fib(n):    # write Fibonacci series up to n
1457     a, b = 0, 1
1458     while b < n:
1459           print b,
1460           a, b = b, a+b
1461
1462 def fib2(n): # return Fibonacci series up to n
1463     result = []
1464     a, b = 0, 1
1465     while b < n:
1466           result.append(b)
1467           a, b = b, a+b
1468     return result
1469 \end{verbatim}\ecode
1470 %
1471 Now enter the Python interpreter and import this module with the
1472 following command:
1473
1474 \bcode\begin{verbatim}
1475 >>> import fibo
1476 >>>
1477 \end{verbatim}\ecode
1478 %
1479 This does not enter the names of the functions defined in
1480 {\tt fibo}
1481 directly in the current symbol table; it only enters the module name
1482 {\tt fibo}
1483 there.
1484 Using the module name you can access the functions:
1485
1486 \bcode\begin{verbatim}
1487 >>> fibo.fib(1000)
1488 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
1489 >>> fibo.fib2(100)
1490 [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
1491 >>> fibo.__name__
1492 'fibo'
1493 >>>
1494 \end{verbatim}\ecode
1495 %
1496 If you intend to use a function often you can assign it to a local name:
1497
1498 \bcode\begin{verbatim}
1499 >>> fib = fibo.fib
1500 >>> fib(500)
1501 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1502 >>>
1503 \end{verbatim}\ecode
1504
1505 \section{More on Modules}
1506
1507 A module can contain executable statements as well as function
1508 definitions.
1509 These statements are intended to initialize the module.
1510 They are executed only the
1511 {\em first}
1512 time the module is imported somewhere.%
1513 \footnote{
1514         In fact function definitions are also `statements' that are
1515         `executed'; the execution enters the function name in the
1516         module's global symbol table.
1517 }
1518
1519 Each module has its own private symbol table, which is used as the
1520 global symbol table by all functions defined in the module.
1521 Thus, the author of a module can use global variables in the module
1522 without worrying about accidental clashes with a user's global
1523 variables.
1524 On the other hand, if you know what you are doing you can touch a
1525 module's global variables with the same notation used to refer to its
1526 functions,
1527 {\tt modname.itemname}.
1528
1529 Modules can import other modules.
1530 It is customary but not required to place all
1531 {\tt import}
1532 statements at the beginning of a module (or script, for that matter).
1533 The imported module names are placed in the importing module's global
1534 symbol table.
1535
1536 There is a variant of the
1537 {\tt import}
1538 statement that imports names from a module directly into the importing
1539 module's symbol table.
1540 For example:
1541
1542 \bcode\begin{verbatim}
1543 >>> from fibo import fib, fib2
1544 >>> fib(500)
1545 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1546 >>>
1547 \end{verbatim}\ecode
1548 %
1549 This does not introduce the module name from which the imports are taken
1550 in the local symbol table (so in the example, {\tt fibo} is not
1551 defined).
1552
1553 There is even a variant to import all names that a module defines:
1554
1555 \bcode\begin{verbatim}
1556 >>> from fibo import *
1557 >>> fib(500)
1558 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1559 >>>
1560 \end{verbatim}\ecode
1561 %
1562 This imports all names except those beginning with an underscore
1563 ({\tt _}).
1564
1565 \section{Standard Modules}
1566
1567 Python comes with a library of standard modules, described in a separate
1568 document (Python Library Reference).  Some modules are built into the
1569 interpreter; these provide access to operations that are not part of the
1570 core of the language but are nevertheless built in, either for
1571 efficiency or to provide access to operating system primitives such as
1572 system calls.  The set of such modules is a configuration option; e.g.,
1573 the {\tt amoeba} module is only provided on systems that somehow support
1574 Amoeba primitives.  One particular module deserves some attention: {\tt
1575 sys}, which is built into every Python interpreter.  The variables {\tt
1576 sys.ps1} and {\tt sys.ps2} define the strings used as primary and
1577 secondary prompts:
1578
1579 \bcode\begin{verbatim}
1580 >>> import sys
1581 >>> sys.ps1
1582 '>>> '
1583 >>> sys.ps2
1584 '... '
1585 >>> sys.ps1 = 'C> '
1586 C> print 'Yuck!'
1587 Yuck!
1588 C>
1589 \end{verbatim}\ecode
1590 %
1591 These two variables are only defined if the interpreter is in
1592 interactive mode.
1593
1594 The variable
1595 {\tt sys.path}
1596 is a list of strings that determine the interpreter's search path for
1597 modules.
1598 It is initialized to a default path taken from the environment variable
1599 {\tt PYTHONPATH},
1600 or from a built-in default if
1601 {\tt PYTHONPATH}
1602 is not set.
1603 You can modify it using standard list operations, e.g.:
1604
1605 \bcode\begin{verbatim}
1606 >>> import sys
1607 >>> sys.path.append('/ufs/guido/lib/python')
1608 >>>
1609 \end{verbatim}\ecode
1610
1611 \section{The {\tt dir()} function}
1612
1613 The built-in function {\tt dir} is used to find out which names a module
1614 defines.  It returns a sorted list of strings:
1615
1616 \bcode\begin{verbatim}
1617 >>> import fibo, sys
1618 >>> dir(fibo)
1619 ['__name__', 'fib', 'fib2']
1620 >>> dir(sys)
1621 ['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
1622 'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
1623 'stderr', 'stdin', 'stdout', 'version']
1624 >>>
1625 \end{verbatim}\ecode
1626 %
1627 Without arguments, {\tt dir()} lists the names you have defined currently:
1628
1629 \bcode\begin{verbatim}
1630 >>> a = [1, 2, 3, 4, 5]
1631 >>> import fibo, sys
1632 >>> fib = fibo.fib
1633 >>> dir()
1634 ['__name__', 'a', 'fib', 'fibo', 'sys']
1635 >>>
1636 \end{verbatim}\ecode
1637 %
1638 Note that it lists all types of names: variables, modules, functions, etc.
1639
1640 {\tt dir()} does not list the names of built-in functions and variables.
1641 If you want a list of those, they are defined in the standard module
1642 {\tt __builtin__}:
1643
1644 \bcode\begin{verbatim}
1645 >>> import __builtin__
1646 >>> dir(__builtin__)
1647 ['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
1648 'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
1649 'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
1650 'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
1651 'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
1652 'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
1653 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
1654 'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
1655 'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange']
1656 >>>
1657 \end{verbatim}\ecode
1658
1659
1660 \chapter{Output Formatting}
1661
1662 So far we've encountered two ways of writing values: {\em expression
1663 statements} and the {\tt print} statement.  (A third way is using the
1664 {\tt write} method of file objects; the standard output file can be
1665 referenced as {\tt sys.stdout}.  See the Library Reference for more
1666 information on this.)
1667
1668 Often you'll want more control over the formatting of your output than
1669 simply printing space-separated values.  The key to nice formatting in
1670 Python is to do all the string handling yourself; using string slicing
1671 and concatenation operations you can create any lay-out you can imagine.
1672 The standard module {\tt string} contains some useful operations for
1673 padding strings to a given column width; these will be discussed shortly.
1674 Finally, the \code{\%} operator (modulo) with a string left argument
1675 interprets this string as a C sprintf format string to be applied to the
1676 right argument, and returns the string resulting from this formatting
1677 operation.
1678
1679 One question remains, of course: how do you convert values to strings?
1680 Luckily, Python has a way to convert any value to a string: just write
1681 the value between reverse quotes (\verb/``/).  Some examples:
1682
1683 \bcode\begin{verbatim}
1684 >>> x = 10 * 3.14
1685 >>> y = 200*200
1686 >>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
1687 >>> print s
1688 The value of x is 31.4, and y is 40000...
1689 >>> # Reverse quotes work on other types besides numbers:
1690 ... p = [x, y]
1691 >>> ps = `p`
1692 >>> ps
1693 '[31.4, 40000]'
1694 >>> # Converting a string adds string quotes and backslashes:
1695 ... hello = 'hello, world\n'
1696 >>> hellos = `hello`
1697 >>> print hellos
1698 'hello, world\012'
1699 >>> # The argument of reverse quotes may be a tuple:
1700 ... `x, y, ('spam', 'eggs')`
1701 "(31.4, 40000, ('spam', 'eggs'))"
1702 >>>
1703 \end{verbatim}\ecode
1704 %
1705 Here are two ways to write a table of squares and cubes:
1706
1707 \bcode\begin{verbatim}
1708 >>> import string
1709 >>> for x in range(1, 11):
1710 ...     print string.rjust(`x`, 2), string.rjust(`x*x`, 3),
1711 ...     # Note trailing comma on previous line
1712 ...     print string.rjust(`x*x*x`, 4)
1713 ...
1714  1   1    1
1715  2   4    8
1716  3   9   27
1717  4  16   64
1718  5  25  125
1719  6  36  216
1720  7  49  343
1721  8  64  512
1722  9  81  729
1723 10 100 1000
1724 >>> for x in range(1,11):
1725 ...     print '%2d %3d %4d' % (x, x*x, x*x*x)
1726 ...
1727  1   1    1
1728  2   4    8
1729  3   9   27
1730  4  16   64
1731  5  25  125
1732  6  36  216
1733  7  49  343
1734  8  64  512
1735  9  81  729
1736 10 100 1000
1737 >>>
1738 \end{verbatim}\ecode
1739 %
1740 (Note that one space between each column was added by the way {\tt print}
1741 works: it always adds spaces between its arguments.)
1742
1743 This example demonstrates the function {\tt string.rjust()}, which
1744 right-justifies a string in a field of a given width by padding it with
1745 spaces on the left.  There are similar functions {\tt string.ljust()}
1746 and {\tt string.center()}.  These functions do not write anything, they
1747 just return a new string.  If the input string is too long, they don't
1748 truncate it, but return it unchanged; this will mess up your column
1749 lay-out but that's usually better than the alternative, which would be
1750 lying about a value.  (If you really want truncation you can always add
1751 a slice operation, as in {\tt string.ljust(x,~n)[0:n]}.)
1752
1753 There is another function, {\tt string.zfill}, which pads a numeric
1754 string on the left with zeros.  It understands about plus and minus
1755 signs:
1756
1757 \bcode\begin{verbatim}
1758 >>> string.zfill('12', 5)
1759 '00012'
1760 >>> string.zfill('-3.14', 7)
1761 '-003.14'
1762 >>> string.zfill('3.14159265359', 5)
1763 '3.14159265359'
1764 >>>
1765 \end{verbatim}\ecode
1766
1767
1768 \chapter{Errors and Exceptions}
1769
1770 Until now error messages haven't been more than mentioned, but if you
1771 have tried out the examples you have probably seen some.  There are
1772 (at least) two distinguishable kinds of errors: {\em syntax\ errors}
1773 and {\em exceptions}.
1774
1775 \section{Syntax Errors}
1776
1777 Syntax errors, also known as parsing errors, are perhaps the most common
1778 kind of complaint you get while you are still learning Python:
1779
1780 \bcode\begin{verbatim}
1781 >>> while 1 print 'Hello world'
1782   File "<stdin>", line 1
1783     while 1 print 'Hello world'
1784                 ^
1785 SyntaxError: invalid syntax
1786 >>>
1787 \end{verbatim}\ecode
1788 %
1789 The parser repeats the offending line and displays a little `arrow'
1790 pointing at the earliest point in the line where the error was detected.
1791 The error is caused by (or at least detected at) the token
1792 {\em preceding}
1793 the arrow: in the example, the error is detected at the keyword
1794 {\tt print}, since a colon ({\tt :}) is missing before it.
1795 File name and line number are printed so you know where to look in case
1796 the input came from a script.
1797
1798 \section{Exceptions}
1799
1800 Even if a statement or expression is syntactically correct, it may
1801 cause an error when an attempt is made to execute it.
1802 Errors detected during execution are called {\em exceptions} and are
1803 not unconditionally fatal: you will soon learn how to handle them in
1804 Python programs.  Most exceptions are not handled by programs,
1805 however, and result in error messages as shown here:
1806
1807 \bcode\small\begin{verbatim}
1808 >>> 10 * (1/0)
1809 Traceback (innermost last):
1810   File "<stdin>", line 1
1811 ZeroDivisionError: integer division or modulo
1812 >>> 4 + spam*3
1813 Traceback (innermost last):
1814   File "<stdin>", line 1
1815 NameError: spam
1816 >>> '2' + 2
1817 Traceback (innermost last):
1818   File "<stdin>", line 1
1819 TypeError: illegal argument type for built-in operation
1820 >>>
1821 \end{verbatim}\ecode
1822 %
1823 The last line of the error message indicates what happened.
1824 Exceptions come in different types, and the type is printed as part of
1825 the message: the types in the example are
1826 {\tt ZeroDivisionError},
1827 {\tt NameError}
1828 and
1829 {\tt TypeError}.
1830 The string printed as the exception type is the name of the built-in
1831 name for the exception that occurred.  This is true for all built-in
1832 exceptions, but need not be true for user-defined exceptions (although
1833 it is a useful convention).
1834 Standard exception names are built-in identifiers (not reserved
1835 keywords).
1836
1837 The rest of the line is a detail whose interpretation depends on the
1838 exception type; its meaning is dependent on the exception type.
1839
1840 The preceding part of the error message shows the context where the
1841 exception happened, in the form of a stack backtrace.
1842 In general it contains a stack backtrace listing source lines; however,
1843 it will not display lines read from standard input.
1844
1845 The Python library reference manual lists the built-in exceptions and
1846 their meanings.
1847
1848 \section{Handling Exceptions}
1849
1850 It is possible to write programs that handle selected exceptions.
1851 Look at the following example, which prints a table of inverses of
1852 some floating point numbers:
1853
1854 \bcode\begin{verbatim}
1855 >>> numbers = [0.3333, 2.5, 0, 10]
1856 >>> for x in numbers:
1857 ...     print x,
1858 ...     try:
1859 ...         print 1.0 / x
1860 ...     except ZeroDivisionError:
1861 ...         print '*** has no inverse ***'
1862 ...
1863 0.3333 3.00030003
1864 2.5 0.4
1865 0 *** has no inverse ***
1866 10 0.1
1867 >>>
1868 \end{verbatim}\ecode
1869 %
1870 The {\tt try} statement works as follows.
1871 \begin{itemize}
1872 \item
1873 First, the
1874 {\em try\ clause}
1875 (the statement(s) between the {\tt try} and {\tt except} keywords) is
1876 executed.
1877 \item
1878 If no exception occurs, the
1879 {\em except\ clause}
1880 is skipped and execution of the {\tt try} statement is finished.
1881 \item
1882 If an exception occurs during execution of the try clause,
1883 the rest of the clause is skipped.  Then if
1884 its type matches the exception named after the {\tt except} keyword,
1885 the rest of the try clause is skipped, the except clause is executed,
1886 and then execution continues after the {\tt try} statement.
1887 \item
1888 If an exception occurs which does not match the exception named in the
1889 except clause, it is passed on to outer try statements; if no handler is
1890 found, it is an
1891 {\em unhandled\ exception}
1892 and execution stops with a message as shown above.
1893 \end{itemize}
1894 A {\tt try} statement may have more than one except clause, to specify
1895 handlers for different exceptions.
1896 At most one handler will be executed.
1897 Handlers only handle exceptions that occur in the corresponding try
1898 clause, not in other handlers of the same {\tt try} statement.
1899 An except clause may name multiple exceptions as a parenthesized list,
1900 e.g.:
1901
1902 \bcode\begin{verbatim}
1903 ... except (RuntimeError, TypeError, NameError):
1904 ...     pass
1905 \end{verbatim}\ecode
1906 %
1907 The last except clause may omit the exception name(s), to serve as a
1908 wildcard.
1909 Use this with extreme caution, since it is easy to mask a real
1910 programming error in this way!
1911
1912 When an exception occurs, it may have an associated value, also known as
1913 the exceptions's
1914 {\em argument}.
1915 The presence and type of the argument depend on the exception type.
1916 For exception types which have an argument, the except clause may
1917 specify a variable after the exception name (or list) to receive the
1918 argument's value, as follows:
1919
1920 \bcode\begin{verbatim}
1921 >>> try:
1922 ...     spam()
1923 ... except NameError, x:
1924 ...     print 'name', x, 'undefined'
1925 ...
1926 name spam undefined
1927 >>>
1928 \end{verbatim}\ecode
1929 %
1930 If an exception has an argument, it is printed as the last part
1931 (`detail') of the message for unhandled exceptions.
1932
1933 Exception handlers don't just handle exceptions if they occur
1934 immediately in the try clause, but also if they occur inside functions
1935 that are called (even indirectly) in the try clause.
1936 For example:
1937
1938 \bcode\begin{verbatim}
1939 >>> def this_fails():
1940 ...     x = 1/0
1941 ...
1942 >>> try:
1943 ...     this_fails()
1944 ... except ZeroDivisionError, detail:
1945 ...     print 'Handling run-time error:', detail
1946 ...
1947 Handling run-time error: integer division or modulo
1948 >>>
1949 \end{verbatim}\ecode
1950
1951 \section{Raising Exceptions}
1952
1953 The {\tt raise} statement allows the programmer to force a specified
1954 exception to occur.
1955 For example:
1956
1957 \bcode\begin{verbatim}
1958 >>> raise NameError, 'HiThere'
1959 Traceback (innermost last):
1960   File "<stdin>", line 1
1961 NameError: HiThere
1962 >>>
1963 \end{verbatim}\ecode
1964 %
1965 The first argument to {\tt raise} names the exception to be raised.
1966 The optional second argument specifies the exception's argument.
1967
1968 \section{User-defined Exceptions}
1969
1970 Programs may name their own exceptions by assigning a string to a
1971 variable.
1972 For example:
1973
1974 \bcode\begin{verbatim}
1975 >>> my_exc = 'my_exc'
1976 >>> try:
1977 ...     raise my_exc, 2*2
1978 ... except my_exc, val:
1979 ...     print 'My exception occurred, value:', val
1980 ...
1981 My exception occurred, value: 4
1982 >>> raise my_exc, 1
1983 Traceback (innermost last):
1984   File "<stdin>", line 1
1985 my_exc: 1
1986 >>>
1987 \end{verbatim}\ecode
1988 %
1989 Many standard modules use this to report errors that may occur in
1990 functions they define.
1991
1992 \section{Defining Clean-up Actions}
1993
1994 The {\tt try} statement has another optional clause which is intended to
1995 define clean-up actions that must be executed under all circumstances.
1996 For example:
1997
1998 \bcode\begin{verbatim}
1999 >>> try:
2000 ...     raise KeyboardInterrupt
2001 ... finally:
2002 ...     print 'Goodbye, world!'
2003 ...
2004 Goodbye, world!
2005 Traceback (innermost last):
2006   File "<stdin>", line 2
2007 KeyboardInterrupt
2008 >>>
2009 \end{verbatim}\ecode
2010 %
2011 A {\tt finally} clause is executed whether or not an exception has
2012 occurred in the {\tt try} clause.  When an exception has occurred, it
2013 is re-raised after the {\tt finally} clause is executed.  The
2014 {\tt finally} clause is also executed ``on the way out'' when the
2015 {\tt try} statement is left via a {\tt break} or {\tt return}
2016 statement.
2017
2018 A {\tt try} statement must either have one or more {\tt except}
2019 clauses or one {\tt finally} clause, but not both.
2020
2021
2022 \chapter{Classes}
2023
2024 Python's class mechanism adds classes to the language with a minimum
2025 of new syntax and semantics.  It is a mixture of the class mechanisms
2026 found in \Cpp{} and Modula-3.  As is true for modules, classes in Python
2027 do not put an absolute barrier between definition and user, but rather
2028 rely on the politeness of the user not to ``break into the
2029 definition.''  The most important features of classes are retained
2030 with full power, however: the class inheritance mechanism allows
2031 multiple base classes, a derived class can override any methods of its
2032 base class(es), a method can call the method of a base class with the
2033 same name.  Objects can contain an arbitrary amount of private data.
2034
2035 In \Cpp{} terminology, all class members (including the data members) are
2036 {\em public}, and all member functions are {\em virtual}.  There are
2037 no special constructors or destructors.  As in Modula-3, there are no
2038 shorthands for referencing the object's members from its methods: the
2039 method function is declared with an explicit first argument
2040 representing the object, which is provided implicitly by the call.  As
2041 in Smalltalk, classes themselves are objects, albeit in the wider
2042 sense of the word: in Python, all data types are objects.  This
2043 provides semantics for importing and renaming.  But, just like in \Cpp{}
2044 or Modula-3, built-in types cannot be used as base classes for
2045 extension by the user.  Also, like in \Cpp{} but unlike in Modula-3, most
2046 built-in operators with special syntax (arithmetic operators,
2047 subscripting etc.) can be redefined for class members.
2048
2049
2050 \section{A word about terminology}
2051
2052 Lacking universally accepted terminology to talk about classes, I'll
2053 make occasional use of Smalltalk and \Cpp{} terms.  (I'd use Modula-3
2054 terms, since its object-oriented semantics are closer to those of
2055 Python than \Cpp{}, but I expect that few readers have heard of it...)
2056
2057 I also have to warn you that there's a terminological pitfall for
2058 object-oriented readers: the word ``object'' in Python does not
2059 necessarily mean a class instance.  Like \Cpp{} and Modula-3, and unlike
2060 Smalltalk, not all types in Python are classes: the basic built-in
2061 types like integers and lists aren't, and even somewhat more exotic
2062 types like files aren't.  However, {\em all} Python types share a little
2063 bit of common semantics that is best described by using the word
2064 object.
2065
2066 Objects have individuality, and multiple names (in multiple scopes)
2067 can be bound to the same object.  This is known as aliasing in other
2068 languages.  This is usually not appreciated on a first glance at
2069 Python, and can be safely ignored when dealing with immutable basic
2070 types (numbers, strings, tuples).  However, aliasing has an
2071 (intended!) effect on the semantics of Python code involving mutable
2072 objects such as lists, dictionaries, and most types representing
2073 entities outside the program (files, windows, etc.).  This is usually
2074 used to the benefit of the program, since aliases behave like pointers
2075 in some respects.  For example, passing an object is cheap since only
2076 a pointer is passed by the implementation; and if a function modifies
2077 an object passed as an argument, the caller will see the change --- this
2078 obviates the need for two different argument passing mechanisms as in
2079 Pascal.
2080
2081
2082 \section{Python scopes and name spaces}
2083
2084 Before introducing classes, I first have to tell you something about
2085 Python's scope rules.  Class definitions play some neat tricks with
2086 name spaces, and you need to know how scopes and name spaces work to
2087 fully understand what's going on.  Incidentally, knowledge about this
2088 subject is useful for any advanced Python programmer.
2089
2090 Let's begin with some definitions.
2091
2092 A {\em name space} is a mapping from names to objects.  Most name
2093 spaces are currently implemented as Python dictionaries, but that's
2094 normally not noticeable in any way (except for performance), and it
2095 may change in the future.  Examples of name spaces are: the set of
2096 built-in names (functions such as \verb\abs()\, and built-in exception
2097 names); the global names in a module; and the local names in a
2098 function invocation.  In a sense the set of attributes of an object
2099 also form a name space.  The important thing to know about name
2100 spaces is that there is absolutely no relation between names in
2101 different name spaces; for instance, two different modules may both
2102 define a function ``maximize'' without confusion --- users of the
2103 modules must prefix it with the module name.
2104
2105 By the way, I use the word {\em attribute} for any name following a
2106 dot --- for example, in the expression \verb\z.real\, \verb\real\ is
2107 an attribute of the object \verb\z\.  Strictly speaking, references to
2108 names in modules are attribute references: in the expression
2109 \verb\modname.funcname\, \verb\modname\ is a module object and
2110 \verb\funcname\ is an attribute of it.  In this case there happens to
2111 be a straightforward mapping between the module's attributes and the
2112 global names defined in the module: they share the same name space!%
2113 \footnote{
2114         Except for one thing.  Module objects have a secret read-only
2115         attribute called {\tt __dict__} which returns the dictionary
2116         used to implement the module's name space; the name
2117         {\tt __dict__} is an attribute but not a global name.
2118         Obviously, using this violates the abstraction of name space
2119         implementation, and should be restricted to things like
2120         post-mortem debuggers...
2121 }
2122
2123 Attributes may be read-only or writable.  In the latter case,
2124 assignment to attributes is possible.  Module attributes are writable:
2125 you can write \verb\modname.the_answer = 42\.  Writable attributes may
2126 also be deleted with the del statement, e.g.
2127 \verb\del modname.the_answer\.
2128
2129 Name spaces are created at different moments and have different
2130 lifetimes.  The name space containing the built-in names is created
2131 when the Python interpreter starts up, and is never deleted.  The
2132 global name space for a module is created when the module definition
2133 is read in; normally, module name spaces also last until the
2134 interpreter quits.  The statements executed by the top-level
2135 invocation of the interpreter, either read from a script file or
2136 interactively, are considered part of a module called \verb\__main__\,
2137 so they have their own global name space.  (The built-in names
2138 actually also live in a module; this is called \verb\__builtin__\.)
2139
2140 The local name space for a function is created when the function is
2141 called, and deleted when the function returns or raises an exception
2142 that is not handled within the function.  (Actually, forgetting would
2143 be a better way to describe what actually happens.)  Of course,
2144 recursive invocations each have their own local name space.
2145
2146 A {\em scope} is a textual region of a Python program where a name space
2147 is directly accessible.  ``Directly accessible'' here means that an
2148 unqualified reference to a name attempts to find the name in the name
2149 space.
2150
2151 Although scopes are determined statically, they are used dynamically.
2152 At any time during execution, exactly three nested scopes are in use
2153 (i.e., exactly three name spaces are directly accessible): the
2154 innermost scope, which is searched first, contains the local names,
2155 the middle scope, searched next, contains the current module's global
2156 names, and the outermost scope (searched last) is the name space
2157 containing built-in names.
2158
2159 Usually, the local scope references the local names of the (textually)
2160 current function.  Outside of functions, the the local scope references
2161 the same name space as the global scope: the module's name space.
2162 Class definitions place yet another name space in the local scope.
2163
2164 It is important to realize that scopes are determined textually: the
2165 global scope of a function defined in a module is that module's name
2166 space, no matter from where or by what alias the function is called.
2167 On the other hand, the actual search for names is done dynamically, at
2168 run time --- however, the the language definition is evolving towards
2169 static name resolution, at ``compile'' time, so don't rely on dynamic
2170 name resolution!  (In fact, local variables are already determined
2171 statically.)
2172
2173 A special quirk of Python is that assignments always go into the
2174 innermost scope.  Assignments do not copy data --- they just
2175 bind names to objects.  The same is true for deletions: the statement
2176 \verb\del x\ removes the binding of x from the name space referenced by the
2177 local scope.  In fact, all operations that introduce new names use the
2178 local scope: in particular, import statements and function definitions
2179 bind the module or function name in the local scope.  (The
2180 \verb\global\ statement can be used to indicate that particular
2181 variables live in the global scope.)
2182
2183
2184 \section{A first look at classes}
2185
2186 Classes introduce a little bit of new syntax, three new object types,
2187 and some new semantics.
2188
2189
2190 \subsection{Class definition syntax}
2191
2192 The simplest form of class definition looks like this:
2193
2194 \begin{verbatim}
2195         class ClassName:
2196                 <statement-1>
2197                 .
2198                 .
2199                 .
2200                 <statement-N>
2201 \end{verbatim}
2202
2203 Class definitions, like function definitions (\verb\def\ statements)
2204 must be executed before they have any effect.  (You could conceivably
2205 place a class definition in a branch of an \verb\if\ statement, or
2206 inside a function.)
2207
2208 In practice, the statements inside a class definition will usually be
2209 function definitions, but other statements are allowed, and sometimes
2210 useful --- we'll come back to this later.  The function definitions
2211 inside a class normally have a peculiar form of argument list,
2212 dictated by the calling conventions for methods --- again, this is
2213 explained later.
2214
2215 When a class definition is entered, a new name space is created, and
2216 used as the local scope --- thus, all assignments to local variables
2217 go into this new name space.  In particular, function definitions bind
2218 the name of the new function here.
2219
2220 When a class definition is left normally (via the end), a {\em class
2221 object} is created.  This is basically a wrapper around the contents
2222 of the name space created by the class definition; we'll learn more
2223 about class objects in the next section.  The original local scope
2224 (the one in effect just before the class definitions was entered) is
2225 reinstated, and the class object is bound here to class name given in
2226 the class definition header (ClassName in the example).
2227
2228
2229 \subsection{Class objects}
2230
2231 Class objects support two kinds of operations: attribute references
2232 and instantiation.
2233
2234 {\em Attribute references} use the standard syntax used for all
2235 attribute references in Python: \verb\obj.name\.  Valid attribute
2236 names are all the names that were in the class's name space when the
2237 class object was created.  So, if the class definition looked like
2238 this:
2239
2240 \begin{verbatim}
2241         class MyClass:
2242                 i = 12345
2243                 def f(x):
2244                         return 'hello world'
2245 \end{verbatim}
2246
2247 then \verb\MyClass.i\ and \verb\MyClass.f\ are valid attribute
2248 references, returning an integer and a function object, respectively.
2249 Class attributes can also be assigned to, so you can change the
2250 value of \verb\MyClass.i\ by assignment.
2251
2252 Class {\em instantiation} uses function notation.  Just pretend that
2253 the class object is a parameterless function that returns a new
2254 instance of the class.  For example, (assuming the above class):
2255
2256 \begin{verbatim}
2257         x = MyClass()
2258 \end{verbatim}
2259
2260 creates a new {\em instance} of the class and assigns this object to
2261 the local variable \verb\x\.
2262
2263
2264 \subsection{Instance objects}
2265
2266 Now what can we do with instance objects?  The only operations
2267 understood by instance objects are attribute references.  There are
2268 two kinds of valid attribute names.
2269
2270 The first I'll call {\em data attributes}.  These correspond to
2271 ``instance variables'' in Smalltalk, and to ``data members'' in \Cpp{}.
2272 Data attributes need not be declared; like local variables, they
2273 spring into existence when they are first assigned to.  For example,
2274 if \verb\x\ in the instance of \verb\MyClass\ created above, the
2275 following piece of code will print the value 16, without leaving a
2276 trace:
2277
2278 \begin{verbatim}
2279         x.counter = 1
2280         while x.counter < 10:
2281                 x.counter = x.counter * 2
2282         print x.counter
2283         del x.counter
2284 \end{verbatim}
2285
2286 The second kind of attribute references understood by instance objects
2287 are {\em methods}.  A method is a function that ``belongs to'' an
2288 object.  (In Python, the term method is not unique to class instances:
2289 other object types can have methods as well, e.g., list objects have
2290 methods called append, insert, remove, sort, and so on.  However,
2291 below, we'll use the term method exclusively to mean methods of class
2292 instance objects, unless explicitly stated otherwise.)
2293
2294 Valid method names of an instance object depend on its class.  By
2295 definition, all attributes of a class that are (user-defined) function
2296 objects define corresponding methods of its instances.  So in our
2297 example, \verb\x.f\ is a valid method reference, since
2298 \verb\MyClass.f\ is a function, but \verb\x.i\ is not, since
2299 \verb\MyClass.i\ is not.  But \verb\x.f\ is not the
2300 same thing as \verb\MyClass.f\ --- it is a {\em method object}, not a
2301 function object.
2302
2303
2304 \subsection{Method objects}
2305
2306 Usually, a method is called immediately, e.g.:
2307
2308 \begin{verbatim}
2309         x.f()
2310 \end{verbatim}
2311
2312 In our example, this will return the string \verb\'hello world'\.
2313 However, it is not necessary to call a method right away: \verb\x.f\
2314 is a method object, and can be stored away and called at a later
2315 moment, for example:
2316
2317 \begin{verbatim}
2318         xf = x.f
2319         while 1:
2320                 print xf()
2321 \end{verbatim}
2322
2323 will continue to print \verb\hello world\ until the end of time.
2324
2325 What exactly happens when a method is called?  You may have noticed
2326 that \verb\x.f()\ was called without an argument above, even though
2327 the function definition for \verb\f\ specified an argument.  What
2328 happened to the argument?  Surely Python raises an exception when a
2329 function that requires an argument is called without any --- even if
2330 the argument isn't actually used...
2331
2332 Actually, you may have guessed the answer: the special thing about
2333 methods is that the object is passed as the first argument of the
2334 function.  In our example, the call \verb\x.f()\ is exactly equivalent
2335 to \verb\MyClass.f(x)\.  In general, calling a method with a list of
2336 {\em n} arguments is equivalent to calling the corresponding function
2337 with an argument list that is created by inserting the method's object
2338 before the first argument.
2339
2340 If you still don't understand how methods work, a look at the
2341 implementation can perhaps clarify matters.  When an instance
2342 attribute is referenced that isn't a data attribute, its class is
2343 searched.  If the name denotes a valid class attribute that is a
2344 function object, a method object is created by packing (pointers to)
2345 the instance object and the function object just found together in an
2346 abstract object: this is the method object.  When the method object is
2347 called with an argument list, it is unpacked again, a new argument
2348 list is constructed from the instance object and the original argument
2349 list, and the function object is called with this new argument list.
2350
2351
2352 \section{Random remarks}
2353
2354
2355 [These should perhaps be placed more carefully...]
2356
2357
2358 Data attributes override method attributes with the same name; to
2359 avoid accidental name conflicts, which may cause hard-to-find bugs in
2360 large programs, it is wise to use some kind of convention that
2361 minimizes the chance of conflicts, e.g., capitalize method names,
2362 prefix data attribute names with a small unique string (perhaps just
2363 an underscore), or use verbs for methods and nouns for data attributes.
2364
2365
2366 Data attributes may be referenced by methods as well as by ordinary
2367 users (``clients'') of an object.  In other words, classes are not
2368 usable to implement pure abstract data types.  In fact, nothing in
2369 Python makes it possible to enforce data hiding --- it is all based
2370 upon convention.  (On the other hand, the Python implementation,
2371 written in C, can completely hide implementation details and control
2372 access to an object if necessary; this can be used by extensions to
2373 Python written in C.)
2374
2375
2376 Clients should use data attributes with care --- clients may mess up
2377 invariants maintained by the methods by stamping on their data
2378 attributes.  Note that clients may add data attributes of their own to
2379 an instance object without affecting the validity of the methods, as
2380 long as name conflicts are avoided --- again, a naming convention can
2381 save a lot of headaches here.
2382
2383
2384 There is no shorthand for referencing data attributes (or other
2385 methods!) from within methods.  I find that this actually increases
2386 the readability of methods: there is no chance of confusing local
2387 variables and instance variables when glancing through a method.
2388
2389
2390 Conventionally, the first argument of methods is often called
2391 \verb\self\.  This is nothing more than a convention: the name
2392 \verb\self\ has absolutely no special meaning to Python.  (Note,
2393 however, that by not following the convention your code may be less
2394 readable by other Python programmers, and it is also conceivable that
2395 a {\em class browser} program be written which relies upon such a
2396 convention.)
2397
2398
2399 Any function object that is a class attribute defines a method for
2400 instances of that class.  It is not necessary that the function
2401 definition is textually enclosed in the class definition: assigning a
2402 function object to a local variable in the class is also ok.  For
2403 example:
2404
2405 \begin{verbatim}
2406         # Function defined outside the class
2407         def f1(self, x, y):
2408                 return min(x, x+y)
2409
2410         class C:
2411                 f = f1
2412                 def g(self):
2413                         return 'hello world'
2414                 h = g
2415 \end{verbatim}
2416
2417 Now \verb\f\, \verb\g\ and \verb\h\ are all attributes of class
2418 \verb\C\ that refer to function objects, and consequently they are all
2419 methods of instances of \verb\C\ --- \verb\h\ being exactly equivalent
2420 to \verb\g\.  Note that this practice usually only serves to confuse
2421 the reader of a program.
2422
2423
2424 Methods may call other methods by using method attributes of the
2425 \verb\self\ argument, e.g.:
2426
2427 \begin{verbatim}
2428         class Bag:
2429                 def empty(self):
2430                         self.data = []
2431                 def add(self, x):
2432                         self.data.append(x)
2433                 def addtwice(self, x):
2434                         self.add(x)
2435                         self.add(x)
2436 \end{verbatim}
2437
2438
2439 The instantiation operation (``calling'' a class object) creates an
2440 empty object.  Many classes like to create objects in a known initial
2441 state.  Therefore a class may define a special method named
2442 \verb\__init__\, like this:
2443
2444 \begin{verbatim}
2445                 def __init__(self):
2446                         self.empty()
2447 \end{verbatim}
2448
2449 When a class defines an \verb\__init__\ method, class instantiation
2450 automatically invokes \verb\__init__\ for the newly-created class
2451 instance.  So in the \verb\Bag\ example, a new and initialized instance
2452 can be obtained by:
2453
2454 \begin{verbatim}
2455         x = Bag()
2456 \end{verbatim}
2457
2458 Of course, the \verb\__init__\ method may have arguments for greater
2459 flexibility.  In that case, arguments given to the class instantiation
2460 operator are passed on to \verb\__init__\.  For example,
2461
2462 \bcode\begin{verbatim}
2463 >>> class Complex:
2464 ...     def __init__(self, realpart, imagpart):
2465 ...         self.r = realpart
2466 ...         self.i = imagpart
2467 ...
2468 >>> x = Complex(3.0,-4.5)
2469 >>> x.r, x.i
2470 (3.0, -4.5)
2471 >>>
2472 \end{verbatim}\ecode
2473 %
2474 Methods may reference global names in the same way as ordinary
2475 functions.  The global scope associated with a method is the module
2476 containing the class definition.  (The class itself is never used as a
2477 global scope!)  While one rarely encounters a good reason for using
2478 global data in a method, there are many legitimate uses of the global
2479 scope: for one thing, functions and modules imported into the global
2480 scope can be used by methods, as well as functions and classes defined
2481 in it.  Usually, the class containing the method is itself defined in
2482 this global scope, and in the next section we'll find some good
2483 reasons why a method would want to reference its own class!
2484
2485
2486 \section{Inheritance}
2487
2488 Of course, a language feature would not be worthy of the name ``class''
2489 without supporting inheritance.  The syntax for a derived class
2490 definition looks as follows:
2491
2492 \begin{verbatim}
2493         class DerivedClassName(BaseClassName):
2494                 <statement-1>
2495                 .
2496                 .
2497                 .
2498                 <statement-N>
2499 \end{verbatim}
2500
2501 The name \verb\BaseClassName\ must be defined in a scope containing
2502 the derived class definition.  Instead of a base class name, an
2503 expression is also allowed.  This is useful when the base class is
2504 defined in another module, e.g.,
2505
2506 \begin{verbatim}
2507         class DerivedClassName(modname.BaseClassName):
2508 \end{verbatim}
2509
2510 Execution of a derived class definition proceeds the same as for a
2511 base class.  When the class object is constructed, the base class is
2512 remembered.  This is used for resolving attribute references: if a
2513 requested attribute is not found in the class, it is searched in the
2514 base class.  This rule is applied recursively if the base class itself
2515 is derived from some other class.
2516
2517 There's nothing special about instantiation of derived classes:
2518 \verb\DerivedClassName()\ creates a new instance of the class.  Method
2519 references are resolved as follows: the corresponding class attribute
2520 is searched, descending down the chain of base classes if necessary,
2521 and the method reference is valid if this yields a function object.
2522
2523 Derived classes may override methods of their base classes.  Because
2524 methods have no special privileges when calling other methods of the
2525 same object, a method of a base class that calls another method
2526 defined in the same base class, may in fact end up calling a method of
2527 a derived class that overrides it.  (For \Cpp{} programmers: all methods
2528 in Python are ``virtual functions''.)
2529
2530 An overriding method in a derived class may in fact want to extend
2531 rather than simply replace the base class method of the same name.
2532 There is a simple way to call the base class method directly: just
2533 call \verb\BaseClassName.methodname(self, arguments)\.  This is
2534 occasionally useful to clients as well.  (Note that this only works if
2535 the base class is defined or imported directly in the global scope.)
2536
2537
2538 \subsection{Multiple inheritance}
2539
2540 Python supports a limited form of multiple inheritance as well.  A
2541 class definition with multiple base classes looks as follows:
2542
2543 \begin{verbatim}
2544         class DerivedClassName(Base1, Base2, Base3):
2545                 <statement-1>
2546                 .
2547                 .
2548                 .
2549                 <statement-N>
2550 \end{verbatim}
2551
2552 The only rule necessary to explain the semantics is the resolution
2553 rule used for class attribute references.  This is depth-first,
2554 left-to-right.  Thus, if an attribute is not found in
2555 \verb\DerivedClassName\, it is searched in \verb\Base1\, then
2556 (recursively) in the base classes of \verb\Base1\, and only if it is
2557 not found there, it is searched in \verb\Base2\, and so on.
2558
2559 (To some people breadth first---searching \verb\Base2\ and
2560 \verb\Base3\ before the base classes of \verb\Base1\---looks more
2561 natural.  However, this would require you to know whether a particular
2562 attribute of \verb\Base1\ is actually defined in \verb\Base1\ or in
2563 one of its base classes before you can figure out the consequences of
2564 a name conflict with an attribute of \verb\Base2\.  The depth-first
2565 rule makes no differences between direct and inherited attributes of
2566 \verb\Base1\.)
2567
2568 It is clear that indiscriminate use of multiple inheritance is a
2569 maintenance nightmare, given the reliance in Python on conventions to
2570 avoid accidental name conflicts.  A well-known problem with multiple
2571 inheritance is a class derived from two classes that happen to have a
2572 common base class.  While it is easy enough to figure out what happens
2573 in this case (the instance will have a single copy of ``instance
2574 variables'' or data attributes used by the common base class), it is
2575 not clear that these semantics are in any way useful.
2576
2577
2578 \section{Odds and ends}
2579
2580 Sometimes it is useful to have a data type similar to the Pascal
2581 ``record'' or C ``struct'', bundling together a couple of named data
2582 items.  An empty class definition will do nicely, e.g.:
2583
2584 \begin{verbatim}
2585         class Employee:
2586                 pass
2587
2588         john = Employee() # Create an empty employee record
2589
2590         # Fill the fields of the record
2591         john.name = 'John Doe'
2592         john.dept = 'computer lab'
2593         john.salary = 1000
2594 \end{verbatim}
2595
2596
2597 A piece of Python code that expects a particular abstract data type
2598 can often be passed a class that emulates the methods of that data
2599 type instead.  For instance, if you have a function that formats some
2600 data from a file object, you can define a class with methods
2601 \verb\read()\ and \verb\readline()\ that gets the data from a string
2602 buffer instead, and pass it as an argument.  (Unfortunately, this
2603 technique has its limitations: a class can't define operations that
2604 are accessed by special syntax such as sequence subscripting or
2605 arithmetic operators, and assigning such a ``pseudo-file'' to
2606 \verb\sys.stdin\ will not cause the interpreter to read further input
2607 from it.)
2608
2609
2610 Instance method objects have attributes, too: \verb\m.im_self\ is the
2611 object of which the method is an instance, and \verb\m.im_func\ is the
2612 function object corresponding to the method.
2613
2614
2615 \chapter{Recent Additions}
2616
2617 Python is an evolving language.  Since this tutorial was last
2618 thoroughly revised, several new features have been added to the
2619 language.  While ideally I should revise the tutorial to incorporate
2620 them in the mainline of the text, lack of time currently requires me
2621 to take a more modest approach.  In this chapter I will briefly list the
2622 most important improvements to the language and how you can use them
2623 to your benefit.
2624
2625 \section{The Last Printed Expression}
2626
2627 In interactive mode, the last printed expression is assigned to the
2628 variable \code{_}.  This means that when you are using Python as a
2629 desk calculator, it is somewhat easier to continue calculations, for
2630 example:
2631
2632 \begin{verbatim}
2633         >>> tax = 17.5 / 100
2634         >>> price = 3.50
2635         >>> price * tax
2636         0.6125
2637         >>> price + _
2638         4.1125
2639         >>> round(_, 2)
2640         4.11
2641         >>>
2642 \end{verbatim}
2643
2644 For reasons too embarrassing to explain, this variable is implemented
2645 as a built-in (living in the module \code{__builtin__}), so it should
2646 be treated as read-only by the user.  I.e. don't explicitly assign a
2647 value to it --- you would create an independent local variable with
2648 the same name masking the built-in variable with its magic behavior.
2649
2650 \section{String Literals}
2651
2652 \subsection{Double Quotes}
2653
2654 Python can now also use double quotes to surround string literals,
2655 e.g. \verb\"this doesn't hurt a bit"\.  There is no semantic
2656 difference between strings surrounded by single or double quotes.
2657
2658 \subsection{Continuation Of String Literals}
2659
2660 String literals can span multiple lines by escaping newlines with
2661 backslashes, e.g.
2662
2663 \begin{verbatim}
2664         hello = "This is a rather long string containing\n\
2665         several lines of text just as you would do in C.\n\
2666             Note that whitespace at the beginning of the line is\
2667          significant.\n"
2668         print hello
2669 \end{verbatim}
2670
2671 which would print the following:
2672 \begin{verbatim}
2673         This is a rather long string containing
2674         several lines of text just as you would do in C.
2675             Note that whitespace at the beginning of the line is significant.
2676 \end{verbatim}
2677
2678 \subsection{Triple-quoted strings}
2679
2680 In some cases, when you need to include really long strings (e.g.
2681 containing several paragraphs of informational text), it is annoying
2682 that you have to terminate each line with \verb@\n\@, especially if
2683 you would like to reformat the text occasionally with a powerful text
2684 editor like Emacs.  For such situations, ``triple-quoted'' strings can
2685 be used, e.g.
2686
2687 \begin{verbatim}
2688         hello = """
2689
2690             This string is bounded by triple double quotes (3 times ").
2691         Unescaped newlines in the string are retained, though \
2692         it is still possible\nto use all normal escape sequences.
2693
2694             Whitespace at the beginning of a line is
2695         significant.  If you need to include three opening quotes
2696         you have to escape at least one of them, e.g. \""".
2697
2698             This string ends in a newline.
2699         """
2700 \end{verbatim}
2701
2702 Triple-quoted strings can be surrounded by three single quotes as
2703 well, again without semantic difference.
2704
2705 \subsection{String Literal Juxtaposition}
2706
2707 One final twist: you can juxtapose multiple string literals.  Two or
2708 more adjacent string literals (but not arbitrary expressions!)
2709 separated only by whitespace will be concatenated (without intervening
2710 whitespace) into a single string object at compile time.  This makes
2711 it possible to continue a long string on the next line without
2712 sacrificing indentation or performance, unlike the use of the string
2713 concatenation operator \verb\+\ or the continuation of the literal
2714 itself on the next line (since leading whitespace is significant
2715 inside all types of string literals).  Note that this feature, like
2716 all string features except triple-quoted strings, is borrowed from
2717 Standard C.
2718
2719 \section{The Formatting Operator}
2720
2721 \subsection{Basic Usage}
2722
2723 The chapter on output formatting is really out of date: there is now
2724 an almost complete interface to C-style printf formats.  This is done
2725 by overloading the modulo operator (\verb\%\) for a left operand
2726 which is a string, e.g.
2727
2728 \begin{verbatim}
2729         >>> import math
2730         >>> print 'The value of PI is approximately %5.3f.' % math.pi
2731         The value of PI is approximately 3.142.
2732         >>>
2733 \end{verbatim}
2734
2735 If there is more than one format in the string you pass a tuple as
2736 right operand, e.g.
2737
2738 \begin{verbatim}
2739         >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
2740         >>> for name, phone in table.items():
2741         ...     print '%-10s ==> %10d' % (name, phone)
2742         ...
2743         Jack       ==>       4098
2744         Dcab       ==>    8637678
2745         Sjoerd     ==>       4127
2746         >>>
2747 \end{verbatim}
2748
2749 Most formats work exactly as in C and require that you pass the proper
2750 type (however, if you don't you get an exception, not a core dump).
2751 The \verb\%s\ format is more relaxed: if the corresponding argument is
2752 not a string object, it is converted to string using the \verb\str()\
2753 built-in function.  Using \verb\*\ to pass the width or precision in
2754 as a separate (integer) argument is supported.  The C formats
2755 \verb\%n\ and \verb\%p\ are not supported.
2756
2757 \subsection{Referencing Variables By Name}
2758
2759 If you have a really long format string that you don't want to split
2760 up, it would be nice if you could reference the variables to be
2761 formatted by name instead of by position.  This can be done by using
2762 an extension of C formats using the form \verb\%(name)format\, e.g.
2763
2764 \begin{verbatim}
2765         >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
2766         >>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
2767         Jack: 4098; Sjoerd: 4127; Dcab: 8637678
2768         >>>
2769 \end{verbatim}
2770
2771 This is particularly useful in combination with the new built-in
2772 \verb\vars()\ function, which returns a dictionary containing all
2773 local variables.
2774
2775 \section{Optional Function Arguments}
2776
2777 It is now possible to define functions with a variable number of
2778 arguments.  There are two forms, which can be combined.
2779
2780 \subsection{Default Argument Values}
2781
2782 The most useful form is to specify a default value for one or more
2783 arguments.  This creates a function that can be called with fewer
2784 arguments than it is defined, e.g.
2785
2786 \begin{verbatim}
2787         def ask_ok(prompt, retries = 4, complaint = 'Yes or no, please!'):
2788                 while 1:
2789                         ok = raw_input(prompt)
2790                         if ok in ('y', 'ye', 'yes'): return 1
2791                         if ok in ('n', 'no', 'nop', 'nope'): return 0
2792                         retries = retries - 1
2793                         if retries < 0: raise IOError, 'refusenik user'
2794                         print complaint
2795 \end{verbatim}
2796
2797 This function can be called either like this:
2798 \verb\ask_ok('Do you really want to quit?')\ or like this:
2799 \verb\ask_ok('OK to overwrite the file?', 2)\.
2800
2801 The default values are evaluated at the point of function definition
2802 in the {\em defining} scope, so that e.g.
2803
2804 \begin{verbatim}
2805         i = 5
2806         def f(arg = i): print arg
2807         i = 6
2808         f()
2809 \end{verbatim}
2810
2811 will print \verb\5\.
2812
2813 \subsection{Arbitrary Argument Lists}
2814
2815 It is also possible to specify that a function can be called with an
2816 arbitrary number of arguments.  These arguments will be wrapped up in
2817 a tuple.  Before the variable number of arguments, zero or more normal
2818 arguments may occur, e.g.
2819
2820 \begin{verbatim}
2821         def fprintf(file, format, *args):
2822                 file.write(format % args)
2823 \end{verbatim}
2824
2825 This feature may be combined with the previous, e.g.
2826
2827 \begin{verbatim}
2828         def but_is_it_useful(required, optional = None, *remains):
2829                 print "I don't know"
2830 \end{verbatim}
2831
2832 \section{Lambda And Functional Programming Tools}
2833
2834 \subsection{Lambda Forms}
2835
2836 By popular demand, a few features commonly found in functional
2837 programming languages and Lisp have been added to Python.  With the
2838 \verb\lambda\ keyword, small anonymous functions can be created.
2839 Here's a function that returns the sum of its two arguments:
2840 \verb\lambda a, b: a+b\.  Lambda forms can be used wherever function
2841 objects are required.  They are syntactically restricted to a single
2842 expression.  Semantically, they are just syntactic sugar for a normal
2843 function definition.  Like nested function definitions, lambda forms
2844 cannot reference variables from the containing scope, but this can be
2845 overcome through the judicious use of default argument values, e.g.
2846
2847 \begin{verbatim}
2848         def make_incrementor(n):
2849                 return lambda x, incr=n: x+incr
2850 \end{verbatim}
2851
2852 \subsection{Map, Reduce and Filter}
2853
2854 Three new built-in functions on sequences are good candidate to pass
2855 lambda forms.
2856
2857 \subsubsection{Map.}
2858
2859 \verb\map(function, sequence)\ calls \verb\function(item)\ for each of
2860 the sequence's items and returns a list of the return values.  For
2861 example, to compute some cubes:
2862
2863 \begin{verbatim}
2864         >>> map(lambda x: x*x*x, range(1, 11))
2865         [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
2866         >>>
2867 \end{verbatim}
2868
2869 More than one sequence may be passed; the function must then have as
2870 many arguments as there are sequences and is called with the
2871 corresponding item from each sequence (or \verb\None\ if some sequence
2872 is shorter than another).  If \verb\None\ is passed for the function,
2873 a function returning its argument(s) is substituted.
2874
2875 Combining these two special cases, we see that
2876 \verb\map(None, list1, list2)\  is a convenient way of turning a pair
2877 of lists into a list of pairs.  For example:
2878
2879 \begin{verbatim}
2880         >>> seq = range(8)
2881         >>> map(None, seq, map(lambda x: x*x, seq))
2882         [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)]
2883         >>>
2884 \end{verbatim}
2885
2886 \subsubsection{Filter.}
2887
2888 \verb\filter(function, sequence)\ returns a sequence (of the same
2889 type, if possible) consisting of those items from the sequence for
2890 which \verb\function(item)\ is true.  For example, to compute some
2891 primes:
2892
2893 \begin{verbatim}
2894         >>> filter(lambda x: x%2 != 0 and x%3 != 0, range(2, 25))
2895         [5, 7, 11, 13, 17, 19, 23]
2896         >>>
2897 \end{verbatim}
2898
2899 \subsubsection{Reduce.}
2900
2901 \verb\reduce(function, sequence)\ returns a single value constructed
2902 by calling the (binary) function on the first two items of the
2903 sequence, then on the result and the next item, and so on.  For
2904 example, to compute the sum of the numbers 1 through 10:
2905
2906 \begin{verbatim}
2907         >>> reduce(lambda x, y: x+y, range(1, 11))
2908         55
2909         >>>
2910 \end{verbatim}
2911
2912 If there's only one item in the sequence, its value is returned; if
2913 the sequence is empty, an exception is raised.
2914
2915 A third argument can be passed to indicate the starting value.  In this
2916 case the starting value is returned for an empty sequence, and the
2917 function is first applied to the starting value and the first sequence
2918 item, then to the result and the next item, and so on.  For example,
2919
2920 \begin{verbatim}
2921         >>> def sum(seq):
2922         ...     return reduce(lambda x, y: x+y, seq, 0)
2923         ...
2924         >>> sum(range(1, 11))
2925         55
2926         >>> sum([])
2927         0
2928         >>>
2929 \end{verbatim}
2930
2931 \section{Continuation Lines Without Backslashes}
2932
2933 While the general mechanism for continuation of a source line on the
2934 next physical line remains to place a backslash on the end of the
2935 line, expressions inside matched parentheses (or square brackets, or
2936 curly braces) can now also be continued without using a backslash.
2937 This is particularly useful for calls to functions with many
2938 arguments, and for initializations of large tables.
2939
2940 For example:
2941
2942 \begin{verbatim}
2943         month_names = ['Januari', 'Februari', 'Maart',
2944                        'April',   'Mei',      'Juni',
2945                        'Juli',    'Augustus', 'September',
2946                        'Oktober', 'November', 'December']
2947 \end{verbatim}
2948
2949 and
2950
2951 \begin{verbatim}
2952         CopyInternalHyperLinks(self.context.hyperlinks,
2953                                copy.context.hyperlinks,
2954                                uidremap)
2955 \end{verbatim}
2956
2957 \section{Regular Expressions}
2958
2959 While C's printf-style output formats, transformed into Python, are
2960 adequate for most output formatting jobs, C's scanf-style input
2961 formats are not very powerful.  Instead of scanf-style input, Python
2962 offers Emacs-style regular expressions as a powerful input and
2963 scanning mechanism.  Read the corresponding section in the Library
2964 Reference for a full description.
2965
2966 \section{Generalized Dictionaries}
2967
2968 The keys of dictionaries are no longer restricted to strings --- they
2969 can be any immutable basic type including strings, numbers, tuples, or
2970 (certain) class instances.  (Lists and dictionaries are not acceptable
2971 as dictionary keys, in order to avoid problems when the object used as
2972 a key is modified.)
2973
2974 Dictionaries have two new methods: \verb\d.values()\ returns a list of
2975 the dictionary's values, and \verb\d.items()\ returns a list of the
2976 dictionary's (key, value) pairs.  Like \verb\d.keys()\, these
2977 operations are slow for large dictionaries.  Examples:
2978
2979 \begin{verbatim}
2980         >>> d = {100: 'honderd', 1000: 'duizend', 10: 'tien'}
2981         >>> d.keys()
2982         [100, 10, 1000]
2983         >>> d.values()
2984         ['honderd', 'tien', 'duizend']
2985         >>> d.items()
2986         [(100, 'honderd'), (10, 'tien'), (1000, 'duizend')]
2987         >>>
2988 \end{verbatim}
2989
2990 \section{Miscellaneous New Built-in Functions}
2991
2992 The function \verb\vars()\ returns a dictionary containing the current
2993 local variables.  With a module argument, it returns that module's
2994 global variables.  The old function \verb\dir(x)\ returns
2995 \verb\vars(x).keys()\.
2996
2997 The function \verb\round(x)\ returns a floating point number rounded
2998 to the nearest integer (but still expressed as a floating point
2999 number).  E.g. \verb\round(3.4) == 3.0\ and \verb\round(3.5) == 4.0\.
3000 With a second argument it rounds to the specified number of digits,
3001 e.g. \verb\round(math.pi, 4) == 3.1416\ or even
3002 \verb\round(123.4, -2) == 100.0\.
3003
3004 The function \verb\hash(x)\ returns a hash value for an object.
3005 All object types acceptable as dictionary keys have a hash value (and
3006 it is this hash value that the dictionary implementation uses).
3007
3008 The function \verb\id(x)\ return a unique identifier for an object.
3009 For two objects x and y, \verb\id(x) == id(y)\ if and only if
3010 \verb\x is y\.  (In fact the object's address is used.)
3011
3012 The function \verb\hasattr(x, name)\ returns whether an object has an
3013 attribute with the given name (a string value).  The function
3014 \verb\getattr(x, name)\ returns the object's attribute with the given
3015 name.  The function \verb\setattr(x, name, value)\ assigns a value to
3016 an object's attribute with the given name.  These three functions are
3017 useful if the attribute names are not known beforehand.  Note that
3018 \verb\getattr(x, 'spam')\ is equivalent to \verb\x.spam\, and
3019 \verb\setattr(x, 'spam', y)\ is equivalent to \verb\x.spam = y\.  By
3020 definition, \verb\hasattr(x, name)\ returns true if and only if
3021 \verb\getattr(x, name)\ returns without raising an exception.
3022
3023 \section{Else Clause For Try Statement}
3024
3025 The \verb\try...except\ statement now has an optional \verb\else\
3026 clause, which must follow all \verb\except\ clauses.  It is useful to
3027 place code that must be executed if the \verb\try\ clause does not
3028 raise an exception.  For example:
3029
3030 \begin{verbatim}
3031         for arg in sys.argv:
3032                 try:
3033                         f = open(arg, 'r')
3034                 except IOError:
3035                         print 'cannot open', arg
3036                 else:
3037                         print arg, 'has', len(f.readlines()), 'lines'
3038                         f.close()
3039 \end{verbatim}
3040
3041
3042 \section{New Class Features in Release 1.1}
3043
3044 Some changes have been made to classes: the operator overloading
3045 mechanism is more flexible, providing more support for non-numeric use
3046 of operators (including calling an object as if it were a function),
3047 and it is possible to trap attribute accesses.
3048
3049 \subsection{New Operator Overloading}
3050
3051 It is no longer necessary to coerce both sides of an operator to the
3052 same class or type.  A class may still provide a \code{__coerce__}
3053 method, but this method may return objects of different types or
3054 classes if it feels like it.  If no \code{__coerce__} is defined, any
3055 argument type or class is acceptable.
3056
3057 In order to make it possible to implement binary operators where the
3058 right-hand side is a class instance but the left-hand side is not,
3059 without using coercions, right-hand versions of all binary operators
3060 may be defined.  These have an `r' prepended to their name,
3061 e.g. \code{__radd__}.
3062
3063 For example, here's a very simple class for representing times.  Times
3064 are initialized from a number of seconds (like time.time()).  Times
3065 are printed like this: \code{Thu Oct 6 14:20:06 1994}.  Subtracting
3066 two Times gives their difference in seconds.  Adding or subtracting a
3067 Time and a number gives a new Time.  You can't add two times, nor can
3068 you subtract a Time from a number.
3069
3070 \begin{verbatim}
3071 import time
3072
3073 class Time:
3074     def __init__(self, seconds):
3075         self.seconds = seconds
3076     def __repr__(self):
3077         return time.ctime(self.seconds)
3078     def __add__(self, x):
3079         return Time(self.seconds + x)
3080     __radd__ = __add__            # support for x+t
3081     def __sub__(self, x):
3082         if hasattr(x, 'seconds'): # test if x could be a Time
3083             return self.seconds - x.seconds
3084         else:
3085             return self.seconds - x
3086
3087 now = Time(time.time())
3088 tomorrow = 24*3600 + now
3089 yesterday = now - today
3090 print tomorrow - yesterday        # prints 172800
3091 \end{verbatim}
3092
3093 \subsection{Trapping Attribute Access}
3094
3095 You can define three new ``magic'' methods in a class now:
3096 \code{__getattr__(self, name)}, \code{__setattr__(self, name, value)}
3097 and \code{__delattr__(self, name)}.
3098
3099 The \code{__getattr__} method is called when an attribute access fails,
3100 i.e. when an attribute access would otherwise raise AttributeError ---
3101 this is {\em after} the instance's dictionary and its class hierarchy
3102 have been searched for the named attribute.  Note that if this method
3103 attempts to access any undefined instance attribute it will be called
3104 recursively!
3105
3106 The \code{__setattr__} and \code{__delattr__} methods are called when
3107 assignment to, respectively deletion of an attribute are attempted.
3108 They are called {\em instead} of the normal action (which is to insert
3109 or delete the attribute in the instance dictionary).  If either of
3110 these methods most set or delete any attribute, they can only do so by
3111 using the instance dictionary directly --- \code{self.__dict__} --- else
3112 they would be called recursively.
3113
3114 For example, here's a near-universal ``Wrapper'' class that passes all
3115 its attribute accesses to another object.  Note how the
3116 \code{__init__} method inserts the wrapped object in
3117 \code{self.__dict__} in order to avoid endless recursion
3118 (\code{__setattr__} would call \code{__getattr__} which would call
3119 itself recursively).
3120
3121 \begin{verbatim}
3122 class Wrapper:
3123     def __init__(self, wrapped):
3124         self.__dict__['wrapped'] = wrapped
3125     def __getattr__(self, name):
3126         return getattr(self.wrapped, name)
3127     def __setattr__(self, name, value):
3128         setattr(self.wrapped, name, value)
3129     def __delattr__(self, name):
3130         delattr(self.wrapped, name)
3131
3132 import sys
3133 f = Wrapper(sys.stdout)
3134 f.write('hello world\n')          # prints 'hello world'
3135 \end{verbatim}
3136
3137 A simpler example of \code{__getattr__} is an attribute that is
3138 computed each time (or the first time) it it accessed.  For instance:
3139
3140 \begin{verbatim}
3141 from math import pi
3142
3143 class Circle:
3144     def __init__(self, radius):
3145         self.radius = radius
3146     def __getattr__(self, name):
3147         if name == 'circumference':
3148             return 2 * pi * self.radius
3149         if name == 'diameter':
3150             return 2 * self.radius
3151         if name == 'area':
3152            return pi * pow(self.radius, 2)
3153         raise AttributeError, name
3154 \end{verbatim}
3155
3156 \subsection{Calling a Class Instance}
3157
3158 If a class defines a method \code{__call__} it is possible to call its
3159 instances as if they were functions.  For example:
3160
3161 \begin{verbatim}
3162 class PresetSomeArguments:
3163     def __init__(self, func, *args):
3164         self.func, self.args = func, args
3165     def __call__(self, *args):
3166         return apply(self.func, self.args + args)
3167
3168 f = PresetSomeArguments(pow, 2)    # f(i) computes powers of 2
3169 for i in range(10): print f(i),    # prints 1 2 4 8 16 32 64 128 256 512
3170 print                              # append newline
3171 \end{verbatim}
3172
3173
3174 \chapter{New in Release 1.2}
3175
3176
3177 This chapter describes even more recent additions to the Python
3178 language and library.
3179
3180
3181 \section{New Class Features}
3182
3183 The semantics of \code{__coerce__} have been changed to be more
3184 reasonable.  As an example, the new standard module \code{Complex}
3185 implements fairly complete complex numbers using this.  Additional
3186 examples of classes with and without \code{__coerce__} methods can be
3187 found in the \code{Demo/classes} subdirectory, modules \code{Rat} and
3188 \code{Dates}.
3189
3190 If a class defines no \code{__coerce__} method, this is equivalent to
3191 the following definition:
3192
3193 \begin{verbatim}
3194 def __coerce__(self, other): return self, other
3195 \end{verbatim}
3196
3197 If \code{__coerce__} coerces itself to an object of a different type,
3198 the operation is carried out using that type --- in release 1.1, this
3199 would cause an error.
3200
3201 Comparisons involving class instances now invoke \code{__coerce__}
3202 exactly as if \code{cmp(x, y)} were a binary operator like \code{+}
3203 (except if \code{x} and \code{y} are the same object).
3204
3205 \section{Unix Signal Handling}
3206
3207 On Unix, Python now supports signal handling.  The module
3208 \code{signal} exports functions \code{signal}, \code{pause} and
3209 \code{alarm}, which act similar to their Unix counterparts.  The
3210 module also exports the conventional names for the various signal
3211 classes (also usable with \code{os.kill()}) and \code{SIG_IGN} and
3212 \code{SIG_DFL}.  See the section on \code{signal} in the Library
3213 Reference Manual for more information.
3214
3215 \section{Exceptions Can Be Classes}
3216
3217 User-defined exceptions are no longer limited to being string objects
3218 --- they can be identified by classes as well.  Using this mechanism it
3219 is possible to create extensible hierarchies of exceptions.
3220
3221 There are two new valid (semantic) forms for the raise statement:
3222
3223 \begin{verbatim}
3224 raise Class, instance
3225
3226 raise instance
3227 \end{verbatim}
3228
3229 In the first form, \code{instance} must be an instance of \code{Class}
3230 or of a class derived from it.  The second form is a shorthand for
3231
3232 \begin{verbatim}
3233 raise instance.__class__, instance
3234 \end{verbatim}
3235
3236 An except clause may list classes as well as string objects.  A class
3237 in an except clause is compatible with an exception if it is the same
3238 class or a base class thereof (but not the other way around --- an
3239 except clause listing a derived class is not compatible with a base
3240 class).  For example, the following code will print B, C, D in that
3241 order:
3242
3243 \begin{verbatim}
3244 class B:
3245     pass
3246 class C(B):
3247     pass
3248 class D(C):
3249     pass
3250
3251 for c in [B, C, D]:
3252     try:
3253         raise c()
3254     except D:
3255         print "D"
3256     except C:
3257         print "C"
3258     except B:
3259         print "B"
3260 \end{verbatim}
3261
3262 Note that if the except clauses were reversed (with ``\code{except B}''
3263 first), it would have printed B, B, B --- the first matching except
3264 clause is triggered.
3265
3266 When an error message is printed for an unhandled exception which is a
3267 class, the class name is printed, then a colon and a space, and
3268 finally the instance converted to a string using the built-in function
3269 \code{str()}.
3270
3271 In this release, the built-in exceptions are still strings.
3272
3273
3274 \section{Object Persistency and Object Copying}
3275
3276 Two new modules, \code{pickle} and \code{shelve}, support storage and
3277 retrieval of (almost) arbitrary Python objects on disk, using the
3278 \code{dbm} package.  A third module, \code{copy}, provides flexible
3279 object copying operations.  More information on these modules is
3280 provided in the Library Reference Manual.
3281
3282 \subsection{Persistent Objects}
3283
3284 The module \code{pickle} provides a general framework for objects to
3285 disassemble themselves into a stream of bytes and to reassemble such a
3286 stream back into an object.  It copes with reference sharing,
3287 recursive objects and instances of user-defined classes, but not
3288 (directly) with objects that have ``magical'' links into the operating
3289 system such as open files, sockets or windows.
3290
3291 The \code{pickle} module defines a simple protocol whereby
3292 user-defined classes can control how they are disassembled and
3293 assembled.  The method \code{__getinitargs__()}, if defined, returns
3294 the argument list for the constructor to be used at assembly time (by
3295 default the constructor is called without arguments).  The methods
3296 \code{__getstate__()} and \code{__setstate__()} are used to pass
3297 additional state from disassembly to assembly; by default the
3298 instance's \code{__dict__} is passed and restored.
3299
3300 Note that \code{pickle} does not open or close any files --- it can be
3301 used equally well for moving objects around on a network or store them
3302 in a database.  For ease of debugging, and the inevitable occasional
3303 manual patch-up, the constructed byte streams consist of printable
3304 ASCII characters only (though it's not designed to be pretty).
3305
3306 The module \code{shelve} provides a simple model for storing objects
3307 on files.  The operation \code{shelve.open(filename)} returns a
3308 ``shelf'', which is a simple persistent database with a
3309 dictionary-like interface.  Database keys are strings, objects stored
3310 in the database can be anything that \code{pickle} will handle.
3311
3312 \subsection{Copying Objects}
3313
3314 The module \code{copy} exports two functions: \code{copy()} and
3315 \code{deepcopy()}.  The \code{copy()} function returns a ``shallow''
3316 copy of an object; \code{deepcopy()} returns a ``deep'' copy.  The
3317 difference between shallow and deep copying is only relevant for
3318 compound objects (objects that contain other objects, like lists or
3319 class instances):
3320
3321 \begin{itemize}
3322
3323 \item
3324 A shallow copy constructs a new compound object and then (to the
3325 extent possible) inserts {\em the same objects} into in that the
3326 original contains.
3327
3328 \item
3329 A deep copy constructs a new compound object and then, recursively,
3330 inserts {\em copies} into it of the objects found in the original.
3331
3332 \end{itemize}
3333
3334 Both functions have the same restrictions and use the same protocols
3335 as \code{pickle} --- user-defined classes can control how they are
3336 copied by providing methods named \code{__getinitargs__()},
3337 \code{__getstate__()} and \code{__setstate__()}.
3338
3339
3340 \section{Documentation Strings}
3341
3342 A variety of objects now have a new attribute, \code{__doc__}, which
3343 is supposed to contain a documentation string (if no documentation is
3344 present, the attribute is \code{None}).  New syntax, compatible with
3345 the old interpreter, allows for convenient initialization of the
3346 \code{__doc__} attribute of modules, classes and functions by placing
3347 a string literal by itself as the first statement in the suite.  It
3348 must be a literal --- an expression yielding a string object is not
3349 accepted as a documentation string, since future tools may need to
3350 derive documentation from source by parsing.
3351
3352 Here is a hypothetical, amply documented module called \code{Spam}:
3353
3354 \begin{verbatim}
3355 """Spam operations.
3356
3357 This module exports two classes, a function and an exception:
3358
3359 class Spam: full Spam functionality --- three can sizes
3360 class SpamLight: limited Spam functionality --- only one can size
3361
3362 def open(filename): open a file and return a corresponding Spam or
3363 SpamLight object
3364
3365 GoneOff: exception raised for errors; should never happen
3366
3367 Note that it is always possible to convert a SpamLight object to a
3368 Spam object by a simple method call, but that the reverse operation is
3369 generally costly and may fail for a number of reasons.
3370 """
3371
3372 class SpamLight:
3373     """Limited spam functionality.
3374
3375     Supports a single can size, no flavor, and only hard disks.
3376     """
3377
3378     def __init__(self, size=12):
3379         """Construct a new SpamLight instance.
3380
3381         Argument is the can size.
3382         """
3383         # etc.
3384
3385     # etc.
3386
3387 class Spam(SpamLight):
3388     """Full spam functionality.
3389
3390     Supports three can sizes, two flavor varieties, and all floppy
3391     disk formats still supported by current hardware.
3392     """
3393
3394     def __init__(self, size1=8, size2=12, size3=20):
3395         """Construct a new Spam instance.
3396
3397         Arguments are up to three can sizes.
3398         """
3399         # etc.
3400
3401     # etc.
3402
3403 def open(filename = "/dev/null"):
3404     """Open a can of Spam.
3405
3406     Argument must be an existing file.
3407     """
3408     # etc.
3409
3410 class GoneOff:
3411     """Class used for Spam exceptions.
3412
3413     There shouldn't be any.
3414     """
3415     pass
3416 \end{verbatim}
3417
3418 After executing ``\code{import Spam}'', the following expressions
3419 return the various documentation strings from the module:
3420
3421 \begin{verbatim}
3422 Spam.__doc__
3423 Spam.SpamLight.__doc__
3424 Spam.SpamLight.__init__.__doc__
3425 Spam.Spam.__doc__
3426 Spam.Spam.__init__.__doc__
3427 Spam.open.__doc__
3428 Spam.GoneOff.__doc__
3429 \end{verbatim}
3430
3431 There are emerging conventions about the content and formatting of
3432 documentation strings.
3433
3434 The first line should always be a short, concise summary of the
3435 object's purpose.  For brevity, it should not explicitly state the
3436 object's name or type, since these are available by other means
3437 (except if the name happens to be a verb describing a function's
3438 operation).  This line should begin with a capital letter and end with
3439 a period.
3440
3441 If there are more lines in the documentation string, the second line
3442 should be blank, visually separating the summary from the rest of the
3443 description.  The following lines should be one of more of paragraphs
3444 describing the objects calling conventions, its side effects, etc.
3445
3446 Some people like to copy the Emacs convention of using UPPER CASE for
3447 function parameters --- this often saves a few words or lines.
3448
3449 The Python parser does not strip indentation from multi-line string
3450 literals in Python, so tools that process documentation have to strip
3451 indentation.  This is done using the following convention.  The first
3452 non-blank line {\em after} the first line of the string determines the
3453 amount of indentation for the entire documentation string.  (We can't
3454 use the first line since it is generally adjacent to the string's
3455 opening quotes so its indentation is not apparent in the string
3456 literal.)  Whitespace ``equivalent'' to this indentation is then
3457 stripped from the start of all lines of the string.  Lines that are
3458 indented less should not occur, but if they occur all their leading
3459 whitespace should be stripped.  Equivalence of whitespace should be
3460 tested after expansion of tabs (to 8 spaces, normally).
3461
3462 In this release, few of the built-in or standard functions and modules
3463 have documentation strings.
3464
3465
3466 \section{Customizing Import and Built-Ins}
3467
3468 In preparation for a ``restricted execution mode'' which will be
3469 usable to run code received from an untrusted source (such as a WWW
3470 server or client), the mechanism by which modules are imported has
3471 been redesigned.  It is now possible to provide your own function
3472 \code{__import__} which is called whenever an \code{import} statement
3473 is executed.  There's a built-in function \code{__import__} which
3474 provides the default implementation, but more interesting, the various
3475 steps it takes are available separately from the new built-in module
3476 \code{imp}.  (See the section on \code{imp} in the Library Reference
3477 Manual for more information on this module.)
3478
3479 When you do \code{dir()} in a fresh interactive interpreter you will
3480 see another ``secret'' object that's present in every module:
3481 \code{__builtins__}.  This is either a dictionary or a module
3482 containing the set of built-in objects used by functions defined in
3483 current module.  Although normally all modules are initialized with a
3484 reference to the same dictionary, it is now possible to use a
3485 different set of built-ins on a per-module basis.  Together with the
3486 fact that the \code{import} statement uses the \code{__import__}
3487 function it finds in the importing modules' dictionary of built-ins,
3488 this forms the basis for a future restricted execution mode.
3489
3490
3491 \section{Python and the World-Wide Web}
3492
3493 There is a growing number of modules available for writing WWW tools.
3494 The previous release already sported modules \code{gopherlib},
3495 \code{ftplib}, \code{httplib} and \code{urllib} (which unifies the
3496 other three) for accessing data through the commonest WWW protocols.
3497 This release also provides \code{cgi}, to ease the writing of
3498 server-side scripts that use the Common Gateway Interface protocol,
3499 supported by most WWW servers.  The module \code{urlparse} provides
3500 precise parsing of a URL string into its components (address scheme,
3501 network location, path, parameters, query, and fragment identifier).
3502
3503 A rudimentary, parser for HTML files is available in the module
3504 \code{htmllib}.  It currently supports a subset of HTML 1.0 (if you
3505 bring it up to date, I'd love to receive your fixes!).  Unfortunately
3506 Python seems to be too slow for real-time parsing and formatting of
3507 HTML such as required by interactive WWW browsers --- but it's good
3508 enough to write a ``robot'' (an automated WWW browser that searches
3509 the web for information).
3510
3511
3512 \section{Miscellaneous}
3513
3514 \begin{itemize}
3515
3516 \item
3517 The \code{socket} module now exports all the needed constants used for
3518 socket operations, such as \code{SO_BROADCAST}.
3519
3520 \item
3521 The functions \code{popen()} and \code{fdopen()} in the \code{os}
3522 module now follow the pattern of the built-in function \code{open()}:
3523 the default mode argument is \code{'r'} and the optional third
3524 argument specifies the buffer size, where \code{0} means unbuffered,
3525 \code{1} means line-buffered, and any larger number means the size of
3526 the buffer in bytes.
3527
3528 \end{itemize}
3529
3530
3531 \end{document}