Doc/tut/tut.tex

   1 \documentstyle[twoside,11pt,myformat]{report}
   2
   3 \title{Python Tutorial}
   4
   5 \author{
   6         Guido van Rossum \\
   7         Dept. CST, CWI, P.O. Box 94079 \\
   8         1090 GB Amsterdam, The Netherlands \\
   9         E-mail: {\tt guido@cwi.nl}
  10 }
  11
  12 \date{4 May 1994 \\ Release 1.0.2} % XXX update before release!
  13
  14 \begin{document}
  15
  16 \pagenumbering{roman}
  17
  18 \maketitle
  19
  20 \begin{abstract}
  21
  22 \noindent
  23 Python is a simple, yet powerful programming language that bridges the
  24 gap between C and shell programming, and is thus ideally suited for
  25 ``throw-away programming''
  26 and rapid prototyping.  Its syntax is put
  27 together from constructs borrowed from a variety of other languages;
  28 most prominent are influences from ABC, C, Modula-3 and Icon.
  29
  30 The Python interpreter is easily extended with new functions and data
  31 types implemented in C.  Python is also suitable as an extension
  32 language for highly customizable C applications such as editors or
  33 window managers.
  34
  35 Python is available for various operating systems, amongst which
  36 several flavors of {\UNIX}, Amoeba, the Apple Macintosh O.S.,
  37 and MS-DOS.
  38
  39 This tutorial introduces the reader informally to the basic concepts
  40 and features of the Python language and system.  It helps to have a
  41 Python interpreter handy for hands-on experience, but as the examples
  42 are self-contained, the tutorial can be read off-line as well.
  43
  44 For a description of standard objects and modules, see the {\em Python
  45 Library Reference} document.  The {\em Python Reference Manual} gives
  46 a more formal definition of the language.
  47
  48 \end{abstract}
  49
  50 \pagebreak
  51 {
  52 \parskip = 0mm
  53 \tableofcontents
  54 }
  55
  56 \pagebreak
  57
  58 \pagenumbering{arabic}
  59
  60
  61 \chapter{Whetting Your Appetite}
  62
  63 If you ever wrote a large shell script, you probably know this
  64 feeling: you'd love to add yet another feature, but it's already so
  65 slow, and so big, and so complicated; or the feature involves a system
  66 call or other function that is only accessible from C \ldots  Usually
  67 the problem at hand isn't serious enough to warrant rewriting the
  68 script in C; perhaps because the problem requires variable-length
  69 strings or other data types (like sorted lists of file names) that are
  70 easy in the shell but lots of work to implement in C; or perhaps just
  71 because you're not sufficiently familiar with C.
  72
  73 In such cases, Python may be just the language for you.  Python is
  74 simple to use, but it is a real programming language, offering much
  75 more structure and support for large programs than the shell has.  On
  76 the other hand, it also offers much more error checking than C, and,
  77 being a {\em very-high-level language}, it has high-level data types
  78 built in, such as flexible arrays and dictionaries that would cost you
  79 days to implement efficiently in C.  Because of its more general data
  80 types Python is applicable to a much larger problem domain than {\em
  81 Awk} or even {\em Perl}, yet many things are at least as easy in
  82 Python as in those languages.
  83
  84 Python allows you to split up your program in modules that can be
  85 reused in other Python programs.  It comes with a large collection of
  86 standard modules that you can use as the basis of your programs --- or
  87 as examples to start learning to program in Python.  There are also
  88 built-in modules that provide things like file I/O, system calls,
  89 sockets, and even a generic interface to window systems (STDWIN).
  90
  91 Python is an interpreted language, which can save you considerable time
  92 during program development because no compilation and linking is
  93 necessary.  The interpreter can be used interactively, which makes it
  94 easy to experiment with features of the language, to write throw-away
  95 programs, or to test functions during bottom-up program development.
  96 It is also a handy desk calculator.
  97
  98 Python allows writing very compact and readable programs.  Programs
  99 written in Python are typically much shorter than equivalent C
 100 programs, for several reasons:
 101 \begin{itemize}
 102 \item
 103 the high-level data types allow you to express complex operations in a
 104 single statement;
 105 \item
 106 statement grouping is done by indentation instead of begin/end
 107 brackets;
 108 \item
 109 no variable or argument declarations are necessary.
 110 \end{itemize}
 111
 112 Python is {\em extensible}: if you know how to program in C it is easy
 113 to add a new built-in
 114 function or
 115 module to the interpreter, either to
 116 perform critical operations at maximum speed, or to link Python
 117 programs to libraries that may only be available in binary form (such
 118 as a vendor-specific graphics library).  Once you are really hooked,
 119 you can link the Python interpreter into an application written in C
 120 and use it as an extension or command language for that application.
 121
 122 By the way, the language is named after the BBC show ``Monty
 123 Python's Flying Circus'' and has nothing to do with nasty reptiles...
 124
 125 \section{Where From Here}
 126
 127 Now that you are all excited about Python, you'll want to examine it
 128 in some more detail.  Since the best way to learn a language is
 129 using it, you are invited here to do so.
 130
 131 In the next chapter, the mechanics of using the interpreter are
 132 explained.  This is rather mundane information, but essential for
 133 trying out the examples shown later.
 134
 135 The rest of the tutorial introduces various features of the Python
 136 language and system though examples, beginning with simple
 137 expressions, statements and data types, through functions and modules,
 138 and finally touching upon advanced concepts like exceptions
 139 and user-defined classes.
 140
 141 When you're through with the tutorial (or just getting bored), you
 142 should read the Library Reference, which gives complete (though terse)
 143 reference material about built-in and standard types, functions and
 144 modules that can save you a lot of time when writing Python programs.
 145
 146
 147 \chapter{Using the Python Interpreter}
 148
 149 \section{Invoking the Interpreter}
 150
 151 The Python interpreter is usually installed as {\tt /usr/local/bin/python}
 152 on those machines where it is available; putting {\tt /usr/local/bin} in
 153 your {\UNIX} shell's search path makes it possible to start it by
 154 typing the command
 155
 156 \bcode\begin{verbatim}
 157 python
 158 \end{verbatim}\ecode
 159 %
 160 to the shell.  Since the choice of the directory where the interpreter
 161 lives is an installation option, other places are possible; check with
 162 your local Python guru or system administrator.  (E.g., {\tt
 163 /usr/local/python} is a popular alternative location.)
 164
 165 The interpreter operates somewhat like the {\UNIX} shell: when called
 166 with standard input connected to a tty device, it reads and executes
 167 commands interactively; when called with a file name argument or with
 168 a file as standard input, it reads and executes a {\em script} from
 169 that file.
 170
 171 A third way of starting the interpreter is
 172 ``{\tt python -c command [arg] ...}'', which
 173 executes the statement(s) in {\tt command}, analogous to the shell's
 174 {\tt -c} option.  Since Python statements often contain spaces or other
 175 characters that are special to the shell, it is best to quote {\tt
 176 command} in its entirety with double quotes.
 177
 178 Note that there is a difference between ``{\tt python file}'' and
 179 ``{\tt python $<$file}''.  In the latter case, input requests from the
 180 program, such as calls to {\tt input()} and {\tt raw_input()}, are
 181 satisfied from {\em file}.  Since this file has already been read
 182 until the end by the parser before the program starts executing, the
 183 program will encounter EOF immediately.  In the former case (which is
 184 usually what you want) they are satisfied from whatever file or device
 185 is connected to standard input of the Python interpreter.
 186
 187 When a script file is used, it is sometimes useful to be able to run
 188 the script and enter interactive mode afterwards.  This can be done by
 189 passing {\tt -i} before the script.  (This does not work if the script
 190 is read from standard input, for the same reason as explained in the
 191 previous paragraph.)
 192
 193 \subsection{Argument Passing}
 194
 195 When known to the interpreter, the script name and additional
 196 arguments thereafter are passed to the script in the variable {\tt
 197 sys.argv}, which is a list of strings.  Its length is at least one;
 198 when no script and no arguments are given, {\tt sys.argv[0]} is an
 199 empty string.  When the script name is given as {\tt '-'} (meaning
 200 standard input), {\tt sys.argv[0]} is set to {\tt '-'}.  When {\tt -c
 201 command} is used, {\tt sys.argv[0]} is set to {\tt '-c'}.  Options
 202 found after {\tt -c command} are not consumed by the Python
 203 interpreter's option processing but left in {\tt sys.argv} for the
 204 command to handle.
 205
 206 \subsection{Interactive Mode}
 207
 208 When commands are read from a tty, the interpreter is said to be in
 209 {\em interactive\ mode}.  In this mode it prompts for the next command
 210 with the {\em primary\ prompt}, usually three greater-than signs ({\tt
 211 >>>}); for continuation lines it prompts with the {\em secondary\
 212 prompt}, by default three dots ({\tt ...}).  Typing an EOF (Control-D)
 213 at the primary prompt causes the interpreter to exit with a zero exit
 214 status.
 215
 216 The interpreter prints a welcome message stating its version number
 217 and a copyright notice before printing the first prompt, e.g.:
 218
 219 \bcode\begin{verbatim}
 220 python
 221 Python 1.0.2 (May  3 1994)
 222 Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam
 223 >>>
 224 \end{verbatim}\ecode
 225
 226 \section{The Interpreter and its Environment}
 227
 228 \subsection{Error Handling}
 229
 230 When an error occurs, the interpreter prints an error
 231 message and a stack trace.  In interactive mode, it then returns to
 232 the primary prompt; when input came from a file, it exits with a
 233 nonzero exit status after printing
 234 the stack trace.  (Exceptions handled by an {\tt except} clause in a
 235 {\tt try} statement are not errors in this context.)  Some errors are
 236 unconditionally fatal and cause an exit with a nonzero exit; this
 237 applies to internal inconsistencies and some cases of running out of
 238 memory.  All error messages are written to the standard error stream;
 239 normal output from the executed commands is written to standard
 240 output.
 241
 242 Typing the interrupt character (usually Control-C or DEL) to the
 243 primary or secondary prompt cancels the input and returns to the
 244 primary prompt.%
 245 \footnote{
 246         A problem with the GNU Readline package may prevent this.
 247 }
 248 Typing an interrupt while a command is executing raises the {\tt
 249 KeyboardInterrupt} exception, which may be handled by a {\tt try}
 250 statement.
 251
 252 \subsection{The Module Search Path}
 253
 254 When a module named {\tt foo} is imported, the interpreter searches
 255 for a file named {\tt foo.py} in the list of directories specified by
 256 the environment variable {\tt PYTHONPATH}.  It has the same syntax as
 257 the {\UNIX} shell variable {\tt PATH}, i.e., a list of colon-separated
 258 directory names.  When {\tt PYTHONPATH} is not set, or when the file
 259 is not found there, the search continues in an installation-dependent
 260 default path, usually {\tt .:/usr/local/lib/python}.
 261
 262 Actually, modules are searched in the list of directories given by the
 263 variable {\tt sys.path} which is initialized from {\tt PYTHONPATH} and
 264 the installation-dependent default.  This allows Python programs that
 265 know what they're doing to modify or replace the module search path.
 266 See the section on Standard Modules later.
 267
 268 \subsection{``Compiled'' Python files}
 269
 270 As an important speed-up of the start-up time for short programs that
 271 use a lot of standard modules, if a file called {\tt foo.pyc} exists
 272 in the directory where {\tt foo.py} is found, this is assumed to
 273 contain an already-``compiled'' version of the module {\tt foo}.  The
 274 modification time of the version of {\tt foo.py} used to create {\tt
 275 foo.pyc} is recorded in {\tt foo.pyc}, and the file is ignored if
 276 these don't match.
 277
 278 Whenever {\tt foo.py} is successfully compiled, an attempt is made to
 279 write the compiled version to {\tt foo.pyc}.  It is not an error if
 280 this attempt fails; if for any reason the file is not written
 281 completely, the resulting {\tt foo.pyc} file will be recognized as
 282 invalid and thus ignored later.
 283
 284 \subsection{Executable Python scripts}
 285
 286 On BSD'ish {\UNIX} systems, Python scripts can be made directly
 287 executable, like shell scripts, by putting the line
 288
 289 \bcode\begin{verbatim}
 290 #! /usr/local/bin/python
 291 \end{verbatim}\ecode
 292 %
 293 (assuming that's the name of the interpreter) at the beginning of the
 294 script and giving the file an executable mode.  The {\tt \#!} must be
 295 the first two characters of the file.
 296
 297 \subsection{The Interactive Startup File}
 298
 299 When you use Python interactively, it is frequently handy to have some
 300 standard commands executed every time the interpreter is started.  You
 301 can do this by setting an environment variable named {\tt
 302 PYTHONSTARTUP} to the name of a file containing your start-up
 303 commands.  This is similar to the {\tt .profile} feature of the UNIX
 304 shells.
 305
 306 This file is only read in interactive sessions, not when Python reads
 307 commands from a script, and not when {\tt /dev/tty} is given as the
 308 explicit source of commands (which otherwise behaves like an
 309 interactive session).  It is executed in the same name space where
 310 interactive commands are executed, so that objects that it defines or
 311 imports can be used without qualification in the interactive session.
 312 You can also change the prompts {\tt sys.ps1} and {\tt sys.ps2} in
 313 this file.
 314
 315 If you want to read an additional start-up file from the current
 316 directory, you can program this in the global start-up file, e.g.
 317 \verb\execfile('.pythonrc')\.  If you want to use the startup file
 318 in a script, you must write this explicitly in the script, e.g.
 319 \verb\import os;\ \verb\execfile(os.environ['PYTHONSTARTUP'])\.
 320
 321 \section{Interactive Input Editing and History Substitution}
 322
 323 Some versions of the Python interpreter support editing of the current
 324 input line and history substitution, similar to facilities found in
 325 the Korn shell and the GNU Bash shell.  This is implemented using the
 326 {\em GNU\ Readline} library, which supports Emacs-style and vi-style
 327 editing.  This library has its own documentation which I won't
 328 duplicate here; however, the basics are easily explained.
 329
 330 Perhaps the quickest check to see whether command line editing is
 331 supported is typing Control-P to the first Python prompt you get.  If
 332 it beeps, you have command line editing.  If nothing appears to
 333 happen, or if \verb/^P/ is echoed, you can skip the rest of this
 334 section.
 335
 336 \subsection{Line Editing}
 337
 338 If supported, input line editing is active whenever the interpreter
 339 prints a primary or secondary prompt.  The current line can be edited
 340 using the conventional Emacs control characters.  The most important
 341 of these are: C-A (Control-A) moves the cursor to the beginning of the
 342 line, C-E to the end, C-B moves it one position to the left, C-F to
 343 the right.  Backspace erases the character to the left of the cursor,
 344 C-D the character to its right.  C-K kills (erases) the rest of the
 345 line to the right of the cursor, C-Y yanks back the last killed
 346 string.  C-underscore undoes the last change you made; it can be
 347 repeated for cumulative effect.
 348
 349 \subsection{History Substitution}
 350
 351 History substitution works as follows.  All non-empty input lines
 352 issued are saved in a history buffer, and when a new prompt is given
 353 you are positioned on a new line at the bottom of this buffer.  C-P
 354 moves one line up (back) in the history buffer, C-N moves one down.
 355 Any line in the history buffer can be edited; an asterisk appears in
 356 front of the prompt to mark a line as modified.  Pressing the Return
 357 key passes the current line to the interpreter.  C-R starts an
 358 incremental reverse search; C-S starts a forward search.
 359
 360 \subsection{Key Bindings}
 361
 362 The key bindings and some other parameters of the Readline library can
 363 be customized by placing commands in an initialization file called
 364 {\tt \$HOME/.inputrc}.  Key bindings have the form
 365
 366 \bcode\begin{verbatim}
 367 key-name: function-name
 368 \end{verbatim}\ecode
 369 %
 370 or
 371
 372 \bcode\begin{verbatim}
 373 "string": function-name
 374 \end{verbatim}\ecode
 375 %
 376 and options can be set with
 377
 378 \bcode\begin{verbatim}
 379 set option-name value
 380 \end{verbatim}\ecode
 381 %
 382 For example:
 383
 384 \bcode\begin{verbatim}
 385 # I prefer vi-style editing:
 386 set editing-mode vi
 387 # Edit using a single line:
 388 set horizontal-scroll-mode On
 389 # Rebind some keys:
 390 Meta-h: backward-kill-word
 391 "\C-u": universal-argument
 392 "\C-x\C-r": re-read-init-file
 393 \end{verbatim}\ecode
 394 %
 395 Note that the default binding for TAB in Python is to insert a TAB
 396 instead of Readline's default filename completion function.  If you
 397 insist, you can override this by putting
 398
 399 \bcode\begin{verbatim}
 400 TAB: complete
 401 \end{verbatim}\ecode
 402 %
 403 in your {\tt \$HOME/.inputrc}.  (Of course, this makes it hard to type
 404 indented continuation lines...)
 405
 406 \subsection{Commentary}
 407
 408 This facility is an enormous step forward compared to previous
 409 versions of the interpreter; however, some wishes are left: It would
 410 be nice if the proper indentation were suggested on continuation lines
 411 (the parser knows if an indent token is required next).  The
 412 completion mechanism might use the interpreter's symbol table.  A
 413 command to check (or even suggest) matching parentheses, quotes etc.
 414 would also be useful.
 415
 416
 417 \chapter{An Informal Introduction to Python}
 418
 419 In the following examples, input and output are distinguished by the
 420 presence or absence of prompts ({\tt >>>} and {\tt ...}): to repeat
 421 the example, you must type everything after the prompt, when the
 422 prompt appears; lines that do not begin with a prompt are output from
 423 the interpreter.%
 424 \footnote{
 425         I'd prefer to use different fonts to distinguish input
 426         from output, but the amount of LaTeX hacking that would require
 427         is currently beyond my ability.
 428 }
 429 Note that a secondary prompt on a line by itself in an example means
 430 you must type a blank line; this is used to end a multi-line command.
 431
 432 \section{Using Python as a Calculator}
 433
 434 Let's try some simple Python commands.  Start the interpreter and wait
 435 for the primary prompt, {\tt >>>}.  (It shouldn't take long.)
 436
 437 \subsection{Numbers}
 438
 439 The interpreter acts as a simple calculator: you can type an
 440 expression at it and it will write the value.  Expression syntax is
 441 straightforward: the operators {\tt +}, {\tt -}, {\tt *} and {\tt /}
 442 work just like in most other languages (e.g., Pascal or C); parentheses
 443 can be used for grouping.  For example:
 444
 445 \bcode\begin{verbatim}
 446 >>> 2+2
 447 4
 448 >>> # This is a comment
 449 ... 2+2
 450 4
 451 >>> 2+2  # and a comment on the same line as code
 452 4
 453 >>> (50-5*6)/4
 454 5
 455 >>> # Integer division returns the floor:
 456 ... 7/3
 457 2
 458 >>> 7/-3
 459 -3
 460 >>>
 461 \end{verbatim}\ecode
 462 %
 463 Like in C, the equal sign ({\tt =}) is used to assign a value to a
 464 variable.  The value of an assignment is not written:
 465
 466 \bcode\begin{verbatim}
 467 >>> width = 20
 468 >>> height = 5*9
 469 >>> width * height
 470 900
 471 >>>
 472 \end{verbatim}\ecode
 473 %
 474 A value can be assigned to several variables simultaneously:
 475
 476 \bcode\begin{verbatim}
 477 >>> x = y = z = 0  # Zero x, y and z
 478 >>> x
 479 0
 480 >>> y
 481 0
 482 >>> z
 483 0
 484 >>>
 485 \end{verbatim}\ecode
 486 %
 487 There is full support for floating point; operators with mixed type
 488 operands convert the integer operand to floating point:
 489
 490 \bcode\begin{verbatim}
 491 >>> 4 * 2.5 / 3.3
 492 3.0303030303
 493 >>> 7.0 / 2
 494 3.5
 495 >>>
 496 \end{verbatim}\ecode
 497
 498 \subsection{Strings}
 499
 500 Besides numbers, Python can also manipulate strings, enclosed in
 501 single quotes or double quotes:
 502
 503 \bcode\begin{verbatim}
 504 >>> 'foo bar'
 505 'foo bar'
 506 >>> 'doesn\'t'
 507 "doesn't"
 508 >>> "doesn't"
 509 "doesn't"
 510 >>> '"Yes," he said.'
 511 '"Yes," he said.'
 512 >>> "\"Yes,\" he said."
 513 '"Yes," he said.'
 514 >>> '"Isn\'t," she said.'
 515 '"Isn\'t," she said.'
 516 >>>
 517 \end{verbatim}\ecode
 518 %
 519 Strings are written the same way as they are typed for input: inside
 520 quotes and with quotes and other funny characters escaped by backslashes,
 521 to show the precise value.  The string is enclosed in double quotes if
 522 the string contains a single quote and no double quotes, else it's
 523 enclosed in single quotes.  (The {\tt print} statement, described later,
 524 can be used to write strings without quotes or escapes.)
 525
 526 Strings can be concatenated (glued together) with the {\tt +}
 527 operator, and repeated with {\tt *}:
 528
 529 \bcode\begin{verbatim}
 530 >>> word = 'Help' + 'A'
 531 >>> word
 532 'HelpA'
 533 >>> '<' + word*5 + '>'
 534 '<HelpAHelpAHelpAHelpAHelpA>'
 535 >>>
 536 \end{verbatim}\ecode
 537 %
 538 Strings can be subscripted (indexed); like in C, the first character of
 539 a string has subscript (index) 0.
 540
 541 There is no separate character type; a character is simply a string of
 542 size one.  Like in Icon, substrings can be specified with the {\em
 543 slice} notation: two indices separated by a colon.
 544
 545 \bcode\begin{verbatim}
 546 >>> word[4]
 547 'A'
 548 >>> word[0:2]
 549 'He'
 550 >>> word[2:4]
 551 'lp'
 552 >>>
 553 \end{verbatim}\ecode
 554 %
 555 Slice indices have useful defaults; an omitted first index defaults to
 556 zero, an omitted second index defaults to the size of the string being
 557 sliced.
 558
 559 \bcode\begin{verbatim}
 560 >>> word[:2]    # The first two characters
 561 'He'
 562 >>> word[2:]    # All but the first two characters
 563 'lpA'
 564 >>>
 565 \end{verbatim}\ecode
 566 %
 567 Here's a useful invariant of slice operations: \verb\s[:i] + s[i:]\
 568 equals \verb\s\.
 569
 570 \bcode\begin{verbatim}
 571 >>> word[:2] + word[2:]
 572 'HelpA'
 573 >>> word[:3] + word[3:]
 574 'HelpA'
 575 >>>
 576 \end{verbatim}\ecode
 577 %
 578 Degenerate slice indices are handled gracefully: an index that is too
 579 large is replaced by the string size, an upper bound smaller than the
 580 lower bound returns an empty string.
 581
 582 \bcode\begin{verbatim}
 583 >>> word[1:100]
 584 'elpA'
 585 >>> word[10:]
 586 ''
 587 >>> word[2:1]
 588 ''
 589 >>>
 590 \end{verbatim}\ecode
 591 %
 592 Indices may be negative numbers, to start counting from the right.
 593 For example:
 594
 595 \bcode\begin{verbatim}
 596 >>> word[-1]     # The last character
 597 'A'
 598 >>> word[-2]     # The last-but-one character
 599 'p'
 600 >>> word[-2:]    # The last two characters
 601 'pA'
 602 >>> word[:-2]    # All but the last two characters
 603 'Hel'
 604 >>>
 605 \end{verbatim}\ecode
 606 %
 607 But note that -0 is really the same as 0, so it does not count from
 608 the right!
 609
 610 \bcode\begin{verbatim}
 611 >>> word[-0]     # (since -0 equals 0)
 612 'H'
 613 >>>
 614 \end{verbatim}\ecode
 615 %
 616 Out-of-range negative slice indices are truncated, but don't try this
 617 for single-element (non-slice) indices:
 618
 619 \bcode\begin{verbatim}
 620 >>> word[-100:]
 621 'HelpA'
 622 >>> word[-10]    # error
 623 Traceback (innermost last):
 624   File "<stdin>", line 1
 625 IndexError: string index out of range
 626 >>>
 627 \end{verbatim}\ecode
 628 %
 629 The best way to remember how slices work is to think of the indices as
 630 pointing {\em between} characters, with the left edge of the first
 631 character numbered 0.  Then the right edge of the last character of a
 632 string of {\tt n} characters has index {\tt n}, for example:
 633
 634 \bcode\begin{verbatim}
 635  +---+---+---+---+---+
 636  | H | e | l | p | A |
 637  +---+---+---+---+---+
 638  0   1   2   3   4   5
 639 -5  -4  -3  -2  -1
 640 \end{verbatim}\ecode
 641 %
 642 The first row of numbers gives the position of the indices 0...5 in
 643 the string; the second row gives the corresponding negative indices.
 644 The slice from \verb\i\ to \verb\j\ consists of all characters between
 645 the edges labeled \verb\i\ and \verb\j\, respectively.
 646
 647 For nonnegative indices, the length of a slice is the difference of
 648 the indices, if both are within bounds, e.g., the length of
 649 \verb\word[1:3]\ is 2.
 650
 651 The built-in function {\tt len()} returns the length of a string:
 652
 653 \bcode\begin{verbatim}
 654 >>> s = 'supercalifragilisticexpialidocious'
 655 >>> len(s)
 656 34
 657 >>>
 658 \end{verbatim}\ecode
 659
 660 \subsection{Lists}
 661
 662 Python knows a number of {\em compound} data types, used to group
 663 together other values.  The most versatile is the {\em list}, which
 664 can be written as a list of comma-separated values (items) between
 665 square brackets.  List items need not all have the same type.
 666
 667 \bcode\begin{verbatim}
 668 >>> a = ['foo', 'bar', 100, 1234]
 669 >>> a
 670 ['foo', 'bar', 100, 1234]
 671 >>>
 672 \end{verbatim}\ecode
 673 %
 674 Like string indices, list indices start at 0, and lists can be sliced,
 675 concatenated and so on:
 676
 677 \bcode\begin{verbatim}
 678 >>> a[0]
 679 'foo'
 680 >>> a[3]
 681 1234
 682 >>> a[-2]
 683 100
 684 >>> a[1:-1]
 685 ['bar', 100]
 686 >>> a[:2] + ['bletch', 2*2]
 687 ['foo', 'bar', 'bletch', 4]
 688 >>> 3*a[:3] + ['Boe!']
 689 ['foo', 'bar', 100, 'foo', 'bar', 100, 'foo', 'bar', 100, 'Boe!']
 690 >>>
 691 \end{verbatim}\ecode
 692 %
 693 Unlike strings, which are {\em immutable}, it is possible to change
 694 individual elements of a list:
 695
 696 \bcode\begin{verbatim}
 697 >>> a
 698 ['foo', 'bar', 100, 1234]
 699 >>> a[2] = a[2] + 23
 700 >>> a
 701 ['foo', 'bar', 123, 1234]
 702 >>>
 703 \end{verbatim}\ecode
 704 %
 705 Assignment to slices is also possible, and this can even change the size
 706 of the list:
 707
 708 \bcode\begin{verbatim}
 709 >>> # Replace some items:
 710 ... a[0:2] = [1, 12]
 711 >>> a
 712 [1, 12, 123, 1234]
 713 >>> # Remove some:
 714 ... a[0:2] = []
 715 >>> a
 716 [123, 1234]
 717 >>> # Insert some:
 718 ... a[1:1] = ['bletch', 'xyzzy']
 719 >>> a
 720 [123, 'bletch', 'xyzzy', 1234]
 721 >>> a[:0] = a     # Insert (a copy of) itself at the beginning
 722 >>> a
 723 [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
 724 >>>
 725 \end{verbatim}\ecode
 726 %
 727 The built-in function {\tt len()} also applies to lists:
 728
 729 \bcode\begin{verbatim}
 730 >>> len(a)
 731 8
 732 >>>
 733 \end{verbatim}\ecode
 734 %
 735 It is possible to nest lists (create lists containing other lists),
 736 for example:
 737
 738 \bcode\begin{verbatim}
 739 >>> q = [2, 3]
 740 >>> p = [1, q, 4]
 741 >>> len(p)
 742 3
 743 >>> p[1]
 744 [2, 3]
 745 >>> p[1][0]
 746 2
 747 >>> p[1].append('xtra')     # See section 5.1
 748 >>> p
 749 [1, [2, 3, 'xtra'], 4]
 750 >>> q
 751 [2, 3, 'xtra']
 752 >>>
 753 \end{verbatim}\ecode
 754 %
 755 Note that in the last example, {\tt p[1]} and {\tt q} really refer to
 756 the same object!  We'll come back to {\em object semantics} later.
 757
 758 \section{First Steps Towards Programming}
 759
 760 Of course, we can use Python for more complicated tasks than adding
 761 two and two together.  For instance, we can write an initial
 762 subsequence of the {\em Fibonacci} series as follows:
 763
 764 \bcode\begin{verbatim}
 765 >>> # Fibonacci series:
 766 ... # the sum of two elements defines the next
 767 ... a, b = 0, 1
 768 >>> while b < 10:
 769 ...       print b
 770 ...       a, b = b, a+b
 771 ...
 772 1
 773 1
 774 2
 775 3
 776 5
 777 8
 778 >>>
 779 \end{verbatim}\ecode
 780 %
 781 This example introduces several new features.
 782
 783 \begin{itemize}
 784
 785 \item
 786 The first line contains a {\em multiple assignment}: the variables
 787 {\tt a} and {\tt b} simultaneously get the new values 0 and 1.  On the
 788 last line this is used again, demonstrating that the expressions on
 789 the right-hand side are all evaluated first before any of the
 790 assignments take place.
 791
 792 \item
 793 The {\tt while} loop executes as long as the condition (here: {\tt b <
 794 100}) remains true.  In Python, like in C, any non-zero integer value is
 795 true; zero is false.  The condition may also be a string or list value,
 796 in fact any sequence; anything with a non-zero length is true, empty
 797 sequences are false.  The test used in the example is a simple
 798 comparison.  The standard comparison operators are written the same as
 799 in C: {\tt <}, {\tt >}, {\tt ==}, {\tt <=}, {\tt >=} and {\tt !=}.
 800
 801 \item
 802 The {\em body} of the loop is {\em indented}: indentation is Python's
 803 way of grouping statements.  Python does not (yet!) provide an
 804 intelligent input line editing facility, so you have to type a tab or
 805 space(s) for each indented line.  In practice you will prepare more
 806 complicated input for Python with a text editor; most text editors have
 807 an auto-indent facility.  When a compound statement is entered
 808 interactively, it must be followed by a blank line to indicate
 809 completion (since the parser cannot guess when you have typed the last
 810 line).
 811
 812 \item
 813 The {\tt print} statement writes the value of the expression(s) it is
 814 given.  It differs from just writing the expression you want to write
 815 (as we did earlier in the calculator examples) in the way it handles
 816 multiple expressions and strings.  Strings are written without quotes,
 817 and a space is inserted between items, so you can format things nicely,
 818 like this:
 819
 820 \bcode\begin{verbatim}
 821 >>> i = 256*256
 822 >>> print 'The value of i is', i
 823 The value of i is 65536
 824 >>>
 825 \end{verbatim}\ecode
 826 %
 827 A trailing comma avoids the newline after the output:
 828
 829 \bcode\begin{verbatim}
 830 >>> a, b = 0, 1
 831 >>> while b < 1000:
 832 ...     print b,
 833 ...     a, b = b, a+b
 834 ...
 835 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
 836 >>>
 837 \end{verbatim}\ecode
 838 %
 839 Note that the interpreter inserts a newline before it prints the next
 840 prompt if the last line was not completed.
 841
 842 \end{itemize}
 843
 844
 845 \chapter{More Control Flow Tools}
 846
 847 Besides the {\tt while} statement just introduced, Python knows the
 848 usual control flow statements known from other languages, with some
 849 twists.
 850
 851 \section{If Statements}
 852
 853 Perhaps the most well-known statement type is the {\tt if} statement.
 854 For example:
 855
 856 \bcode\begin{verbatim}
 857 >>> if x < 0:
 858 ...      x = 0
 859 ...      print 'Negative changed to zero'
 860 ... elif x == 0:
 861 ...      print 'Zero'
 862 ... elif x == 1:
 863 ...      print 'Single'
 864 ... else:
 865 ...      print 'More'
 866 ...
 867 \end{verbatim}\ecode
 868 %
 869 There can be zero or more {\tt elif} parts, and the {\tt else} part is
 870 optional.  The keyword `{\tt elif}' is short for `{\tt else if}', and is
 871 useful to avoid excessive indentation.  An {\tt if...elif...elif...}
 872 sequence is a substitute for the {\em switch} or {\em case} statements
 873 found in other languages.
 874
 875 \section{For Statements}
 876
 877 The {\tt for} statement in Python differs a bit from what you may be
 878 used to in C or Pascal.  Rather than always iterating over an
 879 arithmetic progression of numbers (like in Pascal), or leaving the user
 880 completely free in the iteration test and step (as C), Python's {\tt
 881 for} statement iterates over the items of any sequence (e.g., a list
 882 or a string), in the order that they appear in the sequence.  For
 883 example (no pun intended):
 884
 885 \bcode\begin{verbatim}
 886 >>> # Measure some strings:
 887 ... a = ['cat', 'window', 'defenestrate']
 888 >>> for x in a:
 889 ...     print x, len(x)
 890 ...
 891 cat 3
 892 window 6
 893 defenestrate 12
 894 >>>
 895 \end{verbatim}\ecode
 896 %
 897 It is not safe to modify the sequence being iterated over in the loop
 898 (this can only happen for mutable sequence types, i.e., lists).  If
 899 you need to modify the list you are iterating over, e.g., duplicate
 900 selected items, you must iterate over a copy.  The slice notation
 901 makes this particularly convenient:
 902
 903 \bcode\begin{verbatim}
 904 >>> for x in a[:]: # make a slice copy of the entire list
 905 ...    if len(x) > 6: a.insert(0, x)
 906 ...
 907 >>> a
 908 ['defenestrate', 'cat', 'window', 'defenestrate']
 909 >>>
 910 \end{verbatim}\ecode
 911
 912 \section{The {\tt range()} Function}
 913
 914 If you do need to iterate over a sequence of numbers, the built-in
 915 function {\tt range()} comes in handy.  It generates lists containing
 916 arithmetic progressions, e.g.:
 917
 918 \bcode\begin{verbatim}
 919 >>> range(10)
 920 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 921 >>>
 922 \end{verbatim}\ecode
 923 %
 924 The given end point is never part of the generated list; {\tt range(10)}
 925 generates a list of 10 values, exactly the legal indices for items of a
 926 sequence of length 10.  It is possible to let the range start at another
 927 number, or to specify a different increment (even negative):
 928
 929 \bcode\begin{verbatim}
 930 >>> range(5, 10)
 931 [5, 6, 7, 8, 9]
 932 >>> range(0, 10, 3)
 933 [0, 3, 6, 9]
 934 >>> range(-10, -100, -30)
 935 [-10, -40, -70]
 936 >>>
 937 \end{verbatim}\ecode
 938 %
 939 To iterate over the indices of a sequence, combine {\tt range()} and
 940 {\tt len()} as follows:
 941
 942 \bcode\begin{verbatim}
 943 >>> a = ['Mary', 'had', 'a', 'little', 'lamb']
 944 >>> for i in range(len(a)):
 945 ...     print i, a[i]
 946 ...
 947 0 Mary
 948 1 had
 949 2 a
 950 3 little
 951 4 lamb
 952 >>>
 953 \end{verbatim}\ecode
 954
 955 \section{Break and Continue Statements, and Else Clauses on Loops}
 956
 957 The {\tt break} statement, like in C, breaks out of the smallest
 958 enclosing {\tt for} or {\tt while} loop.
 959
 960 The {\tt continue} statement, also borrowed from C, continues with the
 961 next iteration of the loop.
 962
 963 Loop statements may have an {\tt else} clause; it is executed when the
 964 loop terminates through exhaustion of the list (with {\tt for}) or when
 965 the condition becomes false (with {\tt while}), but not when the loop is
 966 terminated by a {\tt break} statement.  This is exemplified by the
 967 following loop, which searches for a list item of value 0:
 968
 969 \bcode\begin{verbatim}
 970 >>> for n in range(2, 10):
 971 ...     for x in range(2, n):
 972 ...         if n % x == 0:
 973 ...            print n, 'equals', x, '*', n/x
 974 ...            break
 975 ...     else:
 976 ...          print n, 'is a prime number'
 977 ...
 978 2 is a prime number
 979 3 is a prime number
 980 4 equals 2 * 2
 981 5 is a prime number
 982 6 equals 2 * 3
 983 7 is a prime number
 984 8 equals 2 * 4
 985 9 equals 3 * 3
 986 >>>
 987 \end{verbatim}\ecode
 988
 989 \section{Pass Statements}
 990
 991 The {\tt pass} statement does nothing.
 992 It can be used when a statement is required syntactically but the
 993 program requires no action.
 994 For example:
 995
 996 \bcode\begin{verbatim}
 997 >>> while 1:
 998 ...       pass # Busy-wait for keyboard interrupt
 999 ...
1000 \end{verbatim}\ecode
1001
1002 \section{Defining Functions}
1003
1004 We can create a function that writes the Fibonacci series to an
1005 arbitrary boundary:
1006
1007 \bcode\begin{verbatim}
1008 >>> def fib(n):    # write Fibonacci series up to n
1009 ...     a, b = 0, 1
1010 ...     while b <= n:
1011 ...           print b,
1012 ...           a, b = b, a+b
1013 ...
1014 >>> # Now call the function we just defined:
1015 ... fib(2000)
1016 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
1017 >>>
1018 \end{verbatim}\ecode
1019 %
1020 The keyword {\tt def} introduces a function {\em definition}.  It must
1021 be followed by the function name and the parenthesized list of formal
1022 parameters.  The statements that form the body of the function starts at
1023 the next line, indented by a tab stop.
1024
1025 The {\em execution} of a function introduces a new symbol table used
1026 for the local variables of the function.  More precisely, all variable
1027 assignments in a function store the value in the local symbol table;
1028 whereas
1029 variable references first look in the local symbol table, then
1030 in the global symbol table, and then in the table of built-in names.
1031 Thus,
1032 global variables cannot be directly assigned to from within a
1033 function (unless named in a {\tt global} statement), although
1034 they may be referenced.
1035
1036 The actual parameters (arguments) to a function call are introduced in
1037 the local symbol table of the called function when it is called; thus,
1038 arguments are passed using {\em call\ by\ value}.%
1039 \footnote{
1040          Actually, {\em call  by  object reference} would be a better
1041          description, since if a mutable object is passed, the caller
1042          will see any changes the callee makes to it (e.g., items
1043          inserted into a list).
1044 }
1045 When a function calls another function, a new local symbol table is
1046 created for that call.
1047
1048 A function definition introduces the function name in the
1049 current
1050 symbol table.  The value
1051 of the function name
1052 has a type that is recognized by the interpreter as a user-defined
1053 function.  This value can be assigned to another name which can then
1054 also be used as a function.  This serves as a general renaming
1055 mechanism:
1056
1057 \bcode\begin{verbatim}
1058 >>> fib
1059 <function object at 10042ed0>
1060 >>> f = fib
1061 >>> f(100)
1062 1 1 2 3 5 8 13 21 34 55 89
1063 >>>
1064 \end{verbatim}\ecode
1065 %
1066 You might object that {\tt fib} is not a function but a procedure.  In
1067 Python, like in C, procedures are just functions that don't return a
1068 value.  In fact, technically speaking, procedures do return a value,
1069 albeit a rather boring one.  This value is called {\tt None} (it's a
1070 built-in name).  Writing the value {\tt None} is normally suppressed by
1071 the interpreter if it would be the only value written.  You can see it
1072 if you really want to:
1073
1074 \bcode\begin{verbatim}
1075 >>> print fib(0)
1076 None
1077 >>>
1078 \end{verbatim}\ecode
1079 %
1080 It is simple to write a function that returns a list of the numbers of
1081 the Fibonacci series, instead of printing it:
1082
1083 \bcode\begin{verbatim}
1084 >>> def fib2(n): # return Fibonacci series up to n
1085 ...     result = []
1086 ...     a, b = 0, 1
1087 ...     while b <= n:
1088 ...           result.append(b)    # see below
1089 ...           a, b = b, a+b
1090 ...     return result
1091 ...
1092 >>> f100 = fib2(100)    # call it
1093 >>> f100                # write the result
1094 [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
1095 >>>
1096 \end{verbatim}\ecode
1097 %
1098 This example, as usual, demonstrates some new Python features:
1099
1100 \begin{itemize}
1101
1102 \item
1103 The {\tt return} statement returns with a value from a function.  {\tt
1104 return} without an expression argument is used to return from the middle
1105 of a procedure (falling off the end also returns from a procedure), in
1106 which case the {\tt None} value is returned.
1107
1108 \item
1109 The statement {\tt result.append(b)} calls a {\em method} of the list
1110 object {\tt result}.  A method is a function that `belongs' to an
1111 object and is named {\tt obj.methodname}, where {\tt obj} is some
1112 object (this may be an expression), and {\tt methodname} is the name
1113 of a method that is defined by the object's type.  Different types
1114 define different methods.  Methods of different types may have the
1115 same name without causing ambiguity.  (It is possible to define your
1116 own object types and methods, using {\em classes}, as discussed later
1117 in this tutorial.)
1118 The method {\tt append} shown in the example, is defined for
1119 list objects; it adds a new element at the end of the list.  In this
1120 example
1121 it is equivalent to {\tt result = result + [b]}, but more efficient.
1122
1123 \end{itemize}
1124
1125
1126 \chapter{Odds and Ends}
1127
1128 This chapter describes some things you've learned about already in
1129 more detail, and adds some new things as well.
1130
1131 \section{More on Lists}
1132
1133 The list data type has some more methods.  Here are all of the methods
1134 of lists objects:
1135
1136 \begin{description}
1137
1138 \item[{\tt insert(i, x)}]
1139 Insert an item at a given position.  The first argument is the index of
1140 the element before which to insert, so {\tt a.insert(0, x)} inserts at
1141 the front of the list, and {\tt a.insert(len(a), x)} is equivalent to
1142 {\tt a.append(x)}.
1143
1144 \item[{\tt append(x)}]
1145 Equivalent to {\tt a.insert(len(a), x)}.
1146
1147 \item[{\tt index(x)}]
1148 Return the index in the list of the first item whose value is {\tt x}.
1149 It is an error if there is no such item.
1150
1151 \item[{\tt remove(x)}]
1152 Remove the first item from the list whose value is {\tt x}.
1153 It is an error if there is no such item.
1154
1155 \item[{\tt sort()}]
1156 Sort the items of the list, in place.
1157
1158 \item[{\tt reverse()}]
1159 Reverse the elements of the list, in place.
1160
1161 \item[{\tt count(x)}]
1162 Return the number of times {\tt x} appears in the list.
1163
1164 \end{description}
1165
1166 An example that uses all list methods:
1167
1168 \bcode\begin{verbatim}
1169 >>> a = [66.6, 333, 333, 1, 1234.5]
1170 >>> print a.count(333), a.count(66.6), a.count('x')
1171 2 1 0
1172 >>> a.insert(2, -1)
1173 >>> a.append(333)
1174 >>> a
1175 [66.6, 333, -1, 333, 1, 1234.5, 333]
1176 >>> a.index(333)
1177 1
1178 >>> a.remove(333)
1179 >>> a
1180 [66.6, -1, 333, 1, 1234.5, 333]
1181 >>> a.reverse()
1182 >>> a
1183 [333, 1234.5, 1, 333, -1, 66.6]
1184 >>> a.sort()
1185 >>> a
1186 [-1, 1, 66.6, 333, 333, 1234.5]
1187 >>>
1188 \end{verbatim}\ecode
1189
1190 \section{The {\tt del} statement}
1191
1192 There is a way to remove an item from a list given its index instead
1193 of its value: the {\tt del} statement.  This can also be used to
1194 remove slices from a list (which we did earlier by assignment of an
1195 empty list to the slice).  For example:
1196
1197 \bcode\begin{verbatim}
1198 >>> a
1199 [-1, 1, 66.6, 333, 333, 1234.5]
1200 >>> del a[0]
1201 >>> a
1202 [1, 66.6, 333, 333, 1234.5]
1203 >>> del a[2:4]
1204 >>> a
1205 [1, 66.6, 1234.5]
1206 >>>
1207 \end{verbatim}\ecode
1208 %
1209 {\tt del} can also be used to delete entire variables:
1210
1211 \bcode\begin{verbatim}
1212 >>> del a
1213 >>>
1214 \end{verbatim}\ecode
1215 %
1216 Referencing the name {\tt a} hereafter is an error (at least until
1217 another value is assigned to it).  We'll find other uses for {\tt del}
1218 later.
1219
1220 \section{Tuples and Sequences}
1221
1222 We saw that lists and strings have many common properties, e.g.,
1223 indexing and slicing operations.  They are two examples of {\em
1224 sequence} data types.  Since Python is an evolving language, other
1225 sequence data types may be added.  There is also another standard
1226 sequence data type: the {\em tuple}.
1227
1228 A tuple consists of a number of values separated by commas, for
1229 instance:
1230
1231 \bcode\begin{verbatim}
1232 >>> t = 12345, 54321, 'hello!'
1233 >>> t[0]
1234 12345
1235 >>> t
1236 (12345, 54321, 'hello!')
1237 >>> # Tuples may be nested:
1238 ... u = t, (1, 2, 3, 4, 5)
1239 >>> u
1240 ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
1241 >>>
1242 \end{verbatim}\ecode
1243 %
1244 As you see, on output tuples are alway enclosed in parentheses, so
1245 that nested tuples are interpreted correctly; they may be input with
1246 or without surrounding parentheses, although often parentheses are
1247 necessary anyway (if the tuple is part of a larger expression).
1248
1249 Tuples have many uses, e.g., (x, y) coordinate pairs, employee records
1250 from a database, etc.  Tuples, like strings, are immutable: it is not
1251 possible to assign to the individual items of a tuple (you can
1252 simulate much of the same effect with slicing and concatenation,
1253 though).
1254
1255 A special problem is the construction of tuples containing 0 or 1
1256 items: the syntax has some extra quirks to accommodate these.  Empty
1257 tuples are constructed by an empty pair of parentheses; a tuple with
1258 one item is constructed by following a value with a comma
1259 (it is not sufficient to enclose a single value in parentheses).
1260 Ugly, but effective.  For example:
1261
1262 \bcode\begin{verbatim}
1263 >>> empty = ()
1264 >>> singleton = 'hello',    # <-- note trailing comma
1265 >>> len(empty)
1266 0
1267 >>> len(singleton)
1268 1
1269 >>> singleton
1270 ('hello',)
1271 >>>
1272 \end{verbatim}\ecode
1273 %
1274 The statement {\tt t = 12345, 54321, 'hello!'} is an example of {\em
1275 tuple packing}: the values {\tt 12345}, {\tt 54321} and {\tt 'hello!'}
1276 are packed together in a tuple.  The reverse operation is also
1277 possible, e.g.:
1278
1279 \bcode\begin{verbatim}
1280 >>> x, y, z = t
1281 >>>
1282 \end{verbatim}\ecode
1283 %
1284 This is called, appropriately enough, {\em tuple unpacking}.  Tuple
1285 unpacking requires that the list of variables on the left has the same
1286 number of elements as the length of the tuple.  Note that multiple
1287 assignment is really just a combination of tuple packing and tuple
1288 unpacking!
1289
1290 Occasionally, the corresponding operation on lists is useful: {\em list
1291 unpacking}.  This is supported by enclosing the list of variables in
1292 square brackets:
1293
1294 \bcode\begin{verbatim}
1295 >>> a = ['foo', 'bar', 100, 1234]
1296 >>> [a1, a2, a3, a4] = a
1297 >>>
1298 \end{verbatim}\ecode
1299
1300 \section{Dictionaries}
1301
1302 Another useful data type built into Python is the {\em dictionary}.
1303 Dictionaries are sometimes found in other languages as ``associative
1304 memories'' or ``associative arrays''.  Unlike sequences, which are
1305 indexed by a range of numbers, dictionaries are indexed by {\em keys},
1306 which are strings (the use of non-string values as keys
1307 is supported, but beyond the scope of this tutorial).
1308 It is best to think of a dictionary as an unordered set of
1309 {\em key:value} pairs, with the requirement that the keys are unique
1310 (within one dictionary).
1311 A pair of braces creates an empty dictionary: \verb/{}/.
1312 Placing a comma-separated list of key:value pairs within the
1313 braces adds initial key:value pairs to the dictionary; this is also the
1314 way dictionaries are written on output.
1315
1316 The main operations on a dictionary are storing a value with some key
1317 and extracting the value given the key.  It is also possible to delete
1318 a key:value pair
1319 with {\tt del}.
1320 If you store using a key that is already in use, the old value
1321 associated with that key is forgotten.  It is an error to extract a
1322 value using a non-existent key.
1323
1324 The {\tt keys()} method of a dictionary object returns a list of all the
1325 keys used in the dictionary, in random order (if you want it sorted,
1326 just apply the {\tt sort()} method to the list of keys).  To check
1327 whether a single key is in the dictionary, use the \verb/has_key()/
1328 method of the dictionary.
1329
1330 Here is a small example using a dictionary:
1331
1332 \bcode\begin{verbatim}
1333 >>> tel = {'jack': 4098, 'sape': 4139}
1334 >>> tel['guido'] = 4127
1335 >>> tel
1336 {'sape': 4139, 'guido': 4127, 'jack': 4098}
1337 >>> tel['jack']
1338 4098
1339 >>> del tel['sape']
1340 >>> tel['irv'] = 4127
1341 >>> tel
1342 {'guido': 4127, 'irv': 4127, 'jack': 4098}
1343 >>> tel.keys()
1344 ['guido', 'irv', 'jack']
1345 >>> tel.has_key('guido')
1346 1
1347 >>>
1348 \end{verbatim}\ecode
1349
1350 \section{More on Conditions}
1351
1352 The conditions used in {\tt while} and {\tt if} statements above can
1353 contain other operators besides comparisons.
1354
1355 The comparison operators {\tt in} and {\tt not in} check whether a value
1356 occurs (does not occur) in a sequence.  The operators {\tt is} and {\tt
1357 is not} compare whether two objects are really the same object; this
1358 only matters for mutable objects like lists.  All comparison operators
1359 have the same priority, which is lower than that of all numerical
1360 operators.
1361
1362 Comparisons can be chained: e.g., {\tt a < b = c} tests whether {\tt a}
1363 is less than {\tt b} and moreover {\tt b} equals {\tt c}.
1364
1365 Comparisons may be combined by the Boolean operators {\tt and} and {\tt
1366 or}, and the outcome of a comparison (or of any other Boolean
1367 expression) may be negated with {\tt not}.  These all have lower
1368 priorities than comparison operators again; between them, {\tt not} has
1369 the highest priority, and {\tt or} the lowest, so that
1370 {\tt A and not B or C} is equivalent to {\tt (A and (not B)) or C}.  Of
1371 course, parentheses can be used to express the desired composition.
1372
1373 The Boolean operators {\tt and} and {\tt or} are so-called {\em
1374 shortcut} operators: their arguments are evaluated from left to right,
1375 and evaluation stops as soon as the outcome is determined.  E.g., if
1376 {\tt A} and {\tt C} are true but {\tt B} is false, {\tt A and B and C}
1377 does not evaluate the expression C.  In general, the return value of a
1378 shortcut operator, when used as a general value and not as a Boolean, is
1379 the last evaluated argument.
1380
1381 It is possible to assign the result of a comparison or other Boolean
1382 expression to a variable.  For example,
1383
1384 \bcode\begin{verbatim}
1385 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
1386 >>> non_null = string1 or string2 or string3
1387 >>> non_null
1388 'Trondheim'
1389 >>>
1390 \end{verbatim}\ecode
1391 %
1392 Note that in Python, unlike C, assignment cannot occur inside expressions.
1393
1394 \section{Comparing Sequences and Other Types}
1395
1396 Sequence objects may be compared to other objects with the same
1397 sequence type.  The comparison uses {\em lexicographical} ordering:
1398 first the first two items are compared, and if they differ this
1399 determines the outcome of the comparison; if they are equal, the next
1400 two items are compared, and so on, until either sequence is exhausted.
1401 If two items to be compared are themselves sequences of the same type,
1402 the lexicographical comparison is carried out recursively.  If all
1403 items of two sequences compare equal, the sequences are considered
1404 equal.  If one sequence is an initial subsequence of the other, the
1405 shorted sequence is the smaller one.  Lexicographical ordering for
1406 strings uses the ASCII ordering for individual characters.  Some
1407 examples of comparisons between sequences with the same types:
1408
1409 \bcode\begin{verbatim}
1410 (1, 2, 3)              < (1, 2, 4)
1411 [1, 2, 3]              < [1, 2, 4]
1412 'ABC' < 'C' < 'Pascal' < 'Python'
1413 (1, 2, 3, 4)           < (1, 2, 4)
1414 (1, 2)                 < (1, 2, -1)
1415 (1, 2, 3)              = (1.0, 2.0, 3.0)
1416 (1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)
1417 \end{verbatim}\ecode
1418 %
1419 Note that comparing objects of different types is legal.  The outcome
1420 is deterministic but arbitrary: the types are ordered by their name.
1421 Thus, a list is always smaller than a string, a string is always
1422 smaller than a tuple, etc.  Mixed numeric types are compared according
1423 to their numeric value, so 0 equals 0.0, etc.%
1424 \footnote{
1425         The rules for comparing objects of different types should
1426         not be relied upon; they may change in a future version of
1427         the language.
1428 }
1429
1430
1431 \chapter{Modules}
1432
1433 If you quit from the Python interpreter and enter it again, the
1434 definitions you have made (functions and variables) are lost.
1435 Therefore, if you want to write a somewhat longer program, you are
1436 better off using a text editor to prepare the input for the interpreter
1437 and run it with that file as input instead.  This is known as creating a
1438 {\em script}.  As your program gets longer, you may want to split it
1439 into several files for easier maintenance.  You may also want to use a
1440 handy function that you've written in several programs without copying
1441 its definition into each program.
1442
1443 To support this, Python has a way to put definitions in a file and use
1444 them in a script or in an interactive instance of the interpreter.
1445 Such a file is called a {\em module}; definitions from a module can be
1446 {\em imported} into other modules or into the {\em main} module (the
1447 collection of variables that you have access to in a script
1448 executed at the top level
1449 and in calculator mode).
1450
1451 A module is a file containing Python definitions and statements.  The
1452 file name is the module name with the suffix {\tt .py} appended.  Within
1453 a module, the module's name (as a string) is available as the value of
1454 the global variable {\tt __name__}.  For instance, use your favorite text
1455 editor to create a file called {\tt fibo.py} in the current directory
1456 with the following contents:
1457
1458 \bcode\begin{verbatim}
1459 # Fibonacci numbers module
1460
1461 def fib(n):    # write Fibonacci series up to n
1462     a, b = 0, 1
1463     while b <= n:
1464           print b,
1465           a, b = b, a+b
1466
1467 def fib2(n): # return Fibonacci series up to n
1468     result = []
1469     a, b = 0, 1
1470     while b <= n:
1471           result.append(b)
1472           a, b = b, a+b
1473     return result
1474 \end{verbatim}\ecode
1475 %
1476 Now enter the Python interpreter and import this module with the
1477 following command:
1478
1479 \bcode\begin{verbatim}
1480 >>> import fibo
1481 >>>
1482 \end{verbatim}\ecode
1483 %
1484 This does not enter the names of the functions defined in
1485 {\tt fibo}
1486 directly in the current symbol table; it only enters the module name
1487 {\tt fibo}
1488 there.
1489 Using the module name you can access the functions:
1490
1491 \bcode\begin{verbatim}
1492 >>> fibo.fib(1000)
1493 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
1494 >>> fibo.fib2(100)
1495 [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
1496 >>> fibo.__name__
1497 'fibo'
1498 >>>
1499 \end{verbatim}\ecode
1500 %
1501 If you intend to use a function often you can assign it to a local name:
1502
1503 \bcode\begin{verbatim}
1504 >>> fib = fibo.fib
1505 >>> fib(500)
1506 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1507 >>>
1508 \end{verbatim}\ecode
1509
1510 \section{More on Modules}
1511
1512 A module can contain executable statements as well as function
1513 definitions.
1514 These statements are intended to initialize the module.
1515 They are executed only the
1516 {\em first}
1517 time the module is imported somewhere.%
1518 \footnote{
1519         In fact function definitions are also `statements' that are
1520         `executed'; the execution enters the function name in the
1521         module's global symbol table.
1522 }
1523
1524 Each module has its own private symbol table, which is used as the
1525 global symbol table by all functions defined in the module.
1526 Thus, the author of a module can use global variables in the module
1527 without worrying about accidental clashes with a user's global
1528 variables.
1529 On the other hand, if you know what you are doing you can touch a
1530 module's global variables with the same notation used to refer to its
1531 functions,
1532 {\tt modname.itemname}.
1533
1534 Modules can import other modules.
1535 It is customary but not required to place all
1536 {\tt import}
1537 statements at the beginning of a module (or script, for that matter).
1538 The imported module names are placed in the importing module's global
1539 symbol table.
1540
1541 There is a variant of the
1542 {\tt import}
1543 statement that imports names from a module directly into the importing
1544 module's symbol table.
1545 For example:
1546
1547 \bcode\begin{verbatim}
1548 >>> from fibo import fib, fib2
1549 >>> fib(500)
1550 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1551 >>>
1552 \end{verbatim}\ecode
1553 %
1554 This does not introduce the module name from which the imports are taken
1555 in the local symbol table (so in the example, {\tt fibo} is not
1556 defined).
1557
1558 There is even a variant to import all names that a module defines:
1559
1560 \bcode\begin{verbatim}
1561 >>> from fibo import *
1562 >>> fib(500)
1563 1 1 2 3 5 8 13 21 34 55 89 144 233 377
1564 >>>
1565 \end{verbatim}\ecode
1566 %
1567 This imports all names except those beginning with an underscore
1568 ({\tt _}).
1569
1570 \section{Standard Modules}
1571
1572 Python comes with a library of standard modules, described in a separate
1573 document (Python Library Reference).  Some modules are built into the
1574 interpreter; these provide access to operations that are not part of the
1575 core of the language but are nevertheless built in, either for
1576 efficiency or to provide access to operating system primitives such as
1577 system calls.  The set of such modules is a configuration option; e.g.,
1578 the {\tt amoeba} module is only provided on systems that somehow support
1579 Amoeba primitives.  One particular module deserves some attention: {\tt
1580 sys}, which is built into every Python interpreter.  The variables {\tt
1581 sys.ps1} and {\tt sys.ps2} define the strings used as primary and
1582 secondary prompts:
1583
1584 \bcode\begin{verbatim}
1585 >>> import sys
1586 >>> sys.ps1
1587 '>>> '
1588 >>> sys.ps2
1589 '... '
1590 >>> sys.ps1 = 'C> '
1591 C> print 'Yuck!'
1592 Yuck!
1593 C>
1594 \end{verbatim}\ecode
1595 %
1596 These two variables are only defined if the interpreter is in
1597 interactive mode.
1598
1599 The variable
1600 {\tt sys.path}
1601 is a list of strings that determine the interpreter's search path for
1602 modules.
1603 It is initialized to a default path taken from the environment variable
1604 {\tt PYTHONPATH},
1605 or from a built-in default if
1606 {\tt PYTHONPATH}
1607 is not set.
1608 You can modify it using standard list operations, e.g.:
1609
1610 \bcode\begin{verbatim}
1611 >>> import sys
1612 >>> sys.path.append('/ufs/guido/lib/python')
1613 >>>
1614 \end{verbatim}\ecode
1615
1616 \section{The {\tt dir()} function}
1617
1618 The built-in function {\tt dir} is used to find out which names a module
1619 defines.  It returns a sorted list of strings:
1620
1621 \bcode\begin{verbatim}
1622 >>> import fibo, sys
1623 >>> dir(fibo)
1624 ['__name__', 'fib', 'fib2']
1625 >>> dir(sys)
1626 ['__name__', 'argv', 'builtin_module_names', 'copyright', 'exit',
1627 'maxint', 'modules', 'path', 'ps1', 'ps2', 'setprofile', 'settrace',
1628 'stderr', 'stdin', 'stdout', 'version']
1629 >>>
1630 \end{verbatim}\ecode
1631 %
1632 Without arguments, {\tt dir()} lists the names you have defined currently:
1633
1634 \bcode\begin{verbatim}
1635 >>> a = [1, 2, 3, 4, 5]
1636 >>> import fibo, sys
1637 >>> fib = fibo.fib
1638 >>> dir()
1639 ['__name__', 'a', 'fib', 'fibo', 'sys']
1640 >>>
1641 \end{verbatim}\ecode
1642 %
1643 Note that it lists all types of names: variables, modules, functions, etc.
1644
1645 {\tt dir()} does not list the names of built-in functions and variables.
1646 If you want a list of those, they are defined in the standard module
1647 {\tt __builtin__}:
1648
1649 \bcode\begin{verbatim}
1650 >>> import __builtin__
1651 >>> dir(__builtin__)
1652 ['AccessError', 'AttributeError', 'ConflictError', 'EOFError', 'IOError',
1653 'ImportError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
1654 'MemoryError', 'NameError', 'None', 'OverflowError', 'RuntimeError',
1655 'SyntaxError', 'SystemError', 'SystemExit', 'TypeError', 'ValueError',
1656 'ZeroDivisionError', '__name__', 'abs', 'apply', 'chr', 'cmp', 'coerce',
1657 'compile', 'dir', 'divmod', 'eval', 'execfile', 'filter', 'float',
1658 'getattr', 'hasattr', 'hash', 'hex', 'id', 'input', 'int', 'len', 'long',
1659 'map', 'max', 'min', 'oct', 'open', 'ord', 'pow', 'range', 'raw_input',
1660 'reduce', 'reload', 'repr', 'round', 'setattr', 'str', 'type', 'xrange']
1661 >>>
1662 \end{verbatim}\ecode
1663
1664
1665 \chapter{Output Formatting}
1666
1667 So far we've encountered two ways of writing values: {\em expression
1668 statements} and the {\tt print} statement.  (A third way is using the
1669 {\tt write} method of file objects; the standard output file can be
1670 referenced as {\tt sys.stdout}.  See the Library Reference for more
1671 information on this.)
1672
1673 Often you'll want more control over the formatting of your output than
1674 simply printing space-separated values.  The key to nice formatting in
1675 Python is to do all the string handling yourself; using string slicing
1676 and concatenation operations you can create any lay-out you can imagine.
1677 The standard module {\tt string} contains some useful operations for
1678 padding strings to a given column width; these will be discussed shortly.
1679 Finally, the \code{\%} operator (modulo) with a string left argument
1680 interprets this string as a C sprintf format string to be applied to the
1681 right argument, and returns the string resulting from this formatting
1682 operation.
1683
1684 One question remains, of course: how do you convert values to strings?
1685 Luckily, Python has a way to convert any value to a string: just write
1686 the value between reverse quotes (\verb/``/).  Some examples:
1687
1688 \bcode\begin{verbatim}
1689 >>> x = 10 * 3.14
1690 >>> y = 200*200
1691 >>> s = 'The value of x is ' + `x` + ', and y is ' + `y` + '...'
1692 >>> print s
1693 The value of x is 31.4, and y is 40000...
1694 >>> # Reverse quotes work on other types besides numbers:
1695 ... p = [x, y]
1696 >>> ps = `p`
1697 >>> ps
1698 '[31.4, 40000]'
1699 >>> # Converting a string adds string quotes and backslashes:
1700 ... hello = 'hello, world\n'
1701 >>> hellos = `hello`
1702 >>> print hellos
1703 'hello, world\012'
1704 >>> # The argument of reverse quotes may be a tuple:
1705 ... `x, y, ('foo', 'bar')`
1706 "(31.4, 40000, ('foo', 'bar'))"
1707 >>>
1708 \end{verbatim}\ecode
1709 %
1710 Here are two ways to write a table of squares and cubes:
1711
1712 \bcode\begin{verbatim}
1713 >>> import string
1714 >>> for x in range(1, 11):
1715 ...     print string.rjust(`x`, 2), string.rjust(`x*x`, 3),
1716 ...     # Note trailing comma on previous line
1717 ...     print string.rjust(`x*x*x`, 4)
1718 ...
1719  1   1    1
1720  2   4    8
1721  3   9   27
1722  4  16   64
1723  5  25  125
1724  6  36  216
1725  7  49  343
1726  8  64  512
1727  9  81  729
1728 10 100 1000
1729 >>> for x in range(1,11):
1730 ...     print '%2d %3d %4d' % (x, x*x, x*x*x)
1731 ...
1732  1   1    1
1733  2   4    8
1734  3   9   27
1735  4  16   64
1736  5  25  125
1737  6  36  216
1738  7  49  343
1739  8  64  512
1740  9  81  729
1741 10 100 1000
1742 >>>
1743 \end{verbatim}\ecode
1744 %
1745 (Note that one space between each column was added by the way {\tt print}
1746 works: it always adds spaces between its arguments.)
1747
1748 This example demonstrates the function {\tt string.rjust()}, which
1749 right-justifies a string in a field of a given width by padding it with
1750 spaces on the left.  There are similar functions {\tt string.ljust()}
1751 and {\tt string.center()}.  These functions do not write anything, they
1752 just return a new string.  If the input string is too long, they don't
1753 truncate it, but return it unchanged; this will mess up your column
1754 lay-out but that's usually better than the alternative, which would be
1755 lying about a value.  (If you really want truncation you can always add
1756 a slice operation, as in {\tt string.ljust(x,~n)[0:n]}.)
1757
1758 There is another function, {\tt string.zfill}, which pads a numeric
1759 string on the left with zeros.  It understands about plus and minus
1760 signs:
1761
1762 \bcode\begin{verbatim}
1763 >>> string.zfill('12', 5)
1764 '00012'
1765 >>> string.zfill('-3.14', 7)
1766 '-003.14'
1767 >>> string.zfill('3.14159265359', 5)
1768 '3.14159265359'
1769 >>>
1770 \end{verbatim}\ecode
1771
1772
1773 \chapter{Errors and Exceptions}
1774
1775 Until now error messages haven't been more than mentioned, but if you
1776 have tried out the examples you have probably seen some.  There are
1777 (at least) two distinguishable kinds of errors: {\em syntax\ errors}
1778 and {\em exceptions}.
1779
1780 \section{Syntax Errors}
1781
1782 Syntax errors, also known as parsing errors, are perhaps the most common
1783 kind of complaint you get while you are still learning Python:
1784
1785 \bcode\begin{verbatim}
1786 >>> while 1 print 'Hello world'
1787   File "<stdin>", line 1
1788     while 1 print 'Hello world'
1789                 ^
1790 SyntaxError: invalid syntax
1791 >>>
1792 \end{verbatim}\ecode
1793 %
1794 The parser repeats the offending line and displays a little `arrow'
1795 pointing at the earliest point in the line where the error was detected.
1796 The error is caused by (or at least detected at) the token
1797 {\em preceding}
1798 the arrow: in the example, the error is detected at the keyword
1799 {\tt print}, since a colon ({\tt :}) is missing before it.
1800 File name and line number are printed so you know where to look in case
1801 the input came from a script.
1802
1803 \section{Exceptions}
1804
1805 Even if a statement or expression is syntactically correct, it may
1806 cause an error when an attempt is made to execute it.
1807 Errors detected during execution are called {\em exceptions} and are
1808 not unconditionally fatal: you will soon learn how to handle them in
1809 Python programs.  Most exceptions are not handled by programs,
1810 however, and result in error messages as shown here:
1811
1812 \bcode\small\begin{verbatim}
1813 >>> 10 * (1/0)
1814 Traceback (innermost last):
1815   File "<stdin>", line 1
1816 ZeroDivisionError: integer division or modulo
1817 >>> 4 + foo*3
1818 Traceback (innermost last):
1819   File "<stdin>", line 1
1820 NameError: foo
1821 >>> '2' + 2
1822 Traceback (innermost last):
1823   File "<stdin>", line 1
1824 TypeError: illegal argument type for built-in operation
1825 >>>
1826 \end{verbatim}\ecode
1827 %
1828 The last line of the error message indicates what happened.
1829 Exceptions come in different types, and the type is printed as part of
1830 the message: the types in the example are
1831 {\tt ZeroDivisionError},
1832 {\tt NameError}
1833 and
1834 {\tt TypeError}.
1835 The string printed as the exception type is the name of the built-in
1836 name for the exception that occurred.  This is true for all built-in
1837 exceptions, but need not be true for user-defined exceptions (although
1838 it is a useful convention).
1839 Standard exception names are built-in identifiers (not reserved
1840 keywords).
1841
1842 The rest of the line is a detail whose interpretation depends on the
1843 exception type; its meaning is dependent on the exception type.
1844
1845 The preceding part of the error message shows the context where the
1846 exception happened, in the form of a stack backtrace.
1847 In general it contains a stack backtrace listing source lines; however,
1848 it will not display lines read from standard input.
1849
1850 The Python library reference manual lists the built-in exceptions and
1851 their meanings.
1852
1853 \section{Handling Exceptions}
1854
1855 It is possible to write programs that handle selected exceptions.
1856 Look at the following example, which prints a table of inverses of
1857 some floating point numbers:
1858
1859 \bcode\begin{verbatim}
1860 >>> numbers = [0.3333, 2.5, 0, 10]
1861 >>> for x in numbers:
1862 ...     print x,
1863 ...     try:
1864 ...         print 1.0 / x
1865 ...     except ZeroDivisionError:
1866 ...         print '*** has no inverse ***'
1867 ...
1868 0.3333 3.00030003
1869 2.5 0.4
1870 0 *** has no inverse ***
1871 10 0.1
1872 >>>
1873 \end{verbatim}\ecode
1874 %
1875 The {\tt try} statement works as follows.
1876 \begin{itemize}
1877 \item
1878 First, the
1879 {\em try\ clause}
1880 (the statement(s) between the {\tt try} and {\tt except} keywords) is
1881 executed.
1882 \item
1883 If no exception occurs, the
1884 {\em except\ clause}
1885 is skipped and execution of the {\tt try} statement is finished.
1886 \item
1887 If an exception occurs during execution of the try clause,
1888 the rest of the clause is skipped.  Then if
1889 its type matches the exception named after the {\tt except} keyword,
1890 the rest of the try clause is skipped, the except clause is executed,
1891 and then execution continues after the {\tt try} statement.
1892 \item
1893 If an exception occurs which does not match the exception named in the
1894 except clause, it is passed on to outer try statements; if no handler is
1895 found, it is an
1896 {\em unhandled\ exception}
1897 and execution stops with a message as shown above.
1898 \end{itemize}
1899 A {\tt try} statement may have more than one except clause, to specify
1900 handlers for different exceptions.
1901 At most one handler will be executed.
1902 Handlers only handle exceptions that occur in the corresponding try
1903 clause, not in other handlers of the same {\tt try} statement.
1904 An except clause may name multiple exceptions as a parenthesized list,
1905 e.g.:
1906
1907 \bcode\begin{verbatim}
1908 ... except (RuntimeError, TypeError, NameError):
1909 ...     pass
1910 \end{verbatim}\ecode
1911 %
1912 The last except clause may omit the exception name(s), to serve as a
1913 wildcard.
1914 Use this with extreme caution, since it is easy to mask a real
1915 programming error in this way!
1916
1917 When an exception occurs, it may have an associated value, also known as
1918 the exceptions's
1919 {\em argument}.
1920 The presence and type of the argument depend on the exception type.
1921 For exception types which have an argument, the except clause may
1922 specify a variable after the exception name (or list) to receive the
1923 argument's value, as follows:
1924
1925 \bcode\begin{verbatim}
1926 >>> try:
1927 ...     foo()
1928 ... except NameError, x:
1929 ...     print 'name', x, 'undefined'
1930 ...
1931 name foo undefined
1932 >>>
1933 \end{verbatim}\ecode
1934 %
1935 If an exception has an argument, it is printed as the last part
1936 (`detail') of the message for unhandled exceptions.
1937
1938 Exception handlers don't just handle exceptions if they occur
1939 immediately in the try clause, but also if they occur inside functions
1940 that are called (even indirectly) in the try clause.
1941 For example:
1942
1943 \bcode\begin{verbatim}
1944 >>> def this_fails():
1945 ...     x = 1/0
1946 ...
1947 >>> try:
1948 ...     this_fails()
1949 ... except ZeroDivisionError, detail:
1950 ...     print 'Handling run-time error:', detail
1951 ...
1952 Handling run-time error: integer division or modulo
1953 >>>
1954 \end{verbatim}\ecode
1955
1956 \section{Raising Exceptions}
1957
1958 The {\tt raise} statement allows the programmer to force a specified
1959 exception to occur.
1960 For example:
1961
1962 \bcode\begin{verbatim}
1963 >>> raise NameError, 'HiThere'
1964 Traceback (innermost last):
1965   File "<stdin>", line 1
1966 NameError: HiThere
1967 >>>
1968 \end{verbatim}\ecode
1969 %
1970 The first argument to {\tt raise} names the exception to be raised.
1971 The optional second argument specifies the exception's argument.
1972
1973 \section{User-defined Exceptions}
1974
1975 Programs may name their own exceptions by assigning a string to a
1976 variable.
1977 For example:
1978
1979 \bcode\begin{verbatim}
1980 >>> my_exc = 'my_exc'
1981 >>> try:
1982 ...     raise my_exc, 2*2
1983 ... except my_exc, val:
1984 ...     print 'My exception occurred, value:', val
1985 ...
1986 My exception occurred, value: 4
1987 >>> raise my_exc, 1
1988 Traceback (innermost last):
1989   File "<stdin>", line 1
1990 my_exc: 1
1991 >>>
1992 \end{verbatim}\ecode
1993 %
1994 Many standard modules use this to report errors that may occur in
1995 functions they define.
1996
1997 \section{Defining Clean-up Actions}
1998
1999 The {\tt try} statement has another optional clause which is intended to
2000 define clean-up actions that must be executed under all circumstances.
2001 For example:
2002
2003 \bcode\begin{verbatim}
2004 >>> try:
2005 ...     raise KeyboardInterrupt
2006 ... finally:
2007 ...     print 'Goodbye, world!'
2008 ...
2009 Goodbye, world!
2010 Traceback (innermost last):
2011   File "<stdin>", line 2
2012 KeyboardInterrupt
2013 >>>
2014 \end{verbatim}\ecode
2015 %
2016 A {\tt finally} clause is executed whether or not an exception has
2017 occurred in the {\tt try} clause.  When an exception has occurred, it
2018 is re-raised after the {\tt finally} clause is executed.  The
2019 {\tt finally} clause is also executed ``on the way out'' when the
2020 {\tt try} statement is left via a {\tt break} or {\tt return}
2021 statement.
2022
2023 A {\tt try} statement must either have one or more {\tt except}
2024 clauses or one {\tt finally} clause, but not both.
2025
2026
2027 \chapter{Classes}
2028
2029 Python's class mechanism adds classes to the language with a minimum
2030 of new syntax and semantics.  It is a mixture of the class mechanisms
2031 found in C++ and Modula-3.  As is true for modules, classes in Python
2032 do not put an absolute barrier between definition and user, but rather
2033 rely on the politeness of the user not to ``break into the
2034 definition.''  The most important features of classes are retained
2035 with full power, however: the class inheritance mechanism allows
2036 multiple base classes, a derived class can override any methods of its
2037 base class(es), a method can call the method of a base class with the
2038 same name.  Objects can contain an arbitrary amount of private data.
2039
2040 In C++ terminology, all class members (including the data members) are
2041 {\em public}, and all member functions are {\em virtual}.  There are
2042 no special constructors or destructors.  As in Modula-3, there are no
2043 shorthands for referencing the object's members from its methods: the
2044 method function is declared with an explicit first argument
2045 representing the object, which is provided implicitly by the call.  As
2046 in Smalltalk, classes themselves are objects, albeit in the wider
2047 sense of the word: in Python, all data types are objects.  This
2048 provides semantics for importing and renaming.  But, just like in C++
2049 or Modula-3, built-in types cannot be used as base classes for
2050 extension by the user.  Also, like in C++ but unlike in Modula-3, most
2051 built-in operators with special syntax (arithmetic operators,
2052 subscripting etc.) can be redefined for class members.
2053
2054
2055 \section{A word about terminology}
2056
2057 Lacking universally accepted terminology to talk about classes, I'll
2058 make occasional use of Smalltalk and C++ terms.  (I'd use Modula-3
2059 terms, since its object-oriented semantics are closer to those of
2060 Python than C++, but I expect that few readers have heard of it...)
2061
2062 I also have to warn you that there's a terminological pitfall for
2063 object-oriented readers: the word ``object'' in Python does not
2064 necessarily mean a class instance.  Like C++ and Modula-3, and unlike
2065 Smalltalk, not all types in Python are classes: the basic built-in
2066 types like integers and lists aren't, and even somewhat more exotic
2067 types like files aren't.  However, {\em all} Python types share a little
2068 bit of common semantics that is best described by using the word
2069 object.
2070
2071 Objects have individuality, and multiple names (in multiple scopes)
2072 can be bound to the same object.  This is known as aliasing in other
2073 languages.  This is usually not appreciated on a first glance at
2074 Python, and can be safely ignored when dealing with immutable basic
2075 types (numbers, strings, tuples).  However, aliasing has an
2076 (intended!) effect on the semantics of Python code involving mutable
2077 objects such as lists, dictionaries, and most types representing
2078 entities outside the program (files, windows, etc.).  This is usually
2079 used to the benefit of the program, since aliases behave like pointers
2080 in some respects.  For example, passing an object is cheap since only
2081 a pointer is passed by the implementation; and if a function modifies
2082 an object passed as an argument, the caller will see the change --- this
2083 obviates the need for two different argument passing mechanisms as in
2084 Pascal.
2085
2086
2087 \section{Python scopes and name spaces}
2088
2089 Before introducing classes, I first have to tell you something about
2090 Python's scope rules.  Class definitions play some neat tricks with
2091 name spaces, and you need to know how scopes and name spaces work to
2092 fully understand what's going on.  Incidentally, knowledge about this
2093 subject is useful for any advanced Python programmer.
2094
2095 Let's begin with some definitions.
2096
2097 A {\em name space} is a mapping from names to objects.  Most name
2098 spaces are currently implemented as Python dictionaries, but that's
2099 normally not noticeable in any way (except for performance), and it
2100 may change in the future.  Examples of name spaces are: the set of
2101 built-in names (functions such as \verb\abs()\, and built-in exception
2102 names); the global names in a module; and the local names in a
2103 function invocation.  In a sense the set of attributes of an object
2104 also form a name space.  The important things to know about name
2105 spaces is that there is absolutely no relation between names in
2106 different name spaces; for instance, two different modules may both
2107 define a function ``maximize'' without confusion --- users of the
2108 modules must prefix it with the module name.
2109
2110 By the way, I use the word {\em attribute} for any name following a
2111 dot --- for example, in the expression \verb\z.real\, \verb\real\ is
2112 an attribute of the object \verb\z\.  Strictly speaking, references to
2113 names in modules are attribute references: in the expression
2114 \verb\modname.funcname\, \verb\modname\ is a module object and
2115 \verb\funcname\ is an attribute of it.  In this case there happens to
2116 be a straightforward mapping between the module's attributes and the
2117 global names defined in the module: they share the same name space!%
2118 \footnote{
2119         Except for one thing.  Module objects have a secret read-only
2120         attribute called {\tt __dict__} which returns the dictionary
2121         used to implement the module's name space; the name
2122         {\tt __dict__} is an attribute but not a global name.
2123         Obviously, using this violates the abstraction of name space
2124         implementation, and should be restricted to things like
2125         post-mortem debuggers...
2126 }
2127
2128 Attributes may be read-only or writable.  In the latter case,
2129 assignment to attributes is possible.  Module attributes are writable:
2130 you can write \verb\modname.the_answer = 42\.  Writable attributes may
2131 also be deleted with the del statement, e.g.
2132 \verb\del modname.the_answer\.
2133
2134 Name spaces are created at different moments and have different
2135 lifetimes.  The name space containing the built-in names is created
2136 when the Python interpreter starts up, and is never deleted.  The
2137 global name space for a module is created when the module definition
2138 is read in; normally, module name spaces also last until the
2139 interpreter quits.  The statements executed by the top-level
2140 invocation of the interpreter, either read from a script file or
2141 interactively, are considered part of a module called \verb\__main__\,
2142 so they have their own global name space.  (The built-in names
2143 actually also live in a module; this is called \verb\__builtin__\.)
2144
2145 The local name space for a function is created when the function is
2146 called, and deleted when the function returns or raises an exception
2147 that is not handled within the function.  (Actually, forgetting would
2148 be a better way to describe what actually happens.)  Of course,
2149 recursive invocations each have their own local name space.
2150
2151 A {\em scope} is a textual region of a Python program where a name space
2152 is directly accessible.  ``Directly accessible'' here means that an
2153 unqualified reference to a name attempts to find the name in the name
2154 space.
2155
2156 Although scopes are determined statically, they are used dynamically.
2157 At any time during execution, exactly three nested scopes are in use
2158 (i.e., exactly three name spaces are directly accessible): the
2159 innermost scope, which is searched first, contains the local names,
2160 the middle scope, searched next, contains the current module's global
2161 names, and the outermost scope (searched last) is the name space
2162 containing built-in names.
2163
2164 Usually, the local scope references the local names of the (textually)
2165 current function.  Outside functions, the the local scope references
2166 the same name space as the global scope: the module's name space.
2167 Class definitions place yet another name space in the local scope.
2168
2169 It is important to realize that scopes are determined textually: the
2170 global scope of a function defined in a module is that module's name
2171 space, no matter from where or by what alias the function is called.
2172 On the other hand, the actual search for names is done dynamically, at
2173 run time --- however, the the language definition is evolving towards
2174 static name resolution, at ``compile'' time, so don't rely on dynamic
2175 name resolution!  (In fact, local variables are already determined
2176 statically.)
2177
2178 A special quirk of Python is that assignments always go into the
2179 innermost scope.  Assignments do not copy data --- they just
2180 bind names to objects.  The same is true for deletions: the statement
2181 \verb\del x\ removes the binding of x from the name space referenced by the
2182 local scope.  In fact, all operations that introduce new names use the
2183 local scope: in particular, import statements and function definitions
2184 bind the module or function name in the local scope.  (The
2185 \verb\global\ statement can be used to indicate that particular
2186 variables live in the global scope.)
2187
2188
2189 \section{A first look at classes}
2190
2191 Classes introduce a little bit of new syntax, three new object types,
2192 and some new semantics.
2193
2194
2195 \subsection{Class definition syntax}
2196
2197 The simplest form of class definition looks like this:
2198
2199 \begin{verbatim}
2200         class ClassName:
2201                 <statement-1>
2202                 .
2203                 .
2204                 .
2205                 <statement-N>
2206 \end{verbatim}
2207
2208 Class definitions, like function definitions (\verb\def\ statements)
2209 must be executed before they have any effect.  (You could conceivably
2210 place a class definition in a branch of an \verb\if\ statement, or
2211 inside a function.)
2212
2213 In practice, the statements inside a class definition will usually be
2214 function definitions, but other statements are allowed, and sometimes
2215 useful --- we'll come back to this later.  The function definitions
2216 inside a class normally have a peculiar form of argument list,
2217 dictated by the calling conventions for methods --- again, this is
2218 explained later.
2219
2220 When a class definition is entered, a new name space is created, and
2221 used as the local scope --- thus, all assignments to local variables
2222 go into this new name space.  In particular, function definitions bind
2223 the name of the new function here.
2224
2225 When a class definition is left normally (via the end), a {\em class
2226 object} is created.  This is basically a wrapper around the contents
2227 of the name space created by the class definition; we'll learn more
2228 about class objects in the next section.  The original local scope
2229 (the one in effect just before the class definitions was entered) is
2230 reinstated, and the class object is bound here to class name given in
2231 the class definition header (ClassName in the example).
2232
2233
2234 \subsection{Class objects}
2235
2236 Class objects support two kinds of operations: attribute references
2237 and instantiation.
2238
2239 {\em Attribute references} use the standard syntax used for all
2240 attribute references in Python: \verb\obj.name\.  Valid attribute
2241 names are all the names that were in the class's name space when the
2242 class object was created.  So, if the class definition looked like
2243 this:
2244
2245 \begin{verbatim}
2246         class MyClass:
2247                 i = 12345
2248                 def f(x):
2249                         return 'hello world'
2250 \end{verbatim}
2251
2252 then \verb\MyClass.i\ and \verb\MyClass.f\ are valid attribute
2253 references, returning an integer and a function object, respectively.
2254 Class attributes can also be assigned to, so you can change the
2255 value of \verb\MyClass.i\ by assignment.
2256
2257 Class {\em instantiation} uses function notation.  Just pretend that
2258 the class object is a parameterless function that returns a new
2259 instance of the class.  For example, (assuming the above class):
2260
2261 \begin{verbatim}
2262         x = MyClass()
2263 \end{verbatim}
2264
2265 creates a new {\em instance} of the class and assigns this object to
2266 the local variable \verb\x\.
2267
2268
2269 \subsection{Instance objects}
2270
2271 Now what can we do with instance objects?  The only operations
2272 understood by instance objects are attribute references.  There are
2273 two kinds of valid attribute names.
2274
2275 The first I'll call {\em data attributes}.  These correspond to
2276 ``instance variables'' in Smalltalk, and to ``data members'' in C++.
2277 Data attributes need not be declared; like local variables, they
2278 spring into existence when they are first assigned to.  For example,
2279 if \verb\x\ in the instance of \verb\MyClass\ created above, the
2280 following piece of code will print the value 16, without leaving a
2281 trace:
2282
2283 \begin{verbatim}
2284         x.counter = 1
2285         while x.counter < 10:
2286                 x.counter = x.counter * 2
2287         print x.counter
2288         del x.counter
2289 \end{verbatim}
2290
2291 The second kind of attribute references understood by instance objects
2292 are {\em methods}.  A method is a function that ``belongs to'' an
2293 object.  (In Python, the term method is not unique to class instances:
2294 other object types can have methods as well, e.g., list objects have
2295 methods called append, insert, remove, sort, and so on.  However,
2296 below, we'll use the term method exclusively to mean methods of class
2297 instance objects, unless explicitly stated otherwise.)
2298
2299 Valid method names of an instance object depend on its class.  By
2300 definition, all attributes of a class that are (user-defined) function
2301 objects define corresponding methods of its instances.  So in our
2302 example, \verb\x.f\ is a valid method reference, since
2303 \verb\MyClass.f\ is a function, but \verb\x.i\ is not, since
2304 \verb\MyClass.i\ is not.  But \verb\x.f\ is not the
2305 same thing as \verb\MyClass.f\ --- it is a {\em method object}, not a
2306 function object.
2307
2308
2309 \subsection{Method objects}
2310
2311 Usually, a method is called immediately, e.g.:
2312
2313 \begin{verbatim}
2314         x.f()
2315 \end{verbatim}
2316
2317 In our example, this will return the string \verb\'hello world'\.
2318 However, it is not necessary to call a method right away: \verb\x.f\
2319 is a method object, and can be stored away and called at a later
2320 moment, for example:
2321
2322 \begin{verbatim}
2323         xf = x.f
2324         while 1:
2325                 print xf()
2326 \end{verbatim}
2327
2328 will continue to print \verb\hello world\ until the end of time.
2329
2330 What exactly happens when a method is called?  You may have noticed
2331 that \verb\x.f()\ was called without an argument above, even though
2332 the function definition for \verb\f\ specified an argument.  What
2333 happened to the argument?  Surely Python raises an exception when a
2334 function that requires an argument is called without any --- even if
2335 the argument isn't actually used...
2336
2337 Actually, you may have guessed the answer: the special thing about
2338 methods is that the object is passed as the first argument of the
2339 function.  In our example, the call \verb\x.f()\ is exactly equivalent
2340 to \verb\MyClass.f(x)\.  In general, calling a method with a list of
2341 {\em n} arguments is equivalent to calling the corresponding function
2342 with an argument list that is created by inserting the method's object
2343 before the first argument.
2344
2345 If you still don't understand how methods work, a look at the
2346 implementation can perhaps clarify matters.  When an instance
2347 attribute is referenced that isn't a data attribute, its class is
2348 searched.  If the name denotes a valid class attribute that is a
2349 function object, a method object is created by packing (pointers to)
2350 the instance object and the function object just found together in an
2351 abstract object: this is the method object.  When the method object is
2352 called with an argument list, it is unpacked again, a new argument
2353 list is constructed from the instance object and the original argument
2354 list, and the function object is called with this new argument list.
2355
2356
2357 \section{Random remarks}
2358
2359
2360 [These should perhaps be placed more carefully...]
2361
2362
2363 Data attributes override method attributes with the same name; to
2364 avoid accidental name conflicts, which may cause hard-to-find bugs in
2365 large programs, it is wise to use some kind of convention that
2366 minimizes the chance of conflicts, e.g., capitalize method names,
2367 prefix data attribute names with a small unique string (perhaps just
2368 an underscore), or use verbs for methods and nouns for data attributes.
2369
2370
2371 Data attributes may be referenced by methods as well as by ordinary
2372 users (``clients'') of an object.  In other words, classes are not
2373 usable to implement pure abstract data types.  In fact, nothing in
2374 Python makes it possible to enforce data hiding --- it is all based
2375 upon convention.  (On the other hand, the Python implementation,
2376 written in C, can completely hide implementation details and control
2377 access to an object if necessary; this can be used by extensions to
2378 Python written in C.)
2379
2380
2381 Clients should use data attributes with care --- clients may mess up
2382 invariants maintained by the methods by stamping on their data
2383 attributes.  Note that clients may add data attributes of their own to
2384 an instance object without affecting the validity of the methods, as
2385 long as name conflicts are avoided --- again, a naming convention can
2386 save a lot of headaches here.
2387
2388
2389 There is no shorthand for referencing data attributes (or other
2390 methods!) from within methods.  I find that this actually increases
2391 the readability of methods: there is no chance of confusing local
2392 variables and instance variables when glancing through a method.
2393
2394
2395 Conventionally, the first argument of methods is often called
2396 \verb\self\.  This is nothing more than a convention: the name
2397 \verb\self\ has absolutely no special meaning to Python.  (Note,
2398 however, that by not following the convention your code may be less
2399 readable by other Python programmers, and it is also conceivable that
2400 a {\em class browser} program be written which relies upon such a
2401 convention.)
2402
2403
2404 Any function object that is a class attribute defines a method for
2405 instances of that class.  It is not necessary that the function
2406 definition is textually enclosed in the class definition: assigning a
2407 function object to a local variable in the class is also ok.  For
2408 example:
2409
2410 \begin{verbatim}
2411         # Function defined outside the class
2412         def f1(self, x, y):
2413                 return min(x, x+y)
2414
2415         class C:
2416                 f = f1
2417                 def g(self):
2418                         return 'hello world'
2419                 h = g
2420 \end{verbatim}
2421
2422 Now \verb\f\, \verb\g\ and \verb\h\ are all attributes of class
2423 \verb\C\ that refer to function objects, and consequently they are all
2424 methods of instances of \verb\C\ --- \verb\h\ being exactly equivalent
2425 to \verb\g\.  Note that this practice usually only serves to confuse
2426 the reader of a program.
2427
2428
2429 Methods may call other methods by using method attributes of the
2430 \verb\self\ argument, e.g.:
2431
2432 \begin{verbatim}
2433         class Bag:
2434                 def empty(self):
2435                         self.data = []
2436                 def add(self, x):
2437                         self.data.append(x)
2438                 def addtwice(self, x):
2439                         self.add(x)
2440                         self.add(x)
2441 \end{verbatim}
2442
2443
2444 The instantiation operation (``calling'' a class object) creates an
2445 empty object.  Many classes like to create objects in a known initial
2446 state.  In early versions of Python, there was no special syntax to
2447 enforce this (see below), but a convention was widely used:
2448 add a method named \verb\init\ to the class,
2449 which initializes the instance (by assigning to some important data
2450 attributes) and returns the instance itself.  For example, class
2451 \verb\Bag\ above could have the following method:
2452
2453 \begin{verbatim}
2454                 def init(self):
2455                         self.empty()
2456                         return self
2457 \end{verbatim}
2458
2459 The client can then create and initialize an instance in one
2460 statement, as follows:
2461
2462 \begin{verbatim}
2463         x = Bag().init()
2464 \end{verbatim}
2465
2466 In later versions of Python, a special method named \verb\__init__\ may be
2467 defined instead:
2468
2469 \begin{verbatim}
2470                 def __init__(self):
2471                         self.empty()
2472 \end{verbatim}
2473
2474 When a class defines an \verb\__init__\ method, class instantiation
2475 automatically invokes \verb\__init__\ for the newly-created class
2476 instance.  So in the \verb\Bag\ example, a new and initialized instance
2477 can be obtained by:
2478
2479 \begin{verbatim}
2480         x = Bag()
2481 \end{verbatim}
2482
2483 Of course, the \verb\__init__\ method may have arguments for greater
2484 flexibility.  In that case, arguments given to the class instantiation
2485 operator are passed on to \verb\__init__\.  For example,
2486
2487 \bcode\begin{verbatim}
2488 >>> class Complex:
2489 ...     def __init__(self, realpart, imagpart):
2490 ...         self.r = realpart
2491 ...         self.i = imagpart
2492 ...
2493 >>> x = Complex(3.0,-4.5)
2494 >>> x.r, x.i
2495 (3.0, -4.5)
2496 >>>
2497 \end{verbatim}\ecode
2498 %
2499 Methods may reference global names in the same way as ordinary
2500 functions.  The global scope associated with a method is the module
2501 containing the class definition.  (The class itself is never used as a
2502 global scope!)  While one rarely encounters a good reason for using
2503 global data in a method, there are many legitimate uses of the global
2504 scope: for one thing, functions and modules imported into the global
2505 scope can be used by methods, as well as functions and classes defined
2506 in it.  Usually, the class containing the method is itself defined in
2507 this global scope, and in the next section we'll find some good
2508 reasons why a method would want to reference its own class!
2509
2510
2511 \section{Inheritance}
2512
2513 Of course, a language feature would not be worthy of the name ``class''
2514 without supporting inheritance.  The syntax for a derived class
2515 definition looks as follows:
2516
2517 \begin{verbatim}
2518         class DerivedClassName(BaseClassName):
2519                 <statement-1>
2520                 .
2521                 .
2522                 .
2523                 <statement-N>
2524 \end{verbatim}
2525
2526 The name \verb\BaseClassName\ must be defined in a scope containing
2527 the derived class definition.  Instead of a base class name, an
2528 expression is also allowed.  This is useful when the base class is
2529 defined in another module, e.g.,
2530
2531 \begin{verbatim}
2532         class DerivedClassName(modname.BaseClassName):
2533 \end{verbatim}
2534
2535 Execution of a derived class definition proceeds the same as for a
2536 base class.  When the class object is constructed, the base class is
2537 remembered.  This is used for resolving attribute references: if a
2538 requested attribute is not found in the class, it is searched in the
2539 base class.  This rule is applied recursively if the base class itself
2540 is derived from some other class.
2541
2542 There's nothing special about instantiation of derived classes:
2543 \verb\DerivedClassName()\ creates a new instance of the class.  Method
2544 references are resolved as follows: the corresponding class attribute
2545 is searched, descending down the chain of base classes if necessary,
2546 and the method reference is valid if this yields a function object.
2547
2548 Derived classes may override methods of their base classes.  Because
2549 methods have no special privileges when calling other methods of the
2550 same object, a method of a base class that calls another method
2551 defined in the same base class, may in fact end up calling a method of
2552 a derived class that overrides it.  (For C++ programmers: all methods
2553 in Python are ``virtual functions''.)
2554
2555 An overriding method in a derived class may in fact want to extend
2556 rather than simply replace the base class method of the same name.
2557 There is a simple way to call the base class method directly: just
2558 call \verb\BaseClassName.methodname(self, arguments)\.  This is
2559 occasionally useful to clients as well.  (Note that this only works if
2560 the base class is defined or imported directly in the global scope.)
2561
2562
2563 \subsection{Multiple inheritance}
2564
2565 Python supports a limited form of multiple inheritance as well.  A
2566 class definition with multiple base classes looks as follows:
2567
2568 \begin{verbatim}
2569         class DerivedClassName(Base1, Base2, Base3):
2570                 <statement-1>
2571                 .
2572                 .
2573                 .
2574                 <statement-N>
2575 \end{verbatim}
2576
2577 The only rule necessary to explain the semantics is the resolution
2578 rule used for class attribute references.  This is depth-first,
2579 left-to-right.  Thus, if an attribute is not found in
2580 \verb\DerivedClassName\, it is searched in \verb\Base1\, then
2581 (recursively) in the base classes of \verb\Base1\, and only if it is
2582 not found there, it is searched in \verb\Base2\, and so on.
2583
2584 (To some people breadth first---searching \verb\Base2\ and
2585 \verb\Base3\ before the base classes of \verb\Base1\---looks more
2586 natural.  However, this would require you to know whether a particular
2587 attribute of \verb\Base1\ is actually defined in \verb\Base1\ or in
2588 one of its base classes before you can figure out the consequences of
2589 a name conflict with an attribute of \verb\Base2\.  The depth-first
2590 rule makes no differences between direct and inherited attributes of
2591 \verb\Base1\.)
2592
2593 It is clear that indiscriminate use of multiple inheritance is a
2594 maintenance nightmare, given the reliance in Python on conventions to
2595 avoid accidental name conflicts.  A well-known problem with multiple
2596 inheritance is a class derived from two classes that happen to have a
2597 common base class.  While it is easy enough to figure out what happens
2598 in this case (the instance will have a single copy of ``instance
2599 variables'' or data attributes used by the common base class), it is
2600 not clear that these semantics are in any way useful.
2601
2602
2603 \section{Odds and ends}
2604
2605 Sometimes it is useful to have a data type similar to the Pascal
2606 ``record'' or C ``struct'', bundling together a couple of named data
2607 items.  An empty class definition will do nicely, e.g.:
2608
2609 \begin{verbatim}
2610         class Employee:
2611                 pass
2612
2613         john = Employee() # Create an empty employee record
2614
2615         # Fill the fields of the record
2616         john.name = 'John Doe'
2617         john.dept = 'computer lab'
2618         john.salary = 1000
2619 \end{verbatim}
2620
2621
2622 A piece of Python code that expects a particular abstract data type
2623 can often be passed a class that emulates the methods of that data
2624 type instead.  For instance, if you have a function that formats some
2625 data from a file object, you can define a class with methods
2626 \verb\read()\ and \verb\readline()\ that gets the data from a string
2627 buffer instead, and pass it as an argument.  (Unfortunately, this
2628 technique has its limitations: a class can't define operations that
2629 are accessed by special syntax such as sequence subscripting or
2630 arithmetic operators, and assigning such a ``pseudo-file'' to
2631 \verb\sys.stdin\ will not cause the interpreter to read further input
2632 from it.)
2633
2634
2635 Instance method objects have attributes, too: \verb\m.im_self\ is the
2636 object of which the method is an instance, and \verb\m.im_func\ is the
2637 function object corresponding to the method.
2638
2639
2640 \chapter{Recent Additions}
2641
2642 Python is an evolving language.  Since this tutorial was last
2643 thoroughly revised, several new features have been added to the
2644 language.  While ideally I should revise the tutorial to incorporate
2645 them in the mainline of the text, lack of time currently requires me
2646 to a more modest approach.  In this chapter I will briefly list the
2647 most important improvements to the language and how you can use them
2648 to your benefit.
2649
2650 \section{The Last Printed Expression}
2651
2652 In interactive mode, the last printed expression is assigned to the
2653 variable \code\_\.  This means that when you are using Python as a
2654 desk calculator, it is somewhat easier to continue calculations, for
2655 example:
2656
2657 \begin{verbatim}
2658         >>> tax = 17.5 / 100
2659         >>> price = 3.50
2660         >>> price * tax
2661         0.6125
2662         >>> price + _
2663         4.1125
2664         >>> round(_, 2)
2665         4.11
2666         >>>
2667 \end{verbatim}
2668
2669 \section{String Literals}
2670
2671 \subsection{Double Quotes}
2672
2673 Python can now also use double quotes to surround string literals,
2674 e.g. \verb\"this doesn't hurt a bit"\.
2675
2676 \subsection{Continuation Of String Literals}
2677
2678 String literals can span multiple lines by escaping newlines with
2679 backslashes, e.g.
2680
2681 \begin{verbatim}
2682         hello = "This is a rather long string containing\n\
2683         several lines of text just as you would do in C.\n\
2684             Note that whitespace at the beginning of the line is\
2685          significant.\n"
2686         print hello
2687 \end{verbatim}
2688
2689 which would print the following:
2690 \begin{verbatim}
2691         This is a rather long string containing
2692         several lines of text just as you would do in C.
2693             Note that whitespace at the beginning of the line is significant.
2694 \end{verbatim}
2695
2696 \subsection{Triple-quoted strings}
2697
2698 In some cases, when you need to include really long strings (e.g.
2699 containing several paragraphs of informational text), it is annoying
2700 that you have to terminate each line with \verb@\n\@, especially if
2701 you would like to reformat the text occasionally with a powerful text
2702 editor like Emacs.  For such situations, ``triple-quoted'' strings can
2703 be used, e.g.
2704
2705 \begin{verbatim}
2706         hello = """
2707
2708             This string is bounded by triple double quotes (3 times ").
2709         Newlines in the string are retained, though \
2710         it is still possible\nto use all normal escape sequences.
2711
2712             Whitespace at the beginning of a line is
2713         significant.  If you need to include three opening quotes
2714         you have to escape at least one of them, e.g. \""".
2715
2716             This string ends in a newline.
2717         """
2718 \end{verbatim}
2719
2720 Note that there is no semantic difference between strings quoted with
2721 single quotes (\verb/'/) or double quotes (\verb\"\).
2722
2723 \subsection{String Literal Juxtaposition}
2724
2725 One final twist: you can juxtapose multiple string literals.  Two or
2726 more adjacent string literals (but not arbitrary expressions!)
2727 separated only by whitespace will be concatenated (without intervening
2728 whitespace) into a single string object at compile time.  This makes
2729 it possible to continue a long string on the next line without
2730 sacrificing indentation or performance, unlike the use of the string
2731 concatenation operator \verb\+\ or the continuation of the literal
2732 itself on the next line (since leading whitespace is significant
2733 inside all types of string literals).  Note that this feature, like
2734 all string features except triple-quoted strings, is borrowed from
2735 Standard C.
2736
2737 \section{The Formatting Operator}
2738
2739 \subsection{Basic Usage}
2740
2741 The chapter on output formatting is really out of date: there is now
2742 an almost complete interface to C-style printf formats.  This is done
2743 by overloading the modulo operator (\verb\%\) for a left operand
2744 which is a string, e.g.
2745
2746 \begin{verbatim}
2747         >>> import math
2748         >>> print 'The value of PI is approximately %5.3f.' % math.pi
2749         The value of PI is approximately 3.142.
2750         >>>
2751 \end{verbatim}
2752
2753 If there is more than one format in the string you pass a tuple as
2754 right operand, e.g.
2755
2756 \begin{verbatim}
2757         >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
2758         >>> for name, phone in table.items():
2759         ...     print '%-10s ==> %10d' % (name, phone)
2760         ...
2761         Jack       ==>       4098
2762         Dcab       ==>    8637678
2763         Sjoerd     ==>       4127
2764         >>>
2765 \end{verbatim}
2766
2767 Most formats work exactly as in C and require that you pass the proper
2768 type (however, if you don't you get an exception, not a core dump).
2769 The \verb\%s\ format is more relaxed: if the corresponding argument is
2770 not a string object, it is converted to string using the \verb\str()\
2771 built-in function.  Using \verb\*\ to pass the width or precision in
2772 as a separate (integer) argument is supported.  The C formats
2773 \verb\%n\ and \verb\%p\ are not supported.
2774
2775 \subsection{Referencing Variables By Name}
2776
2777 If you have a really long format string that you don't want to split
2778 up, it would be nice if you could reference the variables to be
2779 formatted by name instead of by position.  This can be done by using
2780 an extension of C formats using the form \verb\%(name)format\, e.g.
2781
2782 \begin{verbatim}
2783         >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
2784         >>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
2785         Jack: 4098; Sjoerd: 4127; Dcab: 8637678
2786         >>>
2787 \end{verbatim}
2788
2789 This is particularly useful in combination with the new built-in
2790 \verb\vars()\ function, which returns a dictionary containing all
2791 local variables.
2792
2793 \section{Optional Function Arguments}
2794
2795 It is now possible to define functions with a variable number of
2796 arguments.  There are two forms, which can be combined.
2797
2798 \subsection{Default Argument Values}
2799
2800 The most useful form is to specify a default value for one or more
2801 arguments.  This creates a function that can be called with fewer
2802 arguments than it is defined, e.g.
2803
2804 \begin{verbatim}
2805         def ask_ok(prompt, retries = 4, complaint = 'Yes or no, please!'):
2806                 while 1:
2807                         ok = raw_input(prompt)
2808                         if ok in ('y', 'ye', 'yes'): return 1
2809                         if ok in ('n', 'no', 'nop', 'nope'): return 0
2810                         retries = retries - 1
2811                         if retries < 0: raise IOError, 'refusenik user'
2812                         print complaint
2813 \end{verbatim}
2814
2815 This function can be called either like this:
2816 \verb\ask_ok('Do you really want to quit?')\ or like this:
2817 \verb\ask_ok('OK to overwrite the file?', 2)\.
2818
2819 The default values are evaluated at the point of function definition
2820 in the {\em defining} scope, so that e.g.
2821
2822 \begin{verbatim}
2823         i = 5
2824         def f(arg = i): print arg
2825         i = 6
2826         f()
2827 \end{verbatim}
2828
2829 will print \verb\5\.
2830
2831 \subsection{Arbitrary Argument Lists}
2832
2833 It is also possible to specify that a function can be called with an
2834 arbitrary number of arguments.  These arguments will be wrapped up in
2835 a tuple.  Before the variable number of arguments, zero or more normal
2836 arguments may occur, e.g.
2837
2838 \begin{verbatim}
2839         def fprintf(file, format, *args):
2840                 file.write(format % args)
2841 \end{verbatim}
2842
2843 This feature may be combined with the previous, e.g.
2844
2845 \begin{verbatim}
2846         def but_is_it_useful(required, optional = None, *remains):
2847                 print "I don't know"
2848 \end{verbatim}
2849
2850 \section{Lambda And Functional Programming Tools}
2851
2852 \subsection{Lambda Forms}
2853
2854 On popular demand, a few features commonly found in functional
2855 programming languages and Lisp have been added to Python.  With the
2856 \verb\lambda\ keyword, small anonymous functions can be created.
2857 Here's a function that returns the sum of its two arguments:
2858 \verb\lambda a, b: a+b\.  Lambda forms can be used wherever function
2859 objects are required.  They are syntactically restricted to a single
2860 expression.  Semantically, they are just syntactic sugar for a normal
2861 function definition.  Like nested function definitions, lambda forms
2862 cannot reference variables from the containing scope, but this can be
2863 overcome through the judicious use of default argument values, e.g.
2864
2865 \begin{verbatim}
2866         def make_incrementor(n):
2867                 return lambda(x, incr=n): x+incr
2868 \end{verbatim}
2869
2870 \subsection{Map, Reduce and Filter}
2871
2872 Three new built-in functions on sequences are good candidate to pass
2873 lambda forms.
2874
2875 \subsubsection{Map.}
2876
2877 \verb\map(function, sequence)\ calls \verb\function(item)\ for each of
2878 the sequence's items and returns a list of the return values.  For
2879 example, to compute some cubes:
2880
2881 \begin{verbatim}
2882         >>> map(lambda x: x*x*x, range(1, 11))
2883         [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
2884         >>>
2885 \end{verbatim}
2886
2887 More than one sequence may be passed; the function must then have as
2888 many arguments as there are sequences and is called with the
2889 corresponding item from each sequence (or \verb\None\ if some sequence
2890 is shorter than another).  If \verb\None\ is passed for the function,
2891 a function returning its argument(s) is substituted.
2892
2893 Combining these two special cases, we see that
2894 \verb\map(None, list1, list2)\  is a convenient way of turning a pair
2895 of lists into a list of pairs.  For example:
2896
2897 \begin{verbatim}
2898         >>> seq = range(8)
2899         >>> map(None, seq, map(lambda x: x*x, seq))
2900         [(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25), (6, 36), (7, 49)]
2901         >>>
2902 \end{verbatim}
2903
2904 \subsubsection{Filter.}
2905
2906 \verb\filter(function, sequence)\ returns a sequence (of the same
2907 type, if possible) consisting of those items from the sequence for
2908 which \verb\function(item)\ is true.  For example, to compute some
2909 primes:
2910
2911 \begin{verbatim}
2912         >>> filter(lambda x: x%2 != 0 and x%3 != 0, range(2, 25))
2913         [5, 7, 11, 13, 17, 19, 23]
2914         >>>
2915 \end{verbatim}
2916
2917 \subsubsection{Reduce.}
2918
2919 \verb\reduce(function, sequence)\ returns a single value constructed
2920 by calling the (binary) function on the first two items of the
2921 sequence, then on the result and the next item, and so on.  For
2922 example, to compute the sum of the numbers 1 through 10:
2923
2924 \begin{verbatim}
2925         >>> reduce(lambda x, y: x+y, range(1, 11))
2926         55
2927         >>>
2928 \end{verbatim}
2929
2930 If there's only one item in the sequence, its value is returned; if
2931 the sequence is empty, an exception is raised.
2932
2933 A third argument can be passed to indicate the starting value.  In this
2934 case the starting value is returned for an empty sequence, and the
2935 function is first applied to the starting value and the first sequence
2936 item, then to the result and the next item, and so on.  For example,
2937
2938 \begin{verbatim}
2939         >>> def sum(seq):
2940         ...     return reduce(lambda x, y: x+y, seq, 0)
2941         ...
2942         >>> sum(range(1, 11))
2943         55
2944         >>> sum([])
2945         0
2946         >>>
2947 \end{verbatim}
2948
2949 \section{Continuation Lines Without Backslashes}
2950
2951 While the general mechanism for continuation of a source line on the
2952 next physical line remains to place a backslash on the end of the
2953 line, expressions inside matched parentheses (or square brackets, or
2954 curly braces) can now also be continued without using a backslash.
2955 This is particularly useful for calls to functions with many
2956 arguments, and for initializations of large tables.
2957
2958 For example:
2959
2960 \begin{verbatim}
2961         month_names = ['Januari', 'Februari', 'Maart',
2962                        'April',   'Mei',      'Juni',
2963                        'Juli',    'Augustus', 'September',
2964                        'Oktober', 'November', 'December']
2965 \end{verbatim}
2966
2967 and
2968
2969 \begin{verbatim}
2970         CopyInternalHyperLinks(self.context.hyperlinks,
2971                                copy.context.hyperlinks,
2972                                uidremap)
2973 \end{verbatim}
2974
2975 \section{Regular Expressions}
2976
2977 While C's printf-style output formats, transformed into Python, are
2978 adequate for most output formatting jobs, C's scanf-style input
2979 formats are not very powerful.  Instead of scanf-style input, Python
2980 offers Emacs-style regular expressions as a powerful input and
2981 scanning mechanism.  Read the corresponding section in the Library
2982 Reference for a full description.
2983
2984 \section{Generalized Dictionaries}
2985
2986 The keys of dictionaries are no longer restricted to strings -- they
2987 can be numbers, tuples, or (certain) class instances.  (Lists and
2988 dictionaries are not acceptable as dictionary keys, in order to avoid
2989 problems when the object used as a key is modified.)
2990
2991 Dictionaries have two new methods: \verb\d.values()\ returns a list of
2992 the dictionary's values, and \verb\d.items()\ returns a list of the
2993 dictionary's (key, value) pairs.  Like \verb\d.keys()\, these
2994 operations are slow for large dictionaries.  Examples:
2995
2996 \begin{verbatim}
2997         >>> d = {100: 'honderd', 1000: 'duizend', 10: 'tien'}
2998         >>> d.keys()
2999         [100, 10, 1000]
3000         >>> d.values()
3001         ['honderd', 'tien', 'duizend']
3002         >>> d.items()
3003         [(100, 'honderd'), (10, 'tien'), (1000, 'duizend')]
3004         >>>
3005 \end{verbatim}
3006
3007 \section{Miscellaneous New Built-in Functions}
3008
3009 The function \verb\vars()\ returns a dictionary containing the current
3010 local variables.  With a module as argument, it returns that module's
3011 global variables.  The old function \verb\dir(x)\ returns
3012 \verb\vars(x).keys()\.
3013
3014 The function \verb\round(x)\ returns a floating point number rounded
3015 to the nearest integer (but still expressed as a floating point
3016 number).  E.g. \verb\round(3.4) == 3.0\ and \verb\round(3.5) == 4.0\.
3017 With a second argument it rounds to the specified number of digits,
3018 e.g. \verb\round(math.pi, 4) == 3.1416\ or even
3019 \verb\round(123.4, -2) == 100.0\.
3020
3021 The function \verb\hash(x)\ returns a hash value for an object.
3022 All object types acceptable as dictionary keys have a hash value (and
3023 it is this hash value that the dictionary implementation uses).
3024
3025 The function \verb\id(x)\ return a unique identifier for an object.
3026 For two objects x and y, \verb\id(x) == id(y)\ if and only if
3027 \verb\x is y\.  (In fact the object's address is used.)
3028
3029 The function \verb\hasattr(x, name)\ returns whether an object has an
3030 attribute with the given name (a string value).  The function
3031 \verb\getattr(x, name)\ returns the object's attribute with the given
3032 name.  The function \verb\setattr(x, name, value)\ assigns a value to
3033 an object's attribute with the given name.  These three functions are
3034 useful if the attribute names are not known beforehand.  Note that
3035 \verb\getattr(x, 'foo')\ is equivalent to \verb\x.foo\, and
3036 \verb\setattr(x, 'foo', y)\ is equivalent to \verb\x.foo = y\.  By
3037 definition, \verb\hasattr(x, name)\ returns true if and only if
3038 \verb\getattr(x, name)\ returns without raising an exception.
3039
3040 \section{Else Clause For Try Statement}
3041
3042 The \verb\try...except\ statement now has an optional \verb\else\
3043 clause, which must follow all \verb\except\ clauses.  It is useful to
3044 place code that must be executed if the \verb\try\ clause does not
3045 raise an exception.  For example:
3046
3047 \begin{verbatim}
3048         for arg in sys.argv:
3049                 try:
3050                         f = open(arg, 'r')
3051                 except IOError:
3052                         print 'cannot open', arg
3053                 else:
3054                         print arg, 'has', len(f.readlines()), 'lines'
3055                         f.close()
3056 \end{verbatim}
3057
3058 \end{document}