Doc/lib/libstdtypes.tex

   1 \section{Built-in Types \label{types}}
   2
   3 The following sections describe the standard types that are built into
   4 the interpreter.  Historically, Python's built-in types have differed
   5 from user-defined types because it was not possible to use the built-in
   6 types as the basis for object-oriented inheritance. With the 2.2
   7 release this situation has started to change, although the intended
   8 unification of user-defined and built-in types is as yet far from
   9 complete.
  10
  11 The principal built-in types are numerics, sequences, mappings, files
  12 classes, instances and exceptions.
  13 \indexii{built-in}{types}
  14
  15 Some operations are supported by several object types; in particular,
  16 all objects can be compared, tested for truth value, and converted to
  17 a string (with the \code{`\textrm{\ldots}`} notation).  The latter
  18 conversion is implicitly used when an object is written by the
  19 \keyword{print}\stindex{print} statement.
  20
  21
  22 \subsection{Truth Value Testing} \label{truth}
  23
  24 Any object can be tested for truth value, for use in an \keyword{if} or
  25 \keyword{while} condition or as operand of the Boolean operations below.
  26 The following values are considered false:
  27 \stindex{if}
  28 \stindex{while}
  29 \indexii{truth}{value}
  30 \indexii{Boolean}{operations}
  31 \index{false}
  32
  33 \begin{itemize}
  34
  35 \item   \code{None}
  36         \withsubitem{(Built-in object)}{\ttindex{None}}
  37
  38 \item   \code{False}
  39         \withsubitem{(Built-in object)}{\ttindex{False}}
  40
  41 \item   zero of any numeric type, for example, \code{0}, \code{0L},
  42         \code{0.0}, \code{0j}.
  43
  44 \item   any empty sequence, for example, \code{''}, \code{()}, \code{[]}.
  45
  46 \item   any empty mapping, for example, \code{\{\}}.
  47
  48 \item   instances of user-defined classes, if the class defines a
  49         \method{__nonzero__()} or \method{__len__()} method, when that
  50         method returns the integer zero or \class{bool} value
  51         \code{False}.\footnote{Additional
  52 information on these special methods may be found in the
  53 \citetitle[../ref/ref.html]{Python Reference Manual}.}
  54
  55 \end{itemize}
  56
  57 All other values are considered true --- so objects of many types are
  58 always true.
  59 \index{true}
  60
  61 Operations and built-in functions that have a Boolean result always
  62 return \code{0} or \code{False} for false and \code{1} or \code{True}
  63 for true, unless otherwise stated.  (Important exception: the Boolean
  64 operations \samp{or}\opindex{or} and \samp{and}\opindex{and} always
  65 return one of their operands.)
  66 \index{False}
  67 \index{True}
  68
  69 \subsection{Boolean Operations \label{boolean}}
  70
  71 These are the Boolean operations, ordered by ascending priority:
  72 \indexii{Boolean}{operations}
  73
  74 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
  75   \lineiii{\var{x} or \var{y}}
  76           {if \var{x} is false, then \var{y}, else \var{x}}{(1)}
  77   \lineiii{\var{x} and \var{y}}
  78           {if \var{x} is false, then \var{x}, else \var{y}}{(1)}
  79   \hline
  80   \lineiii{not \var{x}}
  81           {if \var{x} is false, then \code{True}, else \code{False}}{(2)}
  82 \end{tableiii}
  83 \opindex{and}
  84 \opindex{or}
  85 \opindex{not}
  86
  87 \noindent
  88 Notes:
  89
  90 \begin{description}
  91
  92 \item[(1)]
  93 These only evaluate their second argument if needed for their outcome.
  94
  95 \item[(2)]
  96 \samp{not} has a lower priority than non-Boolean operators, so
  97 \code{not \var{a} == \var{b}} is interpreted as \code{not (\var{a} ==
  98 \var{b})}, and \code{\var{a} == not \var{b}} is a syntax error.
  99
 100 \end{description}
 101
 102
 103 \subsection{Comparisons \label{comparisons}}
 104
 105 Comparison operations are supported by all objects.  They all have the
 106 same priority (which is higher than that of the Boolean operations).
 107 Comparisons can be chained arbitrarily; for example, \code{\var{x} <
 108 \var{y} <= \var{z}} is equivalent to \code{\var{x} < \var{y} and
 109 \var{y} <= \var{z}}, except that \var{y} is evaluated only once (but
 110 in both cases \var{z} is not evaluated at all when \code{\var{x} <
 111 \var{y}} is found to be false).
 112 \indexii{chaining}{comparisons}
 113
 114 This table summarizes the comparison operations:
 115
 116 \begin{tableiii}{c|l|c}{code}{Operation}{Meaning}{Notes}
 117   \lineiii{<}{strictly less than}{}
 118   \lineiii{<=}{less than or equal}{}
 119   \lineiii{>}{strictly greater than}{}
 120   \lineiii{>=}{greater than or equal}{}
 121   \lineiii{==}{equal}{}
 122   \lineiii{!=}{not equal}{(1)}
 123   \lineiii{<>}{not equal}{(1)}
 124   \lineiii{is}{object identity}{}
 125   \lineiii{is not}{negated object identity}{}
 126 \end{tableiii}
 127 \indexii{operator}{comparison}
 128 \opindex{==} % XXX *All* others have funny characters < ! >
 129 \opindex{is}
 130 \opindex{is not}
 131
 132 \noindent
 133 Notes:
 134
 135 \begin{description}
 136
 137 \item[(1)]
 138 \code{<>} and \code{!=} are alternate spellings for the same operator.
 139 \code{!=} is the preferred spelling; \code{<>} is obsolescent.
 140
 141 \end{description}
 142
 143 Objects of different types, except different numeric types, never
 144 compare equal; such objects are ordered consistently but arbitrarily
 145 (so that sorting a heterogeneous array yields a consistent result).
 146 Furthermore, some types (for example, file objects) support only a
 147 degenerate notion of comparison where any two objects of that type are
 148 unequal.  Again, such objects are ordered arbitrarily but
 149 consistently. The \code{<}, \code{<=}, \code{>} and \code{>=}
 150 operators will raise a \exception{TypeError} exception when any operand
 151 is a complex number.
 152 \indexii{object}{numeric}
 153 \indexii{objects}{comparing}
 154
 155 Instances of a class normally compare as non-equal unless the class
 156 \withsubitem{(instance method)}{\ttindex{__cmp__()}}
 157 defines the \method{__cmp__()} method.  Refer to the
 158 \citetitle[../ref/customization.html]{Python Reference Manual} for
 159 information on the use of this method to effect object comparisons.
 160
 161 \strong{Implementation note:} Objects of different types except
 162 numbers are ordered by their type names; objects of the same types
 163 that don't support proper comparison are ordered by their address.
 164
 165 Two more operations with the same syntactic priority,
 166 \samp{in}\opindex{in} and \samp{not in}\opindex{not in}, are supported
 167 only by sequence types (below).
 168
 169
 170 \subsection{Numeric Types \label{typesnumeric}}
 171
 172 There are four distinct numeric types: \dfn{plain integers},
 173 \dfn{long integers},
 174 \dfn{floating point numbers}, and \dfn{complex numbers}.
 175 In addition, Booleans are a subtype of plain integers.
 176 Plain integers (also just called \dfn{integers})
 177 are implemented using \ctype{long} in C, which gives them at least 32
 178 bits of precision.  Long integers have unlimited precision.  Floating
 179 point numbers are implemented using \ctype{double} in C.  All bets on
 180 their precision are off unless you happen to know the machine you are
 181 working with.
 182 \obindex{numeric}
 183 \obindex{Boolean}
 184 \obindex{integer}
 185 \obindex{long integer}
 186 \obindex{floating point}
 187 \obindex{complex number}
 188 \indexii{C}{language}
 189
 190 Complex numbers have a real and imaginary part, which are each
 191 implemented using \ctype{double} in C.  To extract these parts from
 192 a complex number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
 193
 194 Numbers are created by numeric literals or as the result of built-in
 195 functions and operators.  Unadorned integer literals (including hex
 196 and octal numbers) yield plain integers unless the value they denote
 197 is too large to be represented as a plain integer, in which case
 198 they yield a long integer.  Integer literals with an
 199 \character{L} or \character{l} suffix yield long integers
 200 (\character{L} is preferred because \samp{1l} looks too much like
 201 eleven!).  Numeric literals containing a decimal point or an exponent
 202 sign yield floating point numbers.  Appending \character{j} or
 203 \character{J} to a numeric literal yields a complex number with a
 204 zero real part. A complex numeric literal is the sum of a real and
 205 an imaginary part.
 206 \indexii{numeric}{literals}
 207 \indexii{integer}{literals}
 208 \indexiii{long}{integer}{literals}
 209 \indexii{floating point}{literals}
 210 \indexii{complex number}{literals}
 211 \indexii{hexadecimal}{literals}
 212 \indexii{octal}{literals}
 213
 214 Python fully supports mixed arithmetic: when a binary arithmetic
 215 operator has operands of different numeric types, the operand with the
 216 ``narrower'' type is widened to that of the other, where plain
 217 integer is narrower than long integer is narrower than floating point is
 218 narrower than complex.
 219 Comparisons between numbers of mixed type use the same rule.\footnote{
 220         As a consequence, the list \code{[1, 2]} is considered equal
 221         to \code{[1.0, 2.0]}, and similarly for tuples.
 222 } The constructors \function{int()}, \function{long()}, \function{float()},
 223 and \function{complex()} can be used
 224 to produce numbers of a specific type.
 225 \index{arithmetic}
 226 \bifuncindex{int}
 227 \bifuncindex{long}
 228 \bifuncindex{float}
 229 \bifuncindex{complex}
 230
 231 All numeric types support the following operations, sorted by
 232 ascending priority (operations in the same box have the same
 233 priority; all numeric operations have a higher priority than
 234 comparison operations):
 235
 236 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 237   \lineiii{\var{x} + \var{y}}{sum of \var{x} and \var{y}}{}
 238   \lineiii{\var{x} - \var{y}}{difference of \var{x} and \var{y}}{}
 239   \hline
 240   \lineiii{\var{x} * \var{y}}{product of \var{x} and \var{y}}{}
 241   \lineiii{\var{x} / \var{y}}{quotient of \var{x} and \var{y}}{(1)}
 242   \lineiii{\var{x} \%{} \var{y}}{remainder of \code{\var{x} / \var{y}}}{}
 243   \hline
 244   \lineiii{-\var{x}}{\var{x} negated}{}
 245   \lineiii{+\var{x}}{\var{x} unchanged}{}
 246   \hline
 247   \lineiii{abs(\var{x})}{absolute value or magnitude of \var{x}}{}
 248   \lineiii{int(\var{x})}{\var{x} converted to integer}{(2)}
 249   \lineiii{long(\var{x})}{\var{x} converted to long integer}{(2)}
 250   \lineiii{float(\var{x})}{\var{x} converted to floating point}{}
 251   \lineiii{complex(\var{re},\var{im})}{a complex number with real part \var{re}, imaginary part \var{im}.  \var{im} defaults to zero.}{}
 252   \lineiii{\var{c}.conjugate()}{conjugate of the complex number \var{c}}{}
 253   \lineiii{divmod(\var{x}, \var{y})}{the pair \code{(\var{x} / \var{y}, \var{x} \%{} \var{y})}}{(3)}
 254   \lineiii{pow(\var{x}, \var{y})}{\var{x} to the power \var{y}}{}
 255   \lineiii{\var{x} ** \var{y}}{\var{x} to the power \var{y}}{}
 256 \end{tableiii}
 257 \indexiii{operations on}{numeric}{types}
 258 \withsubitem{(complex number method)}{\ttindex{conjugate()}}
 259
 260 \noindent
 261 Notes:
 262 \begin{description}
 263
 264 \item[(1)]
 265 For (plain or long) integer division, the result is an integer.
 266 The result is always rounded towards minus infinity: 1/2 is 0,
 267 (-1)/2 is -1, 1/(-2) is -1, and (-1)/(-2) is 0.  Note that the result
 268 is a long integer if either operand is a long integer, regardless of
 269 the numeric value.
 270 \indexii{integer}{division}
 271 \indexiii{long}{integer}{division}
 272
 273 \item[(2)]
 274 Conversion from floating point to (long or plain) integer may round or
 275 truncate as in C; see functions \function{floor()} and
 276 \function{ceil()} in the \refmodule{math}\refbimodindex{math} module
 277 for well-defined conversions.
 278 \withsubitem{(in module math)}{\ttindex{floor()}\ttindex{ceil()}}
 279 \indexii{numeric}{conversions}
 280 \indexii{C}{language}
 281
 282 \item[(3)]
 283 See section \ref{built-in-funcs}, ``Built-in Functions,'' for a full
 284 description.
 285
 286 \end{description}
 287 % XXXJH exceptions: overflow (when? what operations?) zerodivision
 288
 289 \subsubsection{Bit-string Operations on Integer Types \label{bitstring-ops}}
 290 \nodename{Bit-string Operations}
 291
 292 Plain and long integer types support additional operations that make
 293 sense only for bit-strings.  Negative numbers are treated as their 2's
 294 complement value (for long integers, this assumes a sufficiently large
 295 number of bits that no overflow occurs during the operation).
 296
 297 The priorities of the binary bit-wise operations are all lower than
 298 the numeric operations and higher than the comparisons; the unary
 299 operation \samp{\~} has the same priority as the other unary numeric
 300 operations (\samp{+} and \samp{-}).
 301
 302 This table lists the bit-string operations sorted in ascending
 303 priority (operations in the same box have the same priority):
 304
 305 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 306   \lineiii{\var{x} | \var{y}}{bitwise \dfn{or} of \var{x} and \var{y}}{}
 307   \lineiii{\var{x} \^{} \var{y}}{bitwise \dfn{exclusive or} of \var{x} and \var{y}}{}
 308   \lineiii{\var{x} \&{} \var{y}}{bitwise \dfn{and} of \var{x} and \var{y}}{}
 309   \lineiii{\var{x} << \var{n}}{\var{x} shifted left by \var{n} bits}{(1), (2)}
 310   \lineiii{\var{x} >> \var{n}}{\var{x} shifted right by \var{n} bits}{(1), (3)}
 311   \hline
 312   \lineiii{\~\var{x}}{the bits of \var{x} inverted}{}
 313 \end{tableiii}
 314 \indexiii{operations on}{integer}{types}
 315 \indexii{bit-string}{operations}
 316 \indexii{shifting}{operations}
 317 \indexii{masking}{operations}
 318
 319 \noindent
 320 Notes:
 321 \begin{description}
 322 \item[(1)] Negative shift counts are illegal and cause a
 323 \exception{ValueError} to be raised.
 324 \item[(2)] A left shift by \var{n} bits is equivalent to
 325 multiplication by \code{pow(2, \var{n})} without overflow check.
 326 \item[(3)] A right shift by \var{n} bits is equivalent to
 327 division by \code{pow(2, \var{n})} without overflow check.
 328 \end{description}
 329
 330
 331 \subsection{Iterator Types \label{typeiter}}
 332
 333 \versionadded{2.2}
 334 \index{iterator protocol}
 335 \index{protocol!iterator}
 336 \index{sequence!iteration}
 337 \index{container!iteration over}
 338
 339 Python supports a concept of iteration over containers.  This is
 340 implemented using two distinct methods; these are used to allow
 341 user-defined classes to support iteration.  Sequences, described below
 342 in more detail, always support the iteration methods.
 343
 344 One method needs to be defined for container objects to provide
 345 iteration support:
 346
 347 \begin{methoddesc}[container]{__iter__}{}
 348   Return an iterator object.  The object is required to support the
 349   iterator protocol described below.  If a container supports
 350   different types of iteration, additional methods can be provided to
 351   specifically request iterators for those iteration types.  (An
 352   example of an object supporting multiple forms of iteration would be
 353   a tree structure which supports both breadth-first and depth-first
 354   traversal.)  This method corresponds to the \member{tp_iter} slot of
 355   the type structure for Python objects in the Python/C API.
 356 \end{methoddesc}
 357
 358 The iterator objects themselves are required to support the following
 359 two methods, which together form the \dfn{iterator protocol}:
 360
 361 \begin{methoddesc}[iterator]{__iter__}{}
 362   Return the iterator object itself.  This is required to allow both
 363   containers and iterators to be used with the \keyword{for} and
 364   \keyword{in} statements.  This method corresponds to the
 365   \member{tp_iter} slot of the type structure for Python objects in
 366   the Python/C API.
 367 \end{methoddesc}
 368
 369 \begin{methoddesc}[iterator]{next}{}
 370   Return the next item from the container.  If there are no further
 371   items, raise the \exception{StopIteration} exception.  This method
 372   corresponds to the \member{tp_iternext} slot of the type structure
 373   for Python objects in the Python/C API.
 374 \end{methoddesc}
 375
 376 Python defines several iterator objects to support iteration over
 377 general and specific sequence types, dictionaries, and other more
 378 specialized forms.  The specific types are not important beyond their
 379 implementation of the iterator protocol.
 380
 381 The intention of the protocol is that once an iterator's
 382 \method{next()} method raises \exception{StopIteration}, it will
 383 continue to do so on subsequent calls.  Implementations that
 384 do not obey this property are deemed broken.  (This constraint
 385 was added in Python 2.3; in Python 2.2, various iterators are
 386 broken according to this rule.)
 387
 388
 389 \subsection{Sequence Types \label{typesseq}}
 390
 391 There are six sequence types: strings, Unicode strings, lists,
 392 tuples, buffers, and xrange objects.
 393
 394 String literals are written in single or double quotes:
 395 \code{'xyzzy'}, \code{"frobozz"}.  See chapter 2 of the
 396 \citetitle[../ref/strings.html]{Python Reference Manual} for more about
 397 string literals.  Unicode strings are much like strings, but are
 398 specified in the syntax using a preceeding \character{u} character:
 399 \code{u'abc'}, \code{u"def"}.  Lists are constructed with square brackets,
 400 separating items with commas: \code{[a, b, c]}.  Tuples are
 401 constructed by the comma operator (not within square brackets), with
 402 or without enclosing parentheses, but an empty tuple must have the
 403 enclosing parentheses, e.g., \code{a, b, c} or \code{()}.  A single
 404 item tuple must have a trailing comma, e.g., \code{(d,)}.
 405 \obindex{sequence}
 406 \obindex{string}
 407 \obindex{Unicode}
 408 \obindex{tuple}
 409 \obindex{list}
 410
 411 Buffer objects are not directly supported by Python syntax, but can be
 412 created by calling the builtin function
 413 \function{buffer()}.\bifuncindex{buffer}  They don't support
 414 concatenation or repetition.
 415 \obindex{buffer}
 416
 417 Xrange objects are similar to buffers in that there is no specific
 418 syntax to create them, but they are created using the \function{xrange()}
 419 function.\bifuncindex{xrange}  They don't support slicing,
 420 concatenation or repetition, and using \code{in}, \code{not in},
 421 \function{min()} or \function{max()} on them is inefficient.
 422 \obindex{xrange}
 423
 424 Most sequence types support the following operations.  The \samp{in} and
 425 \samp{not in} operations have the same priorities as the comparison
 426 operations.  The \samp{+} and \samp{*} operations have the same
 427 priority as the corresponding numeric operations.\footnote{They must
 428 have since the parser can't tell the type of the operands.}
 429
 430 This table lists the sequence operations sorted in ascending priority
 431 (operations in the same box have the same priority).  In the table,
 432 \var{s} and \var{t} are sequences of the same type; \var{n}, \var{i}
 433 and \var{j} are integers:
 434
 435 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 436   \lineiii{\var{x} in \var{s}}{\code{1} if an item of \var{s} is equal to \var{x}, else \code{0}}{(1)}
 437   \lineiii{\var{x} not in \var{s}}{\code{0} if an item of \var{s} is
 438 equal to \var{x}, else \code{1}}{(1)}
 439   \hline
 440   \lineiii{\var{s} + \var{t}}{the concatenation of \var{s} and \var{t}}{}
 441   \lineiii{\var{s} * \var{n}\textrm{,} \var{n} * \var{s}}{\var{n} shallow copies of \var{s} concatenated}{(2)}
 442   \hline
 443   \lineiii{\var{s}[\var{i}]}{\var{i}'th item of \var{s}, origin 0}{(3)}
 444   \lineiii{\var{s}[\var{i}:\var{j}]}{slice of \var{s} from \var{i} to \var{j}}{(3), (4)}
 445   \hline
 446   \lineiii{len(\var{s})}{length of \var{s}}{}
 447   \lineiii{min(\var{s})}{smallest item of \var{s}}{}
 448   \lineiii{max(\var{s})}{largest item of \var{s}}{}
 449 \end{tableiii}
 450 \indexiii{operations on}{sequence}{types}
 451 \bifuncindex{len}
 452 \bifuncindex{min}
 453 \bifuncindex{max}
 454 \indexii{concatenation}{operation}
 455 \indexii{repetition}{operation}
 456 \indexii{subscript}{operation}
 457 \indexii{slice}{operation}
 458 \opindex{in}
 459 \opindex{not in}
 460
 461 \noindent
 462 Notes:
 463
 464 \begin{description}
 465 \item[(1)] When \var{s} is a string or Unicode string object the
 466 \code{in} and \code{not in} operations act like a substring test.  In
 467 Python versions before 2.3, \var{x} had to be a string of length 1.
 468 In Python 2.3 and beyond, \var{x} may be a string of any length.
 469
 470 \item[(2)] Values of \var{n} less than \code{0} are treated as
 471   \code{0} (which yields an empty sequence of the same type as
 472   \var{s}).  Note also that the copies are shallow; nested structures
 473   are not copied.  This often haunts new Python programmers; consider:
 474
 475 \begin{verbatim}
 476 >>> lists = [[]] * 3
 477 >>> lists
 478 [[], [], []]
 479 >>> lists[0].append(3)
 480 >>> lists
 481 [[3], [3], [3]]
 482 \end{verbatim}
 483
 484   What has happened is that \code{lists} is a list containing three
 485   copies of the list \code{[[]]} (a one-element list containing an
 486   empty list), but the contained list is shared by each copy.  You can
 487   create a list of different lists this way:
 488
 489 \begin{verbatim}
 490 >>> lists = [[] for i in range(3)]
 491 >>> lists[0].append(3)
 492 >>> lists[1].append(5)
 493 >>> lists[2].append(7)
 494 >>> lists
 495 [[3], [5], [7]]
 496 \end{verbatim}
 497
 498 \item[(3)] If \var{i} or \var{j} is negative, the index is relative to
 499   the end of the string: \code{len(\var{s}) + \var{i}} or
 500   \code{len(\var{s}) + \var{j}} is substituted.  But note that \code{-0} is
 501   still \code{0}.
 502
 503 \item[(4)] The slice of \var{s} from \var{i} to \var{j} is defined as
 504   the sequence of items with index \var{k} such that \code{\var{i} <=
 505   \var{k} < \var{j}}.  If \var{i} or \var{j} is greater than
 506   \code{len(\var{s})}, use \code{len(\var{s})}.  If \var{i} is omitted,
 507   use \code{0}.  If \var{j} is omitted, use \code{len(\var{s})}.  If
 508   \var{i} is greater than or equal to \var{j}, the slice is empty.
 509 \end{description}
 510
 511
 512 \subsubsection{String Methods \label{string-methods}}
 513
 514 These are the string methods which both 8-bit strings and Unicode
 515 objects support:
 516
 517 \begin{methoddesc}[string]{capitalize}{}
 518 Return a copy of the string with only its first character capitalized.
 519 \end{methoddesc}
 520
 521 \begin{methoddesc}[string]{center}{width}
 522 Return centered in a string of length \var{width}. Padding is done
 523 using spaces.
 524 \end{methoddesc}
 525
 526 \begin{methoddesc}[string]{count}{sub\optional{, start\optional{, end}}}
 527 Return the number of occurrences of substring \var{sub} in string
 528 S\code{[\var{start}:\var{end}]}.  Optional arguments \var{start} and
 529 \var{end} are interpreted as in slice notation.
 530 \end{methoddesc}
 531
 532 \begin{methoddesc}[string]{decode}{\optional{encoding\optional{, errors}}}
 533 Decodes the string using the codec registered for \var{encoding}.
 534 \var{encoding} defaults to the default string encoding.  \var{errors}
 535 may be given to set a different error handling scheme.  The default is
 536 \code{'strict'}, meaning that encoding errors raise
 537 \exception{ValueError}.  Other possible values are \code{'ignore'} and
 538 \code{replace'}.
 539 \versionadded{2.2}
 540 \end{methoddesc}
 541
 542 \begin{methoddesc}[string]{encode}{\optional{encoding\optional{,errors}}}
 543 Return an encoded version of the string.  Default encoding is the current
 544 default string encoding.  \var{errors} may be given to set a different
 545 error handling scheme.  The default for \var{errors} is
 546 \code{'strict'}, meaning that encoding errors raise a
 547 \exception{ValueError}.  Other possible values are \code{'ignore'} and
 548 \code{'replace'}.
 549 \versionadded{2.0}
 550 \end{methoddesc}
 551
 552 \begin{methoddesc}[string]{endswith}{suffix\optional{, start\optional{, end}}}
 553 Return true if the string ends with the specified \var{suffix},
 554 otherwise return false.  With optional \var{start}, test beginning at
 555 that position.  With optional \var{end}, stop comparing at that position.
 556 \end{methoddesc}
 557
 558 \begin{methoddesc}[string]{expandtabs}{\optional{tabsize}}
 559 Return a copy of the string where all tab characters are expanded
 560 using spaces.  If \var{tabsize} is not given, a tab size of \code{8}
 561 characters is assumed.
 562 \end{methoddesc}
 563
 564 \begin{methoddesc}[string]{find}{sub\optional{, start\optional{, end}}}
 565 Return the lowest index in the string where substring \var{sub} is
 566 found, such that \var{sub} is contained in the range [\var{start},
 567 \var{end}).  Optional arguments \var{start} and \var{end} are
 568 interpreted as in slice notation.  Return \code{-1} if \var{sub} is
 569 not found.
 570 \end{methoddesc}
 571
 572 \begin{methoddesc}[string]{index}{sub\optional{, start\optional{, end}}}
 573 Like \method{find()}, but raise \exception{ValueError} when the
 574 substring is not found.
 575 \end{methoddesc}
 576
 577 \begin{methoddesc}[string]{isalnum}{}
 578 Return true if all characters in the string are alphanumeric and there
 579 is at least one character, false otherwise.
 580 \end{methoddesc}
 581
 582 \begin{methoddesc}[string]{isalpha}{}
 583 Return true if all characters in the string are alphabetic and there
 584 is at least one character, false otherwise.
 585 \end{methoddesc}
 586
 587 \begin{methoddesc}[string]{isdigit}{}
 588 Return true if there are only digit characters, false otherwise.
 589 \end{methoddesc}
 590
 591 \begin{methoddesc}[string]{islower}{}
 592 Return true if all cased characters in the string are lowercase and
 593 there is at least one cased character, false otherwise.
 594 \end{methoddesc}
 595
 596 \begin{methoddesc}[string]{isspace}{}
 597 Return true if there are only whitespace characters in the string and
 598 the string is not empty, false otherwise.
 599 \end{methoddesc}
 600
 601 \begin{methoddesc}[string]{istitle}{}
 602 Return true if the string is a titlecased string: uppercase
 603 characters may only follow uncased characters and lowercase characters
 604 only cased ones.  Return false otherwise.
 605 \end{methoddesc}
 606
 607 \begin{methoddesc}[string]{isupper}{}
 608 Return true if all cased characters in the string are uppercase and
 609 there is at least one cased character, false otherwise.
 610 \end{methoddesc}
 611
 612 \begin{methoddesc}[string]{join}{seq}
 613 Return a string which is the concatenation of the strings in the
 614 sequence \var{seq}.  The separator between elements is the string
 615 providing this method.
 616 \end{methoddesc}
 617
 618 \begin{methoddesc}[string]{ljust}{width}
 619 Return the string left justified in a string of length \var{width}.
 620 Padding is done using spaces.  The original string is returned if
 621 \var{width} is less than \code{len(\var{s})}.
 622 \end{methoddesc}
 623
 624 \begin{methoddesc}[string]{lower}{}
 625 Return a copy of the string converted to lowercase.
 626 \end{methoddesc}
 627
 628 \begin{methoddesc}[string]{lstrip}{\optional{chars}}
 629 Return a copy of the string with leading characters removed.  If
 630 \var{chars} is omitted or \code{None}, whitespace characters are
 631 removed.  If given and not \code{None}, \var{chars} must be a string;
 632 the characters in the string will be stripped from the beginning of
 633 the string this method is called on.
 634 \end{methoddesc}
 635
 636 \begin{methoddesc}[string]{replace}{old, new\optional{, maxsplit}}
 637 Return a copy of the string with all occurrences of substring
 638 \var{old} replaced by \var{new}.  If the optional argument
 639 \var{maxsplit} is given, only the first \var{maxsplit} occurrences are
 640 replaced.
 641 \end{methoddesc}
 642
 643 \begin{methoddesc}[string]{rfind}{sub \optional{,start \optional{,end}}}
 644 Return the highest index in the string where substring \var{sub} is
 645 found, such that \var{sub} is contained within s[start,end].  Optional
 646 arguments \var{start} and \var{end} are interpreted as in slice
 647 notation.  Return \code{-1} on failure.
 648 \end{methoddesc}
 649
 650 \begin{methoddesc}[string]{rindex}{sub\optional{, start\optional{, end}}}
 651 Like \method{rfind()} but raises \exception{ValueError} when the
 652 substring \var{sub} is not found.
 653 \end{methoddesc}
 654
 655 \begin{methoddesc}[string]{rjust}{width}
 656 Return the string right justified in a string of length \var{width}.
 657 Padding is done using spaces.  The original string is returned if
 658 \var{width} is less than \code{len(\var{s})}.
 659 \end{methoddesc}
 660
 661 \begin{methoddesc}[string]{rstrip}{\optional{chars}}
 662 Return a copy of the string with trailing characters removed.  If
 663 \var{chars} is omitted or \code{None}, whitespace characters are
 664 removed.  If given and not \code{None}, \var{chars} must be a string;
 665 the characters in the string will be stripped from the end of the
 666 string this method is called on.
 667 \end{methoddesc}
 668
 669 \begin{methoddesc}[string]{split}{\optional{sep \optional{,maxsplit}}}
 670 Return a list of the words in the string, using \var{sep} as the
 671 delimiter string.  If \var{maxsplit} is given, at most \var{maxsplit}
 672 splits are done.  If \var{sep} is not specified or \code{None}, any
 673 whitespace string is a separator.
 674 \end{methoddesc}
 675
 676 \begin{methoddesc}[string]{splitlines}{\optional{keepends}}
 677 Return a list of the lines in the string, breaking at line
 678 boundaries.  Line breaks are not included in the resulting list unless
 679 \var{keepends} is given and true.
 680 \end{methoddesc}
 681
 682 \begin{methoddesc}[string]{startswith}{prefix\optional{,
 683                                        start\optional{, end}}}
 684 Return true if string starts with the \var{prefix}, otherwise
 685 return false.  With optional \var{start}, test string beginning at
 686 that position.  With optional \var{end}, stop comparing string at that
 687 position.
 688 \end{methoddesc}
 689
 690 \begin{methoddesc}[string]{strip}{\optional{chars}}
 691 Return a copy of the string with leading and trailing characters
 692 removed.  If \var{chars} is omitted or \code{None}, whitespace
 693 characters are removed.  If given and not \code{None}, \var{chars}
 694 must be a string; the characters in the string will be stripped from
 695 the both ends of the string this method is called on.
 696 \end{methoddesc}
 697
 698 \begin{methoddesc}[string]{swapcase}{}
 699 Return a copy of the string with uppercase characters converted to
 700 lowercase and vice versa.
 701 \end{methoddesc}
 702
 703 \begin{methoddesc}[string]{title}{}
 704 Return a titlecased version of the string: words start with uppercase
 705 characters, all remaining cased characters are lowercase.
 706 \end{methoddesc}
 707
 708 \begin{methoddesc}[string]{translate}{table\optional{, deletechars}}
 709 Return a copy of the string where all characters occurring in the
 710 optional argument \var{deletechars} are removed, and the remaining
 711 characters have been mapped through the given translation table, which
 712 must be a string of length 256.
 713 \end{methoddesc}
 714
 715 \begin{methoddesc}[string]{upper}{}
 716 Return a copy of the string converted to uppercase.
 717 \end{methoddesc}
 718
 719 \begin{methoddesc}[string]{zfill}{width}
 720 Return the numeric string left filled with zeros in a string
 721 of length \var{width}. The original string is returned if
 722 \var{width} is less than \code{len(\var{s})}.
 723 \end{methoddesc}
 724
 725
 726 \subsubsection{String Formatting Operations \label{typesseq-strings}}
 727
 728 \index{formatting, string (\%{})}
 729 \index{interpolation, string (\%{})}
 730 \index{string!formatting}
 731 \index{string!interpolation}
 732 \index{printf-style formatting}
 733 \index{sprintf-style formatting}
 734 \index{\protect\%{} formatting}
 735 \index{\protect\%{} interpolation}
 736
 737 String and Unicode objects have one unique built-in operation: the
 738 \code{\%} operator (modulo).  This is also known as the string
 739 \emph{formatting} or \emph{interpolation} operator.  Given
 740 \code{\var{format} \% \var{values}} (where \var{format} is a string or
 741 Unicode object), \code{\%} conversion specifications in \var{format}
 742 are replaced with zero or more elements of \var{values}.  The effect
 743 is similar to the using \cfunction{sprintf()} in the C language.  If
 744 \var{format} is a Unicode object, or if any of the objects being
 745 converted using the \code{\%s} conversion are Unicode objects, the
 746 result will also be a Unicode object.
 747
 748 If \var{format} requires a single argument, \var{values} may be a
 749 single non-tuple object. \footnote{To format only a tuple you
 750 should therefore provide a singleton tuple whose only element
 751 is the tuple to be formatted.}  Otherwise, \var{values} must be a tuple with
 752 exactly the number of items specified by the format string, or a
 753 single mapping object (for example, a dictionary).
 754
 755 A conversion specifier contains two or more characters and has the
 756 following components, which must occur in this order:
 757
 758 \begin{enumerate}
 759   \item  The \character{\%} character, which marks the start of the
 760          specifier.
 761   \item  Mapping key (optional), consisting of a parenthesised sequence
 762          of characters (for example, \code{(somename)}).
 763   \item  Conversion flags (optional), which affect the result of some
 764          conversion types.
 765   \item  Minimum field width (optional).  If specified as an
 766          \character{*} (asterisk), the actual width is read from the
 767          next element of the tuple in \var{values}, and the object to
 768          convert comes after the minimum field width and optional
 769          precision.
 770   \item  Precision (optional), given as a \character{.} (dot) followed
 771          by the precision.  If specified as \character{*} (an
 772          asterisk), the actual width is read from the next element of
 773          the tuple in \var{values}, and the value to convert comes after
 774          the precision.
 775   \item  Length modifier (optional).
 776   \item  Conversion type.
 777 \end{enumerate}
 778
 779 When the right argument is a dictionary (or other mapping type), then
 780 the formats in the string \emph{must} include a parenthesised mapping key into
 781 that dictionary inserted immediately after the \character{\%}
 782 character. The mapping key selects the value to be formatted from the
 783 mapping.  For example:
 784
 785 \begin{verbatim}
 786 >>> print '%(language)s has %(#)03d quote types.' % \
 787           {'language': "Python", "#": 2}
 788 Python has 002 quote types.
 789 \end{verbatim}
 790
 791 In this case no \code{*} specifiers may occur in a format (since they
 792 require a sequential parameter list).
 793
 794 The conversion flag characters are:
 795
 796 \begin{tableii}{c|l}{character}{Flag}{Meaning}
 797   \lineii{\#}{The value conversion will use the ``alternate form''
 798               (where defined below).}
 799   \lineii{0}{The conversion will be zero padded.}
 800   \lineii{-}{The converted value is left adjusted (overrides
 801              \character{-}).}
 802   \lineii{{~}}{(a space) A blank should be left before a positive number
 803              (or empty string) produced by a signed conversion.}
 804   \lineii{+}{A sign character (\character{+} or \character{-}) will
 805              precede the conversion (overrides a "space" flag).}
 806 \end{tableii}
 807
 808 The length modifier may be \code{h}, \code{l}, and \code{L} may be
 809 present, but are ignored as they are not necessary for Python.
 810
 811 The conversion types are:
 812
 813 \begin{tableii}{c|l}{character}{Conversion}{Meaning}
 814   \lineii{d}{Signed integer decimal.}
 815   \lineii{i}{Signed integer decimal.}
 816   \lineii{o}{Unsigned octal.}
 817   \lineii{u}{Unsigned decimal.}
 818   \lineii{x}{Unsigned hexidecimal (lowercase).}
 819   \lineii{X}{Unsigned hexidecimal (uppercase).}
 820   \lineii{e}{Floating point exponential format (lowercase).}
 821   \lineii{E}{Floating point exponential format (uppercase).}
 822   \lineii{f}{Floating point decimal format.}
 823   \lineii{F}{Floating point decimal format.}
 824   \lineii{g}{Same as \character{e} if exponent is greater than -4 or
 825              less than precision, \character{f} otherwise.}
 826   \lineii{G}{Same as \character{E} if exponent is greater than -4 or
 827              less than precision, \character{F} otherwise.}
 828   \lineii{c}{Single character (accepts integer or single character
 829              string).}
 830   \lineii{r}{String (converts any python object using
 831              \function{repr()}).}
 832   \lineii{s}{String (converts any python object using
 833              \function{str()}).}
 834   \lineii{\%}{No argument is converted, results in a \character{\%}
 835               character in the result.  (The complete specification is
 836               \code{\%\%}.)}
 837 \end{tableii}
 838
 839 % XXX Examples?
 840
 841 (The \code{\%r} conversion was added in Python 2.0.)
 842
 843 Since Python strings have an explicit length, \code{\%s} conversions
 844 do not assume that \code{'\e0'} is the end of the string.
 845
 846 For safety reasons, floating point precisions are clipped to 50;
 847 \code{\%f} conversions for numbers whose absolute value is over 1e25
 848 are replaced by \code{\%g} conversions.\footnote{
 849   These numbers are fairly arbitrary.  They are intended to
 850   avoid printing endless strings of meaningless digits without hampering
 851   correct use and without having to know the exact precision of floating
 852   point values on a particular machine.
 853 }  All other errors raise exceptions.
 854
 855 Additional string operations are defined in standard modules
 856 \refmodule{string}\refstmodindex{string} and
 857 \refmodule{re}.\refstmodindex{re}
 858
 859
 860 \subsubsection{XRange Type \label{typesseq-xrange}}
 861
 862 The xrange\obindex{xrange} type is an immutable sequence which is
 863 commonly used for looping.  The advantage of the xrange type is that an
 864 xrange object will always take the same amount of memory, no matter the
 865 size of the range it represents.  There are no consistent performance
 866 advantages.
 867
 868 XRange objects have very little behavior: they only support indexing
 869 and the \function{len()} function.
 870
 871
 872 \subsubsection{Mutable Sequence Types \label{typesseq-mutable}}
 873
 874 List objects support additional operations that allow in-place
 875 modification of the object.
 876 Other mutable sequence types (when added to the language) should
 877 also support these operations.
 878 Strings and tuples are immutable sequence types: such objects cannot
 879 be modified once created.
 880 The following operations are defined on mutable sequence types (where
 881 \var{x} is an arbitrary object):
 882 \indexiii{mutable}{sequence}{types}
 883 \obindex{list}
 884
 885 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
 886   \lineiii{\var{s}[\var{i}] = \var{x}}
 887         {item \var{i} of \var{s} is replaced by \var{x}}{}
 888   \lineiii{\var{s}[\var{i}:\var{j}] = \var{t}}
 889         {slice of \var{s} from \var{i} to \var{j} is replaced by \var{t}}{}
 890   \lineiii{del \var{s}[\var{i}:\var{j}]}
 891         {same as \code{\var{s}[\var{i}:\var{j}] = []}}{}
 892   \lineiii{\var{s}.append(\var{x})}
 893         {same as \code{\var{s}[len(\var{s}):len(\var{s})] = [\var{x}]}}{(1)}
 894   \lineiii{\var{s}.extend(\var{x})}
 895         {same as \code{\var{s}[len(\var{s}):len(\var{s})] = \var{x}}}{(2)}
 896   \lineiii{\var{s}.count(\var{x})}
 897     {return number of \var{i}'s for which \code{\var{s}[\var{i}] == \var{x}}}{}
 898   \lineiii{\var{s}.index(\var{x})}
 899     {return smallest \var{i} such that \code{\var{s}[\var{i}] == \var{x}}}{(3)}
 900   \lineiii{\var{s}.insert(\var{i}, \var{x})}
 901         {same as \code{\var{s}[\var{i}:\var{i}] = [\var{x}]}
 902           if \code{\var{i} >= 0}}{(4)}
 903   \lineiii{\var{s}.pop(\optional{\var{i}})}
 904     {same as \code{\var{x} = \var{s}[\var{i}]; del \var{s}[\var{i}]; return \var{x}}}{(5)}
 905   \lineiii{\var{s}.remove(\var{x})}
 906         {same as \code{del \var{s}[\var{s}.index(\var{x})]}}{(3)}
 907   \lineiii{\var{s}.reverse()}
 908         {reverses the items of \var{s} in place}{(6)}
 909   \lineiii{\var{s}.sort(\optional{\var{cmpfunc}})}
 910         {sort the items of \var{s} in place}{(6), (7), (8)}
 911 \end{tableiii}
 912 \indexiv{operations on}{mutable}{sequence}{types}
 913 \indexiii{operations on}{sequence}{types}
 914 \indexiii{operations on}{list}{type}
 915 \indexii{subscript}{assignment}
 916 \indexii{slice}{assignment}
 917 \stindex{del}
 918 \withsubitem{(list method)}{
 919   \ttindex{append()}\ttindex{extend()}\ttindex{count()}\ttindex{index()}
 920   \ttindex{insert()}\ttindex{pop()}\ttindex{remove()}\ttindex{reverse()}
 921   \ttindex{sort()}}
 922 \noindent
 923 Notes:
 924 \begin{description}
 925 \item[(1)] The C implementation of Python historically accepted
 926   multiple parameters and implicitly joined them into a tuple;
 927   Use of this misfeature has been deprecated since Python 1.4,
 928   and became an error with the introduction of Python 2.0.
 929
 930 \item[(2)] Raises an exception when \var{x} is not a list object.  The
 931   \method{extend()} method is experimental and not supported by
 932   mutable sequence types other than lists.
 933
 934 \item[(3)] Raises \exception{ValueError} when \var{x} is not found in
 935   \var{s}.
 936
 937 \item[(4)] When a negative index is passed as the first parameter to
 938   the \method{insert()} method, the new element is prepended to the
 939   sequence.
 940
 941 \item[(5)] The \method{pop()} method is only supported by the list and
 942   array types.  The optional argument \var{i} defaults to \code{-1},
 943   so that by default the last item is removed and returned.
 944
 945 \item[(6)] The \method{sort()} and \method{reverse()} methods modify the
 946   list in place for economy of space when sorting or reversing a large
 947   list.  To remind you that they operate by side effect, they don't return
 948   the sorted or reversed list.
 949
 950 \item[(7)] The \method{sort()} method takes an optional argument
 951   specifying a comparison function of two arguments (list items) which
 952   should return a negative, zero or positive number depending on whether
 953   the first argument is considered smaller than, equal to, or larger
 954   than the second argument.  Note that this slows the sorting process
 955   down considerably; e.g. to sort a list in reverse order it is much
 956   faster to call method \method{sort()} followed by
 957   \method{reverse()} than to use method
 958   \method{sort()} with a comparison function that reverses the
 959   ordering of the elements.
 960
 961 \item[(8)] Whether the \method{sort()} method is stable is not defined by
 962   the language (a sort is stable if it guarantees not to change the
 963   relative order of elements that compare equal).  In the C
 964   implementation of Python, sorts were stable only by accident through
 965   Python 2.2.  The C implementation of Python 2.3 introduced a stable
 966   \method{sort()} method, but code that intends to be portable across
 967   implementations and versions must not rely on stability.
 968 \end{description}
 969
 970
 971 \subsection{Mapping Types \label{typesmapping}}
 972 \obindex{mapping}
 973 \obindex{dictionary}
 974
 975 A \dfn{mapping} object maps  immutable values to
 976 arbitrary objects.  Mappings are mutable objects.  There is currently
 977 only one standard mapping type, the \dfn{dictionary}.  A dictionary's keys are
 978 almost arbitrary values.  Only values containing lists, dictionaries
 979 or other mutable types (that are compared by value rather than by
 980 object identity) may not be used as keys.
 981 Numeric types used for keys obey the normal rules for numeric
 982 comparison: if two numbers compare equal (e.g. \code{1} and
 983 \code{1.0}) then they can be used interchangeably to index the same
 984 dictionary entry.
 985
 986 Dictionaries are created by placing a comma-separated list of
 987 \code{\var{key}: \var{value}} pairs within braces, for example:
 988 \code{\{'jack': 4098, 'sjoerd': 4127\}} or
 989 \code{\{4098: 'jack', 4127: 'sjoerd'\}}.
 990
 991 The following operations are defined on mappings (where \var{a} and
 992 \var{b} are mappings, \var{k} is a key, and \var{v} and \var{x} are
 993 arbitrary objects):
 994 \indexiii{operations on}{mapping}{types}
 995 \indexiii{operations on}{dictionary}{type}
 996 \stindex{del}
 997 \bifuncindex{len}
 998 \withsubitem{(dictionary method)}{
 999   \ttindex{clear()}
1000   \ttindex{copy()}
1001   \ttindex{has_key()}
1002   \ttindex{items()}
1003   \ttindex{keys()}
1004   \ttindex{update()}
1005   \ttindex{values()}
1006   \ttindex{get()}}
1007
1008 \begin{tableiii}{c|l|c}{code}{Operation}{Result}{Notes}
1009   \lineiii{len(\var{a})}{the number of items in \var{a}}{}
1010   \lineiii{\var{a}[\var{k}]}{the item of \var{a} with key \var{k}}{(1)}
1011   \lineiii{\var{a}[\var{k}] = \var{v}}
1012           {set \code{\var{a}[\var{k}]} to \var{v}}
1013           {}
1014   \lineiii{del \var{a}[\var{k}]}
1015           {remove \code{\var{a}[\var{k}]} from \var{a}}
1016           {(1)}
1017   \lineiii{\var{a}.clear()}{remove all items from \code{a}}{}
1018   \lineiii{\var{a}.copy()}{a (shallow) copy of \code{a}}{}
1019   \lineiii{\var{a}.has_key(\var{k})}
1020           {\code{1} if \var{a} has a key \var{k}, else \code{0}}
1021           {}
1022   \lineiii{\var{k} \code{in} \var{a}}
1023           {Equivalent to \var{a}.has_key(\var{k})}
1024           {(2)}
1025   \lineiii{\var{k} not in \var{a}}
1026           {Equivalent to \code{not} \var{a}.has_key(\var{k})}
1027           {(2)}
1028   \lineiii{\var{a}.items()}
1029           {a copy of \var{a}'s list of (\var{key}, \var{value}) pairs}
1030           {(3)}
1031   \lineiii{\var{a}.keys()}{a copy of \var{a}'s list of keys}{(3)}
1032   \lineiii{\var{a}.update(\var{b})}
1033           {\code{for k in \var{b}.keys(): \var{a}[k] = \var{b}[k]}}
1034           {}
1035   \lineiii{\var{a}.values()}{a copy of \var{a}'s list of values}{(3)}
1036   \lineiii{\var{a}.get(\var{k}\optional{, \var{x}})}
1037           {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1038            else \var{x}}
1039           {(4)}
1040   \lineiii{\var{a}.setdefault(\var{k}\optional{, \var{x}})}
1041           {\code{\var{a}[\var{k}]} if \code{\var{k} in \var{a}},
1042            else \var{x} (also setting it)}
1043           {(5)}
1044   \lineiii{\var{a}.pop(\var{k})}
1045           {remove specified \var{key} and return corresponding \var{value}}
1046           {}
1047   \lineiii{\var{a}.popitem()}
1048           {remove and return an arbitrary (\var{key}, \var{value}) pair}
1049           {(6)}
1050   \lineiii{\var{a}.iteritems()}
1051           {return an iterator over (\var{key}, \var{value}) pairs}
1052           {(2), (3)}
1053   \lineiii{\var{a}.iterkeys()}
1054           {return an iterator over the mapping's keys}
1055           {(2), (3)}
1056   \lineiii{\var{a}.itervalues()}
1057           {return an iterator over the mapping's values}
1058           {(2), (3)}
1059 \end{tableiii}
1060
1061 \noindent
1062 Notes:
1063 \begin{description}
1064 \item[(1)] Raises a \exception{KeyError} exception if \var{k} is not
1065 in the map.
1066
1067 \item[(2)] \versionadded{2.2}
1068
1069 \item[(3)] Keys and values are listed in random order.  If
1070 \method{items()}, \method{keys()}, \method{values()},
1071 \method{iteritems()}, \method{iterkeys()}, and \method{itervalues()}
1072 are called with no intervening modifications to the dictionary, the
1073 lists will directly correspond.  This allows the creation of
1074 \code{(\var{value}, \var{key})} pairs using \function{zip()}:
1075 \samp{pairs = zip(\var{a}.values(), \var{a}.keys())}.  The same
1076 relationship holds for the \method{iterkeys()} and
1077 \method{itervalues()} methods: \samp{pairs = zip(\var{a}.itervalues(),
1078 \var{a}.iterkeys())} provides the same value for \code{pairs}.
1079 Another way to create the same list is \samp{pairs = [(v, k) for (k,
1080 v) in \var{a}.iteritems()]}.
1081
1082 \item[(4)] Never raises an exception if \var{k} is not in the map,
1083 instead it returns \var{x}.  \var{x} is optional; when \var{x} is not
1084 provided and \var{k} is not in the map, \code{None} is returned.
1085
1086 \item[(5)] \function{setdefault()} is like \function{get()}, except
1087 that if \var{k} is missing, \var{x} is both returned and inserted into
1088 the dictionary as the value of \var{k}.
1089
1090 \item[(6)] \function{popitem()} is useful to destructively iterate
1091 over a dictionary, as often used in set algorithms.
1092 \end{description}
1093
1094
1095 \subsection{File Objects
1096             \label{bltin-file-objects}}
1097
1098 File objects\obindex{file} are implemented using C's \code{stdio}
1099 package and can be created with the built-in constructor
1100 \function{file()}\bifuncindex{file} described in section
1101 \ref{built-in-funcs}, ``Built-in Functions.''\footnote{\function{file()}
1102 is new in Python 2.2.  The older built-in \function{open()} is an
1103 alias for \function{file()}.}
1104 File objects are also returned
1105 by some other built-in functions and methods, such as
1106 \function{os.popen()} and \function{os.fdopen()} and the
1107 \method{makefile()} method of socket objects.
1108 \refstmodindex{os}
1109 \refbimodindex{socket}
1110
1111 When a file operation fails for an I/O-related reason, the exception
1112 \exception{IOError} is raised.  This includes situations where the
1113 operation is not defined for some reason, like \method{seek()} on a tty
1114 device or writing a file opened for reading.
1115
1116 Files have the following methods:
1117
1118
1119 \begin{methoddesc}[file]{close}{}
1120   Close the file.  A closed file cannot be read or written any more.
1121   Any operation which requires that the file be open will raise a
1122   \exception{ValueError} after the file has been closed.  Calling
1123   \method{close()} more than once is allowed.
1124 \end{methoddesc}
1125
1126 \begin{methoddesc}[file]{flush}{}
1127   Flush the internal buffer, like \code{stdio}'s
1128   \cfunction{fflush()}.  This may be a no-op on some file-like
1129   objects.
1130 \end{methoddesc}
1131
1132 \begin{methoddesc}[file]{fileno}{}
1133   \index{file descriptor}
1134   \index{descriptor, file}
1135   Return the integer ``file descriptor'' that is used by the
1136   underlying implementation to request I/O operations from the
1137   operating system.  This can be useful for other, lower level
1138   interfaces that use file descriptors, such as the
1139   \refmodule{fcntl}\refbimodindex{fcntl} module or
1140   \function{os.read()} and friends.  \note{File-like objects
1141   which do not have a real file descriptor should \emph{not} provide
1142   this method!}
1143 \end{methoddesc}
1144
1145 \begin{methoddesc}[file]{isatty}{}
1146   Return \code{True} if the file is connected to a tty(-like) device, else
1147   \code{False}.  \note{If a file-like object is not associated
1148   with a real file, this method should \emph{not} be implemented.}
1149 \end{methoddesc}
1150
1151 \begin{methoddesc}[file]{next}{}
1152 A file object is its own iterator, i.e. \code{iter(\var{f})} returns
1153 \var{f} (unless \var{f} is closed).  When a file is used as an
1154 iterator, typically in a \keyword{for} loop (for example,
1155 \code{for line in f: print line}), the \method{next()} method is
1156 called repeatedly.  This method returns the next input line, or raises
1157 \exception{StopIteration} when \EOF{} is hit.  In order to make a
1158 \keyword{for} loop the most efficient way of looping over the lines of
1159 a file (a very common operation), the \method{next()} method uses a
1160 hidden read-ahead buffer.  As a consequence of using a read-ahead
1161 buffer, combining \method{next()} with other file methods (like
1162 \method{readline()}) does not work right.  However, using
1163 \method{seek()} to reposition the file to an absolute position will
1164 flush the read-ahead buffer.
1165 \versionadded{2.3}
1166 \end{methoddesc}
1167
1168 \begin{methoddesc}[file]{read}{\optional{size}}
1169   Read at most \var{size} bytes from the file (less if the read hits
1170   \EOF{} before obtaining \var{size} bytes).  If the \var{size}
1171   argument is negative or omitted, read all data until \EOF{} is
1172   reached.  The bytes are returned as a string object.  An empty
1173   string is returned when \EOF{} is encountered immediately.  (For
1174   certain files, like ttys, it makes sense to continue reading after
1175   an \EOF{} is hit.)  Note that this method may call the underlying
1176   C function \cfunction{fread()} more than once in an effort to
1177   acquire as close to \var{size} bytes as possible.
1178 \end{methoddesc}
1179
1180 \begin{methoddesc}[file]{readline}{\optional{size}}
1181   Read one entire line from the file.  A trailing newline character is
1182   kept in the string\footnote{
1183         The advantage of leaving the newline on is that
1184         returning an empty string is then an unambiguous \EOF{}
1185         indication.  It is also possible (in cases where it might
1186         matter, for example, if you
1187         want to make an exact copy of a file while scanning its lines)
1188         to tell whether the last line of a file ended in a newline
1189         or not (yes this happens!).
1190   } (but may be absent when a file ends with an
1191   incomplete line).  If the \var{size} argument is present and
1192   non-negative, it is a maximum byte count (including the trailing
1193   newline) and an incomplete line may be returned.
1194   An empty string is returned \emph{only} when \EOF{} is encountered
1195   immediately.  \note{Unlike \code{stdio}'s \cfunction{fgets()}, the
1196   returned string contains null characters (\code{'\e 0'}) if they
1197   occurred in the input.}
1198 \end{methoddesc}
1199
1200 \begin{methoddesc}[file]{readlines}{\optional{sizehint}}
1201   Read until \EOF{} using \method{readline()} and return a list containing
1202   the lines thus read.  If the optional \var{sizehint} argument is
1203   present, instead of reading up to \EOF, whole lines totalling
1204   approximately \var{sizehint} bytes (possibly after rounding up to an
1205   internal buffer size) are read.  Objects implementing a file-like
1206   interface may choose to ignore \var{sizehint} if it cannot be
1207   implemented, or cannot be implemented efficiently.
1208 \end{methoddesc}
1209
1210 \begin{methoddesc}[file]{xreadlines}{}
1211   This method returns the same thing as \code{iter(f)}.
1212   \versionadded{2.1}
1213   \deprecated{2.3}{Use \code{for line in file} instead.}
1214 \end{methoddesc}
1215
1216 \begin{methoddesc}[file]{seek}{offset\optional{, whence}}
1217   Set the file's current position, like \code{stdio}'s \cfunction{fseek()}.
1218   The \var{whence} argument is optional and defaults to \code{0}
1219   (absolute file positioning); other values are \code{1} (seek
1220   relative to the current position) and \code{2} (seek relative to the
1221   file's end).  There is no return value.  Note that if the file is
1222   opened for appending (mode \code{'a'} or \code{'a+'}), any
1223   \method{seek()} operations will be undone at the next write.  If the
1224   file is only opened for writing in append mode (mode \code{'a'}),
1225   this method is essentially a no-op, but it remains useful for files
1226   opened in append mode with reading enabled (mode \code{'a+'}).
1227 \end{methoddesc}
1228
1229 \begin{methoddesc}[file]{tell}{}
1230   Return the file's current position, like \code{stdio}'s
1231   \cfunction{ftell()}.
1232 \end{methoddesc}
1233
1234 \begin{methoddesc}[file]{truncate}{\optional{size}}
1235   Truncate the file's size.  If the optional \var{size} argument is
1236   present, the file is truncated to (at most) that size.  The size
1237   defaults to the current position.  The current file position is
1238   not changed.  Note that if a specified size exceeds the file's
1239   current size, the result is platform-dependent:  possibilities
1240   include that file may remain unchanged, increase to the specified
1241   size as if zero-filled, or increase to the specified size with
1242   undefined new content.
1243   Availability:  Windows, many \UNIX variants.
1244 \end{methoddesc}
1245
1246 \begin{methoddesc}[file]{write}{str}
1247   Write a string to the file.  There is no return value.  Due to
1248   buffering, the string may not actually show up in the file until
1249   the \method{flush()} or \method{close()} method is called.
1250 \end{methoddesc}
1251
1252 \begin{methoddesc}[file]{writelines}{sequence}
1253   Write a sequence of strings to the file.  The sequence can be any
1254   iterable object producing strings, typically a list of strings.
1255   There is no return value.
1256   (The name is intended to match \method{readlines()};
1257   \method{writelines()} does not add line separators.)
1258 \end{methoddesc}
1259
1260
1261 Files support the iterator protocol.  Each iteration returns the same
1262 result as \code{\var{file}.readline()}, and iteration ends when the
1263 \method{readline()} method returns an empty string.
1264
1265
1266 File objects also offer a number of other interesting attributes.
1267 These are not required for file-like objects, but should be
1268 implemented if they make sense for the particular object.
1269
1270 \begin{memberdesc}[file]{closed}
1271 bool indicating the current state of the file object.  This is a
1272 read-only attribute; the \method{close()} method changes the value.
1273 It may not be available on all file-like objects.
1274 \end{memberdesc}
1275
1276 \begin{memberdesc}[file]{mode}
1277 The I/O mode for the file.  If the file was created using the
1278 \function{open()} built-in function, this will be the value of the
1279 \var{mode} parameter.  This is a read-only attribute and may not be
1280 present on all file-like objects.
1281 \end{memberdesc}
1282
1283 \begin{memberdesc}[file]{name}
1284 If the file object was created using \function{open()}, the name of
1285 the file.  Otherwise, some string that indicates the source of the
1286 file object, of the form \samp{<\mbox{\ldots}>}.  This is a read-only
1287 attribute and may not be present on all file-like objects.
1288 \end{memberdesc}
1289
1290 \begin{memberdesc}[file]{softspace}
1291 Boolean that indicates whether a space character needs to be printed
1292 before another value when using the \keyword{print} statement.
1293 Classes that are trying to simulate a file object should also have a
1294 writable \member{softspace} attribute, which should be initialized to
1295 zero.  This will be automatic for most classes implemented in Python
1296 (care may be needed for objects that override attribute access); types
1297 implemented in C will have to provide a writable
1298 \member{softspace} attribute.
1299 \note{This attribute is not used to control the
1300 \keyword{print} statement, but to allow the implementation of
1301 \keyword{print} to keep track of its internal state.}
1302 \end{memberdesc}
1303
1304
1305 \subsection{Other Built-in Types \label{typesother}}
1306
1307 The interpreter supports several other kinds of objects.
1308 Most of these support only one or two operations.
1309
1310
1311 \subsubsection{Modules \label{typesmodules}}
1312
1313 The only special operation on a module is attribute access:
1314 \code{\var{m}.\var{name}}, where \var{m} is a module and \var{name}
1315 accesses a name defined in \var{m}'s symbol table.  Module attributes
1316 can be assigned to.  (Note that the \keyword{import} statement is not,
1317 strictly speaking, an operation on a module object; \code{import
1318 \var{foo}} does not require a module object named \var{foo} to exist,
1319 rather it requires an (external) \emph{definition} for a module named
1320 \var{foo} somewhere.)
1321
1322 A special member of every module is \member{__dict__}.
1323 This is the dictionary containing the module's symbol table.
1324 Modifying this dictionary will actually change the module's symbol
1325 table, but direct assignment to the \member{__dict__} attribute is not
1326 possible (you can write \code{\var{m}.__dict__['a'] = 1}, which
1327 defines \code{\var{m}.a} to be \code{1}, but you can't write
1328 \code{\var{m}.__dict__ = \{\}}.
1329
1330 Modules built into the interpreter are written like this:
1331 \code{<module 'sys' (built-in)>}.  If loaded from a file, they are
1332 written as \code{<module 'os' from
1333 '/usr/local/lib/python\shortversion/os.pyc'>}.
1334
1335
1336 \subsubsection{Classes and Class Instances \label{typesobjects}}
1337 \nodename{Classes and Instances}
1338
1339 See chapters 3 and 7 of the \citetitle[../ref/ref.html]{Python
1340 Reference Manual} for these.
1341
1342
1343 \subsubsection{Functions \label{typesfunctions}}
1344
1345 Function objects are created by function definitions.  The only
1346 operation on a function object is to call it:
1347 \code{\var{func}(\var{argument-list})}.
1348
1349 There are really two flavors of function objects: built-in functions
1350 and user-defined functions.  Both support the same operation (to call
1351 the function), but the implementation is different, hence the
1352 different object types.
1353
1354 The implementation adds two special read-only attributes:
1355 \code{\var{f}.func_code} is a function's \dfn{code
1356 object}\obindex{code} (see below) and \code{\var{f}.func_globals} is
1357 the dictionary used as the function's global namespace (this is the
1358 same as \code{\var{m}.__dict__} where \var{m} is the module in which
1359 the function \var{f} was defined).
1360
1361 Function objects also support getting and setting arbitrary
1362 attributes, which can be used to, e.g. attach metadata to functions.
1363 Regular attribute dot-notation is used to get and set such
1364 attributes. \emph{Note that the current implementation only supports
1365 function attributes on user-defined functions.  Function attributes on
1366 built-in functions may be supported in the future.}
1367
1368 Functions have another special attribute \code{\var{f}.__dict__}
1369 (a.k.a. \code{\var{f}.func_dict}) which contains the namespace used to
1370 support function attributes.  \code{__dict__} and \code{func_dict} can
1371 be accessed directly or set to a dictionary object.  A function's
1372 dictionary cannot be deleted.
1373
1374 \subsubsection{Methods \label{typesmethods}}
1375 \obindex{method}
1376
1377 Methods are functions that are called using the attribute notation.
1378 There are two flavors: built-in methods (such as \method{append()} on
1379 lists) and class instance methods.  Built-in methods are described
1380 with the types that support them.
1381
1382 The implementation adds two special read-only attributes to class
1383 instance methods: \code{\var{m}.im_self} is the object on which the
1384 method operates, and \code{\var{m}.im_func} is the function
1385 implementing the method.  Calling \code{\var{m}(\var{arg-1},
1386 \var{arg-2}, \textrm{\ldots}, \var{arg-n})} is completely equivalent to
1387 calling \code{\var{m}.im_func(\var{m}.im_self, \var{arg-1},
1388 \var{arg-2}, \textrm{\ldots}, \var{arg-n})}.
1389
1390 Class instance methods are either \emph{bound} or \emph{unbound},
1391 referring to whether the method was accessed through an instance or a
1392 class, respectively.  When a method is unbound, its \code{im_self}
1393 attribute will be \code{None} and if called, an explicit \code{self}
1394 object must be passed as the first argument.  In this case,
1395 \code{self} must be an instance of the unbound method's class (or a
1396 subclass of that class), otherwise a \code{TypeError} is raised.
1397
1398 Like function objects, methods objects support getting
1399 arbitrary attributes.  However, since method attributes are actually
1400 stored on the underlying function object (\code{meth.im_func}),
1401 setting method attributes on either bound or unbound methods is
1402 disallowed.  Attempting to set a method attribute results in a
1403 \code{TypeError} being raised.  In order to set a method attribute,
1404 you need to explicitly set it on the underlying function object:
1405
1406 \begin{verbatim}
1407 class C:
1408     def method(self):
1409         pass
1410
1411 c = C()
1412 c.method.im_func.whoami = 'my name is c'
1413 \end{verbatim}
1414
1415 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1416 information.
1417
1418
1419 \subsubsection{Code Objects \label{bltin-code-objects}}
1420 \obindex{code}
1421
1422 Code objects are used by the implementation to represent
1423 ``pseudo-compiled'' executable Python code such as a function body.
1424 They differ from function objects because they don't contain a
1425 reference to their global execution environment.  Code objects are
1426 returned by the built-in \function{compile()} function and can be
1427 extracted from function objects through their \member{func_code}
1428 attribute.
1429 \bifuncindex{compile}
1430 \withsubitem{(function object attribute)}{\ttindex{func_code}}
1431
1432 A code object can be executed or evaluated by passing it (instead of a
1433 source string) to the \keyword{exec} statement or the built-in
1434 \function{eval()} function.
1435 \stindex{exec}
1436 \bifuncindex{eval}
1437
1438 See the \citetitle[../ref/ref.html]{Python Reference Manual} for more
1439 information.
1440
1441
1442 \subsubsection{Type Objects \label{bltin-type-objects}}
1443
1444 Type objects represent the various object types.  An object's type is
1445 accessed by the built-in function \function{type()}.  There are no special
1446 operations on types.  The standard module \module{types} defines names
1447 for all standard built-in types.
1448 \bifuncindex{type}
1449 \refstmodindex{types}
1450
1451 Types are written like this: \code{<type 'int'>}.
1452
1453
1454 \subsubsection{The Null Object \label{bltin-null-object}}
1455
1456 This object is returned by functions that don't explicitly return a
1457 value.  It supports no special operations.  There is exactly one null
1458 object, named \code{None} (a built-in name).
1459
1460 It is written as \code{None}.
1461
1462
1463 \subsubsection{The Ellipsis Object \label{bltin-ellipsis-object}}
1464
1465 This object is used by extended slice notation (see the
1466 \citetitle[../ref/ref.html]{Python Reference Manual}).  It supports no
1467 special operations.  There is exactly one ellipsis object, named
1468 \constant{Ellipsis} (a built-in name).
1469
1470 It is written as \code{Ellipsis}.
1471
1472 \subsubsection{Boolean Values}
1473
1474 Boolean values are the two constant objects \code{False} and
1475 \code{True}.  They are used to represent truth values (although other
1476 values can also be considered false or true).  In numeric contexts
1477 (for example when used as the argument to an arithmetic operator),
1478 they behave like the integers 0 and 1, respectively.  The built-in
1479 function \function{bool()} can be used to cast any value to a Boolean,
1480 if the value can be interpreted as a truth value (see section Truth
1481 Value Testing above).
1482
1483 They are written as \code{False} and \code{True}, respectively.
1484 \index{False}
1485 \index{True}
1486 \indexii{Boolean}{values}
1487
1488
1489 \subsubsection{Internal Objects \label{typesinternal}}
1490
1491 See the \citetitle[../ref/ref.html]{Python Reference Manual} for this
1492 information.  It describes stack frame objects, traceback objects, and
1493 slice objects.
1494
1495
1496 \subsection{Special Attributes \label{specialattrs}}
1497
1498 The implementation adds a few special read-only attributes to several
1499 object types, where they are relevant:
1500
1501 \begin{memberdesc}[object]{__dict__}
1502 A dictionary or other mapping object used to store an
1503 object's (writable) attributes.
1504 \end{memberdesc}
1505
1506 \begin{memberdesc}[object]{__methods__}
1507 \deprecated{2.2}{Use the built-in function \function{dir()} to get a
1508 list of an object's attributes.  This attribute is no longer available.}
1509 \end{memberdesc}
1510
1511 \begin{memberdesc}[object]{__members__}
1512 \deprecated{2.2}{Use the built-in function \function{dir()} to get a
1513 list of an object's attributes.  This attribute is no longer available.}
1514 \end{memberdesc}
1515
1516 \begin{memberdesc}[instance]{__class__}
1517 The class to which a class instance belongs.
1518 \end{memberdesc}
1519
1520 \begin{memberdesc}[class]{__bases__}
1521 The tuple of base classes of a class object.  If there are no base
1522 classes, this will be an empty tuple.
1523 \end{memberdesc}