Doc/ref/ref2.tex

   1 \chapter{Lexical analysis\label{lexical}}
   2
   3 A Python program is read by a \emph{parser}.  Input to the parser is a
   4 stream of \emph{tokens}, generated by the \emph{lexical analyzer}.  This
   5 chapter describes how the lexical analyzer breaks a file into tokens.
   6 \index{lexical analysis}
   7 \index{parser}
   8 \index{token}
   9
  10 Python uses the 7-bit \ASCII{} character set for program text and string
  11 literals. 8-bit characters may be used in string literals and comments
  12 but their interpretation is platform dependent; the proper way to
  13 insert 8-bit characters in string literals is by using octal or
  14 hexadecimal escape sequences.
  15
  16 The run-time character set depends on the I/O devices connected to the
  17 program but is generally a superset of \ASCII.
  18
  19 \strong{Future compatibility note:} It may be tempting to assume that the
  20 character set for 8-bit characters is ISO Latin-1 (an \ASCII{}
  21 superset that covers most western languages that use the Latin
  22 alphabet), but it is possible that in the future Unicode text editors
  23 will become common.  These generally use the UTF-8 encoding, which is
  24 also an \ASCII{} superset, but with very different use for the
  25 characters with ordinals 128-255.  While there is no consensus on this
  26 subject yet, it is unwise to assume either Latin-1 or UTF-8, even
  27 though the current implementation appears to favor Latin-1.  This
  28 applies both to the source character set and the run-time character
  29 set.
  30
  31
  32 \section{Line structure\label{line-structure}}
  33
  34 A Python program is divided into a number of \emph{logical lines}.
  35 \index{line structure}
  36
  37
  38 \subsection{Logical lines\label{logical}}
  39
  40 The end of
  41 a logical line is represented by the token NEWLINE.  Statements cannot
  42 cross logical line boundaries except where NEWLINE is allowed by the
  43 syntax (e.g., between statements in compound statements).
  44 A logical line is constructed from one or more \emph{physical lines}
  45 by following the explicit or implicit \emph{line joining} rules.
  46 \index{logical line}
  47 \index{physical line}
  48 \index{line joining}
  49 \index{NEWLINE token}
  50
  51
  52 \subsection{Physical lines\label{physical}}
  53
  54 A physical line ends in whatever the current platform's convention is
  55 for terminating lines.  On \UNIX, this is the \ASCII{} LF (linefeed)
  56 character.  On DOS/Windows, it is the \ASCII{} sequence CR LF (return
  57 followed by linefeed).  On Macintosh, it is the \ASCII{} CR (return)
  58 character.
  59
  60
  61 \subsection{Comments\label{comments}}
  62
  63 A comment starts with a hash character (\code{\#}) that is not part of
  64 a string literal, and ends at the end of the physical line.  A comment
  65 signifies the end of the logical line unless the implicit line joining
  66 rules are invoked.
  67 Comments are ignored by the syntax; they are not tokens.
  68 \index{comment}
  69 \index{hash character}
  70
  71
  72 \subsection{Explicit line joining\label{explicit-joining}}
  73
  74 Two or more physical lines may be joined into logical lines using
  75 backslash characters (\code{\e}), as follows: when a physical line ends
  76 in a backslash that is not part of a string literal or comment, it is
  77 joined with the following forming a single logical line, deleting the
  78 backslash and the following end-of-line character.  For example:
  79 \index{physical line}
  80 \index{line joining}
  81 \index{line continuation}
  82 \index{backslash character}
  83 %
  84 \begin{verbatim}
  85 if 1900 < year < 2100 and 1 <= month <= 12 \
  86    and 1 <= day <= 31 and 0 <= hour < 24 \
  87    and 0 <= minute < 60 and 0 <= second < 60:   # Looks like a valid date
  88         return 1
  89 \end{verbatim}
  90
  91 A line ending in a backslash cannot carry a comment.  A backslash does
  92 not continue a comment.  A backslash does not continue a token except
  93 for string literals (i.e., tokens other than string literals cannot be
  94 split across physical lines using a backslash).  A backslash is
  95 illegal elsewhere on a line outside a string literal.
  96
  97
  98 \subsection{Implicit line joining\label{implicit-joining}}
  99
 100 Expressions in parentheses, square brackets or curly braces can be
 101 split over more than one physical line without using backslashes.
 102 For example:
 103
 104 \begin{verbatim}
 105 month_names = ['Januari', 'Februari', 'Maart',      # These are the
 106                'April',   'Mei',      'Juni',       # Dutch names
 107                'Juli',    'Augustus', 'September',  # for the months
 108                'Oktober', 'November', 'December']   # of the year
 109 \end{verbatim}
 110
 111 Implicitly continued lines can carry comments.  The indentation of the
 112 continuation lines is not important.  Blank continuation lines are
 113 allowed.  There is no NEWLINE token between implicit continuation
 114 lines.  Implicitly continued lines can also occur within triple-quoted
 115 strings (see below); in that case they cannot carry comments.
 116
 117
 118 \subsection{Blank lines \index{blank line}\label{blank-lines}}
 119
 120 A logical line that contains only spaces, tabs, formfeeds and possibly
 121 a comment, is ignored (i.e., no NEWLINE token is generated).  During
 122 interactive input of statements, handling of a blank line may differ
 123 depending on the implementation of the read-eval-print loop.  In the
 124 standard implementation, an entirely blank logical line (i.e.\ one
 125 containing not even whitespace or a comment) terminates a multi-line
 126 statement.
 127
 128
 129 \subsection{Indentation\label{indentation}}
 130
 131 Leading whitespace (spaces and tabs) at the beginning of a logical
 132 line is used to compute the indentation level of the line, which in
 133 turn is used to determine the grouping of statements.
 134 \index{indentation}
 135 \index{whitespace}
 136 \index{leading whitespace}
 137 \index{space}
 138 \index{tab}
 139 \index{grouping}
 140 \index{statement grouping}
 141
 142 First, tabs are replaced (from left to right) by one to eight spaces
 143 such that the total number of characters up to and including the
 144 replacement is a multiple of
 145 eight (this is intended to be the same rule as used by \UNIX).  The
 146 total number of spaces preceding the first non-blank character then
 147 determines the line's indentation.  Indentation cannot be split over
 148 multiple physical lines using backslashes; the whitespace up to the
 149 first backslash determines the indentation.
 150
 151 \strong{Cross-platform compatibility note:} because of the nature of
 152 text editors on non-UNIX platforms, it is unwise to use a mixture of
 153 spaces and tabs for the indentation in a single source file.
 154
 155 A formfeed character may be present at the start of the line; it will
 156 be ignored for the indentation calculations above.  Formfeed
 157 characters occurring elsewhere in the leading whitespace have an
 158 undefined effect (for instance, they may reset the space count to
 159 zero).
 160
 161 The indentation levels of consecutive lines are used to generate
 162 INDENT and DEDENT tokens, using a stack, as follows.
 163 \index{INDENT token}
 164 \index{DEDENT token}
 165
 166 Before the first line of the file is read, a single zero is pushed on
 167 the stack; this will never be popped off again.  The numbers pushed on
 168 the stack will always be strictly increasing from bottom to top.  At
 169 the beginning of each logical line, the line's indentation level is
 170 compared to the top of the stack.  If it is equal, nothing happens.
 171 If it is larger, it is pushed on the stack, and one INDENT token is
 172 generated.  If it is smaller, it \emph{must} be one of the numbers
 173 occurring on the stack; all numbers on the stack that are larger are
 174 popped off, and for each number popped off a DEDENT token is
 175 generated.  At the end of the file, a DEDENT token is generated for
 176 each number remaining on the stack that is larger than zero.
 177
 178 Here is an example of a correctly (though confusingly) indented piece
 179 of Python code:
 180
 181 \begin{verbatim}
 182 def perm(l):
 183         # Compute the list of all permutations of l
 184     if len(l) <= 1:
 185                   return [l]
 186     r = []
 187     for i in range(len(l)):
 188              s = l[:i] + l[i+1:]
 189              p = perm(s)
 190              for x in p:
 191               r.append(l[i:i+1] + x)
 192     return r
 193 \end{verbatim}
 194
 195 The following example shows various indentation errors:
 196
 197 \begin{verbatim}
 198  def perm(l):                       # error: first line indented
 199 for i in range(len(l)):             # error: not indented
 200     s = l[:i] + l[i+1:]
 201         p = perm(l[:i] + l[i+1:])   # error: unexpected indent
 202         for x in p:
 203                 r.append(l[i:i+1] + x)
 204             return r                # error: inconsistent dedent
 205 \end{verbatim}
 206
 207 (Actually, the first three errors are detected by the parser; only the
 208 last error is found by the lexical analyzer --- the indentation of
 209 \code{return r} does not match a level popped off the stack.)
 210
 211
 212 \subsection{Whitespace between tokens\label{whitespace}}
 213
 214 Except at the beginning of a logical line or in string literals, the
 215 whitespace characters space, tab and formfeed can be used
 216 interchangeably to separate tokens.  Whitespace is needed between two
 217 tokens only if their concatenation could otherwise be interpreted as a
 218 different token (e.g., ab is one token, but a b is two tokens).
 219
 220
 221 \section{Other tokens\label{other-tokens}}
 222
 223 Besides NEWLINE, INDENT and DEDENT, the following categories of tokens
 224 exist: \emph{identifiers}, \emph{keywords}, \emph{literals},
 225 \emph{operators}, and \emph{delimiters}.
 226 Whitespace characters (other than line terminators, discussed earlier)
 227 are not tokens, but serve to delimit tokens.
 228 Where
 229 ambiguity exists, a token comprises the longest possible string that
 230 forms a legal token, when read from left to right.
 231
 232
 233 \section{Identifiers and keywords\label{identifiers}}
 234
 235 Identifiers (also referred to as \emph{names}) are described by the following
 236 lexical definitions:
 237 \index{identifier}
 238 \index{name}
 239
 240 \begin{productionlist}
 241   \production{identifier}
 242              {(\token{letter}|"_") (\token{letter} | \token{digit} | "_")*}
 243   \production{letter}
 244              {\token{lowercase} | \token{uppercase}}
 245   \production{lowercase}
 246              {"a"..."z"}
 247   \production{uppercase}
 248              {"A"..."Z"}
 249   \production{digit}
 250              {"0"..."9"}
 251 \end{productionlist}
 252
 253 Identifiers are unlimited in length.  Case is significant.
 254
 255
 256 \subsection{Keywords\label{keywords}}
 257
 258 The following identifiers are used as reserved words, or
 259 \emph{keywords} of the language, and cannot be used as ordinary
 260 identifiers.  They must be spelled exactly as written here:%
 261 \index{keyword}%
 262 \index{reserved word}
 263
 264 \begin{verbatim}
 265 and       del       for       is        raise
 266 assert    elif      from      lambda    return
 267 break     else      global    not       try
 268 class     except    if        or        while
 269 continue  exec      import    pass      yield
 270 def       finally   in        print
 271 \end{verbatim}
 272
 273 % When adding keywords, use reswords.py for reformatting
 274
 275 Note that although the identifier \code{as} can be used as part of the
 276 syntax of \keyword{import} statements, it is not currently a reserved
 277 word.
 278
 279 In some future version of Python, the identifiers \code{as} and
 280 \code{None} will both become keywords.
 281
 282
 283 \subsection{Reserved classes of identifiers\label{id-classes}}
 284
 285 Certain classes of identifiers (besides keywords) have special
 286 meanings.  These are:
 287
 288 \begin{tableiii}{l|l|l}{code}{Form}{Meaning}{Notes}
 289 \lineiii{_*}{Not imported by \samp{from \var{module} import *}}{(1)}
 290 \lineiii{__*__}{System-defined name}{}
 291 \lineiii{__*}{Class-private name mangling}{}
 292 \end{tableiii}
 293
 294 (XXX need section references here.)
 295
 296 Note:
 297
 298 \begin{description}
 299 \item[(1)] The special identifier \samp{_} is used in the interactive
 300 interpreter to store the result of the last evaluation; it is stored
 301 in the \module{__builtin__} module.  When not in interactive mode,
 302 \samp{_} has no special meaning and is not defined.
 303 \end{description}
 304
 305
 306 \section{Literals\label{literals}}
 307
 308 Literals are notations for constant values of some built-in types.
 309 \index{literal}
 310 \index{constant}
 311
 312
 313 \subsection{String literals\label{strings}}
 314
 315 String literals are described by the following lexical definitions:
 316 \index{string literal}
 317
 318 \index{ASCII@\ASCII}
 319 \begin{productionlist}
 320   \production{stringliteral}
 321              {[\token{stringprefix}](\token{shortstring} | \token{longstring})}
 322   \production{stringprefix}
 323              {"r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR"}
 324   \production{shortstring}
 325              {"'" \token{shortstringitem}* "'"
 326               | '"' \token{shortstringitem}* '"'}
 327   \production{longstring}
 328              {"'''" \token{longstringitem}* "'''"}
 329   \productioncont{| '"""' \token{longstringitem}* '"""'}
 330   \production{shortstringitem}
 331              {\token{shortstringchar} | \token{escapeseq}}
 332   \production{longstringitem}
 333              {\token{longstringchar} | \token{escapeseq}}
 334   \production{shortstringchar}
 335              {<any ASCII character except "\e" or newline or the quote>}
 336   \production{longstringchar}
 337              {<any ASCII character except "\e">}
 338   \production{escapeseq}
 339              {"\e" <any ASCII character>}
 340 \end{productionlist}
 341
 342 One syntactic restriction not indicated by these productions is that
 343 whitespace is not allowed between the \grammartoken{stringprefix} and
 344 the rest of the string literal.
 345
 346 \index{triple-quoted string}
 347 \index{Unicode Consortium}
 348 \index{string!Unicode}
 349 In plain English: String literals can be enclosed in matching single
 350 quotes (\code{'}) or double quotes (\code{"}).  They can also be
 351 enclosed in matching groups of three single or double quotes (these
 352 are generally referred to as \emph{triple-quoted strings}).  The
 353 backslash (\code{\e}) character is used to escape characters that
 354 otherwise have a special meaning, such as newline, backslash itself,
 355 or the quote character.  String literals may optionally be prefixed
 356 with a letter `r' or `R'; such strings are called \dfn{raw
 357 strings}\index{raw string} and use different rules for interpreting
 358 backslash escape sequences.  A prefix of 'u' or 'U' makes the string
 359 a Unicode string.  Unicode strings use the Unicode character set as
 360 defined by the Unicode Consortium and ISO~10646.  Some additional
 361 escape sequences, described below, are available in Unicode strings.
 362 The two prefix characters may be combined; in this case, `u' must
 363 appear before `r'.
 364
 365 In triple-quoted strings,
 366 unescaped newlines and quotes are allowed (and are retained), except
 367 that three unescaped quotes in a row terminate the string.  (A
 368 ``quote'' is the character used to open the string, i.e. either
 369 \code{'} or \code{"}.)
 370
 371 Unless an `r' or `R' prefix is present, escape sequences in strings
 372 are interpreted according to rules similar
 373 to those used by Standard C.  The recognized escape sequences are:
 374 \index{physical line}
 375 \index{escape sequence}
 376 \index{Standard C}
 377 \index{C}
 378
 379 \begin{tableii}{l|l}{code}{Escape Sequence}{Meaning}
 380 \lineii{\e\var{newline}} {Ignored}
 381 \lineii{\e\e}   {Backslash (\code{\e})}
 382 \lineii{\e'}    {Single quote (\code{'})}
 383 \lineii{\e"}    {Double quote (\code{"})}
 384 \lineii{\e a}   {\ASCII{} Bell (BEL)}
 385 \lineii{\e b}   {\ASCII{} Backspace (BS)}
 386 \lineii{\e f}   {\ASCII{} Formfeed (FF)}
 387 \lineii{\e n}   {\ASCII{} Linefeed (LF)}
 388 \lineii{\e N\{\var{name}\}}
 389        {Character named \var{name} in the Unicode database (Unicode only)}
 390 \lineii{\e r}   {\ASCII{} Carriage Return (CR)}
 391 \lineii{\e t}   {\ASCII{} Horizontal Tab (TAB)}
 392 \lineii{\e u\var{xxxx}}    {Character with 16-bit hex value \var{xxxx} (Unicode only)}
 393 \lineii{\e U\var{xxxxxxxx}}{Character with 32-bit hex value \var{xxxxxxxx} (Unicode only)}
 394 \lineii{\e v}   {\ASCII{} Vertical Tab (VT)}
 395 \lineii{\e\var{ooo}} {\ASCII{} character with octal value \var{ooo}}
 396 \lineii{\e x\var{hh}} {\ASCII{} character with hex value \var{hh}}
 397 \end{tableii}
 398 \index{ASCII@\ASCII}
 399
 400 As in Standard C, up to three octal digits are accepted.  However,
 401 exactly two hex digits are taken in hex escapes.
 402
 403 Unlike Standard \index{unrecognized escape sequence}C,
 404 all unrecognized escape sequences are left in the string unchanged,
 405 i.e., \emph{the backslash is left in the string}.  (This behavior is
 406 useful when debugging: if an escape sequence is mistyped, the
 407 resulting output is more easily recognized as broken.)  It is also
 408 important to note that the escape sequences marked as ``(Unicode
 409 only)'' in the table above fall into the category of unrecognized
 410 escapes for non-Unicode string literals.
 411
 412 When an `r' or `R' prefix is present, a character following a
 413 backslash is included in the string without change, and \emph{all
 414 backslashes are left in the string}.  For example, the string literal
 415 \code{r"\e n"} consists of two characters: a backslash and a lowercase
 416 `n'.  String quotes can be escaped with a backslash, but the backslash
 417 remains in the string; for example, \code{r"\e""} is a valid string
 418 literal consisting of two characters: a backslash and a double quote;
 419 \code{r"\e"} is not a valid string literal (even a raw string cannot
 420 end in an odd number of backslashes).  Specifically, \emph{a raw
 421 string cannot end in a single backslash} (since the backslash would
 422 escape the following quote character).  Note also that a single
 423 backslash followed by a newline is interpreted as those two characters
 424 as part of the string, \emph{not} as a line continuation.
 425
 426
 427 \subsection{String literal concatenation\label{string-catenation}}
 428
 429 Multiple adjacent string literals (delimited by whitespace), possibly
 430 using different quoting conventions, are allowed, and their meaning is
 431 the same as their concatenation.  Thus, \code{"hello" 'world'} is
 432 equivalent to \code{"helloworld"}.  This feature can be used to reduce
 433 the number of backslashes needed, to split long strings conveniently
 434 across long lines, or even to add comments to parts of strings, for
 435 example:
 436
 437 \begin{verbatim}
 438 re.compile("[A-Za-z_]"       # letter or underscore
 439            "[A-Za-z0-9_]*"   # letter, digit or underscore
 440           )
 441 \end{verbatim}
 442
 443 Note that this feature is defined at the syntactical level, but
 444 implemented at compile time.  The `+' operator must be used to
 445 concatenate string expressions at run time.  Also note that literal
 446 concatenation can use different quoting styles for each component
 447 (even mixing raw strings and triple quoted strings).
 448
 449
 450 \subsection{Numeric literals\label{numbers}}
 451
 452 There are four types of numeric literals: plain integers, long
 453 integers, floating point numbers, and imaginary numbers.  There are no
 454 complex literals (complex numbers can be formed by adding a real
 455 number and an imaginary number).
 456 \index{number}
 457 \index{numeric literal}
 458 \index{integer literal}
 459 \index{plain integer literal}
 460 \index{long integer literal}
 461 \index{floating point literal}
 462 \index{hexadecimal literal}
 463 \index{octal literal}
 464 \index{decimal literal}
 465 \index{imaginary literal}
 466 \index{complex!literal}
 467
 468 Note that numeric literals do not include a sign; a phrase like
 469 \code{-1} is actually an expression composed of the unary operator
 470 `\code{-}' and the literal \code{1}.
 471
 472
 473 \subsection{Integer and long integer literals\label{integers}}
 474
 475 Integer and long integer literals are described by the following
 476 lexical definitions:
 477
 478 \begin{productionlist}
 479   \production{longinteger}
 480              {\token{integer} ("l" | "L")}
 481   \production{integer}
 482              {\token{decimalinteger} | \token{octinteger} | \token{hexinteger}}
 483   \production{decimalinteger}
 484              {\token{nonzerodigit} \token{digit}* | "0"}
 485   \production{octinteger}
 486              {"0" \token{octdigit}+}
 487   \production{hexinteger}
 488              {"0" ("x" | "X") \token{hexdigit}+}
 489   \production{nonzerodigit}
 490              {"1"..."9"}
 491   \production{octdigit}
 492              {"0"..."7"}
 493   \production{hexdigit}
 494              {\token{digit} | "a"..."f" | "A"..."F"}
 495 \end{productionlist}
 496
 497 Although both lower case `l' and upper case `L' are allowed as suffix
 498 for long integers, it is strongly recommended to always use `L', since
 499 the letter `l' looks too much like the digit `1'.
 500
 501 Plain integer decimal literals must be at most 2147483647 (i.e., the
 502 largest positive integer, using 32-bit arithmetic).  Plain octal and
 503 hexadecimal literals may be as large as 4294967295, but values larger
 504 than 2147483647 are converted to a negative value by subtracting
 505 4294967296.  There is no limit for long integer literals apart from
 506 what can be stored in available memory.
 507
 508 Some examples of plain and long integer literals:
 509
 510 \begin{verbatim}
 511 7     2147483647                        0177    0x80000000
 512 3L    79228162514264337593543950336L    0377L   0x100000000L
 513 \end{verbatim}
 514
 515
 516 \subsection{Floating point literals\label{floating}}
 517
 518 Floating point literals are described by the following lexical
 519 definitions:
 520
 521 \begin{productionlist}
 522   \production{floatnumber}
 523              {\token{pointfloat} | \token{exponentfloat}}
 524   \production{pointfloat}
 525              {[\token{intpart}] \token{fraction} | \token{intpart} "."}
 526   \production{exponentfloat}
 527              {(\token{intpart} | \token{pointfloat})
 528               \token{exponent}}
 529   \production{intpart}
 530              {\token{digit}+}
 531   \production{fraction}
 532              {"." \token{digit}+}
 533   \production{exponent}
 534              {("e" | "E") ["+" | "-"] \token{digit}+}
 535 \end{productionlist}
 536
 537 Note that the integer and exponent parts of floating point numbers
 538 can look like octal integers, but are interpreted using radix 10.  For
 539 example, \samp{077e010} is legal, and denotes the same number
 540 as \samp{77e10}.
 541 The allowed range of floating point literals is
 542 implementation-dependent.
 543 Some examples of floating point literals:
 544
 545 \begin{verbatim}
 546 3.14    10.    .001    1e100    3.14e-10    0e0
 547 \end{verbatim}
 548
 549 Note that numeric literals do not include a sign; a phrase like
 550 \code{-1} is actually an expression composed of the operator
 551 \code{-} and the literal \code{1}.
 552
 553
 554 \subsection{Imaginary literals\label{imaginary}}
 555
 556 Imaginary literals are described by the following lexical definitions:
 557
 558 \begin{productionlist}
 559   \production{imagnumber}{(\token{floatnumber} | \token{intpart}) ("j" | "J")}
 560 \end{productionlist}
 561
 562 An imaginary literal yields a complex number with a real part of
 563 0.0.  Complex numbers are represented as a pair of floating point
 564 numbers and have the same restrictions on their range.  To create a
 565 complex number with a nonzero real part, add a floating point number
 566 to it, e.g., \code{(3+4j)}.  Some examples of imaginary literals:
 567
 568 \begin{verbatim}
 569 3.14j   10.j    10j     .001j   1e100j  3.14e-10j
 570 \end{verbatim}
 571
 572
 573 \section{Operators\label{operators}}
 574
 575 The following tokens are operators:
 576 \index{operators}
 577
 578 \begin{verbatim}
 579 +       -       *       **      /       //      %
 580 <<      >>      &       |       ^       ~
 581 <       >       <=      >=      ==      !=      <>
 582 \end{verbatim}
 583
 584 The comparison operators \code{<>} and \code{!=} are alternate
 585 spellings of the same operator.  \code{!=} is the preferred spelling;
 586 \code{<>} is obsolescent.
 587
 588
 589 \section{Delimiters\label{delimiters}}
 590
 591 The following tokens serve as delimiters in the grammar:
 592 \index{delimiters}
 593
 594 \begin{verbatim}
 595 (       )       [       ]       {       }
 596 ,       :       .       `       =       ;
 597 +=      -=      *=      /=      //=     %=
 598 &=      |=      ^=      >>=     <<=     **=
 599 \end{verbatim}
 600
 601 The period can also occur in floating-point and imaginary literals.  A
 602 sequence of three periods has a special meaning as an ellipsis in slices.
 603 The second half of the list, the augmented assignment operators, serve
 604 lexically as delimiters, but also perform an operation.
 605
 606 The following printing \ASCII{} characters have special meaning as part
 607 of other tokens or are otherwise significant to the lexical analyzer:
 608
 609 \begin{verbatim}
 610 '       "       #       \
 611 \end{verbatim}
 612
 613 The following printing \ASCII{} characters are not used in Python.  Their
 614 occurrence outside string literals and comments is an unconditional
 615 error:
 616 \index{ASCII@\ASCII}
 617
 618 \begin{verbatim}
 619 @       $       ?
 620 \end{verbatim}