Doc/lib/libdoctest.tex

   1 \section{\module{doctest} ---
   2          Test docstrings represent reality}
   3
   4 \declaremodule{standard}{doctest}
   5 \moduleauthor{Tim Peters}{tim_one@users.sourceforge.net}
   6 \sectionauthor{Tim Peters}{tim_one@users.sourceforge.net}
   7 \sectionauthor{Moshe Zadka}{moshez@debian.org}
   8
   9 \modulesynopsis{A framework for verifying examples in docstrings.}
  10
  11 The \module{doctest} module searches a module's docstrings for text that looks
  12 like an interactive Python session, then executes all such sessions to verify
  13 they still work exactly as shown.  Here's a complete but small example:
  14
  15 \begin{verbatim}
  16 """
  17 This is module example.
  18
  19 Example supplies one function, factorial.  For example,
  20
  21 >>> factorial(5)
  22 120
  23 """
  24
  25 def factorial(n):
  26     """Return the factorial of n, an exact integer >= 0.
  27
  28     If the result is small enough to fit in an int, return an int.
  29     Else return a long.
  30
  31     >>> [factorial(n) for n in range(6)]
  32     [1, 1, 2, 6, 24, 120]
  33     >>> [factorial(long(n)) for n in range(6)]
  34     [1, 1, 2, 6, 24, 120]
  35     >>> factorial(30)
  36     265252859812191058636308480000000L
  37     >>> factorial(30L)
  38     265252859812191058636308480000000L
  39     >>> factorial(-1)
  40     Traceback (most recent call last):
  41         ...
  42     ValueError: n must be >= 0
  43
  44     Factorials of floats are OK, but the float must be an exact integer:
  45     >>> factorial(30.1)
  46     Traceback (most recent call last):
  47         ...
  48     ValueError: n must be exact integer
  49     >>> factorial(30.0)
  50     265252859812191058636308480000000L
  51
  52     It must also not be ridiculously large:
  53     >>> factorial(1e100)
  54     Traceback (most recent call last):
  55         ...
  56     OverflowError: n too large
  57     """
  58
  59 \end{verbatim}
  60 % allow LaTeX to break here.
  61 \begin{verbatim}
  62
  63     import math
  64     if not n >= 0:
  65         raise ValueError("n must be >= 0")
  66     if math.floor(n) != n:
  67         raise ValueError("n must be exact integer")
  68     if n+1 == n:  # catch a value like 1e300
  69         raise OverflowError("n too large")
  70     result = 1
  71     factor = 2
  72     while factor <= n:
  73         try:
  74             result *= factor
  75         except OverflowError:
  76             result *= long(factor)
  77         factor += 1
  78     return result
  79
  80 def _test():
  81     import doctest, example
  82     return doctest.testmod(example)
  83
  84 if __name__ == "__main__":
  85     _test()
  86 \end{verbatim}
  87
  88 If you run \file{example.py} directly from the command line,
  89 \module{doctest} works its magic:
  90
  91 \begin{verbatim}
  92 $ python example.py
  93 $
  94 \end{verbatim}
  95
  96 There's no output!  That's normal, and it means all the examples
  97 worked.  Pass \programopt{-v} to the script, and \module{doctest}
  98 prints a detailed log of what it's trying, and prints a summary at the
  99 end:
 100
 101 \begin{verbatim}
 102 $ python example.py -v
 103 Running example.__doc__
 104 Trying: factorial(5)
 105 Expecting: 120
 106 ok
 107 0 of 1 examples failed in example.__doc__
 108 Running example.factorial.__doc__
 109 Trying: [factorial(n) for n in range(6)]
 110 Expecting: [1, 1, 2, 6, 24, 120]
 111 ok
 112 Trying: [factorial(long(n)) for n in range(6)]
 113 Expecting: [1, 1, 2, 6, 24, 120]
 114 ok
 115 Trying: factorial(30)
 116 Expecting: 265252859812191058636308480000000L
 117 ok
 118 \end{verbatim}
 119
 120 And so on, eventually ending with:
 121
 122 \begin{verbatim}
 123 Trying: factorial(1e100)
 124 Expecting:
 125 Traceback (most recent call last):
 126     ...
 127 OverflowError: n too large
 128 ok
 129 0 of 8 examples failed in example.factorial.__doc__
 130 2 items passed all tests:
 131    1 tests in example
 132    8 tests in example.factorial
 133 9 tests in 2 items.
 134 9 passed and 0 failed.
 135 Test passed.
 136 $
 137 \end{verbatim}
 138
 139 That's all you need to know to start making productive use of
 140 \module{doctest}!  Jump in.  The docstrings in \file{doctest.py} contain
 141 detailed information about all aspects of \module{doctest}, and we'll
 142 just cover the more important points here.
 143
 144 \subsection{Normal Usage}
 145
 146 In normal use, end each module \module{M} with:
 147
 148 \begin{verbatim}
 149 def _test():
 150     import doctest, M           # replace M with your module's name
 151     return doctest.testmod(M)   # ditto
 152
 153 if __name__ == "__main__":
 154     _test()
 155 \end{verbatim}
 156
 157 If you want to test the module as the main module, you don't need to
 158 pass M to \function{testmod()}; in this case, it will test the current
 159 module.
 160
 161 Then running the module as a script causes the examples in the docstrings
 162 to get executed and verified:
 163
 164 \begin{verbatim}
 165 python M.py
 166 \end{verbatim}
 167
 168 This won't display anything unless an example fails, in which case the
 169 failing example(s) and the cause(s) of the failure(s) are printed to stdout,
 170 and the final line of output is \code{'Test failed.'}.
 171
 172 Run it with the \programopt{-v} switch instead:
 173
 174 \begin{verbatim}
 175 python M.py -v
 176 \end{verbatim}
 177
 178 and a detailed report of all examples tried is printed to standard
 179 output, along with assorted summaries at the end.
 180
 181 You can force verbose mode by passing \code{verbose=1} to
 182 \function{testmod()}, or
 183 prohibit it by passing \code{verbose=0}.  In either of those cases,
 184 \code{sys.argv} is not examined by \function{testmod()}.
 185
 186 In any case, \function{testmod()} returns a 2-tuple of ints \code{(\var{f},
 187 \var{t})}, where \var{f} is the number of docstring examples that
 188 failed and \var{t} is the total number of docstring examples
 189 attempted.
 190
 191 \subsection{Which Docstrings Are Examined?}
 192
 193 See the docstrings in \file{doctest.py} for all the details.  They're
 194 unsurprising: the module docstring, and all function, class and method
 195 docstrings are searched.  Optionally, the tester can be directed to
 196 exclude docstrings attached to objects with private names.  Objects
 197 imported into the module are not searched.
 198
 199 In addition, if \code{M.__test__} exists and "is true", it must be a
 200 dict, and each entry maps a (string) name to a function object, class
 201 object, or string.  Function and class object docstrings found from
 202 \code{M.__test__} are searched even if the tester has been
 203 directed to skip over private names in the rest of the module.
 204 In output, a key \code{K} in \code{M.__test__} appears with name
 205
 206 \begin{verbatim}
 207 <name of M>.__test__.K
 208 \end{verbatim}
 209
 210 Any classes found are recursively searched similarly, to test docstrings in
 211 their contained methods and nested classes.  While private names reached
 212 from \module{M}'s globals can be optionally skipped, all names reached from
 213 \code{M.__test__} are searched.
 214
 215 \subsection{What's the Execution Context?}
 216
 217 By default, each time \function{testmod()} finds a docstring to test, it uses
 218 a \emph{copy} of \module{M}'s globals, so that running tests on a module
 219 doesn't change the module's real globals, and so that one test in
 220 \module{M} can't leave behind crumbs that accidentally allow another test
 221 to work.  This means examples can freely use any names defined at top-level
 222 in \module{M}, and names defined earlier in the docstring being run.
 223
 224 You can force use of your own dict as the execution context by passing
 225 \code{globs=your_dict} to \function{testmod()} instead.  Presumably this
 226 would be a copy of \code{M.__dict__} merged with the globals from other
 227 imported modules.
 228
 229 \subsection{What About Exceptions?}
 230
 231 No problem, as long as the only output generated by the example is the
 232 traceback itself.  For example:
 233
 234 \begin{verbatim}
 235 >>> [1, 2, 3].remove(42)
 236 Traceback (most recent call last):
 237   File "<stdin>", line 1, in ?
 238 ValueError: list.remove(x): x not in list
 239 >>>
 240 \end{verbatim}
 241
 242 Note that only the exception type and value are compared (specifically,
 243 only the last line in the traceback).  The various ``File'' lines in
 244 between can be left out (unless they add significantly to the documentation
 245 value of the example).
 246
 247 \subsection{Advanced Usage}
 248
 249 Several module level functions are available for controlling how doctests
 250 are run.
 251
 252 \begin{funcdesc}{debug}{module, name}
 253   Debug a single docstring containing doctests.
 254
 255   Provide the \var{module} (or dotted name of the module) containing the
 256   docstring to be debugged and the \var{name} (within the module) of the
 257   object with the docstring to be debugged.
 258
 259   The doctest examples are extracted (see function \function{testsource()}),
 260   and written to a temporary file.  The Python debugger, \refmodule{pdb},
 261   is then invoked on that file.
 262   \versionadded{2.3}
 263 \end{funcdesc}
 264
 265 \begin{funcdesc}{testmod}{}
 266   This function provides the most basic interface to the doctests.
 267   It creates a local instance of class \class{Tester}, runs appropriate
 268   methods of that class, and merges the results into the global \class{Tester}
 269   instance, \code{master}.
 270
 271   To get finer control than \function{testmod()} offers, create an instance
 272   of \class{Tester} with custom policies, or run methods of \code{master}
 273   directly.  See \code{Tester.__doc__} for details.
 274 \end{funcdesc}
 275
 276 \begin{funcdesc}{testsource}{module, name}
 277   Extract the doctest examples from a docstring.
 278
 279   Provide the \var{module} (or dotted name of the module) containing the
 280   tests to be extracted and the \var{name} (within the module) of the object
 281   with the docstring containing the tests to be extracted.
 282
 283   The doctest examples are returned as a string containing Python
 284   code.  The expected output blocks in the examples are converted
 285   to Python comments.
 286   \versionadded{2.3}
 287 \end{funcdesc}
 288
 289 \begin{funcdesc}{DocTestSuite}{\optional{module}}
 290   Convert doctest tests for a module to a
 291   \class{\refmodule{unittest}.TestSuite}.
 292
 293   The returned \class{TestSuite} is to be run by the unittest framework
 294   and runs each doctest in the module.  If any of the doctests fail,
 295   then the synthesized unit test fails, and a \exception{DocTestTestFailure}
 296   exception is raised showing the name of the file containing the test and a
 297   (sometimes approximate) line number.
 298
 299   The optional \var{module} argument provides the module to be tested.  It
 300   can be a module object or a (possibly dotted) module name.  If not
 301   specified, the module calling this function is used.
 302
 303   Example using one of the many ways that the \refmodule{unittest} module
 304   can use a \class{TestSuite}:
 305
 306   \begin{verbatim}
 307     import unittest
 308     import doctest
 309     import my_module_with_doctests
 310
 311     suite = doctest.DocTestSuite(my_module_with_doctests)
 312     runner = unittest.TextTestRunner()
 313     runner.run(suite)
 314   \end{verbatim}
 315
 316   \versionadded{2.3}
 317   \warning{This function does not currently search \code{M.__test__}
 318   and its search technique does not exactly match \function{testmod()} in
 319   every detail.  Future versions will bring the two into convergence.}
 320 \end{funcdesc}
 321
 322
 323 \subsection{How are Docstring Examples Recognized?}
 324
 325 In most cases a copy-and-paste of an interactive console session works
 326 fine---just make sure the leading whitespace is rigidly consistent
 327 (you can mix tabs and spaces if you're too lazy to do it right, but
 328 \module{doctest} is not in the business of guessing what you think a tab
 329 means).
 330
 331 \begin{verbatim}
 332 >>> # comments are ignored
 333 >>> x = 12
 334 >>> x
 335 12
 336 >>> if x == 13:
 337 ...     print "yes"
 338 ... else:
 339 ...     print "no"
 340 ...     print "NO"
 341 ...     print "NO!!!"
 342 ...
 343 no
 344 NO
 345 NO!!!
 346 >>>
 347 \end{verbatim}
 348
 349 Any expected output must immediately follow the final
 350 \code{'>\code{>}>~'} or \code{'...~'} line containing the code, and
 351 the expected output (if any) extends to the next \code{'>\code{>}>~'}
 352 or all-whitespace line.
 353
 354 The fine print:
 355
 356 \begin{itemize}
 357
 358 \item Expected output cannot contain an all-whitespace line, since such a
 359   line is taken to signal the end of expected output.
 360
 361 \item Output to stdout is captured, but not output to stderr (exception
 362   tracebacks are captured via a different means).
 363
 364 \item If you continue a line via backslashing in an interactive session, or
 365   for any other reason use a backslash, you need to double the backslash in
 366   the docstring version.  This is simply because you're in a string, and so
 367   the backslash must be escaped for it to survive intact.  Like:
 368
 369 \begin{verbatim}
 370 >>> if "yes" == \\
 371 ...     "y" +   \\
 372 ...     "es":
 373 ...     print 'yes'
 374 yes
 375 \end{verbatim}
 376
 377 \item The starting column doesn't matter:
 378
 379 \begin{verbatim}
 380   >>> assert "Easy!"
 381         >>> import math
 382             >>> math.floor(1.9)
 383             1.0
 384 \end{verbatim}
 385
 386 and as many leading whitespace characters are stripped from the
 387 expected output as appeared in the initial \code{'>\code{>}>~'} line
 388 that triggered it.
 389 \end{itemize}
 390
 391 \subsection{Warnings}
 392
 393 \begin{enumerate}
 394
 395 \item \module{doctest} is serious about requiring exact matches in expected
 396   output.  If even a single character doesn't match, the test fails.  This
 397   will probably surprise you a few times, as you learn exactly what Python
 398   does and doesn't guarantee about output.  For example, when printing a
 399   dict, Python doesn't guarantee that the key-value pairs will be printed
 400   in any particular order, so a test like
 401
 402 % Hey! What happened to Monty Python examples?
 403 % Tim: ask Guido -- it's his example!
 404 \begin{verbatim}
 405 >>> foo()
 406 {"Hermione": "hippogryph", "Harry": "broomstick"}
 407 >>>
 408 \end{verbatim}
 409
 410 is vulnerable!  One workaround is to do
 411
 412 \begin{verbatim}
 413 >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
 414 1
 415 >>>
 416 \end{verbatim}
 417
 418 instead.  Another is to do
 419
 420 \begin{verbatim}
 421 >>> d = foo().items()
 422 >>> d.sort()
 423 >>> d
 424 [('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
 425 \end{verbatim}
 426
 427 There are others, but you get the idea.
 428
 429 Another bad idea is to print things that embed an object address, like
 430
 431 \begin{verbatim}
 432 >>> id(1.0) # certain to fail some of the time
 433 7948648
 434 >>>
 435 \end{verbatim}
 436
 437 Floating-point numbers are also subject to small output variations across
 438 platforms, because Python defers to the platform C library for float
 439 formatting, and C libraries vary widely in quality here.
 440
 441 \begin{verbatim}
 442 >>> 1./7  # risky
 443 0.14285714285714285
 444 >>> print 1./7 # safer
 445 0.142857142857
 446 >>> print round(1./7, 6) # much safer
 447 0.142857
 448 \end{verbatim}
 449
 450 Numbers of the form \code{I/2.**J} are safe across all platforms, and I
 451 often contrive doctest examples to produce numbers of that form:
 452
 453 \begin{verbatim}
 454 >>> 3./4  # utterly safe
 455 0.75
 456 \end{verbatim}
 457
 458 Simple fractions are also easier for people to understand, and that makes
 459 for better documentation.
 460
 461 \item Be careful if you have code that must only execute once.
 462
 463 If you have module-level code that must only execute once, a more foolproof
 464 definition of \function{_test()} is
 465
 466 \begin{verbatim}
 467 def _test():
 468     import doctest, sys
 469     doctest.testmod()
 470 \end{verbatim}
 471
 472 \item WYSIWYG isn't always the case, starting in Python 2.3.  The
 473   string form of boolean results changed from \code{'0'} and
 474   \code{'1'} to \code{'False'} and \code{'True'} in Python 2.3.
 475   This makes it clumsy to write a doctest showing boolean results that
 476   passes under multiple versions of Python.  In Python 2.3, by default,
 477   and as a special case, if an expected output block consists solely
 478   of \code{'0'} and the actual output block consists solely of
 479   \code{'False'}, that's accepted as an exact match, and similarly for
 480   \code{'1'} versus \code{'True'}.  This behavior can be turned off by
 481   passing the new (in 2.3) module constant
 482   \constant{DONT_ACCEPT_TRUE_FOR_1} as the value of \function{testmod()}'s
 483   new (in 2.3) optional \var{optionflags} argument.  Some years after
 484   the integer spellings of booleans are history, this hack will
 485   probably be removed again.
 486
 487 \end{enumerate}
 488
 489
 490 \subsection{Soapbox}
 491
 492 The first word in ``doctest'' is ``doc,'' and that's why the author
 493 wrote \refmodule{doctest}: to keep documentation up to date.  It so
 494 happens that \refmodule{doctest} makes a pleasant unit testing
 495 environment, but that's not its primary purpose.
 496
 497 Choose docstring examples with care.  There's an art to this that
 498 needs to be learned---it may not be natural at first.  Examples should
 499 add genuine value to the documentation.  A good example can often be
 500 worth many words.  If possible, show just a few normal cases, show
 501 endcases, show interesting subtle cases, and show an example of each
 502 kind of exception that can be raised.  You're probably testing for
 503 endcases and subtle cases anyway in an interactive shell:
 504 \refmodule{doctest} wants to make it as easy as possible to capture
 505 those sessions, and will verify they continue to work as designed
 506 forever after.
 507
 508 If done with care, the examples will be invaluable for your users, and
 509 will pay back the time it takes to collect them many times over as the
 510 years go by and things change.  I'm still amazed at how often one of
 511 my \refmodule{doctest} examples stops working after a ``harmless''
 512 change.
 513
 514 For exhaustive testing, or testing boring cases that add no value to the
 515 docs, define a \code{__test__} dict instead.  That's what it's for.