Doc/lib/libdoctest.tex

   1 \section{\module{doctest} ---
   2          Test docstrings represent reality}
   3
   4 \declaremodule{standard}{doctest}
   5 \moduleauthor{Tim Peters}{tim_one@users.sourceforge.net}
   6 \sectionauthor{Tim Peters}{tim_one@users.sourceforge.net}
   7 \sectionauthor{Moshe Zadka}{moshez@debian.org}
   8
   9 \modulesynopsis{A framework for verifying examples in docstrings.}
  10
  11 The \module{doctest} module searches a module's docstrings for text that looks
  12 like an interactive Python session, then executes all such sessions to verify
  13 they still work exactly as shown.  Here's a complete but small example:
  14
  15 \begin{verbatim}
  16 """
  17 This is module example.
  18
  19 Example supplies one function, factorial.  For example,
  20
  21 >>> factorial(5)
  22 120
  23 """
  24
  25 def factorial(n):
  26     """Return the factorial of n, an exact integer >= 0.
  27
  28     If the result is small enough to fit in an int, return an int.
  29     Else return a long.
  30
  31     >>> [factorial(n) for n in range(6)]
  32     [1, 1, 2, 6, 24, 120]
  33     >>> [factorial(long(n)) for n in range(6)]
  34     [1, 1, 2, 6, 24, 120]
  35     >>> factorial(30)
  36     265252859812191058636308480000000L
  37     >>> factorial(30L)
  38     265252859812191058636308480000000L
  39     >>> factorial(-1)
  40     Traceback (most recent call last):
  41         ...
  42     ValueError: n must be >= 0
  43
  44     Factorials of floats are OK, but the float must be an exact integer:
  45     >>> factorial(30.1)
  46     Traceback (most recent call last):
  47         ...
  48     ValueError: n must be exact integer
  49     >>> factorial(30.0)
  50     265252859812191058636308480000000L
  51
  52     It must also not be ridiculously large:
  53     >>> factorial(1e100)
  54     Traceback (most recent call last):
  55         ...
  56     OverflowError: n too large
  57     """
  58
  59 \end{verbatim}
  60 % allow LaTeX to break here.
  61 \begin{verbatim}
  62
  63     import math
  64     if not n >= 0:
  65         raise ValueError("n must be >= 0")
  66     if math.floor(n) != n:
  67         raise ValueError("n must be exact integer")
  68     if n+1 == n:  # e.g., 1e300
  69         raise OverflowError("n too large")
  70     result = 1
  71     factor = 2
  72     while factor <= n:
  73         try:
  74             result *= factor
  75         except OverflowError:
  76             result *= long(factor)
  77         factor += 1
  78     return result
  79
  80 def _test():
  81     import doctest, example
  82     return doctest.testmod(example)
  83
  84 if __name__ == "__main__":
  85     _test()
  86 \end{verbatim}
  87
  88 If you run \file{example.py} directly from the command line, doctest works
  89 its magic:
  90
  91 \begin{verbatim}
  92 $ python example.py
  93 $
  94 \end{verbatim}
  95
  96 There's no output!  That's normal, and it means all the examples worked.
  97 Pass \programopt{-v} to the script, and doctest prints a detailed log
  98 of what it's trying, and prints a summary at the end:
  99
 100 \begin{verbatim}
 101 $ python example.py -v
 102 Running example.__doc__
 103 Trying: factorial(5)
 104 Expecting: 120
 105 ok
 106 0 of 1 examples failed in example.__doc__
 107 Running example.factorial.__doc__
 108 Trying: [factorial(n) for n in range(6)]
 109 Expecting: [1, 1, 2, 6, 24, 120]
 110 ok
 111 Trying: [factorial(long(n)) for n in range(6)]
 112 Expecting: [1, 1, 2, 6, 24, 120]
 113 ok
 114 Trying: factorial(30)
 115 Expecting: 265252859812191058636308480000000L
 116 ok
 117 \end{verbatim}
 118
 119 And so on, eventually ending with:
 120
 121 \begin{verbatim}
 122 Trying: factorial(1e100)
 123 Expecting:
 124 Traceback (most recent call last):
 125     ...
 126 OverflowError: n too large
 127 ok
 128 0 of 8 examples failed in example.factorial.__doc__
 129 2 items passed all tests:
 130    1 tests in example
 131    8 tests in example.factorial
 132 9 tests in 2 items.
 133 9 passed and 0 failed.
 134 Test passed.
 135 $
 136 \end{verbatim}
 137
 138 That's all you need to know to start making productive use of doctest!  Jump
 139 in.  The docstrings in doctest.py contain detailed information about all
 140 aspects of doctest, and we'll just cover the more important points here.
 141
 142 \subsection{Normal Usage}
 143
 144 In normal use, end each module \module{M} with:
 145
 146 \begin{verbatim}
 147 def _test():
 148     import doctest, M           # replace M with your module's name
 149     return doctest.testmod(M)   # ditto
 150
 151 if __name__ == "__main__":
 152     _test()
 153 \end{verbatim}
 154
 155 Then running the module as a script causes the examples in the docstrings
 156 to get executed and verified:
 157
 158 \begin{verbatim}
 159 python M.py
 160 \end{verbatim}
 161
 162 This won't display anything unless an example fails, in which case the
 163 failing example(s) and the cause(s) of the failure(s) are printed to stdout,
 164 and the final line of output is \code{'Test failed.'}.
 165
 166 Run it with the \programopt{-v} switch instead:
 167
 168 \begin{verbatim}
 169 python M.py -v
 170 \end{verbatim}
 171
 172 and a detailed report of all examples tried is printed to \code{stdout},
 173 along with assorted summaries at the end.
 174
 175 You can force verbose mode by passing \code{verbose=1} to testmod, or
 176 prohibit it by passing \code{verbose=0}.  In either of those cases,
 177 \code{sys.argv} is not examined by testmod.
 178
 179 In any case, testmod returns a 2-tuple of ints \code{(\var{f},
 180 \var{t})}, where \var{f} is the number of docstring examples that
 181 failed and \var{t} is the total number of docstring examples
 182 attempted.
 183
 184 \subsection{Which Docstrings Are Examined?}
 185
 186 See \file{docstring.py} for all the details.  They're unsurprising:  the
 187 module docstring, and all function, class and method docstrings are
 188 searched, with the exception of docstrings attached to objects with private
 189 names.
 190
 191 In addition, if \code{M.__test__} exists and "is true", it must be a
 192 dict, and each entry maps a (string) name to a function object, class
 193 object, or string.  Function and class object docstrings found from
 194 \code{M.__test__} are searched even if the name is private, and
 195 strings are searched directly as if they were docstrings.  In output,
 196 a key \code{K} in \code{M.__test__} appears with name
 197
 198 \begin{verbatim}
 199       <name of M>.__test__.K
 200 \end{verbatim}
 201
 202 Any classes found are recursively searched similarly, to test docstrings in
 203 their contained methods and nested classes.  While private names reached
 204 from \module{M}'s globals are skipped, all names reached from
 205 \code{M.__test__} are searched.
 206
 207 \subsection{What's the Execution Context?}
 208
 209 By default, each time testmod finds a docstring to test, it uses a
 210 {\em copy} of \module{M}'s globals, so that running tests on a module
 211 doesn't change the module's real globals, and so that one test in
 212 \module{M} can't leave behind crumbs that accidentally allow another test
 213 to work.  This means examples can freely use any names defined at top-level
 214 in \module{M}, and names defined earlier in the docstring being run.  It
 215 also means that sloppy imports (see below) can cause examples in external
 216 docstrings to use globals inappropriate for them.
 217
 218 You can force use of your own dict as the execution context by passing
 219 \code{globs=your_dict} to \function{testmod()} instead.  Presumably this
 220 would be a copy of \code{M.__dict__} merged with the globals from other
 221 imported modules.
 222
 223 \subsection{What About Exceptions?}
 224
 225 No problem, as long as the only output generated by the example is the
 226 traceback itself.  For example:
 227
 228 \begin{verbatim}
 229 >>> [1, 2, 3].remove(42)
 230 Traceback (most recent call last):
 231   File "<stdin>", line 1, in ?
 232 ValueError: list.remove(x): x not in list
 233 >>>
 234 \end{verbatim}
 235
 236 Note that only the exception type and value are compared (specifically,
 237 only the last line in the traceback).  The various ``File'' lines in
 238 between can be left out (unless they add significantly to the documentation
 239 value of the example).
 240
 241 \subsection{Advanced Usage}
 242
 243 \function{testmod()} actually creates a local instance of class
 244 \class{Tester}, runs appropriate methods of that class, and merges
 245 the results into global \class{Tester} instance \code{master}.
 246
 247 You can create your own instances of \class{Tester}, and so build your
 248 own policies, or even run methods of \code{master} directly.  See
 249 \code{Tester.__doc__} for details.
 250
 251
 252 \subsection{How are Docstring Examples Recognized?}
 253
 254 In most cases a copy-and-paste of an interactive console session works fine
 255 --- just make sure the leading whitespace is rigidly consistent (you can mix
 256 tabs and spaces if you're too lazy to do it right, but doctest is not in
 257 the business of guessing what you think a tab means).
 258
 259 \begin{verbatim}
 260 >>> # comments are ignored
 261 >>> x = 12
 262 >>> x
 263 12
 264 >>> if x == 13:
 265 ...     print "yes"
 266 ... else:
 267 ...     print "no"
 268 ...     print "NO"
 269 ...     print "NO!!!"
 270 ...
 271 no
 272 NO
 273 NO!!!
 274 >>>
 275 \end{verbatim}
 276
 277 Any expected output must immediately follow the final
 278 \code{'>\code{>}>~'} or \code{'...~'} line containing the code, and
 279 the expected output (if any) extends to the next \code{'>\code{>}>~'}
 280 or all-whitespace line.
 281
 282 The fine print:
 283
 284 \begin{itemize}
 285
 286 \item Expected output cannot contain an all-whitespace line, since such a
 287   line is taken to signal the end of expected output.
 288
 289 \item Output to stdout is captured, but not output to stderr (exception
 290   tracebacks are captured via a different means).
 291
 292 \item If you continue a line via backslashing in an interactive session, or
 293   for any other reason use a backslash, you need to double the backslash in
 294   the docstring version.  This is simply because you're in a string, and so
 295   the backslash must be escaped for it to survive intact.  Like:
 296
 297 \begin{verbatim}
 298 >>> if "yes" == \\
 299 ...     "y" +   \\
 300 ...     "es":
 301 ...     print 'yes'
 302 yes
 303 \end{verbatim}
 304
 305 \item The starting column doesn't matter:
 306
 307 \begin{verbatim}
 308   >>> assert "Easy!"
 309         >>> import math
 310             >>> math.floor(1.9)
 311             1.0
 312 \end{verbatim}
 313
 314 and as many leading whitespace characters are stripped from the
 315 expected output as appeared in the initial \code{'>\code{>}>~'} line
 316 that triggered it.
 317 \end{itemize}
 318
 319 \subsection{Warnings}
 320
 321 \begin{enumerate}
 322
 323 \item Sloppy imports can cause trouble; e.g., if you do
 324
 325 \begin{verbatim}
 326 from XYZ import XYZclass
 327 \end{verbatim}
 328
 329 then \class{XYZclass} is a name in \code{M.__dict__} too, and doctest
 330 has no way to know that \class{XYZclass} wasn't \emph{defined} in
 331 \module{M}.  So it may try to execute the examples in
 332 \class{XYZclass}'s docstring, and those in turn may require a
 333 different set of globals to work correctly.  I prefer to do
 334 ``\code{import *}''-friendly imports, a la
 335
 336 \begin{verbatim}
 337 from XYZ import XYZclass as _XYZclass
 338 \end{verbatim}
 339
 340 and then the leading underscore makes \class{_XYZclass} a private name so
 341 testmod skips it by default.  Other approaches are described in
 342 \file{doctest.py}.
 343
 344 \item \module{doctest} is serious about requiring exact matches in expected
 345   output.  If even a single character doesn't match, the test fails.  This
 346   will probably surprise you a few times, as you learn exactly what Python
 347   does and doesn't guarantee about output.  For example, when printing a
 348   dict, Python doesn't guarantee that the key-value pairs will be printed
 349   in any particular order, so a test like
 350
 351 % Hey! What happened to Monty Python examples?
 352 % Tim: ask Guido -- it's his example!
 353 \begin{verbatim}
 354 >>> foo()
 355 {"Hermione": "hippogryph", "Harry": "broomstick"}
 356 >>>
 357 \end{verbatim}
 358
 359 is vulnerable!  One workaround is to do
 360
 361 \begin{verbatim}
 362 >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
 363 1
 364 >>>
 365 \end{verbatim}
 366
 367 instead.  Another is to do
 368
 369 \begin{verbatim}
 370 >>> d = foo().items()
 371 >>> d.sort()
 372 >>> d
 373 [('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
 374 \end{verbatim}
 375
 376 There are others, but you get the idea.
 377
 378 Another bad idea is to print things that embed an object address, like
 379
 380 \begin{verbatim}
 381 >>> id(1.0) # certain to fail some of the time
 382 7948648
 383 >>>
 384 \end{verbatim}
 385
 386 Floating-point numbers are also subject to small output variations across
 387 platforms, because Python defers to the platform C library for float
 388 formatting, and C libraries vary widely in quality here.
 389
 390 \begin{verbatim}
 391 >>> 1./7  # risky
 392 0.14285714285714285
 393 >>> print 1./7 # safer
 394 0.142857142857
 395 >>> print round(1./7, 6) # much safer
 396 0.142857
 397 \end{verbatim}
 398
 399 Numbers of the form \code{I/2.**J} are safe across all platforms, and I
 400 often contrive doctest examples to produce numbers of that form:
 401
 402 \begin{verbatim}
 403 >>> 3./4  # utterly safe
 404 0.75
 405 \end{verbatim}
 406
 407 Simple fractions are also easier for people to understand, and that makes
 408 for better documentation.
 409 \end{enumerate}
 410
 411
 412 \subsection{Soapbox}
 413
 414 The first word in doctest is "doc", and that's why the author wrote
 415 doctest:  to keep documentation up to date.  It so happens that doctest
 416 makes a pleasant unit testing environment, but that's not its primary
 417 purpose.
 418
 419 Choose docstring examples with care.  There's an art to this that needs to
 420 be learned --- it may not be natural at first.  Examples should add genuine
 421 value to the documentation.  A good example can often be worth many words.
 422 If possible, show just a few normal cases, show endcases, show interesting
 423 subtle cases, and show an example of each kind of exception that can be
 424 raised.  You're probably testing for endcases and subtle cases anyway in an
 425 interactive shell:  doctest wants to make it as easy as possible to capture
 426 those sessions, and will verify they continue to work as designed forever
 427 after.
 428
 429 If done with care, the examples will be invaluable for your users, and will
 430 pay back the time it takes to collect them many times over as the years go
 431 by and "things change".  I'm still amazed at how often one of my doctest
 432 examples stops working after a "harmless" change.
 433
 434 For exhaustive testing, or testing boring cases that add no value to the
 435 docs, define a \code{__test__} dict instead.  That's what it's for.