- Got rid of newmodule.c
[python/dscho.git] / Doc / whatsnew / whatsnew23.tex
blobbe1d68bda6df2f0877648f6fc4d2e3fb1c5f56e7
1 \documentclass{howto}
2 % $Id$
4 % TODO:
5 % Go through and get the contributor's name for all the various changes
7 \title{What's New in Python 2.3}
8 \release{0.02}
9 \author{A.M. Kuchling}
10 \authoraddress{\email{akuchlin@mems-exchange.org}}
12 \begin{document}
13 \maketitle
14 \tableofcontents
16 % Timeout sockets:
17 % Executive summary: after sock.settimeout(T), all methods of sock will
18 % block for at most T floating seconds and fail if they can't complete
19 % within that time. sock.settimeout(None) restores full blocking mode.
21 % Optik (or whatever it gets called)
23 % getopt.gnu_getopt
25 % Docstrings now optional (with --without-doc-strings)
27 % New dependency argument to distutils.Extension
30 %\section{Introduction \label{intro}}
32 {\large This article is a draft, and is currently up to date for some
33 random version of the CVS tree around May 26 2002. Please send any
34 additions, comments or errata to the author.}
36 This article explains the new features in Python 2.3. The tentative
37 release date of Python 2.3 is currently scheduled for August 30 2002.
39 This article doesn't attempt to provide a complete specification of
40 the new features, but instead provides a convenient overview. For
41 full details, you should refer to the documentation for Python 2.3,
42 such as the
43 \citetitle[http://www.python.org/doc/2.3/lib/lib.html]{Python Library
44 Reference} and the
45 \citetitle[http://www.python.org/doc/2.3/ref/ref.html]{Python
46 Reference Manual}. If you want to understand the complete
47 implementation and design rationale for a change, refer to the PEP for
48 a particular new feature.
51 %======================================================================
52 \section{PEP 255: Simple Generators\label{section-generators}}
54 In Python 2.2, generators were added as an optional feature, to be
55 enabled by a \code{from __future__ import generators} directive. In
56 2.3 generators no longer need to be specially enabled, and are now
57 always present; this means that \keyword{yield} is now always a
58 keyword. The rest of this section is a copy of the description of
59 generators from the ``What's New in Python 2.2'' document; if you read
60 it when 2.2 came out, you can skip the rest of this section.
62 You're doubtless familiar with how function calls work in Python or C.
63 When you call a function, it gets a private namespace where its local
64 variables are created. When the function reaches a \keyword{return}
65 statement, the local variables are destroyed and the resulting value
66 is returned to the caller. A later call to the same function will get
67 a fresh new set of local variables. But, what if the local variables
68 weren't thrown away on exiting a function? What if you could later
69 resume the function where it left off? This is what generators
70 provide; they can be thought of as resumable functions.
72 Here's the simplest example of a generator function:
74 \begin{verbatim}
75 def generate_ints(N):
76 for i in range(N):
77 yield i
78 \end{verbatim}
80 A new keyword, \keyword{yield}, was introduced for generators. Any
81 function containing a \keyword{yield} statement is a generator
82 function; this is detected by Python's bytecode compiler which
83 compiles the function specially as a result.
85 When you call a generator function, it doesn't return a single value;
86 instead it returns a generator object that supports the iterator
87 protocol. On executing the \keyword{yield} statement, the generator
88 outputs the value of \code{i}, similar to a \keyword{return}
89 statement. The big difference between \keyword{yield} and a
90 \keyword{return} statement is that on reaching a \keyword{yield} the
91 generator's state of execution is suspended and local variables are
92 preserved. On the next call to the generator's \code{.next()} method,
93 the function will resume executing immediately after the
94 \keyword{yield} statement. (For complicated reasons, the
95 \keyword{yield} statement isn't allowed inside the \keyword{try} block
96 of a \code{try...finally} statement; read \pep{255} for a full
97 explanation of the interaction between \keyword{yield} and
98 exceptions.)
100 Here's a sample usage of the \function{generate_ints} generator:
102 \begin{verbatim}
103 >>> gen = generate_ints(3)
104 >>> gen
105 <generator object at 0x8117f90>
106 >>> gen.next()
108 >>> gen.next()
110 >>> gen.next()
112 >>> gen.next()
113 Traceback (most recent call last):
114 File "stdin", line 1, in ?
115 File "stdin", line 2, in generate_ints
116 StopIteration
117 \end{verbatim}
119 You could equally write \code{for i in generate_ints(5)}, or
120 \code{a,b,c = generate_ints(3)}.
122 Inside a generator function, the \keyword{return} statement can only
123 be used without a value, and signals the end of the procession of
124 values; afterwards the generator cannot return any further values.
125 \keyword{return} with a value, such as \code{return 5}, is a syntax
126 error inside a generator function. The end of the generator's results
127 can also be indicated by raising \exception{StopIteration} manually,
128 or by just letting the flow of execution fall off the bottom of the
129 function.
131 You could achieve the effect of generators manually by writing your
132 own class and storing all the local variables of the generator as
133 instance variables. For example, returning a list of integers could
134 be done by setting \code{self.count} to 0, and having the
135 \method{next()} method increment \code{self.count} and return it.
136 However, for a moderately complicated generator, writing a
137 corresponding class would be much messier.
138 \file{Lib/test/test_generators.py} contains a number of more
139 interesting examples. The simplest one implements an in-order
140 traversal of a tree using generators recursively.
142 \begin{verbatim}
143 # A recursive generator that generates Tree leaves in in-order.
144 def inorder(t):
145 if t:
146 for x in inorder(t.left):
147 yield x
148 yield t.label
149 for x in inorder(t.right):
150 yield x
151 \end{verbatim}
153 Two other examples in \file{Lib/test/test_generators.py} produce
154 solutions for the N-Queens problem (placing $N$ queens on an $NxN$
155 chess board so that no queen threatens another) and the Knight's Tour
156 (a route that takes a knight to every square of an $NxN$ chessboard
157 without visiting any square twice).
159 The idea of generators comes from other programming languages,
160 especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
161 idea of generators is central. In Icon, every
162 expression and function call behaves like a generator. One example
163 from ``An Overview of the Icon Programming Language'' at
164 \url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
165 what this looks like:
167 \begin{verbatim}
168 sentence := "Store it in the neighboring harbor"
169 if (i := find("or", sentence)) > 5 then write(i)
170 \end{verbatim}
172 In Icon the \function{find()} function returns the indexes at which the
173 substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
174 \code{i} is first assigned a value of 3, but 3 is less than 5, so the
175 comparison fails, and Icon retries it with the second value of 23. 23
176 is greater than 5, so the comparison now succeeds, and the code prints
177 the value 23 to the screen.
179 Python doesn't go nearly as far as Icon in adopting generators as a
180 central concept. Generators are considered a new part of the core
181 Python language, but learning or using them isn't compulsory; if they
182 don't solve any problems that you have, feel free to ignore them.
183 One novel feature of Python's interface as compared to
184 Icon's is that a generator's state is represented as a concrete object
185 (the iterator) that can be passed around to other functions or stored
186 in a data structure.
188 \begin{seealso}
190 \seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
191 Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
192 and Tim Peters, with other fixes from the Python Labs crew.}
194 \end{seealso}
197 %======================================================================
198 \section{PEP 278: Universal Newline Support}
200 The three major operating systems used today are Microsoft Windows,
201 Apple's Macintosh OS, and the various \UNIX\ derivatives. A minor
202 irritation is that these three platforms all use different characters
203 to mark the ends of lines in text files. \UNIX\ uses character 10,
204 the ASCII linefeed, while MacOS uses character 13, the ASCII carriage
205 return, and Windows uses a two-character sequence of a carriage return
206 plus a newline.
208 Python's file objects can now support end of line conventions other
209 than the one followed by the platform on which Python is running.
210 Opening a file with the mode \samp{U} or \samp{rU} will open a file
211 for reading in universal newline mode. All three line ending
212 conventions will be translated to a \samp{\e n} in the strings
213 returned by the various file methods such as \method{read()} and
214 \method{readline()}.
216 Universal newline support is also used when importing modules and when
217 executing a file with the \function{execfile()} function. This means
218 that Python modules can be shared between all three operating systems
219 without needing to convert the line-endings.
221 This feature can be disabled at compile-time by specifying
222 \longprogramopt{without-universal-newlines} when running Python's
223 \file{configure} script.
225 \begin{seealso}
227 \seepep{278}{Universal Newline Support}{Written
228 and implemented by Jack Jansen.}
230 \end{seealso}
233 %======================================================================
234 \section{PEP 279: The \function{enumerate()} Built-in Function}
236 A new built-in function, \function{enumerate()}, will make
237 certain loops a bit clearer. \code{enumerate(thing)}, where
238 \var{thing} is either an iterator or a sequence, returns a iterator
239 that will return \code{(0, \var{thing[0]})}, \code{(1,
240 \var{thing[1]})}, \code{(2, \var{thing[2]})}, and so forth. Fairly
241 often you'll see code to change every element of a list that looks
242 like this:
244 \begin{verbatim}
245 for i in range(len(L)):
246 item = L[i]
247 # ... compute some result based on item ...
248 L[i] = result
249 \end{verbatim}
251 This can be rewritten using \function{enumerate()} as:
253 \begin{verbatim}
254 for i, item in enumerate(L):
255 # ... compute some result based on item ...
256 L[i] = result
257 \end{verbatim}
260 \begin{seealso}
262 \seepep{279}{The enumerate() built-in function}{Written
263 by Raymond D. Hettinger.}
265 \end{seealso}
268 %======================================================================
269 \section{PEP 285: The \class{bool} Type\label{section-bool}}
271 A Boolean type was added to Python 2.3. Two new constants were added
272 to the \module{__builtin__} module, \constant{True} and
273 \constant{False}. The type object for this new type is named
274 \class{bool}; the constructor for it takes any Python value and
275 converts it to \constant{True} or \constant{False}.
277 \begin{verbatim}
278 >>> bool(1)
279 True
280 >>> bool(0)
281 False
282 >>> bool([])
283 False
284 >>> bool( (1,) )
285 True
286 \end{verbatim}
288 Most of the standard library modules and built-in functions have been
289 changed to return Booleans.
291 \begin{verbatim}
292 >>> obj = []
293 >>> hasattr(obj, 'append')
294 True
295 >>> isinstance(obj, list)
296 True
297 >>> isinstance(obj, tuple)
298 False
299 \end{verbatim}
301 Python's Booleans were added with the primary goal of making code
302 clearer. For example, if you're reading a function and encounter the
303 statement \code{return 1}, you might wonder whether the \samp{1}
304 represents a truth value, or whether it's an index, or whether it's a
305 coefficient that multiplies some other quantity. If the statement is
306 \code{return True}, however, the meaning of the return value is quite
307 clearly a truth value.
309 Python's Booleans were not added for the sake of strict type-checking.
310 A very strict language such as Pascal would also prevent you
311 performing arithmetic with Booleans, and would require that the
312 expression in an \keyword{if} statement always evaluate to a Boolean.
313 Python is not this strict, and it never will be. (\pep{285}
314 explicitly says so.) So you can still use any expression in an
315 \keyword{if}, even ones that evaluate to a list or tuple or some
316 random object, and the Boolean type is a subclass of the
317 \class{int} class, so arithmetic using a Boolean still works.
319 \begin{verbatim}
320 >>> True + 1
322 >>> False + 1
324 >>> False * 75
326 >>> True * 75
328 \end{verbatim}
330 To sum up \constant{True} and \constant{False} in a sentence: they're
331 alternative ways to spell the integer values 1 and 0, with the single
332 difference that \function{str()} and \function{repr()} return the
333 strings \samp{True} and \samp{False} instead of \samp{1} and \samp{0}.
335 \begin{seealso}
337 \seepep{285}{Adding a bool type}{Written and implemented by GvR.}
339 \end{seealso}
341 \section{Extended Slices\label{extended-slices}}
343 Ever since Python 1.4 the slice syntax has supported a third
344 ``stride'' argument, but the builtin sequence types have not supported
345 this feature (it was initially included at the behest of the
346 developers of the Numerical Python package). This changes with Python
347 2.3.
349 % XXX examples, etc.
351 %======================================================================
352 %\section{Other Language Changes}
354 %Here are the changes that Python 2.3 makes to the core language.
356 %\begin{itemize}
357 %\item The \keyword{yield} statement is now always a keyword, as
358 %described in section~\ref{section-generators}.
360 %\item Two new constants, \constant{True} and \constant{False} were
361 %added along with the built-in \class{bool} type, as described in
362 %section~\ref{section-bool}.
364 %\item
365 %\end{itemize}
368 %\begin{PendingDeprecationWarning}
369 A new warning PendingDeprecationWarning was added to provide
370 direction on features which are in the process of being deprecated.
371 The warning will not be printed by default. To see the pending
372 deprecations, use -Walways::PendingDeprecationWarning:: on the command line
373 or warnings.filterwarnings().
374 %\end{PendingDeprecationWarning}
377 %======================================================================
378 \section{Specialized Object Allocator (pymalloc)\label{section-pymalloc}}
380 An experimental feature added to Python 2.1 was a specialized object
381 allocator called pymalloc, written by Vladimir Marangozov. Pymalloc
382 was intended to be faster than the system \function{malloc()} and have
383 less memory overhead for typical allocation patterns of Python
384 programs. The allocator uses C's \function{malloc()} function to get
385 large pools of memory, and then fulfills smaller memory requests from
386 these pools.
388 In 2.1 and 2.2, pymalloc was an experimental feature and wasn't
389 enabled by default; you had to explicitly turn it on by providing the
390 \longprogramopt{with-pymalloc} option to the \program{configure}
391 script. In 2.3, pymalloc has had further enhancements and is now
392 enabled by default; you'll have to supply
393 \longprogramopt{without-pymalloc} to disable it.
395 This change is transparent to code written in Python; however,
396 pymalloc may expose bugs in C extensions. Authors of C extension
397 modules should test their code with the object allocator enabled,
398 because some incorrect code may cause core dumps at runtime. There
399 are a bunch of memory allocation functions in Python's C API that have
400 previously been just aliases for the C library's \function{malloc()}
401 and \function{free()}, meaning that if you accidentally called
402 mismatched functions, the error wouldn't be noticeable. When the
403 object allocator is enabled, these functions aren't aliases of
404 \function{malloc()} and \function{free()} any more, and calling the
405 wrong function to free memory may get you a core dump. For example,
406 if memory was allocated using \function{PyObject_Malloc()}, it has to
407 be freed using \function{PyObject_Free()}, not \function{free()}. A
408 few modules included with Python fell afoul of this and had to be
409 fixed; doubtless there are more third-party modules that will have the
410 same problem.
412 As part of this change, the confusing multiple interfaces for
413 allocating memory have been consolidated down into two API families.
414 Memory allocated with one family must not be manipulated with
415 functions from the other family.
417 There is another family of functions specifically for allocating
418 Python \emph{objects} (as opposed to memory).
420 \begin{itemize}
421 \item To allocate and free an undistinguished chunk of memory use
422 the ``raw memory'' family: \cfunction{PyMem_Malloc()},
423 \cfunction{PyMem_Realloc()}, and \cfunction{PyMem_Free()}.
425 \item The ``object memory'' family is the interface to the pymalloc
426 facility described above and is biased towards a large number of
427 ``small'' allocations: \cfunction{PyObject_Malloc},
428 \cfunction{PyObject_Realloc}, and \cfunction{PyObject_Free}.
430 \item To allocate and free Python objects, use the ``object'' family
431 \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()}, and
432 \cfunction{PyObject_Del()}.
433 \end{itemize}
435 Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides
436 debugging features to catch memory overwrites and doubled frees in
437 both extension modules and in the interpreter itself. To enable this
438 support, turn on the Python interpreter's debugging code by running
439 \program{configure} with \longprogramopt{with-pydebug}.
441 To aid extension writers, a header file \file{Misc/pymemcompat.h} is
442 distributed with the source to Python 2.3 that allows Python
443 extensions to use the 2.3 interfaces to memory allocation and compile
444 against any version of Python since 1.5.2. (The idea is that you take
445 the file from Python's source distribution and bundle it with the
446 source of your extension).
448 \begin{seealso}
450 \seeurl{http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c}
451 {For the full details of the pymalloc implementation, see
452 the comments at the top of the file \file{Objects/obmalloc.c} in the
453 Python source code. The above link points to the file within the
454 SourceForge CVS browser.}
456 \end{seealso}
458 %======================================================================
459 \section{New and Improved Modules}
461 As usual, Python's standard modules had a number of enhancements and
462 bug fixes. Here's a partial list; consult the \file{Misc/NEWS} file
463 in the source tree, or the CVS logs, for a more complete list.
465 \begin{itemize}
467 \item One minor but far-reaching change is that the names of extension
468 types defined by the modules included with Python now contain the
469 module and a \samp{.} in front of the type name. For example, in
470 Python 2.2, if you created a socket and printed its
471 \member{__class__}, you'd get this output:
473 \begin{verbatim}
474 >>> s = socket.socket()
475 >>> s.__class__
476 <type 'socket'>
477 \end{verbatim}
479 In 2.3, you get this:
480 \begin{verbatim}
481 >>> s.__class__
482 <type '_socket.socket'>
483 \end{verbatim}
485 \item The \method{strip()}, \method{lstrip()}, and \method{rstrip()}
486 string methods now have an optional argument for specifying the
487 characters to strip. The default is still to remove all whitespace
488 characters:
490 \begin{verbatim}
491 >>> ' abc '.strip()
492 'abc'
493 >>> '><><abc<><><>'.strip('<>')
494 'abc'
495 >>> '><><abc<><><>\n'.strip('<>')
496 'abc<><><>\n'
497 >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
498 u'\u4001abc'
500 \end{verbatim}
502 \item The \method{startswith()} and \method{endswith()}
503 string methods now have accept negative numbers for
504 start and end parameters.
506 \item Another new string method is \method{zfill()}, originally a
507 function in the \module{string} module. \method{zfill()} pads a
508 numeric string with zeros on the left until it's the specified width.
509 Note that the \code{\%} operator is still more flexible and powerful
510 than \method{zfill()}.
512 \begin{verbatim}
513 >>> '45'.zfill(4)
514 '0045'
515 >>> '12345'.zfill(4)
516 '12345'
517 >>> 'goofy'.zfill(6)
518 '0goofy'
519 \end{verbatim}
521 \item Dictionaries have a new method, \method{pop(\var{key})}, that
522 returns the value corresponding to \var{key} and removes that
523 key/value pair from the dictionary. \method{pop()} will raise a
524 \exception{KeyError} if the requsted key isn't present in the
525 dictionary:
527 \begin{verbatim}
528 >>> d = {1:2}
529 >>> d
530 {1: 2}
531 >>> d.pop(4)
532 Traceback (most recent call last):
533 File ``stdin'', line 1, in ?
534 KeyError: 4
535 >>> d.pop(1)
537 >>> d.pop(1)
538 Traceback (most recent call last):
539 File ``stdin'', line 1, in ?
540 KeyError: pop(): dictionary is empty
541 >>> d
544 \end{verbatim}
546 (Contributed by Raymond Hettinger.)
548 \item Two new functions in the \module{math} module,
549 \function{degrees(\var{rads})} and \function{radians(\var{degs})},
550 convert between radians and degrees. Other functions in the
551 \module{math} module such as
552 \function{math.sin()} and \function{math.cos()} have always required
553 input values measured in radians. (Contributed by Raymond Hettinger.)
555 \item Three new functions, \function{getpgid()}, \function{killpg()},
556 and \function{mknod()}, were added to the \module{posix} module that
557 underlies the \module{os} module.
559 \item Two new binary packagers were added to the Distutils.
560 \code{bdist_pkgtool} builds \file{.pkg} files to use with Solaris
561 \program{pkgtool}, and \code{bdist_sdux} builds \program{swinstall}
562 packages for use on HP-UX. (Contributed by Mark Alexander.)
564 \item The \module{array} module now supports arrays of Unicode
565 characters using the \samp{u} format character. Arrays also
566 now support using the \code{+=} assignment operator to add another array's
567 contents, and the \code{*=} assignment operator to repeat an array.
568 (Contributed by Jason Orendorff.)
570 \item The \module{grp} module now returns enhanced tuples:
572 \begin{verbatim}
573 >>> import grp
574 >>> g = grp.getgrnam('amk')
575 >>> g.gr_name, g.gr_gid
576 ('amk', 500)
577 \end{verbatim}
579 \item The \module{readline} module also gained a number of new
580 functions: \function{get_history_item()},
581 \function{get_current_history_length()}, and \function{redisplay()}.
583 \item Support for more advanced POSIX signal handling was added
584 to the \module{signal} module by adding the \function{sigpending},
585 \function{sigprocmask} and \function{sigsuspend} functions, where supported
586 by the platform. These functions make it possible to avoid some previously
587 unavoidable race conditions.
589 \end{itemize}
592 % ======================================================================
593 \section{Build and C API Changes}
595 Changes to Python's build process, and to the C API, include:
597 \begin{itemize}
599 \item Python can now optionally be built as a shared library
600 (\file{libpython2.3.so}) by supplying \longprogramopt{enable-shared}
601 when running Python's \file{configure} script. (Contributed by Ondrej
602 Palkovsky.)
604 \item The \cfunction{PyArg_NoArgs()} macro is now deprecated, and code
605 that
606 uses it should be changed to use \code{PyArg_ParseTuple(args, "")}
607 instead.
609 \item A new function, \cfunction{PyObject_DelItemString(\var{mapping},
610 char *\var{key})} was added
611 as shorthand for
612 \code{PyObject_DelItem(\var{mapping}, PyString_New(\var{key})}.
614 \item The source code for the Expat XML parser is now included with
615 the Python source, so the \module{pyexpat} module is no longer
616 dependent on having a system library containing Expat.
618 \item File objects now manage their internal string buffer
619 differently by increasing it exponentially when needed.
620 This results in the benchmark tests in \file{Lib/test/test_bufio.py}
621 speeding up from 57 seconds to 1.7 seconds, according to one
622 measurement.
624 \item It's now possible to define class and static methods for a C
625 extension type by setting either the \constant{METH_CLASS} or
626 \constant{METH_STATIC} flags in a method's \ctype{PyMethodDef}
627 structure.
629 \end{itemize}
631 \subsection{Port-Specific Changes}
633 Support for a port to IBM's OS/2 using the EMX runtime environment was
634 merged into the main Python source tree. EMX is a POSIX emulation
635 layer over the OS/2 system APIs. The Python port for EMX tries to
636 support all the POSIX-like capability exposed by the EMX runtime, and
637 mostly succeeds; \function{fork()} and \function{fcntl()} are
638 restricted by the limitations of the underlying emulation layer. The
639 standard OS/2 port, which uses IBM's Visual Age compiler, also gained
640 support for case-sensitive import semantics as part of the integration
641 of the EMX port into CVS. (Contributed by Andrew MacIntyre.)
643 On MacOS, most toolbox modules have been weaklinked to improve
644 backward compatibility. This means that modules will no longer fail
645 to load if a single routine is missing on the curent OS version.
646 Instead calling the missing routine will raise an exception.
647 (Contributed by Jack Jansen.)
649 The RPM spec files, found in the \file{Misc/RPM/} directory in the
650 Python source distribution, were updated for 2.3. (Contributed by
651 Sean Reifschneider.)
654 %======================================================================
655 \section{Other Changes and Fixes}
657 Finally, there are various miscellaneous fixes:
659 \begin{itemize}
661 \item The tools used to build the documentation now work under Cygwin
662 as well as \UNIX.
664 \end{itemize}
667 %======================================================================
668 \section{Acknowledgements \label{acks}}
670 The author would like to thank the following people for offering
671 suggestions, corrections and assistance with various drafts of this
672 article: Michael Chermside, Scott David Daniels, Fred~L. Drake, Jr.,
673 Detlef Lannert, Andrew MacIntyre.
675 \end{document}