This commit was manufactured by cvs2svn to create tag 'cnrisync'.
[python/dscho.git] / Doc / ext.tex
blob963f8fff9ee6b0a81ef001fdbd89017eb8e1c6c6
1 \documentstyle[twoside,11pt,myformat]{report}
3 % XXX PM Modulator
5 \title{Extending and Embedding the Python Interpreter}
7 \input{boilerplate}
9 % Tell \index to actually write the .idx file
10 \makeindex
12 \begin{document}
14 \pagenumbering{roman}
16 \maketitle
18 \input{copyright}
20 \begin{abstract}
22 \noindent
23 Python is an interpreted, object-oriented programming language. This
24 document describes how to write modules in C or \Cpp{} to extend the
25 Python interpreter with new modules. Those modules can define new
26 functions but also new object types and their methods. The document
27 also describes how to embed the Python interpreter in another
28 application, for use as an extension language. Finally, it shows how
29 to compile and link extension modules so that they can be loaded
30 dynamically (at run time) into the interpreter, if the underlying
31 operating system supports this feature.
33 This document assumes basic knowledge about Python. For an informal
34 introduction to the language, see the Python Tutorial. The Python
35 Reference Manual gives a more formal definition of the language. The
36 Python Library Reference documents the existing object types,
37 functions and modules (both built-in and written in Python) that give
38 the language its wide application range.
40 \end{abstract}
42 \pagebreak
45 \parskip = 0mm
46 \tableofcontents
49 \pagebreak
51 \pagenumbering{arabic}
54 \chapter{Extending Python with C or \Cpp{} code}
57 \section{Introduction}
59 It is quite easy to add new built-in modules to Python, if you know
60 how to program in C. Such \dfn{extension modules} can do two things
61 that can't be done directly in Python: they can implement new built-in
62 object types, and they can call C library functions and system calls.
64 To support extensions, the Python API (Application Programmers
65 Interface) defines a set of functions, macros and variables that
66 provide access to most aspects of the Python run-time system. The
67 Python API is incorporated in a C source file by including the header
68 \code{"Python.h"}.
70 The compilation of an extension module depends on its intended use as
71 well as on your system setup; details are given in a later section.
74 \section{A Simple Example}
76 Let's create an extension module called \samp{spam} (the favorite food
77 of Monty Python fans...) and let's say we want to create a Python
78 interface to the C library function \code{system()}.\footnote{An
79 interface for this function already exists in the standard module
80 \code{os} --- it was chosen as a simple and straightfoward example.}
81 This function takes a null-terminated character string as argument and
82 returns an integer. We want this function to be callable from Python
83 as follows:
85 \begin{verbatim}
86 >>> import spam
87 >>> status = spam.system("ls -l")
88 \end{verbatim}
90 Begin by creating a file \samp{spammodule.c}. (In general, if a
91 module is called \samp{spam}, the C file containing its implementation
92 is called \file{spammodule.c}; if the module name is very long, like
93 \samp{spammify}, the module name can be just \file{spammify.c}.)
95 The first line of our file can be:
97 \begin{verbatim}
98 #include "Python.h"
99 \end{verbatim}
101 which pulls in the Python API (you can add a comment describing the
102 purpose of the module and a copyright notice if you like).
104 All user-visible symbols defined by \code{"Python.h"} have a prefix of
105 \samp{Py} or \samp{PY}, except those defined in standard header files.
106 For convenience, and since they are used extensively by the Python
107 interpreter, \code{"Python.h"} includes a few standard header files:
108 \code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, and
109 \code{<stdlib.h>}. If the latter header file does not exist on your
110 system, it declares the functions \code{malloc()}, \code{free()} and
111 \code{realloc()} directly.
113 The next thing we add to our module file is the C function that will
114 be called when the Python expression \samp{spam.system(\var{string})}
115 is evaluated (we'll see shortly how it ends up being called):
117 \begin{verbatim}
118 static PyObject *
119 spam_system(self, args)
120 PyObject *self;
121 PyObject *args;
123 char *command;
124 int sts;
125 if (!PyArg_ParseTuple(args, "s", &command))
126 return NULL;
127 sts = system(command);
128 return Py_BuildValue("i", sts);
130 \end{verbatim}
132 There is a straightforward translation from the argument list in
133 Python (e.g.\ the single expression \code{"ls -l"}) to the arguments
134 passed to the C function. The C function always has two arguments,
135 conventionally named \var{self} and \var{args}.
137 The \var{self} argument is only used when the C function implements a
138 builtin method. This will be discussed later. In the example,
139 \var{self} will always be a \code{NULL} pointer, since we are defining
140 a function, not a method. (This is done so that the interpreter
141 doesn't have to understand two different types of C functions.)
143 The \var{args} argument will be a pointer to a Python tuple object
144 containing the arguments. Each item of the tuple corresponds to an
145 argument in the call's argument list. The arguments are Python
146 objects -- in order to do anything with them in our C function we have
147 to convert them to C values. The function \code{PyArg_ParseTuple()}
148 in the Python API checks the argument types and converts them to C
149 values. It uses a template string to determine the required types of
150 the arguments as well as the types of the C variables into which to
151 store the converted values. More about this later.
153 \code{PyArg_ParseTuple()} returns true (nonzero) if all arguments have
154 the right type and its components have been stored in the variables
155 whose addresses are passed. It returns false (zero) if an invalid
156 argument list was passed. In the latter case it also raises an
157 appropriate exception by so the calling function can return
158 \code{NULL} immediately (as we saw in the example).
161 \section{Intermezzo: Errors and Exceptions}
163 An important convention throughout the Python interpreter is the
164 following: when a function fails, it should set an exception condition
165 and return an error value (usually a \code{NULL} pointer). Exceptions
166 are stored in a static global variable inside the interpreter; if this
167 variable is \code{NULL} no exception has occurred. A second global
168 variable stores the ``associated value'' of the exception (the second
169 argument to \code{raise}). A third variable contains the stack
170 traceback in case the error originated in Python code. These three
171 variables are the C equivalents of the Python variables
172 \code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback}
173 (see the section on module \code{sys} in the Library Reference
174 Manual). It is important to know about them to understand how errors
175 are passed around.
177 The Python API defines a number of functions to set various types of
178 exceptions.
180 The most common one is \code{PyErr_SetString()}. Its arguments are an
181 exception object and a C string. The exception object is usually a
182 predefined object like \code{PyExc_ZeroDivisionError}. The C string
183 indicates the cause of the error and is converted to a Python string
184 object and stored as the ``associated value'' of the exception.
186 Another useful function is \code{PyErr_SetFromErrno()}, which only
187 takes an exception argument and constructs the associated value by
188 inspection of the (\UNIX{}) global variable \code{errno}. The most
189 general function is \code{PyErr_SetObject()}, which takes two object
190 arguments, the exception and its associated value. You don't need to
191 \code{Py_INCREF()} the objects passed to any of these functions.
193 You can test non-destructively whether an exception has been set with
194 \code{PyErr_Occurred()}. This returns the current exception object,
195 or \code{NULL} if no exception has occurred. You normally don't need
196 to call \code{PyErr_Occurred()} to see whether an error occurred in a
197 function call, since you should be able to tell from the return value.
199 When a function \var{f} that calls another function var{g} detects
200 that the latter fails, \var{f} should itself return an error value
201 (e.g. \code{NULL} or \code{-1}). It should \emph{not} call one of the
202 \code{PyErr_*()} functions --- one has already been called by \var{g}.
203 \var{f}'s caller is then supposed to also return an error indication
204 to \emph{its} caller, again \emph{without} calling \code{PyErr_*()},
205 and so on --- the most detailed cause of the error was already
206 reported by the function that first detected it. Once the error
207 reaches the Python interpreter's main loop, this aborts the currently
208 executing Python code and tries to find an exception handler specified
209 by the Python programmer.
211 (There are situations where a module can actually give a more detailed
212 error message by calling another \code{PyErr_*()} function, and in
213 such cases it is fine to do so. As a general rule, however, this is
214 not necessary, and can cause information about the cause of the error
215 to be lost: most operations can fail for a variety of reasons.)
217 To ignore an exception set by a function call that failed, the exception
218 condition must be cleared explicitly by calling \code{PyErr_Clear()}.
219 The only time C code should call \code{PyErr_Clear()} is if it doesn't
220 want to pass the error on to the interpreter but wants to handle it
221 completely by itself (e.g. by trying something else or pretending
222 nothing happened).
224 Note that a failing \code{malloc()} call must be turned into an
225 exception --- the direct caller of \code{malloc()} (or
226 \code{realloc()}) must call \code{PyErr_NoMemory()} and return a
227 failure indicator itself. All the object-creating functions
228 (\code{PyInt_FromLong()} etc.) already do this, so only if you call
229 \code{malloc()} directly this note is of importance.
231 Also note that, with the important exception of
232 \code{PyArg_ParseTuple()} and friends, functions that return an
233 integer status usually return a positive value or zero for success and
234 \code{-1} for failure, like \UNIX{} system calls.
236 Finally, be careful to clean up garbage (by making \code{Py_XDECREF()}
237 or \code{Py_DECREF()} calls for objects you have already created) when
238 you return an error indicator!
240 The choice of which exception to raise is entirely yours. There are
241 predeclared C objects corresponding to all built-in Python exceptions,
242 e.g. \code{PyExc_ZeroDevisionError} which you can use directly. Of
243 course, you should choose exceptions wisely --- don't use
244 \code{PyExc_TypeError} to mean that a file couldn't be opened (that
245 should probably be \code{PyExc_IOError}). If something's wrong with
246 the argument list, the \code{PyArg_ParseTuple()} function usually
247 raises \code{PyExc_TypeError}. If you have an argument whose value
248 which must be in a particular range or must satisfy other conditions,
249 \code{PyExc_ValueError} is appropriate.
251 You can also define a new exception that is unique to your module.
252 For this, you usually declare a static object variable at the
253 beginning of your file, e.g.
255 \begin{verbatim}
256 static PyObject *SpamError;
257 \end{verbatim}
259 and initialize it in your module's initialization function
260 (\code{initspam()}) with a string object, e.g. (leaving out the error
261 checking for now):
263 \begin{verbatim}
264 void
265 initspam()
267 PyObject *m, *d;
268 m = Py_InitModule("spam", SpamMethods);
269 d = PyModule_GetDict(m);
270 SpamError = PyString_FromString("spam.error");
271 PyDict_SetItemString(d, "error", SpamError);
273 \end{verbatim}
275 Note that the Python name for the exception object is
276 \code{spam.error}. It is conventional for module and exception names
277 to be spelled in lower case. It is also conventional that the
278 \emph{value} of the exception object is the same as its name, e.g.\
279 the string \code{"spam.error"}.
282 \section{Back to the Example}
284 Going back to our example function, you should now be able to
285 understand this statement:
287 \begin{verbatim}
288 if (!PyArg_ParseTuple(args, "s", &command))
289 return NULL;
290 \end{verbatim}
292 It returns \code{NULL} (the error indicator for functions returning
293 object pointers) if an error is detected in the argument list, relying
294 on the exception set by \code{PyArg_ParseTuple()}. Otherwise the
295 string value of the argument has been copied to the local variable
296 \code{command}. This is a pointer assignment and you are not supposed
297 to modify the string to which it points (so in Standard C, the variable
298 \code{command} should properly be declared as \samp{const char
299 *command}).
301 The next statement is a call to the \UNIX{} function \code{system()},
302 passing it the string we just got from \code{PyArg_ParseTuple()}:
304 \begin{verbatim}
305 sts = system(command);
306 \end{verbatim}
308 Our \code{spam.system()} function must return the value of \code{sys}
309 as a Python object. This is done using the function
310 \code{Py_BuildValue()}, which is something like the inverse of
311 \code{PyArg_ParseTuple()}: it takes a format string and an arbitrary
312 number of C values, and returns a new Python object. More info on
313 \code{Py_BuildValue()} is given later.
315 \begin{verbatim}
316 return Py_BuildValue("i", sts);
317 \end{verbatim}
319 In this case, it will return an integer object. (Yes, even integers
320 are objects on the heap in Python!)
322 If you have a C function that returns no useful argument (a function
323 returning \code{void}), the corresponding Python function must return
324 \code{None}. You need this idiom to do so:
326 \begin{verbatim}
327 Py_INCREF(Py_None);
328 return Py_None;
329 \end{verbatim}
331 \code{Py_None} is the C name for the special Python object
332 \code{None}. It is a genuine Python object (not a \code{NULL}
333 pointer, which means ``error'' in most contexts, as we have seen).
336 \section{The Module's Method Table and Initialization Function}
338 I promised to show how \code{spam_system()} is called from Python
339 programs. First, we need to list its name and address in a ``method
340 table'':
342 \begin{verbatim}
343 static PyMethodDef SpamMethods[] = {
345 {"system", spam_system, 1},
347 {NULL, NULL} /* Sentinel */
349 \end{verbatim}
351 Note the third entry (\samp{1}). This is a flag telling the
352 interpreter the calling convention to be used for the C function. It
353 should normally always be \samp{1}; a value of \samp{0} means that an
354 obsolete variant of \code{PyArg_ParseTuple()} is used.
356 The method table must be passed to the interpreter in the module's
357 initialization function (which should be the only non-\code{static}
358 item defined in the module file):
360 \begin{verbatim}
361 void
362 initspam()
364 (void) Py_InitModule("spam", SpamMethods);
366 \end{verbatim}
368 When the Python program imports module \code{spam} for the first time,
369 \code{initspam()} is called. It calls \code{Py_InitModule()}, which
370 creates a ``module object'' (which is inserted in the dictionary
371 \code{sys.modules} under the key \code{"spam"}), and inserts built-in
372 function objects into the newly created module based upon the table
373 (an array of \code{PyMethodDef} structures) that was passed as its
374 second argument. \code{Py_InitModule()} returns a pointer to the
375 module object that it creates (which is unused here). It aborts with
376 a fatal error if the module could not be initialized satisfactorily,
377 so the caller doesn't need to check for errors.
380 \section{Compilation and Linkage}
382 There are two more things to do before you can use your new extension:
383 compiling and linking it with the Python system. If you use dynamic
384 loading, the details depend on the style of dynamic loading your
385 system uses; see the chapter on Dynamic Loading for more info about
386 this.
388 If you can't use dynamic loading, or if you want to make your module a
389 permanent part of the Python interpreter, you will have to change the
390 configuration setup and rebuild the interpreter. Luckily, this is
391 very simple: just place your file (\file{spammodule.c} for example) in
392 the \file{Modules} directory, add a line to the file
393 \file{Modules/Setup} describing your file:
395 \begin{verbatim}
396 spam spammodule.o
397 \end{verbatim}
399 and rebuild the interpreter by running \code{make} in the toplevel
400 directory. You can also run \code{make} in the \file{Modules}
401 subdirectory, but then you must first rebuilt the \file{Makefile}
402 there by running \code{make Makefile}. (This is necessary each time
403 you change the \file{Setup} file.)
405 If your module requires additional libraries to link with, these can
406 be listed on the line in the \file{Setup} file as well, for instance:
408 \begin{verbatim}
409 spam spammodule.o -lX11
410 \end{verbatim}
413 \section{Calling Python Functions From C}
415 So far we have concentrated on making C functions callable from
416 Python. The reverse is also useful: calling Python functions from C.
417 This is especially the case for libraries that support so-called
418 ``callback'' functions. If a C interface makes use of callbacks, the
419 equivalent Python often needs to provide a callback mechanism to the
420 Python programmer; the implementation will require calling the Python
421 callback functions from a C callback. Other uses are also imaginable.
423 Fortunately, the Python interpreter is easily called recursively, and
424 there is a standard interface to call a Python function. (I won't
425 dwell on how to call the Python parser with a particular string as
426 input --- if you're interested, have a look at the implementation of
427 the \samp{-c} command line option in \file{Python/pythonmain.c}.)
429 Calling a Python function is easy. First, the Python program must
430 somehow pass you the Python function object. You should provide a
431 function (or some other interface) to do this. When this function is
432 called, save a pointer to the Python function object (be careful to
433 \code{Py_INCREF()} it!) in a global variable --- or whereever you see fit.
434 For example, the following function might be part of a module
435 definition:
437 \begin{verbatim}
438 static PyObject *my_callback = NULL;
440 static PyObject *
441 my_set_callback(dummy, arg)
442 PyObject *dummy, *arg;
444 Py_XDECREF(my_callback); /* Dispose of previous callback */
445 Py_XINCREF(arg); /* Add a reference to new callback */
446 my_callback = arg; /* Remember new callback */
447 /* Boilerplate to return "None" */
448 Py_INCREF(Py_None);
449 return Py_None;
451 \end{verbatim}
453 The macros \code{Py_XINCREF()} and \code{Py_XDECREF()} increment/decrement
454 the reference count of an object and are safe in the presence of
455 \code{NULL} pointers. More info on them in the section on Reference
456 Counts below.
458 Later, when it is time to call the function, you call the C function
459 \code{PyEval_CallObject()}. This function has two arguments, both
460 pointers to arbitrary Python objects: the Python function, and the
461 argument list. The argument list must always be a tuple object, whose
462 length is the number of arguments. To call the Python function with
463 no arguments, pass an empty tuple; to call it with one argument, pass
464 a singleton tuple. \code{Py_BuildValue()} returns a tuple when its
465 format string consists of zero or more format codes between
466 parentheses. For example:
468 \begin{verbatim}
469 int arg;
470 PyObject *arglist;
471 PyObject *result;
473 arg = 123;
475 /* Time to call the callback */
476 arglist = Py_BuildValue("(i)", arg);
477 result = PyEval_CallObject(my_callback, arglist);
478 Py_DECREF(arglist);
479 \end{verbatim}
481 \code{PyEval_CallObject()} returns a Python object pointer: this is
482 the return value of the Python function. \code{PyEval_CallObject()} is
483 ``reference-count-neutral'' with respect to its arguments. In the
484 example a new tuple was created to serve as the argument list, which
485 is \code{Py_DECREF()}-ed immediately after the call.
487 The return value of \code{PyEval_CallObject()} is ``new'': either it
488 is a brand new object, or it is an existing object whose reference
489 count has been incremented. So, unless you want to save it in a
490 global variable, you should somehow \code{Py_DECREF()} the result,
491 even (especially!) if you are not interested in its value.
493 Before you do this, however, it is important to check that the return
494 value isn't \code{NULL}. If it is, the Python function terminated by raising
495 an exception. If the C code that called \code{PyEval_CallObject()} is
496 called from Python, it should now return an error indication to its
497 Python caller, so the interpreter can print a stack trace, or the
498 calling Python code can handle the exception. If this is not possible
499 or desirable, the exception should be cleared by calling
500 \code{PyErr_Clear()}. For example:
502 \begin{verbatim}
503 if (result == NULL)
504 return NULL; /* Pass error back */
505 ...use result...
506 Py_DECREF(result);
507 \end{verbatim}
509 Depending on the desired interface to the Python callback function,
510 you may also have to provide an argument list to \code{PyEval_CallObject()}.
511 In some cases the argument list is also provided by the Python
512 program, through the same interface that specified the callback
513 function. It can then be saved and used in the same manner as the
514 function object. In other cases, you may have to construct a new
515 tuple to pass as the argument list. The simplest way to do this is to
516 call \code{Py_BuildValue()}. For example, if you want to pass an integral
517 event code, you might use the following code:
519 \begin{verbatim}
520 PyObject *arglist;
522 arglist = Py_BuildValue("(l)", eventcode);
523 result = PyEval_CallObject(my_callback, arglist);
524 Py_DECREF(arglist);
525 if (result == NULL)
526 return NULL; /* Pass error back */
527 /* Here maybe use the result */
528 Py_DECREF(result);
529 \end{verbatim}
531 Note the placement of \code{Py_DECREF(argument)} immediately after the call,
532 before the error check! Also note that strictly spoken this code is
533 not complete: \code{Py_BuildValue()} may run out of memory, and this should
534 be checked.
537 \section{Format Strings for {\tt PyArg_ParseTuple()}}
539 The \code{PyArg_ParseTuple()} function is declared as follows:
541 \begin{verbatim}
542 int PyArg_ParseTuple(PyObject *arg, char *format, ...);
543 \end{verbatim}
545 The \var{arg} argument must be a tuple object containing an argument
546 list passed from Python to a C function. The \var{format} argument
547 must be a format string, whose syntax is explained below. The
548 remaining arguments must be addresses of variables whose type is
549 determined by the format string. For the conversion to succeed, the
550 \var{arg} object must match the format and the format must be
551 exhausted.
553 Note that while \code{PyArg_ParseTuple()} checks that the Python
554 arguments have the required types, it cannot check the validity of the
555 addresses of C variables passed to the call: if you make mistakes
556 there, your code will probably crash or at least overwrite random bits
557 in memory. So be careful!
559 A format string consists of zero or more ``format units''. A format
560 unit describes one Python object; it is usually a single character or
561 a parenthesized sequence of format units. With a few exceptions, a
562 format unit that is not a parenthesized sequence normally corresponds
563 to a single address argument to \code{PyArg_ParseTuple()}. In the
564 following description, the quoted form is the format unit; the entry
565 in (round) parentheses is the Python object type that matches the
566 format unit; and the entry in [square] brackets is the type of the C
567 variable(s) whose address should be passed. (Use the \samp{\&}
568 operator to pass a variable's address.)
570 \begin{description}
572 \item[\samp{s} (string) [char *]]
573 Convert a Python string to a C pointer to a character string. You
574 must not provide storage for the string itself; a pointer to an
575 existing string is stored into the character pointer variable whose
576 address you pass. The C string is null-terminated. The Python string
577 must not contain embedded null bytes; if it does, a \code{TypeError}
578 exception is raised.
580 \item[\samp{s\#} (string) {[char *, int]}]
581 This variant on \code{'s'} stores into two C variables, the first one
582 a pointer to a character string, the second one its length. In this
583 case the Python string may contain embedded null bytes.
585 \item[\samp{z} (string or \code{None}) {[char *]}]
586 Like \samp{s}, but the Python object may also be \code{None}, in which
587 case the C pointer is set to \code{NULL}.
589 \item[\samp{z\#} (string or \code{None}) {[char *, int]}]
590 This is to \code{'s\#'} as \code{'z'} is to \code{'s'}.
592 \item[\samp{b} (integer) {[char]}]
593 Convert a Python integer to a tiny int, stored in a C \code{char}.
595 \item[\samp{h} (integer) {[short int]}]
596 Convert a Python integer to a C \code{short int}.
598 \item[\samp{i} (integer) {[int]}]
599 Convert a Python integer to a plain C \code{int}.
601 \item[\samp{l} (integer) {[long int]}]
602 Convert a Python integer to a C \code{long int}.
604 \item[\samp{c} (string of length 1) {[char]}]
605 Convert a Python character, represented as a string of length 1, to a
606 C \code{char}.
608 \item[\samp{f} (float) {[float]}]
609 Convert a Python floating point number to a C \code{float}.
611 \item[\samp{d} (float) {[double]}]
612 Convert a Python floating point number to a C \code{double}.
614 \item[\samp{O} (object) {[PyObject *]}]
615 Store a Python object (without any conversion) in a C object pointer.
616 The C program thus receives the actual object that was passed. The
617 object's reference count is not increased. The pointer stored is not
618 \code{NULL}.
620 \item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
621 Store a Python object in a C object pointer. This is similar to
622 \samp{O}, but takes two C arguments: the first is the address of a
623 Python type object, the second is the address of the C variable (of
624 type \code{PyObject *}) into which the object pointer is stored.
625 If the Python object does not have the required type, a
626 \code{TypeError} exception is raised.
628 \item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
629 Convert a Python object to a C variable through a \var{converter}
630 function. This takes two arguments: the first is a function, the
631 second is the address of a C variable (of arbitrary type), converted
632 to \code{void *}. The \var{converter} function in turn is called as
633 follows:
635 \code{\var{status} = \var{converter}(\var{object}, \var{address});}
637 where \var{object} is the Python object to be converted and
638 \var{address} is the \code{void *} argument that was passed to
639 \code{PyArg_ConvertTuple()}. The returned \var{status} should be
640 \code{1} for a successful conversion and \code{0} if the conversion
641 has failed. When the conversion fails, the \var{converter} function
642 should raise an exception.
644 \item[\samp{S} (string) {[PyStringObject *]}]
645 Like \samp{O} but raises a \code{TypeError} exception that the object
646 is a string object. The C variable may also be declared as
647 \code{PyObject *}.
649 \item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
650 The object must be a Python tuple whose length is the number of format
651 units in \var{items}. The C arguments must correspond to the
652 individual format units in \var{items}. Format units for tuples may
653 be nested.
655 \end{description}
657 It is possible to pass Python long integers where integers are
658 requested; however no proper range checking is done -- the most
659 significant bits are silently truncated when the receiving field is
660 too small to receive the value (actually, the semantics are inherited
661 from downcasts in C --- your milage may vary).
663 A few other characters have a meaning in a format string. These may
664 not occur inside nested parentheses. They are:
666 \begin{description}
668 \item[\samp{|}]
669 Indicates that the remaining arguments in the Python argument list are
670 optional. The C variables corresponding to optional arguments should
671 be initialized to their default value --- when an optional argument is
672 not specified, the \code{PyArg_ParseTuple} does not touch the contents
673 of the corresponding C variable(s).
675 \item[\samp{:}]
676 The list of format units ends here; the string after the colon is used
677 as the function name in error messages (the ``associated value'' of
678 the exceptions that \code{PyArg_ParseTuple} raises).
680 \item[\samp{;}]
681 The list of format units ends here; the string after the colon is used
682 as the error message \emph{instead} of the default error message.
683 Clearly, \samp{:} and \samp{;} mutually exclude each other.
685 \end{description}
687 Some example calls:
689 \begin{verbatim}
690 int ok;
691 int i, j;
692 long k, l;
693 char *s;
694 int size;
696 ok = PyArg_ParseTuple(args, ""); /* No arguments */
697 /* Python call: f() */
699 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
700 /* Possible Python call: f('whoops!') */
702 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
703 /* Possible Python call: f(1, 2, 'three') */
705 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
706 /* A pair of ints and a string, whose size is also returned */
707 /* Possible Python call: f(1, 2, 'three') */
710 char *file;
711 char *mode = "r";
712 int bufsize = 0;
713 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
714 /* A string, and optionally another string and an integer */
715 /* Possible Python calls:
716 f('spam')
717 f('spam', 'w')
718 f('spam', 'wb', 100000) */
722 int left, top, right, bottom, h, v;
723 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
724 &left, &top, &right, &bottom, &h, &v);
725 /* A rectangle and a point */
726 /* Possible Python call:
727 f(((0, 0), (400, 300)), (10, 10)) */
729 \end{verbatim}
732 \section{The {\tt Py_BuildValue()} Function}
734 This function is the counterpart to \code{PyArg_ParseTuple()}. It is
735 declared as follows:
737 \begin{verbatim}
738 PyObject *Py_BuildValue(char *format, ...);
739 \end{verbatim}
741 It recognizes a set of format units similar to the ones recognized by
742 \code{PyArg_ParseTuple()}, but the arguments (which are input to the
743 function, not output) must not be pointers, just values. It returns a
744 new Python object, suitable for returning from a C function called
745 from Python.
747 One difference with \code{PyArg_ParseTuple()}: while the latter
748 requires its first argument to be a tuple (since Python argument lists
749 are always represented as tuples internally), \code{BuildValue()} does
750 not always build a tuple. It builds a tuple only if its format string
751 contains two or more format units. If the format string is empty, it
752 returns \code{None}; if it contains exactly one format unit, it
753 returns whatever object is described by that format unit. To force it
754 to return a tuple of size 0 or one, parenthesize the format string.
756 In the following description, the quoted form is the format unit; the
757 entry in (round) parentheses is the Python object type that the format
758 unit will return; and the entry in [square] brackets is the type of
759 the C value(s) to be passed.
761 The characters space, tab, colon and comma are ignored in format
762 strings (but not within format units such as \samp{s\#}). This can be
763 used to make long format strings a tad more readable.
765 \begin{description}
767 \item[\samp{s} (string) {[char *]}]
768 Convert a null-terminated C string to a Python object. If the C
769 string pointer is \code{NULL}, \code{None} is returned.
771 \item[\samp{s\#} (string) {[char *, int]}]
772 Convert a C string and its length to a Python object. If the C string
773 pointer is \code{NULL}, the length is ignored and \code{None} is
774 returned.
776 \item[\samp{z} (string or \code{None}) {[char *]}]
777 Same as \samp{s}.
779 \item[\samp{z\#} (string or \code{None}) {[char *, int]}]
780 Same as \samp{s\#}.
782 \item[\samp{i} (integer) {[int]}]
783 Convert a plain C \code{int} to a Python integer object.
785 \item[\samp{b} (integer) {[char]}]
786 Same as \samp{i}.
788 \item[\samp{h} (integer) {[short int]}]
789 Same as \samp{i}.
791 \item[\samp{l} (integer) {[long int]}]
792 Convert a C \code{long int} to a Python integer object.
794 \item[\samp{c} (string of length 1) {[char]}]
795 Convert a C \code{int} representing a character to a Python string of
796 length 1.
798 \item[\samp{d} (float) {[double]}]
799 Convert a C \code{double} to a Python floating point number.
801 \item[\samp{f} (float) {[float]}]
802 Same as \samp{d}.
804 \item[\samp{O} (object) {[PyObject *]}]
805 Pass a Python object untouched (except for its reference count, which
806 is incremented by one). If the object passed in is a \code{NULL}
807 pointer, it is assumed that this was caused because the call producing
808 the argument found an error and set an exception. Therefore,
809 \code{Py_BuildValue()} will return \code{NULL} but won't raise an
810 exception. If no exception has been raised yet,
811 \code{PyExc_SystemError} is set.
813 \item[\samp{S} (object) {[PyObject *]}]
814 Same as \samp{O}.
816 \item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
817 Convert \var{anything} to a Python object through a \var{converter}
818 function. The function is called with \var{anything} (which should be
819 compatible with \code{void *}) as its argument and should return a
820 ``new'' Python object, or \code{NULL} if an error occurred.
822 \item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
823 Convert a sequence of C values to a Python tuple with the same number
824 of items.
826 \item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
827 Convert a sequence of C values to a Python list with the same number
828 of items.
830 \item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
831 Convert a sequence of C values to a Python dictionary. Each pair of
832 consecutive C values adds one item to the dictionary, serving as key
833 and value, respectively.
835 \end{description}
837 If there is an error in the format string, the
838 \code{PyExc_SystemError} exception is raised and \code{NULL} returned.
840 Examples (to the left the call, to the right the resulting Python value):
842 \begin{verbatim}
843 Py_BuildValue("") None
844 Py_BuildValue("i", 123) 123
845 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
846 Py_BuildValue("s", "hello") 'hello'
847 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
848 Py_BuildValue("s#", "hello", 4) 'hell'
849 Py_BuildValue("()") ()
850 Py_BuildValue("(i)", 123) (123,)
851 Py_BuildValue("(ii)", 123, 456) (123, 456)
852 Py_BuildValue("(i,i)", 123, 456) (123, 456)
853 Py_BuildValue("[i,i]", 123, 456) [123, 456]
854 Py_BuildValue("{s:i,s:i}",
855 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
856 Py_BuildValue("((ii)(ii)) (ii)",
857 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
858 \end{verbatim}
861 \section{Reference Counts}
863 \subsection{Introduction}
865 In languages like C or \Cpp{}, the programmer is responsible for
866 dynamic allocation and deallocation of memory on the heap. In C, this
867 is done using the functions \code{malloc()} and \code{free()}. In
868 \Cpp{}, the operators \code{new} and \code{delete} are used with
869 essentially the same meaning; they are actually implemented using
870 \code{malloc()} and \code{free()}, so we'll restrict the following
871 discussion to the latter.
873 Every block of memory allocated with \code{malloc()} should eventually
874 be returned to the pool of available memory by exactly one call to
875 \code{free()}. It is important to call \code{free()} at the right
876 time. If a block's address is forgotten but \code{free()} is not
877 called for it, the memory it occupies cannot be reused until the
878 program terminates. This is called a \dfn{memory leak}. On the other
879 hand, if a program calls \code{free()} for a block and then continues
880 to use the block, it creates a conflict with re-use of the block
881 through another \code{malloc()} call. This is called \dfn{using freed
882 memory} has the same bad consequences as referencing uninitialized
883 data --- core dumps, wrong results, mysterious crashes.
885 Common causes of memory leaks are unusual paths through the code. For
886 instance, a function may allocate a block of memory, do some
887 calculation, and then free the block again. Now a change in the
888 requirements for the function may add a test to the calculation that
889 detects an error condition and can return prematurely from the
890 function. It's easy to forget to free the allocated memory block when
891 taking this premature exit, especially when it is added later to the
892 code. Such leaks, once introduced, often go undetected for a long
893 time: the error exit is taken only in a small fraction of all calls,
894 and most modern machines have plenty of virtual memory, so the leak
895 only becomes apparent in a long-running process that uses the leaking
896 function frequently. Therefore, it's important to prevent leaks from
897 happening by having a coding convention or strategy that minimizes
898 this kind of errors.
900 Since Python makes heavy use of \code{malloc()} and \code{free()}, it
901 needs a strategy to avoid memory leaks as well as the use of freed
902 memory. The chosen method is called \dfn{reference counting}. The
903 principle is simple: every object contains a counter, which is
904 incremented when a reference to the object is stored somewhere, and
905 which is decremented when a reference to it is deleted. When the
906 counter reaches zero, the last reference to the object has been
907 deleted and the object is freed.
909 An alternative strategy is called \dfn{automatic garbage collection}.
910 (Sometimes, reference counting is also referred to as a garbage
911 collection strategy, hence my use of ``automatic'' to distinguish the
912 two.) The big advantage of automatic garbage collection is that the
913 user doesn't need to call \code{free()} explicitly. (Another claimed
914 advantage is an improvement in speed or memory usage --- this is no
915 hard fact however.) The disadvantage is that for C, there is no
916 truly portable automatic garbage collector, while reference counting
917 can be implemented portably (as long as the functions \code{malloc()}
918 and \code{free()} are available --- which the C Standard guarantees).
919 Maybe some day a sufficiently portable automatic garbage collector
920 will be available for C. Until then, we'll have to live with
921 reference counts.
923 \subsection{Reference Counting in Python}
925 There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
926 which handle the incrementing and decrementing of the reference count.
927 \code{Py_DECREF()} also frees the object when the count reaches zero.
928 For flexibility, it doesn't call \code{free()} directly --- rather, it
929 makes a call through a function pointer in the object's \dfn{type
930 object}. For this purpose (and others), every object also contains a
931 pointer to its type object.
933 The big question now remains: when to use \code{Py_INCREF(x)} and
934 \code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
935 ``owns'' an object; however, you can \dfn{own a reference} to an
936 object. An object's reference count is now defined as the number of
937 owned references to it. The owner of a reference is responsible for
938 calling \code{Py_DECREF()} when the reference is no longer needed.
939 Ownership of a reference can be transferred. There are three ways to
940 dispose of an owned reference: pass it on, store it, or call
941 \code{Py_DECREF()}. Forgetting to dispose of an owned reference creates
942 a memory leak.
944 It is also possible to \dfn{borrow}\footnote{The metaphor of
945 ``borrowing'' a reference is not completely correct: the owner still
946 has a copy of the reference.} a reference to an object. The borrower
947 of a reference should not call \code{Py_DECREF()}. The borrower must
948 not hold on to the object longer than the owner from which it was
949 borrowed. Using a borrowed reference after the owner has disposed of
950 it risks using freed memory and should be avoided
951 completely.\footnote{Checking that the reference count is at least 1
952 \strong{does not work} --- the reference count itself could be in
953 freed memory and may thus be reused for another object!}
955 The advantage of borrowing over owning a reference is that you don't
956 need to take care of disposing of the reference on all possible paths
957 through the code --- in other words, with a borrowed reference you
958 don't run the risk of leaking when a premature exit is taken. The
959 disadvantage of borrowing over leaking is that there are some subtle
960 situations where in seemingly correct code a borrowed reference can be
961 used after the owner from which it was borrowed has in fact disposed
962 of it.
964 A borrowed reference can be changed into an owned reference by calling
965 \code{Py_INCREF()}. This does not affect the status of the owner from
966 which the reference was borrowed --- it creates a new owned reference,
967 and gives full owner responsibilities (i.e., the new owner must
968 dispose of the reference properly, as well as the previous owner).
970 \subsection{Ownership Rules}
972 Whenever an object reference is passed into or out of a function, it
973 is part of the function's interface specification whether ownership is
974 transferred with the reference or not.
976 Most functions that return a reference to an object pass on ownership
977 with the reference. In particular, all functions whose function it is
978 to create a new object, e.g.\ \code{PyInt_FromLong()} and
979 \code{Py_BuildValue()}, pass ownership to the receiver. Even if in
980 fact, in some cases, you don't receive a reference to a brand new
981 object, you still receive ownership of the reference. For instance,
982 \code{PyInt_FromLong()} maintains a cache of popular values and can
983 return a reference to a cached item.
985 Many functions that extract objects from other objects also transfer
986 ownership with the reference, for instance
987 \code{PyObject_GetAttrString()}. The picture is less clear, here,
988 however, since a few common routines are exceptions:
989 \code{PyTuple_GetItem()}, \code{PyList_GetItem()} and
990 \code{PyDict_GetItem()} (and \code{PyDict_GetItemString()}) all return
991 references that you borrow from the tuple, list or dictionary.
993 The function \code{PyImport_AddModule()} also returns a borrowed
994 reference, even though it may actually create the object it returns:
995 this is possible because an owned reference to the object is stored in
996 \code{sys.modules}.
998 When you pass an object reference into another function, in general,
999 the function borrows the reference from you --- if it needs to store
1000 it, it will use \code{Py_INCREF()} to become an independent owner.
1001 There are exactly two important exceptions to this rule:
1002 \code{PyTuple_SetItem()} and \code{PyList_SetItem()}. These functions
1003 take over ownership of the item passed to them --- even if they fail!
1004 (Note that \code{PyDict_SetItem()} and friends don't take over
1005 ownership --- they are ``normal''.)
1007 When a C function is called from Python, it borrows references to its
1008 arguments from the caller. The caller owns a reference to the object,
1009 so the borrowed reference's lifetime is guaranteed until the function
1010 returns. Only when such a borrowed reference must be stored or passed
1011 on, it must be turned into an owned reference by calling
1012 \code{Py_INCREF()}.
1014 The object reference returned from a C function that is called from
1015 Python must be an owned reference --- ownership is tranferred from the
1016 function to its caller.
1018 \subsection{Thin Ice}
1020 There are a few situations where seemingly harmless use of a borrowed
1021 reference can lead to problems. These all have to do with implicit
1022 invocations of the interpreter, which can cause the owner of a
1023 reference to dispose of it.
1025 The first and most important case to know about is using
1026 \code{Py_DECREF()} on an unrelated object while borrowing a reference
1027 to a list item. For instance:
1029 \begin{verbatim}
1030 bug(PyObject *list) {
1031 PyObject *item = PyList_GetItem(list, 0);
1032 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1033 PyObject_Print(item, stdout, 0); /* BUG! */
1035 \end{verbatim}
1037 This function first borrows a reference to \code{list[0]}, then
1038 replaces \code{list[1]} with the value \code{0}, and finally prints
1039 the borrowed reference. Looks harmless, right? But it's not!
1041 Let's follow the control flow into \code{PyList_SetItem()}. The list
1042 owns references to all its items, so when item 1 is replaced, it has
1043 to dispose of the original item 1. Now let's suppose the original
1044 item 1 was an instance of a user-defined class, and let's further
1045 suppose that the class defined a \code{__del__()} method. If this
1046 class instance has a reference count of 1, disposing of it will call
1047 its \code{__del__()} method.
1049 Since it is written in Python, the \code{__del__()} method can execute
1050 arbitrary Python code. Could it perhaps do something to invalidate
1051 the reference to \code{item} in \code{bug()}? You bet! Assuming that
1052 the list passed into \code{bug()} is accessible to the
1053 \code{__del__()} method, it could execute a statement to the effect of
1054 \code{del list[0]}, and assuming this was the last reference to that
1055 object, it would free the memory associated with it, thereby
1056 invalidating \code{item}.
1058 The solution, once you know the source of the problem, is easy:
1059 temporarily increment the reference count. The correct version of the
1060 function reads:
1062 \begin{verbatim}
1063 no_bug(PyObject *list) {
1064 PyObject *item = PyList_GetItem(list, 0);
1065 Py_INCREF(item);
1066 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1067 PyObject_Print(item, stdout, 0);
1068 Py_DECREF(item);
1070 \end{verbatim}
1072 This is a true story. An older version of Python contained variants
1073 of this bug and someone spent a considerable amount of time in a C
1074 debugger to figure out why his \code{__del__()} methods would fail...
1076 The second case of problems with a borrowed reference is a variant
1077 involving threads. Normally, multiple threads in the Python
1078 interpreter can't get in each other's way, because there is a global
1079 lock protecting Python's entire object space. However, it is possible
1080 to temporarily release this lock using the macro
1081 \code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
1082 \code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
1083 calls, to let other threads use the CPU while waiting for the I/O to
1084 complete. Obviously, the following function has the same problem as
1085 the previous one:
1087 \begin{verbatim}
1088 bug(PyObject *list) {
1089 PyObject *item = PyList_GetItem(list, 0);
1090 Py_BEGIN_ALLOW_THREADS
1091 ...some blocking I/O call...
1092 Py_END_ALLOW_THREADS
1093 PyObject_Print(item, stdout, 0); /* BUG! */
1095 \end{verbatim}
1097 \subsection{NULL Pointers}
1099 In general, functions that take object references as arguments don't
1100 expect you to pass them \code{NULL} pointers, and will dump core (or
1101 cause later core dumps) if you do so. Functions that return object
1102 references generally return \code{NULL} only to indicate that an
1103 exception occurred. The reason for not testing for \code{NULL}
1104 arguments is that functions often pass the objects they receive on to
1105 other function --- if each function were to test for \code{NULL},
1106 there would be a lot of redundant tests and the code would run slower.
1108 It is better to test for \code{NULL} only at the ``source'', i.e.\
1109 when a pointer that may be \code{NULL} is received, e.g.\ from
1110 \code{malloc()} or from a function that may raise an exception.
1112 The macros \code{Py_INCREF()} and \code{Py_DECREF()}
1113 don't check for \code{NULL} pointers --- however, their variants
1114 \code{Py_XINCREF()} and \code{Py_XDECREF()} do.
1116 The macros for checking for a particular object type
1117 (\code{Py\var{type}_Check()}) don't check for \code{NULL} pointers ---
1118 again, there is much code that calls several of these in a row to test
1119 an object against various different expected types, and this would
1120 generate redundant tests. There are no variants with \code{NULL}
1121 checking.
1123 The C function calling mechanism guarantees that the argument list
1124 passed to C functions (\code{args} in the examples) is never
1125 \code{NULL} --- in fact it guarantees that it is always a tuple.%
1126 \footnote{These guarantees don't hold when you use the ``old'' style
1127 calling convention --- this is still found in much existing code.}
1129 It is a severe error to ever let a \code{NULL} pointer ``escape'' to
1130 the Python user.
1133 \section{Writing Extensions in \Cpp{}}
1135 It is possible to write extension modules in \Cpp{}. Some restrictions
1136 apply. If the main program (the Python interpreter) is compiled and
1137 linked by the C compiler, global or static objects with constructors
1138 cannot be used. This is not a problem if the main program is linked
1139 by the \Cpp{} compiler. All functions that will be called directly or
1140 indirectly (i.e. via function pointers) by the Python interpreter will
1141 have to be declared using \code{extern "C"}; this applies to all
1142 ``methods'' as well as to the module's initialization function.
1143 It is unnecessary to enclose the Python header files in
1144 \code{extern "C" \{...\}} --- they use this form already if the symbol
1145 \samp{__cplusplus} is defined (all recent C++ compilers define this
1146 symbol).
1148 \chapter{Embedding Python in another application}
1150 Embedding Python is similar to extending it, but not quite. The
1151 difference is that when you extend Python, the main program of the
1152 application is still the Python interpreter, while if you embed
1153 Python, the main program may have nothing to do with Python ---
1154 instead, some parts of the application occasionally call the Python
1155 interpreter to run some Python code.
1157 So if you are embedding Python, you are providing your own main
1158 program. One of the things this main program has to do is initialize
1159 the Python interpreter. At the very least, you have to call the
1160 function \code{Py_Initialize()}. There are optional calls to pass command
1161 line arguments to Python. Then later you can call the interpreter
1162 from any part of the application.
1164 There are several different ways to call the interpreter: you can pass
1165 a string containing Python statements to \code{PyRun_SimpleString()},
1166 or you can pass a stdio file pointer and a file name (for
1167 identification in error messages only) to \code{PyRun_SimpleFile()}. You
1168 can also call the lower-level operations described in the previous
1169 chapters to construct and use Python objects.
1171 A simple demo of embedding Python can be found in the directory
1172 \file{Demo/embed}.
1175 \section{Embedding Python in \Cpp{}}
1177 It is also possible to embed Python in a \Cpp{} program; precisely how this
1178 is done will depend on the details of the \Cpp{} system used; in general you
1179 will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
1180 to compile and link your program. There is no need to recompile Python
1181 itself using \Cpp{}.
1184 \chapter{Dynamic Loading}
1186 On most modern systems it is possible to configure Python to support
1187 dynamic loading of extension modules implemented in C. When shared
1188 libraries are used dynamic loading is configured automatically;
1189 otherwise you have to select it as a build option (see below). Once
1190 configured, dynamic loading is trivial to use: when a Python program
1191 executes \code{import spam}, the search for modules tries to find a
1192 file \file{spammodule.o} (\file{spammodule.so} when using shared
1193 libraries) in the module search path, and if one is found, it is
1194 loaded into the executing binary and executed. Once loaded, the
1195 module acts just like a built-in extension module.
1197 The advantages of dynamic loading are twofold: the ``core'' Python
1198 binary gets smaller, and users can extend Python with their own
1199 modules implemented in C without having to build and maintain their
1200 own copy of the Python interpreter. There are also disadvantages:
1201 dynamic loading isn't available on all systems (this just means that
1202 on some systems you have to use static loading), and dynamically
1203 loading a module that was compiled for a different version of Python
1204 (e.g. with a different representation of objects) may dump core.
1207 \section{Configuring and Building the Interpreter for Dynamic Loading}
1209 There are three styles of dynamic loading: one using shared libraries,
1210 one using SGI IRIX 4 dynamic loading, and one using GNU dynamic
1211 loading.
1213 \subsection{Shared Libraries}
1215 The following systems support dynamic loading using shared libraries:
1216 SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all
1217 systems derived from SVR4, or at least those SVR4 derivatives that
1218 support shared libraries (are there any that don't?).
1220 You don't need to do anything to configure dynamic loading on these
1221 systems --- the \file{configure} detects the presence of the
1222 \file{<dlfcn.h>} header file and automatically configures dynamic
1223 loading.
1225 \subsection{SGI IRIX 4 Dynamic Loading}
1227 Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic
1228 loading. (SGI IRIX 5 might also support it but it is inferior to
1229 using shared libraries so there is no reason to; a small test didn't
1230 work right away so I gave up trying to support it.)
1232 Before you build Python, you first need to fetch and build the \code{dl}
1233 package written by Jack Jansen. This is available by anonymous ftp
1234 from host \file{ftp.cwi.nl}, directory \file{pub/dynload}, file
1235 \file{dl-1.6.tar.Z}. (The version number may change.) Follow the
1236 instructions in the package's \file{README} file to build it.
1238 Once you have built \code{dl}, you can configure Python to use it. To
1239 this end, you run the \file{configure} script with the option
1240 \code{--with-dl=\var{directory}} where \var{directory} is the absolute
1241 pathname of the \code{dl} directory.
1243 Now build and install Python as you normally would (see the
1244 \file{README} file in the toplevel Python directory.)
1246 \subsection{GNU Dynamic Loading}
1248 GNU dynamic loading supports (according to its \file{README} file) the
1249 following hardware and software combinations: VAX (Ultrix), Sun 3
1250 (SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and
1251 Atari ST. There is no reason to use it on a Sparc; I haven't seen a
1252 Sun 3 for years so I don't know if these have shared libraries or not.
1254 You need to fetch and build two packages. One is GNU DLD 3.2.3,
1255 available by anonymous ftp from host \file{ftp.cwi.nl}, directory
1256 \file{pub/dynload}, file \file{dld-3.2.3.tar.Z}. (As far as I know,
1257 no further development on GNU DLD is being done.) The other is an
1258 emulation of Jack Jansen's \code{dl} package that I wrote on top of
1259 GNU DLD 3.2.3. This is available from the same host and directory,
1260 file dl-dld-1.1.tar.Z. (The version number may change --- but I doubt
1261 it will.) Follow the instructions in each package's \file{README}
1262 file to configure build them.
1264 Now configure Python. Run the \file{configure} script with the option
1265 \code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where
1266 \var{dl-directory} is the absolute pathname of the directory where you
1267 have built the \file{dl-dld} package, and \var{dld-directory} is that
1268 of the GNU DLD package. The Python interpreter you build hereafter
1269 will support GNU dynamic loading.
1272 \section{Building a Dynamically Loadable Module}
1274 Since there are three styles of dynamic loading, there are also three
1275 groups of instructions for building a dynamically loadable module.
1276 Instructions common for all three styles are given first. Assuming
1277 your module is called \code{spam}, the source filename must be
1278 \file{spammodule.c}, so the object name is \file{spammodule.o}. The
1279 module must be written as a normal Python extension module (as
1280 described earlier).
1282 Note that in all cases you will have to create your own Makefile that
1283 compiles your module file(s). This Makefile will have to pass two
1284 \samp{-I} arguments to the C compiler which will make it find the
1285 Python header files. If the Make variable \var{PYTHONTOP} points to
1286 the toplevel Python directory, your \var{CFLAGS} Make variable should
1287 contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}.
1288 (Most header files are in the \file{Include} subdirectory, but the
1289 \file{config.h} header lives in the toplevel directory.) You must
1290 also add \samp{-DHAVE_CONFIG_H} to the definition of \var{CFLAGS} to
1291 direct the Python headers to include \file{config.h}.
1294 \subsection{Shared Libraries}
1296 You must link the \samp{.o} file to produce a shared library. This is
1297 done using a special invocation of the \UNIX{} loader/linker, {\em
1298 ld}(1). Unfortunately the invocation differs slightly per system.
1300 On SunOS 4, use
1301 \begin{verbatim}
1302 ld spammodule.o -o spammodule.so
1303 \end{verbatim}
1305 On Solaris 2, use
1306 \begin{verbatim}
1307 ld -G spammodule.o -o spammodule.so
1308 \end{verbatim}
1310 On SGI IRIX 5, use
1311 \begin{verbatim}
1312 ld -shared spammodule.o -o spammodule.so
1313 \end{verbatim}
1315 On other systems, consult the manual page for \code{ld}(1) to find what
1316 flags, if any, must be used.
1318 If your extension module uses system libraries that haven't already
1319 been linked with Python (e.g. a windowing system), these must be
1320 passed to the \code{ld} command as \samp{-l} options after the
1321 \samp{.o} file.
1323 The resulting file \file{spammodule.so} must be copied into a directory
1324 along the Python module search path.
1327 \subsection{SGI IRIX 4 Dynamic Loading}
1329 {\bf IMPORTANT:} You must compile your extension module with the
1330 additional C flag \samp{-G0} (or \samp{-G 0}). This instruct the
1331 assembler to generate position-independent code.
1333 You don't need to link the resulting \file{spammodule.o} file; just
1334 copy it into a directory along the Python module search path.
1336 The first time your extension is loaded, it takes some extra time and
1337 a few messages may be printed. This creates a file
1338 \file{spammodule.ld} which is an image that can be loaded quickly into
1339 the Python interpreter process. When a new Python interpreter is
1340 installed, the \code{dl} package detects this and rebuilds
1341 \file{spammodule.ld}. The file \file{spammodule.ld} is placed in the
1342 directory where \file{spammodule.o} was found, unless this directory is
1343 unwritable; in that case it is placed in a temporary
1344 directory.\footnote{Check the manual page of the \code{dl} package for
1345 details.}
1347 If your extension modules uses additional system libraries, you must
1348 create a file \file{spammodule.libs} in the same directory as the
1349 \file{spammodule.o}. This file should contain one or more lines with
1350 whitespace-separated options that will be passed to the linker ---
1351 normally only \samp{-l} options or absolute pathnames of libraries
1352 (\samp{.a} files) should be used.
1355 \subsection{GNU Dynamic Loading}
1357 Just copy \file{spammodule.o} into a directory along the Python module
1358 search path.
1360 If your extension modules uses additional system libraries, you must
1361 create a file \file{spammodule.libs} in the same directory as the
1362 \file{spammodule.o}. This file should contain one or more lines with
1363 whitespace-separated absolute pathnames of libraries (\samp{.a}
1364 files). No \samp{-l} options can be used.
1367 \input{ext.ind}
1369 \end{document}