1 Instructions for hacking on Xapian
2 ==================================
4 .. contents:: Table of contents
6 This file is aimed to help developers get started with working on
7 Xapian. The documentation contains a section covering various internal
8 aspects of the library - this can also be found on the Xapian website
11 Extra options to give to configure
12 ==================================
14 Note: Non-developer configure options are described in INSTALL
16 You will probably want to use some of these if you're going to be developing
20 This enables compiling of assertion code which will throw
21 Xapian::AssertionError if the code detects violating of
22 preconditions, postconditions, or fails other consistency checks.
24 --enable-assertions=partial
25 This option enables a subset of the assertions enabled by
26 "--enable-assertions", but not the most expensive. The intention is
27 that it should be suitable for use in a real-world system for tracking
28 down problems without imposing too much of an overhead (but note that
29 we haven't yet performed timings to measure the overhead...)
32 This enables compiling code into the library which generates verbose
33 debugging messages. See "Debugging Messages", below.
36 In 1.2.0 and earlier, this used to use the debug logging macros to
37 report to stderr how long each method takes to execute. This feature
38 was removed in 1.2.1 - you are likely to get better results using
39 dedicated profiling tools - for more information see:
40 https://trac.xapian.org/wiki/ProfilingXapian
42 --enable-maintainer-mode
43 This tells configure to enable make dependencies for regenerating build
44 system files (such as configure, Makefile.in, and Makefile) and other
45 generated files (such as the stemmers and query parser) when required.
46 These are disabled by default as some make programs try to rebuild them
47 when it's not appropriate (e.g. BSD make doesn't handle VPATH except
48 for implicit rules). For this reason, we recommend GNU make if you
49 enable maintainer mode. You'll also need a non-cross-compiling C
50 compiler for compiling the Lemon parser generator and the Snowball
51 stemming algorithm compiler. The configure script will attempt to
52 locate one, but you can override this autodetection by passing
53 CC_FOR_BUILD on the command line like so::
55 ./configure CC_FOR_BUILD=/opt/bin/gcc
57 --enable-documentation
58 This tells configure to enable make dependencies for regenerating
59 documentation files. By default it uses the same setting as
60 --enable-maintainer-mode.
65 If you configure with --enable-log, lots of places in the code generate
66 debugging messages to tell us what they're up to - this information can be
67 very useful for debugging both the Xapian library and code which uses it. But
68 the quantity of information generated is potentially vast so there's a
69 mechanism to allow you to select where to store the log and which types of
70 message you're interested by setting environment variables. You can:
72 * set XAPIAN_DEBUG_LOG to be the path to a file that you would like debugging
73 output to be appended to, or to the special value ``-`` to indicate that you
74 would like debugging output to be sent to stderr. Unless XAPIAN_DEBUG_LOG
75 is set, no debug logging will be performed. Occurrences of %p in
76 XAPIAN_DEBUG_LOG will be replaced with the current process-id.
78 * set XAPIAN_DEBUG_FLAGS to a string of capital letters indicating the types
79 of debugging message you would like to display (the default is to log calls
80 to API functions and methods). These letters are shown in the first column
81 of the log output, and are also listed in ``common/debuglog.h``. If the
82 first character is ``-``, then the letters indicate those categories of
83 message *not* be shown instead. As a consequence of this, setting
84 ``XAPIAN_DEBUG_FLAGS=-`` will give you all debugging messages.
86 These environment variables only have any effect if you ran configure with the
91 <message type> <pid> [<this>] <message>
95 A 16747 [0x57ad1e0] void Xapian::Query::Internal::validate_query()
97 Each nested call adds another space before the ``[`` so you can easily see
98 which function call and return messages correspond.
100 Debugging memory allocations
101 ============================
103 The testsuite can make use of valgrind 3.3.0 or newer to check for memory
104 leaks, reads from uninitialised memory, and some other bugs during tests.
106 Valgrind doesn't support every platform, but Xapian contains very little
107 platform specific code (and most of what there is is Microsoft Windows
108 specific) so even just testing with valgrind on one platform gives good
111 If you have a new enough version of valgrind installed, it's automatically
112 detected by configure and used when running the testsuite. The testsuite runs
113 more slowly under valgrind, so if you wish to disable this auto-detection you
114 can run configure with:
116 ./configure VALGRIND=
118 Or you can disable use of valgrind during a particular run of "make check"
123 Or disable it while running a test directly (under sh or bash):
125 VALGRIND= ./runtest ./apitest
127 Running test programs
128 =====================
130 To run all tests, use ``make check``. You can also run just the subset of
131 tests which exercise the inmemory, remote progserver, remote TCP,
132 multi-database, brass, chert, or flint backends using ``make check-inmemory``,
133 ``make check-remoteprog``, ``make check-remotetcp``, ``make check-multi``,
134 ``make check-brass``, ``make check-chert``, or ``make check-flint``
137 Also, ``make check-remote`` will run the tests on both variants of the remote
138 backend, and ``make check-none`` will run those tests which don't use any
139 backend. These are handy shortcuts when doing development work on a particular
142 The runtest script (in the tests subdirectory) takes care of the details of
143 running the test programs (including setting up the environment so they work
144 when srcdir != builddir and handling libtool dynamically linked binaries). To
145 run a test program by hand (rather than via make) just use:
149 You can specify options and arguments. Individual test programs optionally
150 take one or more test names as arguments, and you can also pass ``-v`` to get
151 more verbose output from failing tests, e.g.:
153 ./runtest ./apitest -v deldoc1
155 If the number of the test is omitted, all tests with that basename are run,
156 so to run deldoc1, deldoc2, etc:
158 ./runtest ./apitest deldoc
160 You can also use runtest to run a test program under gdb (or most other tools):
162 ./runtest gdb ./apitest -v deldoc1
163 ./runtest valgrind ./apitest -v deldoc1
165 Some test programs take special arguments - for example, you can restrict
166 apitest to the flint backend using ``-bflint``.
168 There are a few environmental variables which the testsuite harness checks for
169 which you might find useful:
171 XAPIAN_TESTSUITE_SIG_DFL:
172 By default, the testsuite harness catches signals and handles them
173 gracefully - the current test is failed, and the testsuite moves onto the
174 next test. If you want to suppress this (some debugging tools may work
175 better if the signal is not caught) set the environment variable
176 XAPIAN_TESTSUITE_SIG_DFL to any value to prevent the testsuite harness
177 from installing its own signal handling.
179 XAPIAN_TESTSUITE_OUTPUT:
180 By default, the testsuite harness uses ANSI escape sequences to give
181 colour output if stdout is a tty. You can disable this feature by setting
182 XAPIAN_TESTSUITE_OUTPUT=plain (alternatively, piping the output (e.g.
183 through ``cat`` or ``more``) will have the same effect). Auto-detection
184 can be explicitly specified with XAPIAN_TESTSUITE_OUTPUT=auto (or empty).
185 Any other value forces the use of colour. Colour output is always disabled
186 on Microsoft Windows, so XAPIAN_TESTSUITE_OUTPUT has no effect there.
188 XAPIAN_TESTSUITE_LD_PRELOAD:
189 The runtest script will add this to LD_PRELOAD if it is set, allowing you
190 to easily load LD_PRELOAD libraries when running the testsuite. The
191 original intended use was to allow use of libeatmydata
192 (https://www.flamingspork.com/projects/libeatmydata/) which makes fsync
193 and related calls no-ops, but configure now checks for the eatmydata
194 wrapper script and this is used automatically. However, there may be
195 other LD_PRELOAD libraries which are useful, so we've left the machinery
198 Speeding up the testsuite with eatmydata
199 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
201 The testsuite does a lot of small database operations, and the calls to fsync,
202 fdatasync, etc which Xapian makes by default can slow down testsuite runs
203 substantially. There's a handy LD_PRELOAD library called eatmydata
204 (http://www.flamingspork.com/projects/libeatmydata/), which can help here, by
205 turning fsync and related calls into no-ops.
207 You need a version of eatmydata with the eatmydata wrapper script (version 37
208 or newer), and then configure should auto-detect it and it'll get used when
209 running the testsuite (via runtest). If you wish to disable this
210 auto-detection for some reason, you can run configure with:
212 ./configure EATMYDATA=
214 Or you can disable use of eatmydata during a particular run of "make check"
217 make check EATMYDATA=
219 Or disable it while running a test directly (under sh or bash):
221 EATMYDATA= ./runtest ./apitest
223 Using various debugging, profiling, and leak-finding tools
224 ==========================================================
226 If you're using GCC 3.4 or newer, you can turn on debugging iterators, etc in
227 the GNU C++ STL by defining _GLIBCXX_DEBUG:
229 ./configure CPPFLAGS=-D_GLIBCXX_DEBUG
231 For documentation of this option, see:
232 http://gcc.gnu.org/onlinedocs/libstdc++/debug.html
234 Note: all C++ code must be compiled with this defined or you'll get problems -
235 Xapian 0.9.7 and later add a suitable check to xapian/version.h to prevent you
238 To use valgrind (http://www.valgrind.org/), no special build options are
239 required, but make sure you compile with debugging information (on by default
240 for GCC) and the valgrind documentation recommends disabling optimisation (with
241 optimisation, line numbers in error messages can be confusing due to code
244 ./configure CXXFLAGS='-O0 -g'
246 To use gdb (http://www.gnu.org/software/gdb/), no special build options are
247 required, but make sure you compile with debugging information (on by default
248 for GCC). You'll probably find debugging easier if you compile without
249 optimisation (with optimisation, line numbers in error messages can be
250 confusing due to code inlining, etc, and the values of some variables can't be
251 printed because they've been eliminated from the code completely):
253 ./configure CXXFLAGS='-O0 -g'
255 To enable profiling for gprof:
257 ./configure CXXFLAGS=-pg LDFLAGS=-pg
259 To use Purify (a proprietary tool):
261 ./configure CXXLD='purify c++' --disable-shared
263 To use Insure (another proprietary tool):
265 ./configure CXX=insure
267 To use lcov (at least version 1.10) to generate a test coverage report (see
268 `lcov.xapian.org <http://lcov.xapian.org/>`_ for reports) there are two make
271 * coverage-reconfigure: reruns configure in the source tree. See
272 Makefile.am for details of the configure options used and why they
275 * coverage-check: runs "make check" and generates an HTML report in a
276 directory called "lcov".
278 If you have runes for using other tools, please add them above, or send them
284 If you want to try unreleased Xapian code, you can fetch it from our git
285 repository. For convenience, we also provide bootstrapped tarballs (much like
286 the sourcecode download for any release version) which get built every 20
287 minutes if there have been any changes checked in. These tarballs need to
288 pass "make distcheck" to be automatically uploaded, so using them will help
289 to assure that you don't pick a "bad" version. The snapshots are available
290 from the "Bleeding Edge" page of the Xapian website.
295 When building from a git checkout, we *strongly* recommend that you use
296 the ``bootstrap`` script in the top level directory to set up the tree ready
297 for building. This script will check which directories you have checked out,
298 so you can bootstrap a partial tree.
300 You will need the following tools installed to build from git:
302 * GNU m4 (for autoconf)
303 * perl 5 (for automake; also for various maintainer scripts)
304 * python >= 2.3 (for generating the Python bindings)
305 * GNU make (or another make which support VPATH for explicit rules)
306 * GNU flex (for building doxygen)
307 * GNU bison (for building doxygen)
308 * Tcl (to generate unicode/unicode-data.cc)
310 For a recent version of Debian or Ubuntu, this command should ensure you have
311 all the necessary tools and libraries::
313 apt-get install build-essential m4 perl python zlib1g-dev uuid-dev wget flex bison tcl
315 If you want to build Omega, you'll also need::
317 apt-get install libpcre3-dev libmagic-dev
319 On Fedora, the uuid library can be installed by doing::
321 yum install libuuid-devel
323 If building from git, you'll also need git-svn installed.
325 If you're doing much development work, you'll probably also want the following
328 * valgrind for better testsuite error finding
329 * ccache for faster rebuilds
330 * eatmydata for faster testsuite runs
332 The repository does not contain any automatically generated files
333 (such as configure, Makefile.in, Snowball-generated stemmers, Lemon-generated
334 parsers, SWIG-generated code, etc) because experience shows it's best to keep
335 these out of version control. To avoid requiring you to install the correct
336 versions of the tools required, we either include the source to these tools in
337 the repo directly (in the case of Snowball and Lemon), or pull them in to a
338 checkout using svn:externals or git-svn (SWIG), or the bootstrap script will
339 lazily download, build, and install them within the source tree (autoconf,
340 automake, libtool, and doxygen). To download the source tarballs for these,
341 bootstrap will use wget or curl if installed. If not, it will give an error
342 telling you the URL to download from by hand and where to copy the file to.
344 Bootstrap will then run autoreconf on each of the checked-out subdirectories,
345 and generate a top-level configure script. This configure script allows you to
346 configure xapian-core and any other modules you've checked out with single
347 simple command, such that the other modules link against the uninstalled
348 xapian-core (which is very handy for development work and a bit fiddly to set
349 up by hand). It automatically passes --enable-maintainer-mode to the
350 subprojects so that the autotools will be rerun if configure.ac, Makefile.am,
353 The bootstrap script doesn't care what the current directory is. The top-level
354 configure script generated by it supports building in a separate directory to
355 the sources: simply create the directory you want to build in, and then run the
356 configure script from inside that directory. For example, to build in a
357 directory called "build" (starting in the top level source directory)::
364 When running bootstrap, if you need to add any extra macro directories to the
365 path searched by aclocal (which is part of automake), you can do this by
366 specifying these in the ACLOCAL_FLAGS environment variable, e.g.::
368 ACLOCAL_FLAGS=-I/extra/macro/directory ./bootstrap
370 If you wish to prevent bootstrap from downloading and building the autotools
371 pass the --without-autotools option. You can force it to delete the downloaded
372 and installed versions by passing --clean.
374 If you are tracking development in git, there will sometimes be changes
375 to the build system sources which require regeneration of the generated
376 makefiles and associated machinery. We aim to make the build system
377 automatically regenerate the necessary files, but in the event that a build
378 fails after an update, it may be worth re-running the bootstrap script to
379 regenerate the build system from scratch, before looking for the cause of the
382 Tools required to build documentation
383 -------------------------------------
385 If you want to be able to build distribution tarballs (with "make dist") then
386 you'll also need some further tools. If you don't want to have to install all
387 these tools, then pass --disable-documentation to configure to disable these
388 rules (the default state of this follows the setting of
389 --enable-maintainer-mode, so in a non-maintainer-mode tree, you can pass
390 --enable-documentation to enable these rules). Without the documentation,
391 "make dist" will fail (to prevent accidentally distributing tarballs without
392 documentation), but you can configure and build.
394 The documentation tools are:
396 * doxygen (v1.5.9 is used for 1.1.x snapshots and releases; 1.4.6 produced
397 incomplete documentation for Xapian::Query).
398 * dot (part of Graphviz. Doxygen's DOT_MULTI_TARGETS option apparently needs ">1.8.10")
399 * gs (part of Ghostscript)
400 * pdflatex (in texlive-latex-base on Debian/Ubuntu)
401 * epstopdf (in texlive-extra-utils on Debian/Ubuntu)
402 * makeindex (in texlive-binaries on Debian/Ubuntu, or texlive-base-bin for older releases)
404 * rst2html or rst2html.py (in python-docutils on Debian/Ubuntu)
405 * pngcrush (optional - used to reduce the size of PNG files in the HTML
408 Note that pdflatex, epstopdf, gs, and makeindex must all currently be on your
409 path (as specified by the environmental variable PATH), since doxygen will look
412 For a recent version of Debian or Ubuntu, this command should install all the
413 required documentation tools::
415 apt-get install doxygen graphviz ghostscript texlive-latex-base texlive-extra-utils texlive-binaries texlive-fonts-extra texlive-fonts-recommended texlive-latex-extra texlive-latex-recommended help2man python-docutils pngcrush
420 * autoconf 2.68 is used to generate snapshots and releases.
422 autoconf 2.64 is a hard minimum requirement.
424 autoconf 2.60 is required for docdir support and AC_TYPE_SSIZE_T.
426 autoconf 2.62 generates faster configure scripts and warns about unrecognised
427 options passed to configure.
429 autoconf 2.63 fixes a regression in AC_C_BIGENDIAN introduced in 2.62
430 (Omega uses this macro).
432 autoconf 2.64 generates smaller configure scripts by using shell functions.
434 * automake 1.11.6 is used to generate snapshots and releases.
436 automake 1.10.2 is a hard minimum requirement.
438 automake 1.10 requires autoconf 2.60 and Perl 5.6.
440 automake 1.10.1 fixes "make clean" to remove .libs from bin/ and examples/.
442 automake 1.10.2 fixes a few bugs (not investigated if these affect Xapian)
444 * libtool 2.4.6 is used to generate snapshots and releases.
446 libtool 2.2.8 is the current hard minimum requirement.
448 libtool 2.2 is required for us to be able to override link_all_deplibs_CXX
449 and sys_lib_dlsearch_path_spec in configure. It also fixes some
450 long-standing issues and is significantly faster.
452 Please tell us if you find that newer versions of any of these tools work or
455 There is a good GNU autotools tutorial at
456 <http://www.lrde.epita.fr/~adl/autotools.html>.
458 Building from git on Windows with MSVC
459 --------------------------------------
461 The windows build process is maintained in the xapian-maintainer-tools
462 directory in the Xapian git repository. See the win32msvc/README file in that
463 directory for details of how to build from git.
468 * STL: We decided early on to embrace the C++ STL. Some older compilers
469 don't include full support for this. Often we can work around this, and we
470 should do where the effort is low compared to the gain (so a compiler
471 version which is widely used is more worth supporting than one which is
472 hardly used by anyone).
474 There is now plenty of choice of compilers which provide good conformance to
475 ISO C++, so if working around problems for some compiler proves too hard we
476 should just document the issue and users will either have to upgrade to a
477 more compliant compiler, or use another STL implementation such as STLport
478 (http://stlport.sourceforge.net/).
480 * C++ features we currently assume:
482 * We assume <sstream> is available. GCC < 2.95.3 didn't have it but GCC
483 2.95.3 includes a backported version. We aren't aware of any other
484 compilers still in use which lack it.
486 * Non-".h" versions of standard ISO C++ headers (e.g. ``#include <list>``
487 rather than ``#include <list.h>``). We aren't aware of any compiler still
488 in use which lacks these, and GCC 4.3 no longer has the old versions. If
489 there are any, we could add a directory full of forwarding headers to work
492 * Standard header ``<limits>`` (for ``numeric_limits<>``) - for GCC, this was
495 * Standard header ``<streambuf>`` (GCC < 3.0 only has ``<streambuf.h>``).
497 * Working auto_ptr in header ``<memory>`` (some old version of some compiler
498 had a buggy implementation - the details are lost to history, but it may
499 have been GCC 2.95, or perhaps EGCS).
501 * RTTI (dynamic_cast<>, typeid, etc): Needing to use RTTI features in the
502 library most likely indicates a design flaw, and you should avoid use
503 of these features. Where necessary, you can use a technique similar to
504 Database::as_networkdatabase() to replace dynamic_cast<>.
506 * Exceptions: In hindsight, throwing exceptions in the library seems to have
507 been a poor design decision. GCC on Solaris can't cope with exceptions in
508 shared libraries (though it appears this may have been fixed in more recent
509 versions), and we've also had test failures on other platforms which only
510 occur with shared libraries - possibly with a similar cause. Exceptions can
511 also be a pain to handle elegantly in the bindings. We intend to investigate
512 modifying the library to return error codes internally, and then offering the
513 user the choice of exception throwing or error code returning API methods
514 (with the exception being thrown by an inlined wrapper in the externally
515 visible header files). With this in mind, please don't complicate the
516 internal handling of exceptions...
518 * "using namespace std;" and "using std::XXX;" - it's OK to use these in
519 applications, library code, and internal library headers. But in externally
520 visible headers (such as anything included by "#include <xapian.h>") you MUST
521 use explicit "std::" qualifiers - it's not acceptable to pull anything from
522 namespace std into the namespace of an application which uses Xapian.
524 * Use C++ style casts (static_cast<>, reinterpret_cast<>, and const_cast<>)
525 or constructor-syntax (e.g. ``double(value)``) in preference to C style
526 casts. The syntax of the C++ casts is ugly, but they do make the intent much
527 clearer which is definitely a good thing.
529 * std::pair<> with an STL class as one (or both) of the members can produce
530 very long symbols (over 4KB!) after name mangling - long enough to overflow
531 the size limits of some vendor compilers or toolchains (so this can affect
532 GCC if it is using the system ld or as). Even where the compiler works, the
533 symbol bloat in an unstripped build is probably best avoided, so it's
534 preferable to use a simple two member struct instead. The code is probably
535 more readable anyway, and easier to extend if more members are needed later.
537 * We try to avoid putting the full definition of virtual methods in header
538 files. This is because current compilers can't (as far as we know) inline
539 virtual methods, so putting the definition in the header file simply slows
540 down compilation (and, because method definitions often require further
541 header files to be included, this can result in many more files needing
542 recompilation after a change to a header file than is really necessary).
543 Just put the declaration in the header file, and put the definition in a .cc
544 file with the same basename.
546 Include ordering for source files
547 ---------------------------------
549 To help us move towards a consistent ordering of #include lines in source
550 files, please follow the following policy when ordering them:
552 * #include <config.h> should be first, and use <> not "" (as recommended by the
555 * The header corresponding to the source file should be next. This means that
556 compilation of the library ensures that each header with a corresponding
557 source file is "self supporting" (i.e. it implicitly or explicitly includes
558 all of the headers it requires).
560 * External xapian-core headers, alphabetically. When included from other
561 external headers, use <> to reduce problems with finding headers in the
562 user's source tree by mistake. In sources and internal headers, use "" (?) -
563 practically this makes no difference as we have -I for srcdir and builddir,
564 but <> suggests installed header files so "" seems more natural).
566 * Internal headers, alphabetically (using "").
568 * "Safe" versions of library headers (include these first to avoid issues if
569 other library headers include the ones we want to wrap). Use "" and order
572 * Library headers, alphabetically.
574 * Standard C++ headers, alphabetically. Use the modern (no .h suffix) names.
576 C++ Portability Issues
577 ======================
582 The "C++ FAQ Lite" covers many frequently asked C++ questions:
583 http://www.parashift.com/c++-faq-lite/
585 Header Portability Issues
586 -------------------------
591 Don't directly '#include <fcntl.h>' - instead '#include "safefcntl.h"'.
593 The main reason for this is that when using certain compilers on certain
594 versions of Solaris, fcntl.h does '#define open open64'. Sadly this breaks C++
595 code which has methods called open (as we do). There's a cunning workaround
596 for this problem in common/safefcntl.h.
598 Also, safefcntl.h ensures the O_BINARY is defined (to 0 if not required) so
599 calls to open() and creat() can specify O_BINARY unconditionally for the
600 benefit of platforms which discriminate between text and binary files.
605 Don't directly '#include <windows.h>' - instead '#include "safewindows.h"'
606 which reduces the bloat of header files included and prevents some of the
607 more egregious namespace pollution. It also defines any constants we need
608 which might be missing in older versions of the mingw headers.
613 Don't directly '#include <winsock2.h>' - instead '#include "safewinsock2.h"'.
614 This ensure that safewindows.h is included before <winsock2.h> to avoid
615 winsock2.h including windows.h without our namespace pollution reducing
621 Don't directly '#include <errno.h>' - instead '#include "safeerrno.h"' which
622 works around a problem with Compaq's C++ compiler.
627 Don't directly '#include <sys/select.h>' - instead '#include "safesysselect.h"'
628 which supports older UNIX platforms which predate POSIX 1003.1-2001 and works
629 around a problem on Solaris.
634 Don't directly '#include <sys/stat.h>' - instead '#include "safesysstat.h"'
635 which under MSVC enables stat to work on files > 2GB, defines the missing
636 POSIX macros S_ISDIR and S_ISREG, pulls in <direct.h> for mkdir() (which is
637 provided by sys/stat.h under UNIX) and provides a compatibility wrapper for
638 mkdir() which takes 2 arguments (so code using mkdir can always just pass
644 To get `WEXITSTATUS` or `WIFEXITED` defined, '#include "safesyswait.h"'.
645 Note that this won't provide `waitpid()`, etc on Microsoft Windows, since
646 these functions are only really useful to use when `fork()` is available.
651 Don't directly '#include <unistd.h>' - instead '#include "safeunistd.h"'
652 - MSVC doesn't even HAVE unistd.h!
654 The various "safe" headers are maintained in xapian-core/common, but also used
655 by Omega. Omega pulls in a copy using the svn:externals property which is
656 set on xapian-applications/omega. Because of how this feature of SVN works,
657 we pull in a read-only copy via HTTP access to the main repository, so you
658 have to update it in xapian-core, and if you have ssh write access to the
659 repo but no HTTP access, this will fail.
661 The imported URL has to be absolute, which isn't too branch friendly. To avoid
662 problems from this, we specify a particular revision to import, but this does
663 mean we need to monitor changes to xapian-core and decide when to update omega.
664 The release checklist includes a reminder to check this.
666 Warning-Free Compilation
667 ------------------------
669 Compiling without warnings on every platform is our goal, though it's not
670 always possible to achieve. For example, some GCC 3.x compilers produce the
671 occasional bogus warning (e.g. warning that a variable may be used
672 uninitialised, despite it being initialised at the point of declaration!)
674 You should consider configure-ing with:
676 ./configure CXXFLAGS=-Werror
678 when doing development work on Xapian. This promotes warnings to errors,
679 which should ensure you at least don't introduce new warnings for the compiler
682 If you configure with --enable-maintainer-mode, and are using GCC 4.1 or newer,
683 this is done for you automatically. This is intended to be an aid rather than
684 a form of automated punishment - it's all too easy to miss a new warning as
685 once a file is compiled, you don't see it unless you modify that file or one of
688 With Intel's C++ compiler, --enable-maintainer-mode also enables -Werror.
689 If you know the equivalent of -Werror for other compilers, please add a note
690 here, or tell us so that we can add a note.
692 Miscellaneous Portability Issues
693 --------------------------------
695 Make sure that the last line of any source file ends with a linefeed character
696 since it's undefined behaviour if it doesn't (most compilers accept it, though
697 at least GCC gives a warning).
699 Branch Prediction Hints
700 =======================
702 For compilers which support ``__builtin_expect()`` (GCC >= 3.0 and some others)
703 you can provide manual hints to assist branch prediction. We've wrapped these
704 in macros which evaluate to just their argument for compilers which don't
705 support ``__builtin_expect()__``.
707 Within the xapian-core library code, you can mark the expressions in ``if`` and
708 ``while`` statements as ``rare`` (if the condition is rarely true) or ``usual``
709 (if the condition is usually true).
713 if (rare(something_unusual())) deal_with_it();
715 while (usual(!end_condition()) keep_going();
717 It's easy to make incorrect assumptions about where hotspots are and which
718 branches are usually taken or not, so except for really obvious cases (such
719 as ``if (!consistency_check()) throw_exception();``) you should benchmark
720 that new ``rare`` and ``usual`` hints help rather than hinder before committing
721 them to the repository. It's also likely to be a waste of effort to add them
722 outside of areas of code which are executed very frequently.
724 Don't expect miracles - the first 15 uses added saved approximately 1%.
726 If you know how to implement the ``rare`` and ``usual`` macros for other
727 compilers, please let us know.
732 Especially for a library, compile-time options aren't a good solution for
733 how to integrate a new feature. An increasingly large number of users install
734 pre-built binary packages rather than building from source, and unless the
735 package is capable of being split into modules, the packager has to choose a
736 set of compile-time options to use. And they'll tend to choose either the
737 standard ones, or perhaps a broader set to try to keep everyone happy. For a
738 library, similar issues occur when installing from source as well - the
739 sysadmin must choose the options which will keep all users happy.
741 Another problem with compile-time options is that it's hard to ensure that
742 a change doesn't break compilation under some combination of options without
743 actually building and running the test-suite on all combinations. The fewer
744 compile-time options, the more likely the code will compile with every
747 So please think carefully before adding more compile-time options. They're
748 probably OK for experimental features (but should go away once a feature is no
749 longer experimental). Options to instrument a build for special purposes
750 (debug, profiling, etc) are also acceptable. Disabling whole features probably
751 isn't (e.g. the --disable-backend-XXX options we already have are dubious,
752 though being able to disable the remote backend can be useful when trying to
753 get Xapian going on a platform).
758 We don't want to force those building Xapian from the source distribution to
759 have to use GNU make. Requiring GNU make for "make dist" isn't such a problem
760 but it's probably better to use portable constructs everywhere to avoid
761 problems when people move or copy code between targets. If you do make use
762 of non-portable constructs where it's OK, add a comment noting the special
763 circumstances which justify doing so.
765 Here's an incomplete list of things to avoid:
767 * Don't use "$(RM)" - it's defined by GNU make, but using it actually harms
768 portability as other makes don't define it. Use plain "rm" instead.
770 * Don't use "%" pattern rules - these are GNU make specific. Use an
771 implicit rule (e.g. ".c.o:") if you can. Otherwise, write out each version
774 * Don't use "$<" except in implicit rules. This is an annoying restriction,
775 as using "$<" makes it much easier to make VPATH builds work. But it's only
776 portable in implicit rules. Tips for rewriting - if it's a source file,
781 If it's a generated object file or similar, just write the name as is. The
782 tricky case is a generated file which isn't in SVN but is shipped in the
783 distribution tarball, as such a file could be in either the source or build
784 tree. Use this trick to make sure it's found whichever directory it's in::
786 `test -f foo.ext || echo '$(srcdir)/'`foo.ext
788 * Don't use "exit 0" to make a rule fail. Use "false" instead. BSD make
789 doesn't like "exit 0" in a rule.
791 * Don't use make conditionals. Automake offers conditionals which may be
792 of use, and these are implemented to work with any make. See the automake
793 manual for details, and a few caveats.
795 * The list of portable utilities is:
797 cat cmp cp diff echo egrep expr false grep install-info
798 ln ls mkdir mv pwd rm rmdir sed sleep sort tar test touch true
800 Note that versions of these (GNU versions in particular) support switches
801 which aren't portable - notably, "test -r" isn't portable; neither is
802 "cp -a". And note that "mkdir -p" isn't portable - the semantics vary.
803 The autoconf manual has some useful information about writing portable
804 shell code (most of it not specific to autoconf)::
806 http://www.gnu.org/software/autoconf/manual/autoconf.html#Portable-Shell
808 * Don't use "include" - it's not present in BSD make (at least some versions
809 have ".include" instead, but that doesn't really seem to help...) Automake
810 provides a configure-time include, which may provide a replacement for some
813 * It appears that BSD make only supports VPATH for implicit rules (e.g.
814 ".c.o:") - there's certainly a restriction there which is not present in GNU
815 make. We used to try to work around this, but now we use AM_MAINTAINER_MODE
816 to disable rules which are only needed by those developing Xapian (these were
817 the rules which caused problems). And we recommend those developing Xapian
818 use GNU make to avoid problems.
820 * Rules with multiple targets can cause problems for parallel builds. These
821 rules are really just a shorthand for multiple rules with the same
822 prerequisites and commands, and it is fine to use them in this way. However,
823 a common temptation is to use them when a single invocation of a command
824 generates multiple output files, by adding each of the output files as a
825 target. Eg, if a swig language module generates xapian_wrap.cc and
826 xapian_wrap.h, it is tempting to add a single rule something like::
828 # This rule has a problem
829 xapian_wrap.cc xapian_wrap.h: xapian.i
832 This can result in SWIG_commands being run twice, in parallel. If
833 SWIG_commands generates any temporary files, the two invocations can
834 interfere causing one of them to fail.
836 Instead of this rule, one solution is to pick one of the output files as a
837 primary target, and add a dependency for the second output file on the first
840 # This rule also has a problem
841 xapian_wrap.h: xapian_wrap.cc
842 xapian_wrap.cc: xapian.i
845 This ensures that make knows that only one invocation of SWIG_commands is
846 necessary, but could result in problems if the invocation of SWIG_commands
847 failed after creating xapian_wrap.cc, but before creating xapian_wrap.h.
848 Instead, we recommend creating an intermediate target::
850 # This rule works in most cases
851 xapian_wrap.cc xapian_wrap.h: xapian_wrap.stamp
852 xapian_wrap.stamp: xapian.i
856 Because the intermediate target is only touched after the commands have
857 executed successfully, subsequent builds will always retry the commands if an
858 error occurs. Note that the intermediate target cannot be a "phony" target
859 because this would result in the commands being re-run for every build.
861 However, this rule still has a problem - if the xapian_wrap.cc and
862 xapian_wrap.h files are removed, but the xapian_wrap.stamp file is not, the
863 .cc and .h files will not be regenerated. There is no simple solution to
864 this, but the following is a recipe taken from the automake manual which
865 works. For details of *why* it works, see the section in the automake manual
866 titled "Multiple Outputs"::
868 # This rule works even if some of the output files were removed
869 xapian_wrap.cc xapian_wrap.h: xapian_wrap.stamp
870 ## Recover from the removal of $@. A full explanation of these rules is in
871 ## the automake manual under the heading "Multiple Outputs".
872 @if test -f $@; then :; else \
873 trap 'rm -rf xapian_wrap.lock xapian_wrap.stamp' 1 2 13 15; \
874 if mkdir xapian_wrap.lock 2>/dev/null; then \
875 rm -f xapian_wrap.stamp; \
876 $(MAKE) $(AM_MAKEFLAGS) xapian_wrap.stamp; \
877 rmdir xapian_wrap.lock; \
879 while test -d xapian_wrap.lock; do sleep 1; done; \
880 test -f xapian_wrap.stamp; exit $$?; \
883 xapian_wrap.stamp: xapian.i
887 * This is actually a robustness point, not portability per se. Rules which
888 generate files should be careful not to leave a partial file in place if
889 there's an error as it will have a timestamp which leads make to believe it's
890 up-to-date. So this is bad:
893 $PERL script.pl > foo.cc
898 $PERL script.pl > foo.tmp
901 Alternatively, pass the output filename to the script and make sure you
902 delete the output on error or a signal (although this approach can leave
903 a partial file in place if the power fails). All used Makefile.am-s and
904 scripts have been checked (and fixed if required) as of 2003-07-10 (didn't
905 check xapian-bindings).
907 * Another robustness point - if you add a non-file target to a makefile, you
908 should also list it in ".PHONY". Otherwise your target won't get remade
909 reliably if someone creates a file with the same name in their tree. For
912 .PHONY: hello goodbye
920 And lastly a style point - using "@" to suppress echoing of commands being
921 executed removes choice from the user - they may want to see what commands
922 are being executed. And if they don't want to, many versions of make support
923 the use "make -s" to suppress the echoing of commands.
925 Using @echo on a message sent to stdout or stderr is acceptable (since it
926 avoids showing the message twice). Otherwise don't use "@" - it makes it
927 harder to track down problems in the makefiles.
932 Scripts generally should *not* have an extension indicating the language they
933 are currently implemented in (e.g. ``runtest`` rather than ``runtest.sh`` or
934 ``runtest.pl``). The problem with such an extension is that if we decide
935 to reimplement the script in a different language, we either have to rename
936 the script (which is annoying as people will be used to the name, and may
937 have embedded it in their own scripts), or we have a script with a confusing
938 name (e.g. a Python script with extension ``.pl``).
940 The above reasoning doesn't apply to scripts which have to be in a particular
941 language for some reason, though for consistency they probably shouldn't get
942 an extension either, unless there's a good reason to have one.
947 Use Assert to perform internal consistency checks, and to check for invalid
948 arguments to functions and methods (e.g. passing a NULL pointer when this isn't
949 permitted). It should *NOT* be used to check for error conditions such as
950 file read errors, memory allocation failing, etc (since we want to perform such
951 checks in non-debug builds too).
953 File format errors should also not be tested with Assert - we want to catch
954 a corrupted database or a malformed input file in a non-debug build too.
956 There are several variants of Assert:
958 - Assert(P) -- asserts that expression P is true.
960 - AssertRel(a,rel,b) -- asserts that (a rel b) is true - rel can be a boolean
961 relational operator, i.e. one of ``==``, ``!=``, ``>``, ``>=``, ``<``,
962 ``<=``. The message given if the assertion fails reports the values of
963 a and b, so ``AssertRel(a,<,b);`` is more helpful than ``Assert(a < b);``
965 - AssertEq(a,b) -- shorthand for AssertRel(a,==,b).
967 - AssertEqDouble(a,b) -- asserts a and b differ by less than DBL_EPSILON
969 - AssertParanoid(P) -- a particularly expensive assertion. If you want a build
970 with Asserts enabled, but without a great performance overhead, then
971 passing --enable-assertions=partial to configure and AssertParanoids
972 won't be checked, but Asserts will. You can also use AssertRelParanoid
973 and AssertEqParanoid.
975 - CompileTimeAssert(P) -- if P is a constant expression, CompileTimeAssert
976 can be used to assert it is non-zero at compile-time - the P evaluates
977 to zero, then the compilation will fail with an error. CompileTimeAssert
978 can only be used inside a function body. There should be no runtime
979 overhead for using CompileTimeAssert(), so CompileTimeAssert() is always
980 enabled, regardless of whether --enable-assertions is passed to configure
983 Marking Features as Deprecated
984 ==============================
986 In the API headers, a feature (a class, method, function, enum, typedef, etc)
987 can be marked as deprecated by using the XAPIAN_DEPRECATED() or
988 XAPIAN_DEPRECATED_CLASS macros. Note that you can't deprecate a preprocessor
991 For compilers with a suitable mechanism (currently GCC 3.1 or later, and
992 MSVC 7.0 or later) this causes compile-time warning messages to be emitted for
993 any use of the deprecated feature. For compilers without support, the macro
994 just expands to its argument.
996 You must add this line to any API header which uses XAPIAN_DEPRECATED() or
997 XAPIAN_DEPRECATED_CLASS::
999 #include <xapian/deprecated.h>
1001 When marking a feature as deprecated, document the deprecation in
1002 docs/deprecation.rst. When actually removing deprecated features, please tidy
1003 up by removing the inclusion of <xapian/deprecated.h> from any file which no
1004 longer marks any features as deprecated.
1006 The XAPIAN_DEPRECATED() macro should wrap the whole declaration except for the
1007 semicolon and any "definition" part, for example::
1009 XAPIAN_DEPRECATED(int old_function(double arg));
1013 XAPIAN_DEPRECATED(int old_method());
1015 XAPIAN_DEPRECATED(int old_const_method() const);
1017 XAPIAN_DEPRECATED(static int old_static_method());
1019 XAPIAN_DEPRECATED(static const int OLD_CONSTANT) = 42;
1022 Mark a class as deprecated by inserting ``XAPIAN_DEPRECATED_CLASS`` after the
1023 class keyword like so::
1025 class XAPIAN_DEPRECATED_CLASS Foo {
1032 To avoid compilation errors with older GCC versions (noted with GCC 3.3.5),
1033 you can't mark a method which is defined inline in a class with
1034 ``XAPIAN_DEPRECATED()`` (this works with recent GCC versions though)::
1038 // This fails to compile with GCC 3.3.5, so don't do this!
1039 XAPIAN_DEPRECATED(int old_inline_method()) { return 42; }
1042 Instead rewrite like so::
1046 XAPIAN_DEPRECATED(int old_inline_method());
1049 inline int Foo::old_inline_method() { return 42; }
1054 If you have a patch to fix a problem in Xapian, or to add a new feature,
1055 please send it to us for inclusion. Any major changes should be discussed
1056 on the xapian-devel mailing list first:
1057 <https://xapian.org/lists>
1059 Also, please read the following section on licensing of patches before
1062 We find patches in unified diff format easiest to read. If you're using
1063 git, then "git diff" is good (or "git format-patch" for a patch series). If
1064 you're using a SVN checkout just use "svn diff" to generate the diff. If
1065 you're working from a tarball, you can unpack a second clean copy of the files
1066 and compare the two versions with "diff -pruN" (-p reports the function name
1067 for each chunk, -r acts recursively, -u does a unified diff, and -N shows
1068 new files in the diff). Alternatively "ptardiff" (which comes with perl, at
1069 least on Debian and Ubuntu) can diff against the original tarball, unpacking
1072 Please set the width of a tab character in your editor to 8 spaces, and use
1073 Unix line endings (i.e. LF, not CR+LF). Failing to do so will make it much
1074 harder for us to merge in your changes.
1076 We don't currently have a formal coding standards document, but please try
1077 to follow the style of the existing code. In particular:
1079 * Indent C++ code by 4 spaces for a new indentation level, and set your editor
1080 to tab-fill indentation (with a tab being 8 spaces wide).
1082 As an exception, "public", "protected" and "private" declarations in classes
1083 and structs should be indented by 2 spaces, and the following code should be
1084 indented by 2 more spaces::
1091 The rationale for this exception is that class definitions in header files
1092 often have fairly long lines, so losing an indent level to the access
1093 specifier tends to make class definitions less readable.
1095 The default access for a class is always "private", so there's no need
1096 to specify that explicitly - in other words, write this::
1099 int internal_method();
1102 int external_method();
1109 int internal_method();
1112 int external_method();
1115 If a class only contains public methods and data, consider declaring it as a
1116 "struct" (the only difference in C++ is that the default access for a
1117 struct is "public").
1119 * Put a space before the "(" after control flow constructs like "for", "if",
1120 "while", etc. Don't put a space before the "(" in function calls. So
1121 write "if (strlen(p) > 10)" not "if(strlen (p) > 10)".
1123 * When "if", "else", "for", "while", "do," "switch", "case", "default", "try",
1124 or "catch" is followed by a block enclosed in braces, the opening brace
1125 should be on the same line, like so::
1134 The rationale for this is that it conserves vertical space (allowing more
1135 code to fit on screen) without reducing readability.
1137 * If you have an empty loop body, use `{ }` rather than `;` as the former
1138 stands out more clearly to the reader (but also consider if the code might be
1139 clearer written a different way).
1141 * Prefer "++i;" to "i++;", "i += 1;", or "i = i + 1". For simple integer
1142 variables these should generate equivalent (if not identical) code, but if i
1143 is an iterator object then the pre-increment form can be more efficient in
1144 some cases with some compilers. It's simpler and more consistent to always
1145 use the pre-increment form (unless you make use of the old value which the
1146 post-increment form returns). For the same reasons, prefer "--i;" to "i--;",
1147 "i -= 1;", or "i = i - 1;".
1149 * Prefer "container.empty()" to "container.size() == 0" (and
1150 "!container.empty()" to "container.size() != 0" or "container.size() > 0").
1151 Finding the size of a container may not be a constant time operation for
1152 all containers (e.g. std::list may not be, and indeed isn't for GCC - see
1153 https://gcc.gnu.org/onlinedocs/libstdc++/manual/containers.html#sequences.list.size).
1154 Also the "empty()" form makes the intent of the test more explicit.
1156 * Prefer not to use "else" when the control flow is diverted elsewhere at the
1157 end of the "if" block (e.g. by "return", "continue", "break", "throw"). This
1158 eliminates a level of indentation from the code in the "else" block, and
1159 typically makes the control flow logic clearer. For example::
1181 * For standard ISO C headers, prefer the C++ form for ISO C headers (e.g.
1182 "#include <cstdlib>" rather than "#include <stdlib.h>") unless there's a good
1183 reason (e.g. portability) to do otherwise. Be sure to document such
1184 exceptions to avoid another developer changing them to the standard form.
1185 Global exceptions: <signal.h> (lots of POSIX stuff which e.g. Sun's compiler
1186 doesn't provide in <csignal>).
1188 * For standard ISO C++ headers, *always* use the ISO C++ form '#include <list>'
1189 (pre-ISO compilers used '#include <list.h>', but GCC has generated a warning
1190 for this form for years, and GCC 4.3 dropped support entirely).
1192 * Some guidelines for efficient use of std::string:
1194 + When passing an empty string to a method expecting ``const std::string &``
1195 prefer ``std::string()`` to ``""`` or ``std::string("")`` as the first form
1196 is more likely to directly use a special "empty string representation" (it
1197 does with GCC at least).
1199 + To make a string object empty, ``s.resize(0)`` (if you want to keep the
1200 current reserved space) or ``s = string()`` (if you don't) seem the best
1203 + Use ``std::string::assign()`` rather than building a temporary string
1204 object and assigning that. For example, ``foo = std::string(ptr, len);``
1205 is better written as ``foo.assign(ptr, len);``.
1207 + It's generally better to build up strings using ``+=`` rather than
1208 combining series of components with ``+``. So ``foo = a + " and " + c`` is
1209 better written as ``foo = a; foo += " and "; foo += c;``. It's possible
1210 for compilers to handle the former without a lot of temporary string
1211 objects by returning a proxy object to allow the concatenation to happen
1212 lazily, but not all compilers do this, and it's likely to still have some
1213 overhead. Note that GCC 4.1 seems to produce larger code in some cases for
1214 the latter approach, but it's a definite win with GCC 4.4.
1216 * ``std::string(1, '\0')`` seems to be slightly more efficient than
1217 ``std::string("", 1)`` for constructing a std::string containing a single
1218 ASCII nul character.
1220 * Prefer ``new SomeClass`` to ``new SomeClass()``, since the latter tends to
1221 lead one to write ``SomeClass foo();` which is a function prototype, and not
1222 equivalent to the variable definition ``SomeClass foo``. However, note that
1223 ``new SomePODType()`` is *not* the same as ``new SomePODType`` (if
1224 SomePODType is a POD (Plain Old Data) type) - the former will zero-initialise
1225 scalar members of SomePODType.
1227 * When catching an exception which is an object, do it by const reference, so
1232 } catch (const ErrorClass &e) {
1236 Catching by value is bad because it "slices" the object if an object of a
1237 derived type is thrown. Even if derived types aren't a worry, it also causes
1238 the copy constructor to be called needlessly.
1240 See also: http://www.parashift.com/c++-faq-lite/exceptions.html#faq-17.7
1242 A const reference is preferable to a non-const reference as it stops the
1243 object being inadvertently modified. In the rare cases when you want to
1244 modify the caught object, a non-const reference is OK.
1246 We will do our best to give credit where credit is due - if we have used
1247 patches from you, or received helpful reports or advice, we will add your name
1248 to the AUTHORS file (unless you specifically request us not to). If you see we
1249 have forgotten to do this, please draw it to our attention so that we can
1250 address the omission.
1252 Licensing of patches
1253 ====================
1255 If you want a patch to be considered for inclusion in the Xapian sources, you
1256 must own the copyright on this patch. Employers often claim copyright on code
1257 written by their employees (even if the code is written in their spare time),
1258 so please check with your employer if this applies. Be aware that even if you
1259 are a student your university may try and claim some rights on code which you
1262 Patches which are submitted to Xapian will only be included if the copyright
1263 holder(s) dual-license them under each of the following licences:
1265 - GPL version 2 and all later versions (see the file "COPYING" for details).
1268 Copyright (c) <year> <copyright holders>
1270 Permission is hereby granted, free of charge, to any person obtaining a copy
1271 of this software and associated documentation files (the "Software"), to
1272 deal in the Software without restriction, including without limitation the
1273 rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
1274 sell copies of the Software, and to permit persons to whom the Software is
1275 furnished to do so, subject to the following conditions:
1277 The above copyright notice and this permission notice shall be included in
1278 all copies or substantial portions of the Software.
1280 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
1281 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
1282 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
1283 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
1284 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
1285 FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
1288 The current distribution of Xapian contains many files which are only licensed
1289 under the GPL, but we are working towards being able to distribute Xapian under
1290 a more permissive license, and are not willing to accept patches which we will
1291 have to rewrite before this can happen.
1293 Developers with SVN access:
1294 ===========================
1296 People who are more seriously involved with the project are likely to
1297 have write access to the SVN repository. This section gives the conventions
1298 for those developers, but most of these also apply if you're generating a
1299 patch you'd like us to include.
1301 1) Make sure that the documentation is updated
1302 ----------------------------------------------
1304 * API classes, methods, functions, and types must be documented by
1305 documentation comments alongside the declaration in ``include/xapian/*.h``.
1306 These are collated by doxygen - see doxygen's documentation for details
1307 of the supported syntax. We've decided to prefer to use @ rather than \
1308 to introduce doxygen commands (the choice is essentially arbitrary, though
1309 \ introduces C/C++ escape sequences so @ is likely to make for easier to
1310 read mark up for C/C++ coders). A *lot* of existing comments use \
1311 currently but please use @ for new comments.
1313 * The documentation comments don't give users a good overview, so we also
1314 need documentation which gives a good overview of how to achieve particular
1315 tasks. In particularly, major new functionality should have its own "topic"
1316 document, or extend an existing topic document if more appropriate.
1318 * Internal classes, etc should also be documented by documentation comments
1319 where they are declared.
1321 2) Make sure the tests are right
1322 --------------------------------
1324 * If you're adding a feature, also add feature tests for it. These both
1325 ensure that the feature isn't broken to start with and detect if later
1326 changes stop it working as intended.
1327 * If you've fixed a bug, make sure there's a regression test which
1328 fails on the existing code and succeeds after your changes.
1329 * Make sure all existing tests continue to pass.
1331 If you don't know how to write tests using the Xapian test rig, then
1332 ask. It's reasonably simple once you've done it once. There is a brief
1333 introduction to the Xapian test system in ``docs/tests.html``.
1335 3) Make sure the attributions are right
1336 ---------------------------------------
1338 * If necessary, modify the copyright statement at the top of any
1339 files you've altered. If there is no copyright statement, you may
1340 add one (there are a couple of Makefile.am's and similar that don't
1341 have copyright statements; anything that small doesn't really need
1342 one anyway, so it's a judgement call). If you've added files which
1343 you've written from scratch, they should include the GPL boilerplate
1344 with your name only.
1345 * If you're not in there, add yourself to the AUTHORS file.
1352 + If there's a trac ticket or other reference for the bug, mention it in the
1353 commit message - it's a great help to future developers trying to work out
1354 why a change was made.
1356 5) Consider backporting
1357 -----------------------
1359 * If there's an active release branch, check if the bug is present in that
1360 branch, and if the fix is appropriate to backport - if the fix breaks ABI
1361 compatibility or is very invasive, you need to fix it in a different way
1362 for the release branch, or decide not to backport the fix.
1367 * If there's a related trac ticket, update it (if the issue is completely
1368 addressed by the changes you've made, then close it).
1370 * Update the release notes for the most recent release with a copy of the
1371 patch. If the commit from git applies cleanly, you can just link to
1372 it. If it fails to apply, please attach an adjusted patch which does.
1373 If there are conflicts in test cases which aren't easy to resolve, it is
1374 acceptable to just drop those changes from the patch if we can still be
1375 confident that the issue is actually fixed by the patch.
1380 We use reference counted pointers for most API classes. These are implemented
1381 using Xapian::Internal::RefCntPtr, the implementation of which is exposed for
1382 efficiency, and because it's unlikely we'll need to change it frequently, if at
1385 For the reference counted classes, the API class (e.g. Xapian::Enquire) is
1386 really just a wrapper around a reference counted pointer. This points to an
1387 internal class (e.g. Xapian::Enquire::Internal). The reference counted
1388 pointer is a member variable of the API class called internal. Conceptually
1389 this member is private, though it typically isn't declared as private (this
1390 is to avoid littering the external headers with friend declarations for
1393 There are a few exceptions to the reference counted structure, such as
1394 MSetIterator and ESetIterator which have an exposed implementation. Tests show
1395 this makes a substantial difference to speed (it's ~20% faster) in typical
1396 cases of iterator use.
1398 The postfix operator++ for iterators should be implemented inline in terms
1399 of the prefix form as described by Joe Buck on the gcc mailing list
1400 - excerpt from http://article.gmane.org/gmane.comp.gcc.devel:50201 ::
1402 class some_iterator {
1405 some_iterator& operator++();
1407 some_iterator operator++(int) {
1408 some_iterator tmp = *this;
1414 The compiler is allowed to assume that the copy constructor only does
1415 a copy, and to optimize away unneeded copy operations. The result
1416 in this case should be that, for some_iterator above, using the
1417 postfix operator without using the result should give code equivalent
1418 to using the prefix operator.
1420 Now, for [GCC 3.4], you'll find that the dead uses of tmp are only
1421 completely optimized away if tmp has only one data member that can fit in a
1422 register. [GCC 4.0 will do] better, and you should find that this style
1423 comes very close to eliminating any penalty from "incorrect" use of the
1426 Xapian's PostingIterator, TermIterator, PositionIterator, and ValueIterator all
1427 have only one data member which fits in a register.
1429 Handy tips for aiding development
1430 =================================
1432 If you are find you are repeatedly changing the API headers (in include/)
1433 during development, then you may become annoyed that the docs/ subdirectory
1434 will rebuild the doxygen documentation every time you run "make" since this
1435 takes a while. You can disable this temporarily (if you're using GNU make),
1436 by creating a file "docs/GNUmakefile" containing these two lines::
1439 @echo "Skipping 'make $@' in docs"
1441 Note that the whitespace at the start of the second line needs to be a
1442 single "tab" character!
1444 Don't forget to remove (or rename) this and check the documentation builds
1445 before committing or generating a patch though!
1447 How to make a release
1448 =====================
1450 This is a (hopefully complete) list of the jobs which need doing:
1452 * Email Fabrice Colin and Tim Brody so they can check RPM packaging.
1454 * Check if `config/config.guess` and `config/config.sub` need updating to
1455 more recent versions from http://git.savannah.gnu.org/gitweb/?p=config.git
1457 * Check the revision currently specified in the svn:externals property of
1458 xapian-applications/omega. Unless there's a good reason, we should release
1459 xapian-core and omega with synchronised versions of the shared files.
1461 * Make sure that any new/changed/removed API methods in xapian-core have been
1462 wrapped/updated/removed in xapian-bindings.
1464 * Update the lists of deprecated/removed API methods in docs/deprecation.rst
1466 * Update the NEWS files using information from the ChangeLog files
1468 * Update the version in configure.ac for each module (xapian-core, omega, and
1469 xapian-bindings), and the library version info in xapian-core's configure.ac
1471 * Make sure the submitters of fixed bugs are mentioned in the "thanks" list in
1472 xapian-core/AUTHORS. Check the list for the appropriate milestone::
1474 https://trac.xapian.org/query?col=id&col=summary&col=reporter&milestone=1.0.14
1476 * On atreus, svn tag the source trees for the new revision - use the
1477 svn-tag-release script, running it with the new version number, for example:
1479 xapian-maintainer-tools/svn-tag-release 1.0.14
1481 This script also generates tarballs for the new release and copies them
1482 across to the website.
1484 * Add the new version to the list of versions in trac:
1485 https://trac.xapian.org/admin/ticket/versions
1487 * Add a new milestone for the version after this one:
1488 https://trac.xapian.org/admin/ticket/milestones
1490 * Mark the current milestone as completed. In order to do so, any unfixed bugs
1491 with this milestone will need to be moved to another milestone (most likely
1492 the milestone you just added).
1496 Create a new page http://wiki.xapian.org/ReleaseNotes/X.Y.Z and link it into
1497 http://wiki.xapian.org/ReleaseNotes in place of the old current release link,
1498 which should be moved to the archived section.
1500 Also update the roadmap at http://wiki.xapian.org/RoadMap by recording the
1501 date of this release and adding an entry for the next release with an
1502 estimated release date.
1504 * Update the website: `generate` in the CVS module www.xapian.org contains the
1505 latest version and the date it was released.
1507 * Run /home/olly/tmp/xapian-website-update/update_website.sh
1509 * Announce the new version on xapian-discuss
1511 * Have a nice cup of tea!
1513 How to make Debian packages for a new release
1514 =============================================
1516 Debian control files are stored in separate git repositories:
1518 * http://anonscm.debian.org/cgit/collab-maint/xapian-bindings.git
1519 * http://anonscm.debian.org/cgit/collab-maint/xapian-core.git
1520 * http://anonscm.debian.org/cgit/collab-maint/xapian-omega.git
1522 To package a new upstream release, these should be updated as follows:
1524 * If there are any patch files in "debian/patches", check if these have been
1525 incorporated into the new release, and if so remove them and update
1526 "debian/patches/series".
1528 * Update the debian/changelog file, being sure to keep it in the
1529 standard Debian format (the easiest way is to use the dch utility
1530 like so: "dch -v 1.2.19-1". The new version number should be the
1531 version number of the release followed by "-1" (i.e., a debian
1532 patch number of 1). The changelog message should indicate that
1533 there is a new upstream release, and should mention any significant
1534 changes in the new release.
1536 * Tag using: ``git tag -s -m 1.2.19-1 1.2.19-1``
1538 * FIXME: Document how to make source packages, or update
1539 ``make-source-packages``.
1541 * FIXME: Document how to build binary packages, or update ``build-packages``.
1543 * Test the packages.
1545 * Run ``debsign build/*_amd64.changes`` to GPG sign the packages.
1547 * Run ``dput build/*_amd64.changes`` to upload them to Debian.
1549 * For the Ubuntu backports::
1551 ./backport-source-packages xapian-core 1.2.19-1 ubuntu
1552 ./backport-source-packages xapian-omega 1.2.19-1 ubuntu
1553 ./backport-source-packages xapian-bindings 1.2.19-1 ubuntu
1555 And once libsearch-xapian-perl is uploaded to Debian unstable::
1557 ./backport-source-packages libsearch-xapian-perl 1.2.19.0-1 ubuntu
1561 debsign build/*99*_source.changes
1565 dput xapian-backports build/xapian-core*99*_source.changes
1567 Wait for that to have a chance to build, and then::
1569 dput xapian-backports build/xapian-[bo]*99*_source.changes
1570 dput xapian-backports build/libsearch-xapian-perl*_source.changes