1 Instructions for hacking on Xapian
2 ==================================
4 .. contents:: Table of contents
6 This file is aimed to help developers get started with working on
7 Xapian. The documentation contains a section covering various internal
8 aspects of the library - this can also be found on the Xapian website
11 Extra options to give to configure
12 ==================================
14 Note: Non-developer configure options are described in INSTALL
16 You will probably want to use some of these if you're going to be developing
20 This enables compiling of assertion code which will throw
21 Xapian::AssertionError if the code detects violating of
22 preconditions, postconditions, or fails other consistency checks.
24 --enable-assertions=partial
25 This option enables a subset of the assertions enabled by
26 "--enable-assertions", but not the most expensive. The intention is
27 that it should be suitable for use in a real-world system for tracking
28 down problems without imposing too much of an overhead (but note that
29 we haven't yet performed timings to measure the overhead...)
32 This enables compiling code into the library which generates verbose
33 debugging messages. See "Debugging Messages", below.
36 In 1.2.0 and earlier, this used to use the debug logging macros to
37 report to stderr how long each method takes to execute. This feature
38 was removed in 1.2.1 - you are likely to get better results using
39 dedicated profiling tools - for more information see:
40 https://trac.xapian.org/wiki/ProfilingXapian
42 --enable-maintainer-mode
43 This tells configure to enable make dependencies for regenerating build
44 system files (such as configure, Makefile.in, and Makefile) and other
45 generated files (such as the stemmers and query parser) when required.
46 These are disabled by default as some make programs try to rebuild them
47 when it's not appropriate (e.g. BSD make doesn't handle VPATH except
48 for implicit rules). For this reason, we recommend GNU make if you
49 enable maintainer mode. You'll also need a non-cross-compiling C
50 compiler for compiling the Lemon parser generator and the Snowball
51 stemming algorithm compiler. The configure script will attempt to
52 locate one, but you can override this autodetection by passing
53 CC_FOR_BUILD on the command line like so::
55 ./configure CC_FOR_BUILD=/opt/bin/gcc
57 --enable-documentation
58 This tells configure to enable make dependencies for regenerating
59 documentation files. By default it uses the same setting as
60 --enable-maintainer-mode.
65 If you configure with --enable-log, lots of places in the code generate
66 debugging messages to tell us what they're up to - this information can be
67 very useful for debugging both the Xapian library and code which uses it. But
68 the quantity of information generated is potentially vast so there's a
69 mechanism to allow you to select where to store the log and which types of
70 message you're interested by setting environment variables. You can:
72 * set XAPIAN_DEBUG_LOG to be the path to a file that you would like debugging
73 output to be appended to, or to the special value ``-`` to indicate that you
74 would like debugging output to be sent to stderr. Unless XAPIAN_DEBUG_LOG
75 is set, no debug logging will be performed. Occurrences of ``%p`` in
76 XAPIAN_DEBUG_LOG will be replaced with the current process-id.
78 If you're debugging a crash and want to avoid losing the most recent log
79 messages then include ``%!`` in XAPIAN_DEBUG_LOG (which is replaced with
80 the empty string). This will cause the log file to be opened with
81 ``O_DSYNC`` or ``O_SYNC`` or similar if running on a platform that supports
82 a suitable mechanism. In 1.4.10 and earlier this was on by default (and
83 ``%!`` has no special meaning) but it can incur a significant performance
84 overhead and in most cases isn't necessary.
86 * set XAPIAN_DEBUG_FLAGS to a string of capital letters indicating the types
87 of debugging message you would like to display (the default is to log calls
88 to API functions and methods). These letters are shown in the first column
89 of the log output, and are also listed in ``common/debuglog.h``. If the
90 first character is ``-``, then the letters indicate those categories of
91 message *not* be shown instead. As a consequence of this, setting
92 ``XAPIAN_DEBUG_FLAGS=-`` will give you all debugging messages.
94 These environment variables only have any effect if you ran configure with the
99 <message type> <pid> [<this>] <message>
103 A 16747 [0x57ad1e0] void Xapian::Query::Internal::validate_query()
105 Each nested call adds another space before the ``[`` so you can easily see
106 which function call and return messages correspond.
108 Debugging memory allocations
109 ============================
111 The testsuite can make use of valgrind 3.3.0 or newer to check for memory
112 leaks, reads from uninitialised memory, and some other bugs during tests.
114 Valgrind doesn't support every platform, but Xapian contains very little
115 platform specific code (and most of what there is is Microsoft Windows
116 specific) so even just testing with valgrind on one platform gives good
119 If you have a new enough version of valgrind installed, it's automatically
120 detected by configure and used when running the testsuite. The testsuite runs
121 more slowly under valgrind, so if you wish to disable this auto-detection you
122 can run configure with::
124 ./configure VALGRIND=
126 Or you can disable use of valgrind during a particular run of "make check"
131 Or disable it while running a test directly (under sh or bash)::
133 VALGRIND= ./runtest ./apitest
135 Current versions of valgrind result in false positives on current versions
136 of macOS, so on this platform configure only enables use of valgrind if
137 it's specified explicitly, for example if valgrind is on your ``PATH``
140 ./configure VALGRIND=valgrind
142 Running test programs
143 =====================
145 To run all tests, use ``make check``. You can also run just the subset of
146 tests which exercise the inmemory, remote progserver, remote TCP,
147 multi-database, glass, or chert backends using ``make check-inmemory``,
148 ``make check-remoteprog``, ``make check-remotetcp``, ``make check-multi``,
149 ``make check-glass``, or ``make check-chert``
152 Also, ``make check-remote`` will run the tests on both variants of the remote
153 backend, and ``make check-none`` will run those tests which don't use any
154 backend. These are handy shortcuts when doing development work on a particular
157 The runtest script (in the tests subdirectory) takes care of the details of
158 running the test programs (including setting up the environment so they work
159 when srcdir != builddir and handling libtool dynamically linked binaries). To
160 run a test program by hand (rather than via make) just use:
164 You can specify options and arguments. Individual test programs optionally
165 take one or more test names as arguments, and you can also pass ``-v`` to get
166 more verbose output from failing tests, e.g.:
168 ./runtest ./apitest -v deldoc1
170 If the number of the test is omitted, all tests with that basename are run,
171 so to run deldoc1, deldoc2, etc:
173 ./runtest ./apitest deldoc
175 You can also use runtest to run a test program under gdb (or most other tools):
177 ./runtest gdb ./apitest -v deldoc1
178 ./runtest valgrind ./apitest -v deldoc1
180 Some test programs take special arguments - for example, you can restrict
181 apitest to the glass backend using ``-bglass``.
183 There are a few environment variables which the testsuite harness checks for
184 which you might find useful:
186 XAPIAN_TESTSUITE_SIG_DFL:
187 By default, the testsuite harness catches signals and handles them
188 gracefully - the current test is failed, and the testsuite moves onto the
189 next test. If you want to suppress this (some debugging tools may work
190 better if the signal is not caught) set the environment variable
191 XAPIAN_TESTSUITE_SIG_DFL to any value to prevent the testsuite harness
192 from installing its own signal handling.
194 XAPIAN_TESTSUITE_OUTPUT:
195 By default, the testsuite harness uses ANSI escape sequences to give
196 colour output if stdout is a tty. You can disable this feature by setting
197 XAPIAN_TESTSUITE_OUTPUT=plain (alternatively, piping the output (e.g.
198 through ``cat`` or ``more``) will have the same effect). Auto-detection
199 can be explicitly specified with XAPIAN_TESTSUITE_OUTPUT=auto (or empty).
200 Any other value forces the use of colour. Colour output is always disabled
201 on Microsoft Windows, so XAPIAN_TESTSUITE_OUTPUT has no effect there.
203 XAPIAN_TESTSUITE_LD_PRELOAD:
204 The runtest script will add this to LD_PRELOAD if it is set, allowing you
205 to easily load LD_PRELOAD libraries when running the testsuite. The
206 original intended use was to allow use of libeatmydata
207 (https://www.flamingspork.com/projects/libeatmydata/) which makes fsync
208 and related calls no-ops, but configure now checks for the eatmydata
209 wrapper script and this is used automatically. However, there may be
210 other LD_PRELOAD libraries which are useful, so we've left the machinery
213 Speeding up the testsuite with eatmydata
214 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
216 The testsuite does a lot of small database operations, and the calls to fsync,
217 fdatasync, etc which Xapian makes by default can slow down testsuite runs
218 substantially. There's a handy LD_PRELOAD library called eatmydata
219 (https://www.flamingspork.com/projects/libeatmydata/), which can help here, by
220 turning fsync and related calls into no-ops.
222 You need a version of eatmydata with the eatmydata wrapper script (version 37
223 or newer), and then configure should auto-detect it and it'll get used when
224 running the testsuite (via runtest). If you wish to disable this
225 auto-detection for some reason, you can run configure with:
227 ./configure EATMYDATA=
229 Or you can disable use of eatmydata during a particular run of "make check"
232 make check EATMYDATA=
234 Or disable it while running a test directly (under sh or bash):
236 EATMYDATA= ./runtest ./apitest
238 Using various debugging, profiling, and leak-finding tools
239 ==========================================================
241 GCC's libstdc++ supports a debug mode, which checks for various misuses of
242 the STL - to enable this, define _GLIBCXX_DEBUG when building Xapian:
244 ./configure CPPFLAGS=-D_GLIBCXX_DEBUG
246 For documentation of this option, see:
247 https://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode.html
249 Note: all C++ code must be compiled with this defined or you'll get problems.
250 Xapian's API headers include a check that the same setting is used when
251 building code using Xapian as was used to build Xapian.
253 To use valgrind (http://www.valgrind.org/), no special build options are
254 required, but make sure you compile with debugging information (on by default
255 for GCC) and the valgrind documentation recommends disabling optimisation (with
256 optimisation, line numbers in error messages can be confusing due to code
259 ./configure CXXFLAGS='-O0 -g'
261 To use gdb (https://www.gnu.org/software/gdb/), no special build options are
262 required, but make sure you compile with debugging information (on by default
263 for GCC). You'll probably find debugging easier if you compile without
264 optimisation (with optimisation, line numbers in error messages can be
265 confusing due to code inlining, etc, and the values of some variables can't be
266 printed because they've been eliminated from the code completely):
268 ./configure CXXFLAGS='-O0 -g'
270 To enable profiling for gprof:
272 ./configure CXXFLAGS=-pg LDFLAGS=-pg
274 To use Purify (a proprietary tool):
276 ./configure CXXLD='purify c++' --disable-shared
278 To use Insure (another proprietary tool):
280 ./configure CXX=insure
282 To use lcov (at least version 1.10) to generate a test coverage report (see
283 `lcov.xapian.org <http://lcov.xapian.org/>`_ for reports) there are three make
284 targets (all in the `xapian-core` directory):
286 * `make coverage-reconfigure`: reruns configure in the source tree. See
287 Makefile.am for details of the configure options used and why they
288 are needed. If you're using ccache, make sure it's at least version
289 3.0, and ideally at least 3.2.2.
291 * `make coverage-reconfigure-maintainer-mode`: does the same thing, except
292 the tree is configured in "maintainer mode", which is what you want if
293 generating coverage reports while working on the code.
295 * `make coverage-check`: runs `make check` and generates an HTML report in a
296 directory called `lcov`.
298 + You can specify extra arguments to pass to the ``genhtml`` tool using
299 `GENHTML_ARGS`, so for example if you plan to serve the generated HTML
300 coverage report from a webserver, you might use:
301 `make coverage-check GENHTML_ARGS=--html-gzip`
303 You ideally want lcov 1.11 or later, since 1.11 includes patches to reduce
304 memory usage significantly - lcov 1.10 would run out of memory in a 1GB VM.
306 If you have runes for using other tools, please add them above, or send them
312 If you want to try unreleased Xapian code, you can fetch it from our git
313 repository. For convenience, we also provide bootstrapped tarballs (much like
314 the sourcecode download for any release version) which get built every 20
315 minutes if there have been any changes checked in. These tarballs need to
316 pass "make distcheck" to be automatically uploaded, so using them will help
317 to assure that you don't pick a "bad" version. The snapshots are available
318 from the "Bleeding Edge" page of the Xapian website.
323 When building from a git checkout, we *strongly* recommend that you use
324 the ``bootstrap`` script in the top level directory to set up the tree ready
325 for building. This script will check which directories you have checked out,
326 so you can bootstrap a partial tree. You can also ``touch .nobootstrap`` in
327 a subdirectory to tell bootstrap to ignore it.
329 You will need the following tools installed to build from git:
331 * GNU m4 >= 4.6 (for autoconf)
332 * perl >= 5.6 (for automake; also for various maintainer scripts)
333 * python >= 2.3 (for generating the Python bindings)
334 * GNU make (or another make which support VPATH for explicit rules)
335 * GNU bison (for building SWIG, used for generating the bindings)
336 * Tcl (to generate unicode/unicode-data.cc)
338 For a recent version of Debian or Ubuntu, this command should ensure you have
339 all the necessary tools and libraries::
341 apt-get install build-essential m4 perl python zlib1g-dev uuid-dev wget bison tcl
343 If you want to build Omega, you'll also need::
345 apt-get install libpcre2-dev libmagic-dev
347 On Fedora, the uuid library can be installed by doing::
349 yum install libuuid-devel
351 On macOS, if you're using macports you'll want the following:
353 * file (magic.h in configure)
355 If you're using homebrew you'll want the following::
357 brew install libmagic pcre2
359 If you're doing much development work, you'll probably also want the following
362 * valgrind for better testsuite error finding
363 * ccache for faster rebuilds
364 * eatmydata for faster testsuite runs
366 The repository does not contain any automatically generated files
367 (such as configure, Makefile.in, Snowball-generated stemmers, Lemon-generated
368 parsers, SWIG-generated code, etc) because experience shows it's best to keep
369 these out of version control. To avoid requiring you to install the correct
370 versions of the tools required, we either include the source to these tools in
371 the repo directly (in the case of Snowball and Lemon), or the bootstrap script
372 will download them as tarballs (autoconf, automake, libtool) or
373 from git (SWIG), build them, and install them within the source tree.
375 To download source tarballs, bootstrap will use wget, curl or lwp-request if
376 installed. If not, it will give an error telling you the URL to download from
377 by hand and where to copy the file to.
379 Bootstrap will then run autoreconf on each of the checked-out subdirectories,
380 and generate a top-level configure script. This configure script allows you to
381 configure xapian-core and any other modules you've checked out with single
382 simple command, such that the other modules link against the uninstalled
383 xapian-core (which is very handy for development work and a bit fiddly to set
384 up by hand). It automatically passes --enable-maintainer-mode to the
385 subprojects so that the autotools will be rerun if configure.ac, Makefile.am,
388 The bootstrap script doesn't care what the current directory is. The top-level
389 configure script generated by it supports building in a separate directory to
390 the sources: simply create the directory you want to build in, and then run the
391 configure script from inside that directory. For example, to build in a
392 directory called "build" (starting in the top level source directory)::
399 When running bootstrap, if you need to add any extra macro directories to the
400 path searched by aclocal (which is part of automake), you can do this by
401 specifying these in the ACLOCAL_FLAGS environment variable, e.g.::
403 ACLOCAL_FLAGS=-I/extra/macro/directory ./bootstrap
405 If you wish to prevent bootstrap from downloading and building the autotools
406 pass the --without-autotools option. You can force it to delete the downloaded
407 and installed versions by passing --clean.
409 If you are tracking development in git, there will sometimes be changes
410 to the build system sources which require regeneration of the generated
411 makefiles and associated machinery. We aim to make the build system
412 automatically regenerate the necessary files, but in the event that a build
413 fails after an update, it may be worth re-running the bootstrap script to
414 regenerate the build system from scratch, before looking for the cause of the
417 Tools required to build documentation
418 -------------------------------------
420 If you want to be able to build distribution tarballs (with "make dist") then
421 you'll also need some further tools. If you don't want to have to install all
422 these tools, then pass --disable-documentation to configure to disable these
423 rules (the default state of this follows the setting of
424 --enable-maintainer-mode, so in a non-maintainer-mode tree, you can pass
425 --enable-documentation to enable these rules). Without the documentation,
426 "make dist" will fail (to prevent accidentally distributing tarballs without
427 documentation), but you can configure and build.
429 The documentation tools are:
431 * doxygen (v1.8.8 is used for 1.3.x snapshots and releases; 1.7.6.1 fails to
432 process trunk after PL2Weight was added).
433 * dot (part of Graphviz. Doxygen's DOT_MULTI_TARGETS option apparently needs
436 * rst2html or rst2html.py (in python-docutils on Debian/Ubuntu)
437 * pngcrush (optional - used to reduce the size of PNG files in the HTML
439 * sphinx-doc (in python-sphinx and python3-sphinx on Debian/Ubuntu, or as
440 sphinx via pip install)
442 For a recent version of Debian or Ubuntu, this command should install all the
443 required documentation tools::
445 apt-get install doxygen graphviz help2man python-docutils pngcrush python-sphinx python3-sphinx
447 Documentation builds on macOS
448 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
450 On macOS, if you're using homebrew, you'll want the following::
452 brew install doxygen help2man graphviz pngcrush
454 (Ensure you're up to date with brew, as earlier packaging of graphviz
455 didn't properly install dot.)
457 You also need sphinx and docutils, which are python packages; you can
458 install them via pip::
460 pip install sphinx docutils
462 You may find it easier to use homebrew to install python first, so
463 these packages are separate from the system python::
467 If you install both python (v2) and python3 (v3) via homebrew, you
468 will be able to build bindings for both; you'll then need to install
476 As of 1.3.2, we no longer build PDF versions of the API docs by default, but
477 you can build them yourself with::
479 make -C docs apidoc.pdf
481 Additional tools are needed for these:
483 * gs (part of Ghostscript)
484 * pdflatex (in texlive-latex-base on Debian/Ubuntu)
485 * epstopdf (in texlive-extra-utils on Debian/Ubuntu)
486 * makeindex (in texlive-binaries on Debian/Ubuntu, or texlive-base-bin for older releases)
488 Note that pdflatex, epstopdf, gs, and makeindex must all currently be on your
489 path (as specified by the environment variable PATH), since doxygen will look
492 For a recent version of Debian or Ubuntu, this command should install these
495 apt-get install ghostscript texlive-latex-base texlive-extra-utils texlive-binaries texlive-fonts-extra texlive-fonts-recommended texlive-latex-extra texlive-latex-recommended
497 On macOS, if you're using macports you'll want the following:
499 * texlive (pdflatex during build)
500 * texlive-basic (for makeindex in configure)
501 * texlive-latex-extra (latex style)
503 Alternatively, you can install MacTeX from https://www.tug.org/mactex/ instead
504 of texlive, texlive-basic and texlive-latex-extra.
506 The homebrew texlive package only supports 32 bit systems, so even if you're
507 using homebrew, you'll probably want to install MacTeX from
508 https://www.tug.org/mactex/ instead.
513 * autoconf 2.69 is used to generate snapshots and releases.
515 autoconf 2.64 is a hard minimum requirement.
517 autoconf 2.60 is required for docdir support and AC_TYPE_SSIZE_T.
519 autoconf 2.62 generates faster configure scripts and warns about unrecognised
520 options passed to configure.
522 autoconf 2.63 fixes a regression in AC_C_BIGENDIAN introduced in 2.62
523 (Omega uses this macro).
525 autoconf 2.64 generates smaller configure scripts by using shell functions.
527 * automake 1.15.1 is used to generate snapshots and releases.
529 automake 1.12.2 is a hard minimum requirement. This version fixes a
530 security issue (CVE-2012-3386) in the generated `make distcheck` rules.
532 automake 1.12 is needed to support using LOG_COMPILER to specify a testsuite
533 driver (used by xapian-bindings).
535 * libtool 2.4.6 is used to generate snapshots and releases.
537 libtool 2.2.8 is the current hard minimum requirement.
539 libtool 2.2 is required for us to be able to override link_all_deplibs_CXX
540 and sys_lib_dlsearch_path_spec in configure. It also fixes some
541 long-standing issues and is significantly faster.
543 Please tell us if you find that newer versions of any of these tools work or
546 There is a good GNU autotools tutorial at
547 <https://www.lrde.epita.fr/~adl/autotools.html>.
549 Building from git on Windows with MSVC
550 --------------------------------------
552 Building using MSVC is now supported by the autotools build system. You need
553 to install a set of Unix-like tools first - we recommended MSYS2:
554 https://www.msys2.org/
556 For details of how to specify MSVC to ``configure`` see the "INSTALL" document.
558 When building from git, by default you'll need some extra tools to generate
559 Unicode tables (Tcl) and build documentation (doxygen, help2man, sphinx-doc).
560 We don't currently have detailed advice on how to do this (if you can provide
561 some then please send a patch).
563 You can avoid needing Tcl by copying ``xapian-core/unicode/unicode-data.cc``
564 from another platform or a release which uses the same Unicode version. You
565 can avoid needing most of the documentation tools by running configure with
566 the ``--disable-documentation`` option.
571 * As of Xapian 1.3.3, a compiler with decent support for C++11 is required to
572 build Xapian. We currently aim to allow users to use a non-C++11 compiler
573 to build code which uses Xapian.
575 There are now several compilers with good C++11 support, but there are a
576 few shortfalls in commonly deployed versions of most of them. Often we can
577 work around this, and we should do where the effort is low compared to the
578 gain (so a compiler version which is widely used is more worth supporting
579 than one which is hardly used by anyone).
581 However, we shouldn't have to jump through hoops to cater for compilers where
582 their authors aren't putting in the effort to keep up with the language
585 Please avoid the following C++11 features for the time being:
587 * ``std::to_string()`` - this is completely missing on current versions of
588 mingw and cygwin - in the library, you can ``#include "str.h"`` and then
589 use the ``str()`` function instead for most cases. This is also usually
590 faster than ``std::to_string()``.
592 * C++ features we currently assume:
594 * We assume <sstream> is available. GCC < 2.95.3 didn't have it but GCC
595 2.95.3 includes a backported version. We aren't aware of any other
596 compilers still in use which lack it.
598 * Non-".h" versions of standard ISO C++ headers (e.g. ``#include <list>``
599 rather than ``#include <list.h>``). We aren't aware of any compiler still
600 in use which lacks these, and GCC 4.3 no longer has the old versions. If
601 there are any, we could add a directory full of forwarding headers to work
604 * Standard header ``<limits>`` (for ``numeric_limits<>``) - for GCC, this was
607 * Standard header ``<streambuf>`` (GCC < 3.0 only has ``<streambuf.h>``).
609 * RTTI (dynamic_cast<>, typeid, etc): Needing to use RTTI features in the
610 library most likely indicates a design flaw, and you should avoid use
611 of these features. Where necessary, you can use a technique similar to
612 Database::as_networkdatabase() to replace dynamic_cast<>.
614 * Exceptions: In hindsight, throwing exceptions in the library seems to have
615 been a poor design decision. GCC on Solaris can't cope with exceptions in
616 shared libraries (though it appears this may have been fixed in more recent
617 versions), and we've also had test failures on other platforms which only
618 occur with shared libraries - possibly with a similar cause. Exceptions can
619 also be a pain to handle elegantly in the bindings. We intend to investigate
620 modifying the library to return error codes internally, and then offering the
621 user the choice of exception throwing or error code returning API methods
622 (with the exception being thrown by an inlined wrapper in the externally
623 visible header files). With this in mind, please don't complicate the
624 internal handling of exceptions...
626 * "using namespace std;" and "using std::XXX;" - it's OK to use these in
627 applications, library code, and internal library headers. But in externally
628 visible headers (such as anything included by "#include <xapian.h>") you MUST
629 use explicit "std::" qualifiers - it's not acceptable to pull anything from
630 namespace std into the namespace of an application which uses Xapian.
632 * Use C++ style casts (static_cast<>, reinterpret_cast<>, and const_cast<>)
633 or constructor-syntax (e.g. ``double(value)``) in preference to C style
634 casts. The syntax of the C++ casts is ugly, but they do make the
635 intent much clearer which is definitely a good thing, and they avoid issues
636 such as casting away const when you only meant to cast the type of a pointer.
638 * std::pair<> with an STL class as one (or both) of the members can produce
639 very long symbols (over 4KB!) after name mangling - long enough to overflow
640 the size limits of some vendor compilers or toolchains (so this can affect
641 GCC if it is using the system ld or as). Even where the compiler works, the
642 symbol bloat in an unstripped build is probably best avoided, so it's
643 preferable to use a simple two member struct instead. The code is probably
644 more readable anyway, and easier to extend if more members are needed later.
646 * We try to avoid putting the full definition of virtual methods in header
647 files. This is because current compilers can't (as far as we know) inline
648 virtual methods, so putting the definition in the header file simply slows
649 down compilation (and, because method definitions often require further
650 header files to be included, this can result in many more files needing
651 recompilation after a change to a header file than is really necessary).
652 Just put the declaration in the header file, and put the definition in a .cc
653 file with the same basename.
655 Include ordering for source files
656 ---------------------------------
658 To help us move towards a consistent ordering of #include lines in source
659 files, please follow the following policy when ordering them:
661 * #include <config.h> should be first, and use <> not "" (as recommended by the
662 autoconf manual). Always include config.h from C/C++ source files, but don't
663 include it from header files - the autoconf manual recommends that it should
664 be included first, so including it from headers is either redundant, or may
665 hide a missing config.h include in the source file the header was included
666 from (better to get an error in this case).
668 * The header corresponding to the source file should be next. This means that
669 compilation of the library ensures that each header with a corresponding
670 source file is "self supporting" (i.e. it implicitly or explicitly includes
671 all of the headers it requires).
673 * External xapian-core headers, alphabetically. When included from other
674 external headers, use <> to reduce problems with finding headers in the
675 user's source tree by mistake. In sources and internal headers, use "" (?) -
676 practically this makes no difference as we have -I for srcdir and builddir,
677 but <> suggests installed header files so "" seems more natural).
679 * Internal headers, alphabetically (using "").
681 * "Safe" versions of library headers (include these first to avoid issues if
682 other library headers include the ones we want to wrap). Use "" and order
685 * Library headers, alphabetically.
687 * Standard C++ headers, alphabetically. Use the modern (no .h suffix) names.
689 C++ Portability Issues
690 ======================
695 The "C++ Super-FAQ" covers many frequently asked C++ questions:
696 https://isocpp.org/faq
698 Header Portability Issues
699 -------------------------
704 Don't directly '#include <fcntl.h>' - instead '#include "safefcntl.h"'.
706 The main reason for this is that when using certain compilers on certain
707 versions of Solaris, fcntl.h does '#define open open64'. Sadly this breaks C++
708 code which has methods called open (as we do). There's a cunning workaround
709 for this problem in common/safefcntl.h.
711 Also, safefcntl.h ensures the O_BINARY is defined (to 0 if not required) so
712 calls to open() and creat() can specify O_BINARY unconditionally for the
713 benefit of platforms which discriminate between text and binary files.
718 Don't directly '#include <windows.h>' - instead '#include "safewindows.h"'
719 which reduces the bloat of header files included and prevents some of the
720 more egregious namespace pollution. It also defines any constants we need
721 which might be missing in older versions of the mingw headers.
726 Don't directly '#include <winsock2.h>' - instead '#include "safewinsock2.h"'.
727 This ensure that safewindows.h is included before <winsock2.h> to avoid
728 winsock2.h including windows.h without our namespace pollution reducing
734 Don't directly '#include <sys/select.h>' - instead '#include "safesysselect.h"'
735 which supports older UNIX platforms which predate POSIX 1003.1-2001 and works
736 around a problem on Solaris.
741 Don't directly '#include <sys/socket.h>' - instead '#include "safesyssocket.h"'
742 which supports older UNIX platforms which predate POSIX 1003.1-2001 and works
748 Don't directly '#include <sys/stat.h>' - instead '#include "safesysstat.h"'
749 which under MSVC enables stat to work on files > 2GB, defines the missing
750 POSIX macros S_ISDIR and S_ISREG, pulls in <direct.h> for mkdir() (which is
751 provided by sys/stat.h under UNIX) and provides a compatibility wrapper for
752 mkdir() which takes 2 arguments (so code using mkdir can always just pass
758 To get `WEXITSTATUS` or `WIFEXITED` defined, '#include "safesyswait.h"'.
759 Note that this won't provide `waitpid()`, etc on Microsoft Windows, since
760 these functions are only really useful to use when `fork()` is available.
765 Don't directly '#include <unistd.h>' - instead '#include "safeunistd.h"'
766 - MSVC doesn't even HAVE unistd.h!
768 The various "safe" headers are maintained in xapian-core/common, but also used
769 by Omega. Currently bootstrap sorts out setting up a copy of this subdirectory
770 via a secondary git checkout.
772 Warning-Free Compilation
773 ------------------------
775 Compiling without warnings on every platform is our goal, though it's not
776 always possible to achieve. For example, some GCC 3.x compilers produce the
777 occasional bogus warning (e.g. warning that a variable may be used
778 uninitialised, despite it being initialised at the point of declaration!)
780 You should consider configure-ing with:
782 ./configure CXXFLAGS=-Werror
784 when doing development work on Xapian. This promotes warnings to errors,
785 which should ensure you at least don't introduce new warnings for the compiler
788 If you configure with --enable-maintainer-mode, and are using GCC 4.1 or newer,
789 this is done for you automatically. This is intended to be an aid rather than
790 a form of automated punishment - it's all too easy to miss a new warning as
791 once a file is compiled, you don't see it unless you modify that file or one of
794 With Intel's C++ compiler, --enable-maintainer-mode also enables -Werror.
795 If you know the equivalent of -Werror for other compilers, please add a note
796 here, or tell us so that we can add a note.
798 Miscellaneous Portability Issues
799 --------------------------------
801 Make sure that the last line of any source file ends with a linefeed character
802 since it's undefined behaviour if it doesn't (most compilers accept it, though
803 at least GCC gives a warning).
805 Branch Prediction Hints
806 =======================
808 For compilers which support ``__builtin_expect()`` (GCC >= 3.0 and some others)
809 you can provide manual hints to assist branch prediction. We've wrapped these
810 in macros which evaluate to just their argument for compilers which don't
811 support ``__builtin_expect()__``.
813 Within the xapian-core library code, you can mark the expressions in ``if`` and
814 ``while`` statements as ``rare`` (if the condition is rarely true) or ``usual``
815 (if the condition is usually true).
819 if (rare(something_unusual())) deal_with_it();
821 while (usual(!end_condition()) keep_going();
823 It's easy to make incorrect assumptions about where hotspots are and which
824 branches are usually taken or not, so except for really obvious cases (such
825 as ``if (!consistency_check()) throw_exception();``) you should benchmark
826 that new ``rare`` and ``usual`` hints help rather than hinder before committing
827 them to the repository. It's also likely to be a waste of effort to add them
828 outside of areas of code which are executed very frequently.
830 Don't expect miracles - the first 15 uses added saved approximately 1%.
832 If you know how to implement the ``rare`` and ``usual`` macros for other
833 compilers, please let us know.
838 Especially for a library, compile-time options aren't a good solution for
839 how to integrate a new feature. An increasingly large number of users install
840 pre-built binary packages rather than building from source, and unless the
841 package is capable of being split into modules, the packager has to choose a
842 set of compile-time options to use. And they'll tend to choose either the
843 standard ones, or perhaps a broader set to try to keep everyone happy. For a
844 library, similar issues occur when installing from source as well - the
845 sysadmin must choose the options which will keep all users happy.
847 Another problem with compile-time options is that it's hard to ensure that
848 a change doesn't break compilation under some combination of options without
849 actually building and running the test-suite on all combinations. The fewer
850 compile-time options, the more likely the code will compile with every
853 So please think carefully before adding more compile-time options. They're
854 probably OK for experimental features (but should go away once a feature is no
855 longer experimental). Options to instrument a build for special purposes
856 (debug, profiling, etc) are also acceptable. Disabling whole features probably
857 isn't (e.g. the --disable-backend-XXX options we already have are dubious,
858 though being able to disable the remote backend can be useful when trying to
859 get Xapian going on a platform).
864 We don't want to force those building Xapian from the source distribution to
865 have to use GNU make. Requiring GNU make for "make dist" isn't such a problem
866 but it's probably better to use portable constructs everywhere to avoid
867 problems when people move or copy code between targets. If you do make use
868 of non-portable constructs where it's OK, add a comment noting the special
869 circumstances which justify doing so.
871 Here's an incomplete list of things to avoid:
873 * Don't use "$(RM)" - it's defined by GNU make, but using it actually harms
874 portability as other makes don't define it. Use plain "rm" instead.
876 * Don't use "%" pattern rules - these are GNU make specific. Use an
877 implicit rule (e.g. ".c.o:") if you can. Otherwise, write out each version
880 * Don't use "$<" except in implicit rules. This is an annoying restriction,
881 as using "$<" makes it much easier to make VPATH builds work. But it's only
882 portable in implicit rules. Tips for rewriting - if it's a source file,
887 If it's a generated object file or similar, just write the name as is. The
888 tricky case is a generated file which isn't in git but is shipped in the
889 distribution tarball, as such a file could be in either the source or build
890 tree. Use this trick to make sure it's found whichever directory it's in::
892 `test -f foo.ext || echo '$(srcdir)/'`foo.ext
894 * Don't use "exit 0" to make a rule fail. Use "false" instead. BSD make
895 doesn't like "exit 0" in a rule.
897 * Don't use make conditionals. Automake offers conditionals which may be
898 of use, and these are implemented to work with any make. See the automake
899 manual for details, and a few caveats.
901 * The list of portable utilities is:
903 cat cmp cp diff echo egrep expr false grep install-info
904 ln ls mkdir mv pwd rm rmdir sed sleep sort tar test touch true
906 Note that versions of these (GNU versions in particular) support switches
907 which aren't portable - notably, "test -r" isn't portable; neither is
908 "cp -a". And note that "mkdir -p" isn't portable - the semantics vary.
909 The autoconf manual has some useful information about writing portable
910 shell code (most of it not specific to autoconf)::
912 https://www.gnu.org/software/autoconf/manual/autoconf.html#Portable-Shell
914 * Don't use "include" - it's not present in BSD make (at least some versions
915 have ".include" instead, but that doesn't really seem to help...) Automake
916 provides a configure-time include, which may provide a replacement for some
919 * It appears that BSD make only supports VPATH for implicit rules (e.g.
920 ".c.o:") - there's certainly a restriction there which is not present in GNU
921 make. We used to try to work around this, but now we use AM_MAINTAINER_MODE
922 to disable rules which are only needed by those developing Xapian (these were
923 the rules which caused problems). And we recommend those developing Xapian
924 use GNU make to avoid problems.
926 * Rules with multiple targets can cause problems for parallel builds. These
927 rules are really just a shorthand for multiple rules with the same
928 prerequisites and commands, and it is fine to use them in this way. However,
929 a common temptation is to use them when a single invocation of a command
930 generates multiple output files, by adding each of the output files as a
931 target. Eg, if a swig language module generates xapian_wrap.cc and
932 xapian_wrap.h, it is tempting to add a single rule something like::
934 # This rule has a problem
935 xapian_wrap.cc xapian_wrap.h: xapian.i
938 This can result in SWIG_commands being run twice, in parallel. If
939 SWIG_commands generates any temporary files, the two invocations can
940 interfere causing one of them to fail.
942 Instead of this rule, one solution is to pick one of the output files as a
943 primary target, and add a dependency for the second output file on the first
946 # This rule also has a problem
947 xapian_wrap.h: xapian_wrap.cc
948 xapian_wrap.cc: xapian.i
951 This ensures that make knows that only one invocation of SWIG_commands is
952 necessary, but could result in problems if the invocation of SWIG_commands
953 failed after creating xapian_wrap.cc, but before creating xapian_wrap.h.
954 Instead, we recommend creating an intermediate target::
956 # This rule works in most cases
957 xapian_wrap.cc xapian_wrap.h: xapian_wrap.stamp
958 xapian_wrap.stamp: xapian.i
962 Because the intermediate target is only touched after the commands have
963 executed successfully, subsequent builds will always retry the commands if an
964 error occurs. Note that the intermediate target cannot be a "phony" target
965 because this would result in the commands being re-run for every build.
967 However, this rule still has a problem - if the xapian_wrap.cc and
968 xapian_wrap.h files are removed, but the xapian_wrap.stamp file is not, the
969 .cc and .h files will not be regenerated. There is no simple solution to
970 this, but the following is a recipe taken from the automake manual which
971 works. For details of *why* it works, see the section in the automake manual
972 titled "Multiple Outputs"::
974 # This rule works even if some of the output files were removed
975 xapian_wrap.cc xapian_wrap.h: xapian_wrap.stamp
976 ## Recover from the removal of $@. A full explanation of these rules is in
977 ## the automake manual under the heading "Multiple Outputs".
978 @if test -f $@; then :; else \
979 trap 'rm -rf xapian_wrap.lock xapian_wrap.stamp' 1 2 13 15; \
980 if mkdir xapian_wrap.lock 2>/dev/null; then \
981 rm -f xapian_wrap.stamp; \
982 $(MAKE) $(AM_MAKEFLAGS) xapian_wrap.stamp; \
983 rmdir xapian_wrap.lock; \
985 while test -d xapian_wrap.lock; do sleep 1; done; \
986 test -f xapian_wrap.stamp; exit $$?; \
989 xapian_wrap.stamp: xapian.i
993 * This is actually a robustness point, not portability per se. Rules which
994 generate files should be careful not to leave a partial file in place if
995 there's an error as it will have a timestamp which leads make to believe it's
996 up-to-date. So this is bad:
999 $PERL script.pl > foo.cc
1004 $PERL script.pl > foo.tmp
1007 Alternatively, pass the output filename to the script and make sure you
1008 delete the output on error or a signal (although this approach can leave
1009 a partial file in place if the power fails). All used Makefile.am-s and
1010 scripts have been checked (and fixed if required) as of 2003-07-10 (didn't
1011 check xapian-bindings).
1013 * Another robustness point - if you add a non-file target to a makefile, you
1014 should also list it in ".PHONY". Otherwise your target won't get remade
1015 reliably if someone creates a file with the same name in their tree. For
1018 .PHONY: hello goodbye
1026 And lastly a style point - using "@" to suppress echoing of commands being
1027 executed removes choice from the user - they may want to see what commands
1028 are being executed. And if they don't want to, many versions of make support
1029 the use "make -s" to suppress the echoing of commands.
1031 Using @echo on a message sent to stdout or stderr is acceptable (since it
1032 avoids showing the message twice). Otherwise don't use "@" - it makes it
1033 harder to track down problems in the makefiles.
1038 Scripts generally should *not* have an extension indicating the language they
1039 are currently implemented in (e.g. ``runtest`` rather than ``runtest.sh`` or
1040 ``runtest.pl``). The problem with such an extension is that if we decide
1041 to reimplement the script in a different language, we either have to rename
1042 the script (which is annoying as people will be used to the name, and may
1043 have embedded it in their own scripts), or we have a script with a confusing
1044 name (e.g. a Python script with extension ``.pl``).
1046 The above reasoning doesn't apply to scripts which have to be in a particular
1047 language for some reason, though for consistency they probably shouldn't get
1048 an extension either, unless there's a good reason to have one.
1053 Use Assert to perform internal consistency checks, and to check for invalid
1054 arguments to functions and methods (e.g. passing a NULL pointer when this isn't
1055 permitted). It should *NOT* be used to check for error conditions such as
1056 file read errors, memory allocation failing, etc (since we want to perform such
1057 checks in non-debug builds too).
1059 File format errors should also not be tested with Assert - we want to catch
1060 a corrupted database or a malformed input file in a non-debug build too.
1062 There are several variants of Assert:
1064 - Assert(P) -- asserts that expression P is true.
1066 - AssertRel(a,rel,b) -- asserts that (a rel b) is true - rel can be a boolean
1067 relational operator, i.e. one of ``==``, ``!=``, ``>``, ``>=``, ``<``,
1068 ``<=``. The message given if the assertion fails reports the values of
1069 a and b, so ``AssertRel(a,<,b);`` is more helpful than ``Assert(a < b);``
1071 - AssertEq(a,b) -- shorthand for AssertRel(a,==,b).
1073 - AssertEqDouble(a,b) -- asserts a and b differ by less than DBL_EPSILON
1075 - AssertParanoid(P) -- a particularly expensive assertion. If you want a build
1076 with Asserts enabled, but without a great performance overhead, then
1077 passing --enable-assertions=partial to configure and AssertParanoids
1078 won't be checked, but Asserts will. You can also use AssertRelParanoid
1079 and AssertEqParanoid.
1081 - CompileTimeAssert(P) -- this has now been removed, since we require C++11
1082 support from the compiler, and C++11 added ``static_assert``.
1084 Marking Features as Deprecated
1085 ==============================
1087 In the API headers, a feature (a class, method, function, enum, typedef, etc)
1088 can be marked as deprecated by using the XAPIAN_DEPRECATED() or
1089 XAPIAN_DEPRECATED_CLASS macros. Note that you can't deprecate a preprocessor
1092 For compilers with a suitable mechanism (such as GCC, clang and MSVC) this
1093 causes compile-time warning messages to be emitted for any use of the
1094 deprecated feature. For compilers without support, the macro just expands to
1097 Sometimes a deprecated feature will also be removed from the library itself
1098 (particularly something like a typedef), but if the feature is still used
1099 inside the library (for example, so we can define class methods), then use
1100 XAPIAN_DEPRECATED_EX() or XAPIAN_DEPRECATED_CLASS_EX instead, which will only
1101 issue a warning in user code (this relies on user code including xapian.h
1102 and library code including individual headers)
1104 You must add this line to any API header which uses XAPIAN_DEPRECATED() or
1105 XAPIAN_DEPRECATED_CLASS::
1107 #include <xapian/deprecated.h>
1109 When marking a feature as deprecated, document the deprecation in
1110 docs/deprecation.rst. When actually removing deprecated features, please tidy
1111 up by removing the inclusion of <xapian/deprecated.h> from any file which no
1112 longer marks any features as deprecated.
1114 The XAPIAN_DEPRECATED() macro should wrap the whole declaration except for the
1115 semicolon and any "definition" part, for example::
1117 XAPIAN_DEPRECATED(int old_function(double arg));
1121 XAPIAN_DEPRECATED(int old_method());
1123 XAPIAN_DEPRECATED(int old_const_method() const);
1125 XAPIAN_DEPRECATED(virtual int old_virt_method()) = 0;
1127 XAPIAN_DEPRECATED(static int old_static_method());
1129 XAPIAN_DEPRECATED(static const int OLD_CONSTANT) = 42;
1132 Mark a class as deprecated by inserting ``XAPIAN_DEPRECATED_CLASS`` after the
1133 class keyword like so::
1135 class XAPIAN_DEPRECATED_CLASS Foo {
1142 With recent versions of GCC (4.4.7 allows this, 3.3.5 doesn't), you can
1143 simply mark a method defined inline in a class with ``XAPIAN_DEPRECATED()``
1148 // This failed to compile with GCC 3.3.5.
1149 XAPIAN_DEPRECATED(int old_inline_method()) { return 42; }
1152 Xapian 1.3.x and later require at least GCC 4.7, so you can now just use the
1158 If you have a patch to fix a problem in Xapian, or to add a new feature,
1159 please send it to us for inclusion. Any major changes should be discussed
1160 on the xapian-devel mailing list first:
1161 <https://xapian.org/lists>
1163 Also, please read the following section on licensing of patches before
1166 We find patches in unified diff format easiest to read. If you're using
1167 git, then "git diff" is good (or "git format-patch" for a patch series). If
1168 you're working from a tarball, you can unpack a second clean copy of the files
1169 and compare the two versions with "diff -pruN" (-p reports the function name
1170 for each chunk, -r acts recursively, -u does a unified diff, and -N shows
1171 new files in the diff). Alternatively "ptardiff" (which comes with perl, at
1172 least on Debian and Ubuntu) can diff against the original tarball, unpacking
1175 Please set the width of a tab character in your editor to 8 spaces, and use
1176 Unix line endings (i.e. LF, not CR+LF). Failing to do so will make it much
1177 harder for us to merge in your changes.
1179 We don't currently have a formal coding standards document, but please try
1180 to follow the style of the existing code. In particular:
1182 * Indent C++ code by 4 spaces for a new indentation level, and set your editor
1183 to tab-fill indentation (with a tab being 8 spaces wide).
1185 As an exception, "public", "protected" and "private" declarations in classes
1186 and structs should be indented by 2 spaces, and the following code should be
1187 indented by 2 more spaces::
1194 The rationale for this exception is that class definitions in header files
1195 often have fairly long lines, so losing an indent level to the access
1196 specifier tends to make class definitions less readable.
1198 The default access for a class is always "private", so there's no need
1199 to specify that explicitly - in other words, write this::
1202 int internal_method();
1205 int external_method();
1212 int internal_method();
1215 int external_method();
1218 If a class only contains public methods and data, consider declaring it as a
1219 "struct" (the only difference in C++ is that the default access for a
1220 struct is "public").
1222 * Put a space before the "(" after control flow constructs like "for", "if",
1223 "while", etc. Don't put a space before the "(" in function calls. So
1224 write "if (strlen(p) > 10)" not "if(strlen (p) > 10)".
1226 * When "if", "else", "for", "while", "do," "switch", "case", "default", "try",
1227 or "catch" is followed by a block enclosed in braces, the opening brace
1228 should be on the same line, like so::
1237 The rationale for this is that it conserves vertical space (allowing more
1238 code to fit on screen) without reducing readability.
1240 * If you have an empty loop body, use `{ }` rather than `;` as the former
1241 stands out more clearly to the reader (but also consider if the code might be
1242 clearer written a different way).
1244 * Prefer "++i;" to "i++;", "i += 1;", or "i = i + 1". For simple integer
1245 variables these should generate equivalent (if not identical) code, but if i
1246 is an iterator object then the pre-increment form can be more efficient in
1247 some cases with some compilers. It's simpler and more consistent to always
1248 use the pre-increment form (unless you make use of the old value which the
1249 post-increment form returns). For the same reasons, prefer "--i;" to "i--;",
1250 "i -= 1;", or "i = i - 1;".
1252 * Prefer "container.empty()" to "container.size() == 0" (and
1253 "!container.empty()" to "container.size() != 0" or "container.size() > 0").
1254 Some containers (e.g. std::forward_list) support "empty()" but not "size()".
1255 Pre-C++11 finding the size of a container wasn't necessarily a constant time
1256 operation for some containers (e.g. std::list with GCC) - that's no longer
1257 the case for any STL containers since C++11, but it could still be true for
1258 non-STL containers. Also the "empty()" form is a little more concise and
1259 makes the intent of the test more explicit.
1261 * Prefer not to use "else" when the control flow is diverted elsewhere at the
1262 end of the "if" block (e.g. by "return", "continue", "break", "throw"). This
1263 eliminates a level of indentation from the code in the "else" block, and
1264 typically makes the control flow logic clearer. For example::
1286 * For standard ISO C headers, prefer the C++ form for ISO C headers (e.g.
1287 "#include <cstdlib>" rather than "#include <stdlib.h>") unless there's a good
1288 reason (e.g. portability) to do otherwise. Be sure to document such
1289 exceptions to avoid another developer changing them to the standard form.
1290 Global exceptions: <signal.h> (lots of POSIX stuff which e.g. Sun's compiler
1291 doesn't provide in <csignal>).
1293 * For standard ISO C++ headers, *always* use the ISO C++ form '#include <list>'
1294 (pre-ISO compilers used '#include <list.h>', but GCC has generated a warning
1295 for this form for years, and GCC 4.3 dropped support entirely).
1297 * Some guidelines for efficient use of std::string:
1299 + When passing an empty string to a method expecting ``const std::string &``
1300 prefer ``std::string()`` to ``""`` or ``std::string("")`` as the first form
1301 is more likely to directly use a special "empty string representation" (it
1302 does with GCC at least).
1304 + To make a string object empty, ``s.resize(0)`` (if you want to keep the
1305 current reserved space) or ``s = string()`` (if you don't) seem the best
1308 + Use ``std::string::assign()`` rather than building a temporary string
1309 object and assigning that. For example, ``foo = std::string(ptr, len);``
1310 is better written as ``foo.assign(ptr, len);``.
1312 + It's generally better to build up strings using ``+=`` rather than
1313 combining series of components with ``+``. So ``foo = a + " and " + c`` is
1314 better written as ``foo = a; foo += " and "; foo += c;``. It's possible
1315 for compilers to handle the former without a lot of temporary string
1316 objects by returning a proxy object to allow the concatenation to happen
1317 lazily, but not all compilers do this, and it's likely to still have some
1318 overhead. Note that GCC 4.1 seems to produce larger code in some cases for
1319 the latter approach, but it's a definite win with GCC 4.4.
1321 * ``std::string(1, '\0')`` seems to be slightly more efficient than
1322 ``std::string("", 1)`` for constructing a std::string containing a single
1323 ASCII nul character.
1325 * Prefer ``new SomeClass`` to ``new SomeClass()``, since the latter tends to
1326 lead one to write ``SomeClass foo();` which is a function prototype, and not
1327 equivalent to the variable definition ``SomeClass foo``. However, note that
1328 ``new SomePODType()`` is *not* the same as ``new SomePODType`` (if
1329 SomePODType is a POD (Plain Old Data) type) - the former will zero-initialise
1330 scalar members of SomePODType.
1332 * When catching an exception which is an object, do it by const reference, so
1337 } catch (const ErrorClass &e) {
1341 Catching by value is bad because it "slices" the object if an object of a
1342 derived type is thrown. Even if derived types aren't a worry, it also causes
1343 the copy constructor to be called needlessly.
1345 See also: https://isocpp.org/wiki/faq/exceptions#what-to-catch
1347 A const reference is preferable to a non-const reference as it stops the
1348 object being inadvertently modified. In the rare cases when you want to
1349 modify the caught object, a non-const reference is OK.
1351 We will do our best to give credit where credit is due - if we have used
1352 patches from you, or received helpful reports or advice, we will add your name
1353 to the AUTHORS file (unless you specifically request us not to). If you see we
1354 have forgotten to do this, please draw it to our attention so that we can
1355 address the omission.
1357 Licensing of patches
1358 ====================
1360 If you want a patch to be considered for inclusion in the Xapian sources, you
1361 must own the copyright on this patch. Employers often claim copyright on code
1362 written by their employees (even if the code is written in their spare time),
1363 so please check with your employer if this applies. Be aware that even if you
1364 are a student your university may try and claim some rights on code which you
1367 Patches which are submitted to Xapian will only be included if the copyright
1368 holder(s) dual-license them under each of the following licences:
1370 - GPL version 2 and all later versions (see the file "COPYING" for details).
1373 Copyright (c) <year> <copyright holders>
1375 Permission is hereby granted, free of charge, to any person obtaining a copy
1376 of this software and associated documentation files (the "Software"), to
1377 deal in the Software without restriction, including without limitation the
1378 rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
1379 sell copies of the Software, and to permit persons to whom the Software is
1380 furnished to do so, subject to the following conditions:
1382 The above copyright notice and this permission notice shall be included in
1383 all copies or substantial portions of the Software.
1385 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
1386 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
1387 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
1388 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
1389 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
1390 FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
1393 The current distribution of Xapian contains many files which are only licensed
1394 under the GPL, but we are working towards being able to distribute Xapian under
1395 a more permissive license, and are not willing to accept patches which we will
1396 have to rewrite before this can happen.
1398 Tips for Submitting a Good Patch
1399 ================================
1401 1) Make sure that the documentation is updated
1402 ----------------------------------------------
1404 * API classes, methods, functions, and types must be documented by
1405 documentation comments alongside the declaration in ``include/xapian/*.h``.
1406 These are collated by doxygen - see doxygen's documentation for details
1407 of the supported syntax. We've decided to prefer to use @ rather than \
1408 to introduce doxygen commands (the choice is essentially arbitrary, but
1409 \ introduces C/C++ escape sequences so @ is likely to make for easier to
1410 read mark up for C/C++ coders).
1412 * The documentation comments don't give users a good overview, so we also
1413 need documentation which gives a good overview of how to achieve particular
1414 tasks. In particularly, major new functionality should have its own "topic"
1415 document, or extend an existing topic document if more appropriate.
1417 * Internal classes, etc should also be documented by documentation comments
1418 where they are declared.
1420 2) Make sure the tests are right
1421 --------------------------------
1423 * If you're adding a feature, also add feature tests for it. These both
1424 ensure that the feature isn't broken to start with and detect if later
1425 changes stop it working as intended.
1427 * If you've fixed a bug, make sure there's a regression test which
1428 fails on the existing code and succeeds after your changes.
1430 * If you're adding a new testcase to demonstrate an existing bug, and not
1431 checking a fix in at the same time, mark the testcase as a known failure (by
1432 calling ``XFAIL("explanatory message")`` at the start of your testcase (if
1433 necessary this can be conditional on backend or other factors - the backend
1434 case has explicit support via ``XFAIL_FOR_BACKEND("backend", "message")``).
1436 This will mean that this testcase failing will be reported as "XFAIL" which
1437 won't cause the test run to fail. If such a testcase in fact passes, that
1438 gets reported as "XPASS" and *will* cause the test run to fail. A testcase
1439 should not be flagged as "XFAIL" for a long time, but it can be useful to be
1440 able to add such testcases during development. It also allows a patch
1441 series which fixes a bug to first demonstrate the bug via a new testcase
1442 marked as "XFAIL", then fix the bug and remove the "XFAIL" - this makes it
1443 clear that the regression test actually failed before the fix.
1445 Note that failures which are due to valgrind errors or leaked fds are not
1446 affected by this macro - such errors are inherently not suitable for "XFAIL"
1447 as they go away when the testsuite is run without valgrind or on a platform
1448 where our fd leak detector code isn't supported.
1450 * Make sure all existing tests continue to pass.
1452 If you don't know how to write tests using the Xapian test rig, then
1453 ask. It's reasonably simple once you've done it once. There is a brief
1454 introduction to the Xapian test system in ``docs/tests.html``.
1456 3) Make sure the attributions are right
1457 ---------------------------------------
1459 * If necessary, modify the copyright statement at the top of any
1460 files you've altered. If there is no copyright statement, you may
1461 add one (there are a couple of Makefile.am's and similar that don't
1462 have copyright statements; anything that small doesn't really need
1463 one anyway, so it's a judgement call). If you've added files which
1464 you've written from scratch, they should include the GPL boilerplate
1465 with your name only.
1467 * If you're not in there, add yourself to the AUTHORS file.
1474 + If there's a trac ticket or other reference for the bug, mention it in the
1475 commit message - it's a great help to future developers trying to work out
1476 why a change was made.
1478 5) Consider backporting
1479 -----------------------
1481 * If there's an active release branch, check if the bug is present in that
1482 branch, and if the fix is appropriate to backport - if the fix breaks ABI
1483 compatibility or is very invasive, you need to fix it in a different way
1484 for the release branch, or decide not to backport the fix.
1489 * If there's a related trac ticket, update it (if the issue is completely
1490 addressed by the changes you've made, then close it).
1492 * Update the release notes for the most recent release with a copy of the
1493 patch. If the commit from git applies cleanly, you can just link to
1494 it. If it fails to apply, please attach an adjusted patch which does.
1495 If there are conflicts in test cases which aren't easy to resolve, it is
1496 acceptable to just drop those changes from the patch if we can still be
1497 confident that the issue is actually fixed by the patch.
1502 We use reference counted pointers for most API classes. These are implemented
1503 using Xapian::Internal::intrusive_ptr, the implementation of which is exposed
1504 for efficiency, and because it's unlikely we'll need to change it frequently,
1507 For the reference counted classes, the API class (e.g. Xapian::Enquire) is
1508 really just a wrapper around a reference counted pointer. This points to an
1509 internal class (e.g. Xapian::Enquire::Internal). The reference counted
1510 pointer is a member variable of the API class called internal. Conceptually
1511 this member is private, though it typically isn't declared as private (this
1512 is to avoid littering the external headers with friend declarations for
1515 There are a few exceptions to the reference counted structure, such as
1516 MSetIterator and ESetIterator which have an exposed implementation. Tests show
1517 this makes a substantial difference to speed (it's ~20% faster) in typical
1518 cases of iterator use.
1520 The postfix operator++ for iterators should be implemented inline in terms
1521 of the prefix form as described by Joe Buck on the gcc mailing list
1522 - excerpt from https://article.gmane.org/gmane.comp.gcc.devel/50201 ::
1524 class some_iterator {
1527 some_iterator& operator++();
1529 some_iterator operator++(int) {
1530 some_iterator tmp = *this;
1536 The compiler is allowed to assume that the copy constructor only does
1537 a copy, and to optimize away unneeded copy operations. The result
1538 in this case should be that, for some_iterator above, using the
1539 postfix operator without using the result should give code equivalent
1540 to using the prefix operator.
1542 Now, for [GCC 3.4], you'll find that the dead uses of tmp are only
1543 completely optimized away if tmp has only one data member that can fit in a
1544 register. [GCC 4.0 will do] better, and you should find that this style
1545 comes very close to eliminating any penalty from "incorrect" use of the
1548 Xapian's PostingIterator, TermIterator, PositionIterator, and ValueIterator all
1549 have only one data member which fits in a register.
1551 Handy tips for aiding development
1552 =================================
1554 If you are find you are repeatedly changing the API headers (in include/)
1555 during development, then you may become annoyed that the docs/ subdirectory
1556 will rebuild the doxygen documentation every time you run "make" since this
1557 takes a while. You can disable this temporarily (if you're using GNU make),
1558 by creating a file "docs/GNUmakefile" containing these two lines::
1561 @echo "Skipping 'make $@' in docs"
1563 Note that the whitespace at the start of the second line needs to be a
1564 single "tab" character!
1566 Don't forget to remove (or rename) this and check the documentation builds
1567 before committing or generating a patch though!
1569 If you are using an editor or other tool capable of running syntax checks as you
1570 work there you can use the `make` target 'check-syntax'. For 'emacs' users this
1571 works well with 'flymake'. Usage from a shell::
1573 make check-syntax check_sources=api/omdatabase.cc
1576 How to make a release
1577 =====================
1579 See https://github.com/xapian/xapian-developer-guide/blob/master/releases/index.rst
1580 where the documentation for this now lives.