2 .. Copyright (C) 2007,2008,2011 Olly Betts
8 .. contents:: Table of contents
13 Xapian provides support for storing a synonym dictionary, or thesaurus. This
14 can be used by the Xapian::QueryParser class to expand terms in user query
15 strings, either automatically, or when requested by the user with an explicit
16 synonym operator (``~``).
18 Note that Xapian doesn't offer automated generation of the synonym dictionary.
23 The model for the synonym dictionary is that a term or group of consecutive
24 terms can have one or more synonym terms. A group of consecutive terms is
25 specified in the dictionary by simply joining them with a single space between
28 QueryParser Integration
29 =======================
31 In order for any of the synonym features of the QueryParser to work, you must
32 call ``QueryParser::set_database()`` to specify the database to use.
34 If ``FLAG_SYNONYM`` is passed to ``QueryParser::parse_query()`` then the
35 QueryParser will recognise ``~`` in front of a term as indicating a request for
36 synonym expansion. If ``FLAG_LOVEHATE`` is also specified, you can use ``+``
37 and ``-`` before the ``~`` to indicate that you love or hate the synonym
40 A synonym-expanded term becomes the term itself OR-ed with any listed synonyms,
41 so ``~truck`` might expand to ``truck OR lorry OR van``. A group of terms is
42 handled in much the same way.
44 If a term to be synonym expanded will be stemmed by the QueryParser, then
45 synonyms will be checked for the unstemmed form first, and then for the stemmed
46 form, so you can provide different synonyms for particular unstemmed forms
49 If ``FLAG_AUTO_SYNONYMS`` is passed to ``QueryParser::parse_query()`` then the
50 QueryParser will automatically expand any term which has synonyms, unless the
51 term is in a phrase or similar.
53 If ``FLAG_AUTO_MULTIWORD_SYNONYMS`` is passed to ``QueryParser::parse_query()``
54 then the QueryParser will look at groups of terms separated only by whitespace
55 and try to expand them as term groups. This is done in a "greedy" fashion, so
56 the first term which can start a group is expanded first, and the longest group
57 starting with that term is expanded. After expansion, the QueryParser will
58 look for further possible expansions starting with the term after the last
59 term in the expanded group.
64 Explicit multi-word synonyms
65 ----------------------------
67 There ought to be a way to explicitly request expansion of multi-term synonyms,
68 probably with the syntax ``~"stock market"``. This hasn't been implemented
74 Currently synonyms are supported by glass and chert databases. They work
75 with a single database or multiple databases (use Database::add_database() as
76 usual). We've no plans to support them for the InMemory backend, but we do
77 intend to support them for the remote backend in the future.