1 This ChangeLog file is no longer maintained - see the git repo history for
2 more recent changes: https://xapian.org/bleeding
4 Wed Sep 30 19:40:15 GMT 2015 Olly Betts <olly@survex.com>
6 * query.cc: Avoid creating temporary string objects when appending a
7 substring of another string.
9 Wed Sep 30 19:36:07 GMT 2015 Olly Betts <olly@survex.com>
11 * query.cc: Use += to build up strings (which should be O(n)), rather
12 than str = str + str2 (which is likely to be O(n*n)).
14 Thu Sep 24 03:50:58 GMT 2015 Olly Betts <olly@survex.com>
16 * query.cc: Fix $jsonarray not to prepend ']' to the first array
19 Thu Sep 17 00:28:30 GMT 2015 Olly Betts <olly@survex.com>
21 * Makefile.am: Fix "make check" compilation failure on platforms
24 Thu Sep 17 00:26:16 GMT 2015 Olly Betts <olly@survex.com>
26 * Makefile.am: atomparsetest and htmlparsetest need datetime.cc.
28 Wed Sep 16 23:43:16 GMT 2015 Olly Betts <olly@survex.com>
30 * myhtmlparse.cc: Remove unused header.
32 Wed Sep 16 23:42:29 GMT 2015 Olly Betts <olly@survex.com>
34 * Makefile.am,datetime.cc,datetime.h,metaxmlparse.cc,myhtmlparse.cc:
35 Factor out parse_datetime() function.
37 Wed Sep 16 20:22:03 GMT 2015 Olly Betts <olly@survex.com>
39 * diritor.cc: Avoid magic_descriptor() in libmagic < 5.15 as it
40 closes the fd passed to it. The magic_descriptor() code path isn't
41 actually used currently, so this issue doesn't cause omindex to
44 Tue Sep 15 22:52:24 GMT 2015 Olly Betts <olly@survex.com>
46 * index_file.cc,index_file.h,omindex.cc: Pass in Document object and
47 string for record so extra data can be added before the file is
50 Mon Sep 14 03:15:50 GMT 2015 Olly Betts <olly@survex.com>
52 * index_file.cc,index_file.h,omindex.cc: Move setting of default
53 command filters into index_file.cc.
55 Mon Sep 14 00:54:36 GMT 2015 Olly Betts <olly@survex.com>
57 * Makefile.am,mime.cc,mime.h,omindex.cc: Factor out the code to find
58 a MIME type for a file.
60 Sat Sep 12 05:22:16 GMT 2015 Olly Betts <olly@survex.com>
62 * docs/Makefile.am: Remove bogus '.' from rm command.
64 Fri Sep 11 04:42:53 GMT 2015 Olly Betts <olly@survex.com>
66 * omindex.cc: Tweak order of constant definitions.
68 Fri Sep 11 04:42:27 GMT 2015 Olly Betts <olly@survex.com>
70 * index_file.cc,index_file.h: Factor out index_add_document() function.
72 Thu Sep 10 20:43:47 GMT 2015 Olly Betts <olly@survex.com>
74 * Makefile.am,index_file.cc,index_file.h,omindex.cc: Refactor to start
75 to split out the code to index a file.
77 Thu Sep 10 07:30:25 GMT 2015 Olly Betts <olly@survex.com>
79 * configure.ac,docs/Makefile.am: Fix generation of overview.rst when
82 Thu Sep 10 05:56:12 GMT 2015 Olly Betts <olly@survex.com>
84 * omindex.cc: Use SAMPLE_SIZE in help text rather than literal 512.
85 Add '--title-size' option.
87 Wed Sep 09 04:28:43 GMT 2015 Olly Betts <olly@survex.com>
89 * omindex.cc: Fix comment typo.
91 Tue Sep 08 04:31:51 GMT 2015 Olly Betts <olly@survex.com>
93 * docs/omegascript.rst,query.cc: Fix documentation of $last to say
94 it's the MSet index *one beyond* the end of the current page.
95 Reported by Andrew Chilton.
97 Wed Sep 02 07:10:31 GMT 2015 Olly Betts <olly@survex.com>
99 * docs/overview.rst: SVG extraction is built-in too.
101 Wed Sep 02 02:28:45 GMT 2015 Olly Betts <olly@survex.com>
103 * docs/.gitignore,docs/Makefile.am,docs/overview.rst,gen-mimemap:
104 Generate the list of recognised mime types and of ignored extensions.
106 Tue Sep 01 07:22:38 GMT 2015 Olly Betts <olly@survex.com>
108 * .gitignore: Update.
110 Tue Sep 01 07:14:46 GMT 2015 Olly Betts <olly@survex.com>
112 * .gitignore,Makefile.am,gen-mimemap,mimemap.tokens,omindex.cc: Factor
113 out the default extension to MIME content-type mapping into a file,
114 and generate a static lookup table for use with keyword().
116 Sat Aug 15 17:24:07 GMT 2015 Olly Betts <olly@survex.com>
118 * docs/cgiparams.rst: Document behaviour if xDB is not set.
120 Sat Aug 15 17:19:51 GMT 2015 Olly Betts <olly@survex.com>
122 * docs/cgiparams.rst,query.cc: If xFILTERS is not set, don't force the
123 first page as that's unhelpful if someone fails to set it in their
126 Mon Jul 06 09:54:50 GMT 2015 Olly Betts <olly@survex.com>
128 * configure.ac: Don't provide our own implementation of sleep() under
129 __WIN32__ if there's already one - mingw provides one, and in some
130 situations it seems to clash with ours. Reported to xapian-discuss
133 Thu Jun 11 05:03:52 GMT 2015 Olly Betts <olly@survex.com>
135 * Makefile.am,docs/overview.rst,failed.h,omindex.cc: Track files which
136 couldn't be indexed in the user metadata and skip them by default on
137 subsequent runs to avoid the costs of repeatedly running a filter on
138 a file it can't handle.
140 Mon Jun 01 13:13:26 GMT 2015 Olly Betts <olly@survex.com>
142 * NEWS,configure.ac: Update for 1.3.3.
144 Wed May 27 02:13:24 GMT 2015 Olly Betts <olly@survex.com>
148 Fri May 22 13:05:39 GMT 2015 Olly Betts <olly@survex.com>
150 * docs/termprefixes.rst,omindex.cc,templates/query: Index the filename
151 terms with an 'F' prefix, rather than treating them as more body
152 text. (Fixes #633, reported by Emmanuel Garette)
154 Fri May 22 03:43:15 GMT 2015 Olly Betts <olly@survex.com>
158 Fri May 15 03:08:15 GMT 2015 Olly Betts <olly@survex.com>
160 * Makefile.am,configure.ac: Use -no-install or -no-fast-install when
161 linking test programs which never get installed, which means libtool
162 can often avoid creating a shell script wrapper.
164 Wed May 13 15:02:36 GMT 2015 Olly Betts <olly@survex.com>
166 * omindex.cc: Message tweak.
168 Wed May 13 15:00:49 GMT 2015 Olly Betts <olly@survex.com>
170 * omindex.cc: Handle text/x-perl and application/x-dvi via the
171 commands map instead of hardcoded cases.
173 Wed May 13 14:49:52 GMT 2015 Olly Betts <olly@survex.com>
175 * docs/overview.rst,omindex.cc: Allow --filter to specify the
176 character set of the output the filter produces.
178 Wed May 13 01:34:09 GMT 2015 Olly Betts <olly@survex.com>
180 * outlookmsg2html.in: Fix handling of message/rfc822 subparts.
182 Tue May 12 14:07:27 GMT 2015 Olly Betts <olly@survex.com>
184 * docs/overview.rst,omindex.cc: Allow --filter to handle commands
185 which produce output in a temporary file rather than on stdout.
187 Tue May 12 13:37:01 GMT 2015 Olly Betts <olly@survex.com>
189 * omindex.cc: Handle application/vnd.ms-excel via commands map instead
192 Tue May 12 12:56:40 GMT 2015 Olly Betts <olly@survex.com>
194 * docs/overview.rst,omindex.cc: Add support for %f in command passed
195 to --filter to allow specifying commands where the input file is
196 not the final argument. Fixed #570, reported by "catkin".
198 Tue May 12 11:17:20 GMT 2015 Olly Betts <olly@survex.com>
200 * diritor.h,omindex.cc,values.h: Add -track-ctime option to allow
201 omindex to pick up changes to file ownership and permissions.
203 Tue May 12 07:38:45 GMT 2015 Olly Betts <olly@survex.com>
205 * configure.ac: Fix typo.
207 Mon May 11 07:05:24 GMT 2015 Olly Betts <olly@survex.com>
209 * cdb_hash.cc,md5.cc: Remove 'register' as it's deprecated, and
210 likely to just be ignored by any modern compiler anyway.
212 Tue May 05 12:25:20 GMT 2015 Olly Betts <olly@survex.com>
214 * docs/encodings.rst: $prettyurl undoes %-encoding of UTF-8 in 1.2.21
217 Tue May 05 01:40:17 GMT 2015 Olly Betts <olly@survex.com>
219 * omega.cc: Drop compilation date and time from output - they prevent
220 reproducible builds and the version number is sufficient
223 Fri May 01 09:36:27 GMT 2015 Olly Betts <olly@survex.com>
225 * configure.ac,md5.h,values.h: Now we require C++11, just include
226 <cstdint> for uint32_t.
228 Fri May 01 09:02:34 GMT 2015 Olly Betts <olly@survex.com>
230 * configure.ac: For Sun's C++ compiler, -std=c++11 enables C++11
231 support, and is incompatible with -library=stlport, so remove code
232 to enable that later option.
234 Fri May 01 08:35:47 GMT 2015 Olly Betts <olly@survex.com>
236 * commonhelp.cc,omega.cc,omindex-list.cc,omindex.cc,scriptindex.cc:
237 Add spaces between literal strings and macros which expand to
238 literal strings for C++11 compatibility.
240 Fri May 01 06:27:13 GMT 2015 Olly Betts <olly@survex.com>
242 * configure.ac,m4/ax_cxx_compile_stdcxx_11.m4: Sync with xapian-core,
245 Fri May 01 01:38:08 GMT 2015 Olly Betts <olly@survex.com>
247 * INSTALL: IRIX is past EOL so drop information about IRIX make.
249 Thu Apr 30 05:08:13 GMT 2015 Olly Betts <olly@survex.com>
251 * Makefile.am: Add common/stringutils.cc to urlenctest_SOURCES, needed
252 now urldecode.h uses C_isxdigit().
254 Thu Apr 30 02:56:08 GMT 2015 Olly Betts <olly@survex.com>
256 * configfile.cc,htmlparse.cc,myhtmlparse.cc,omega.cc,omindex.cc,
257 query.cc,scriptindex.cc,urldecode.h: Consistently use C_isupper(),
258 C_toupper(), etc as these versions aren't affected by the locale
259 setting, and also allow signed char values (so we don't need to
260 cast the argument to unsigned char).
262 Wed Apr 15 01:12:33 GMT 2015 Olly Betts <olly@survex.com>
264 * docs/cgiparams.rst,docs/omegascript.rst,omega.cc,query.cc: Fix
265 handling of multiple P.<prefix> fields - previously only the first
266 seen was used. These fields are also now taken into account when
267 deciding if the query has changed. $query now returns an
268 OmegaScript list with one entry for each CGI parameter passed.
270 Wed Apr 15 01:11:08 GMT 2015 Olly Betts <olly@survex.com>
272 * templates/query: Fix setting setting of prefix map for P - in 1.3.2,
273 this would failed to also search in the subject. Now it also
274 searches in the subject and topic.
276 Wed Apr 15 01:09:29 GMT 2015 Olly Betts <olly@survex.com>
278 * templates/query: When listing matching terms, don't make the commas
281 Wed Mar 11 12:18:31 GMT 2015 Olly Betts <olly@survex.com>
283 * docs/encodings.rst: Note that one should ensure that Omega gets sent
284 form submissions encoded in UTF-8.
286 Wed Mar 11 11:28:31 GMT 2015 Olly Betts <olly@survex.com>
288 * docs/encodings.rst: Discuss encodings of filenames (see #550).
290 Wed Mar 11 11:07:29 GMT 2015 Olly Betts <olly@survex.com>
292 * urldecode.h,urlenctest.cc: $prettyurl now decodes valid UTF-8
293 sequences. Fixes #550 and #644, reported by catkin and terencz.
295 Mon Mar 09 12:31:44 GMT 2015 Olly Betts <olly@survex.com>
297 * docs/: Add a document about character encoding, as suggested by
298 James Aylett in #550.
300 Mon Mar 09 11:32:32 GMT 2015 Olly Betts <olly@survex.com>
302 * docs/overview.rst: Document 'E' prefixed boolean terms for filtering
303 by extension (see #668, reported by bramvdh).
305 Mon Mar 09 11:29:15 GMT 2015 Olly Betts <olly@survex.com>
307 * docs/overview.rst: Whitespace cleanup.
309 Mon Mar 09 10:16:05 GMT 2015 Olly Betts <olly@survex.com>
311 * templates/xml: Add XML declaration.
313 Mon Mar 09 10:14:51 GMT 2015 Olly Betts <olly@survex.com>
315 * templates/query: Eliminate blank line before <html>.
317 Mon Mar 09 10:14:03 GMT 2015 Olly Betts <olly@survex.com>
319 * templates/godmode: Return charset utf-8 in the content-type.
321 Mon Mar 09 06:58:13 GMT 2015 Olly Betts <olly@survex.com>
323 * urldecode.h,urlenctest.cc: Improve decoding done by $prettyurl - we
324 now leave the query and fragment parts of the URL alone and don't
325 decode an escaped "/" (omindex doesn't create URLs with any of
326 these, so we only risk breaking other URLs which have them), and we
327 decode some additional ASCII characters in the path part:
328 []@!$&'()*+.;= (addresses #550 in part)
330 Sun Feb 15 11:04:05 GMT 2015 Olly Betts <olly@tartarus.org>
332 * expand.cc: Suppress bogus uninitialised variable warning with -Os
335 Tue Jan 27 04:37:12 GMT 2015 Olly Betts <olly@survex.com>
337 * docs/overview.rst,omindex.cc: Interpret a command of "false" in
338 "--filter" as meaning to ignore files with that MIME type.
340 Tue Jan 27 04:13:26 GMT 2015 Olly Betts <olly@survex.com>
342 * docs/overview.rst,omindex.cc: Add support for specifying a MIME
343 subtype of '*' in --filter arguments.
345 Thu Jan 22 01:44:01 GMT 2015 Olly Betts <olly@survex.com>
347 * omindex.cc: Ignore extensions .msi and .msp, which are Microsoft
348 installer files, but which libmagic sometimes incorrectly identifies
349 as application/msword.
351 Tue Jan 06 21:15:14 GMT 2015 Olly Betts <olly@survex.com>
353 * configure.ac: Use pkg-config in preference to determine flags needed
354 to compile and link with PCRE, as this will just work when
355 cross-compiling (at least under MXE).
357 Sun Dec 21 21:54:48 GMT 2014 Olly Betts <olly@survex.com>
359 * query.cc: Handle [=0 as [=1.
361 Fri Dec 19 13:09:11 GMT 2014 Olly Betts <olly@survex.com>
363 * configure.ac,diritor.cc: Avoid doing link tests with libmagic in
364 configure as they fail on mingw due to not automatically picking up
365 libraries which libmagic itself depends on.
367 Fri Dec 19 03:21:13 GMT 2014 Olly Betts <olly@survex.com>
369 * docs/cgiparams.rst: Improve wording of docs for SORT parameter.
371 Tue Dec 16 04:06:06 GMT 2014 Olly Betts <olly@survex.com>
373 * configure.ac: Fix typo: 'libmagic-devl' -> 'libmagic-devel'
375 Tue Dec 16 03:53:25 GMT 2014 Olly Betts <olly@survex.com>
377 * configure.ac: Define MINGW_HAS_SECURE_API under mingw to get
378 _putenv_s() declared in stdlib.h.
380 Thu Dec 11 11:33:41 GMT 2014 Olly Betts <olly@survex.com>
382 * Makefile.am: Add timegm.cc to scriptindex_SOURCES to fix build on
383 platforms which don't provide timegm().
385 Wed Dec 03 04:17:18 GMT 2014 Olly Betts <olly@survex.com>
387 * templates/xml: Update handling of DATE1, DATE2 and DAYSMINUS which
388 were renamed in 0.6.x and the compatibility aliases removed in
391 Wed Dec 03 04:15:51 GMT 2014 Olly Betts <olly@survex.com>
393 * docs/omegascript.rst: Update documentation references to DATE1,
394 DATE2, and DAYSMINUS which were renamed in 0.6.x and the
395 compatibility aliases removed in 1.0.0.
397 Wed Dec 03 02:40:50 GMT 2014 Olly Betts <olly@survex.com>
399 * omindex.cc: Make sample_size a global variable rather than passing
400 it around everywhere.
402 Wed Dec 03 02:29:37 GMT 2014 Olly Betts <olly@survex.com>
404 * omindex.cc: Remove unused '#include <fstream>'.
406 Wed Dec 03 02:18:51 GMT 2014 Olly Betts <olly@survex.com>
408 * diritor.h: Fix get_mtime() to return time_t not off_t. In practice,
409 this probably wouldn't have caused issues until at least 2038.
411 Fri Nov 28 17:12:37 GMT 2014 James Aylett <james@tartarus.org>
413 * Makefile.am: link omindex-list with our (GNU) getopt.
415 Fri Nov 28 11:38:56 GMT 2014 Olly Betts <olly@survex.com>
417 * configure.ac: Move AC_CANONICAL_HOST before first use of $host_os.
418 In practice this wasn't a problem, as LT_INIT implicitly calls
419 AC_CANONICAL_HOST before this point anyway.
421 Wed Nov 26 03:55:13 GMT 2014 Olly Betts <olly@survex.com>
423 * configure.ac: Enable automake option 'subdir-objects' to avoid
424 warning from newer automake.
426 Mon Nov 24 20:00:52 GMT 2014 Olly Betts <olly@survex.com>
428 * NEWS,configure.ac: Update for 1.3.2.
430 Sun Nov 23 23:57:34 GMT 2014 Olly Betts <olly@survex.com>
432 * docs/termprefixes.rst: Update for renaming of 'brass' backend to
435 Sun Nov 09 22:48:43 GMT 2014 Olly Betts <olly@survex.com>
439 Tue Oct 28 02:34:34 GMT 2014 Olly Betts <olly@survex.com>
441 * docs/overview.rst: Document built-in list of stopwords.
443 Fri Oct 24 23:07:24 GMT 2014 Gaurav Arora <gauravarora.daiict@gmail.com>
445 * docs/omegascript.rst,weight.cc: Add support for $set{weighting,lm}.
447 Mon Oct 20 05:09:04 GMT 2014 Olly Betts <olly@survex.com>
449 * docs/overview.rst: Note that pdftotext is part of poppler as well as
450 xpdf. (Noted by Paul Wise)
452 Mon Oct 20 00:55:51 GMT 2014 Olly Betts <olly@survex.com>
454 * .gitignore: Update to ignore new generated files.
456 Thu Jul 24 21:27:26 GMT 2014 Olly Betts <olly@survex.com>
460 Sun Jun 22 13:32:10 GMT 2014 Olly Betts <olly@survex.com>
464 Fri Jun 20 14:37:04 GMT 2014 Olly Betts <olly@survex.com>
466 * Makefile.am: Don't compile in unixperm.cc - it isn't currently used,
467 and it fails to build with mingw. (fixes #635)
469 Mon Jun 09 01:23:48 GMT 2014 Olly Betts <olly@survex.com>
471 * myhtmlparse.cc: LibreOffice can export a timestamp of "0;0" - treat
472 this as invalid rather than as "year 0".
474 Fri Jun 06 05:50:21 GMT 2014 Olly Betts <olly@survex.com>
476 * myhtmlparse.cc: Add handling for longer form of timestamps in LO
479 Tue Jun 03 02:29:51 GMT 2014 Olly Betts <olly@survex.com>
481 * diritor.cc: Fix "applications/msword" to "application/msword" in the
482 fallback code for CDF files.
484 Tue Jun 03 01:49:10 GMT 2014 Olly Betts <olly@survex.com>
486 * diritor.cc: In fallback for CDF files, compare the extension
487 *without* leading dot.
489 Fri May 30 05:38:01 GMT 2014 Olly Betts <olly@survex.com>
491 * omindex.cc,urlencode.cc,urlencode.h: URL encode starting URL
494 Thu May 29 03:38:26 GMT 2014 Olly Betts <olly@survex.com>
496 * docs/omegascript.rst: Put ``...`` around Xapian C++ class names.
498 Thu May 29 03:36:55 GMT 2014 Olly Betts <olly@survex.com>
500 * docs/omegascript.rst,query.cc: Add optional LENGTH parameter to
503 Thu May 29 02:41:08 GMT 2014 Olly Betts <olly@survex.com>
505 * diritor.cc: libmagic can return a second string starting "Composite
506 Document File V2 Document" for the mime-type, so just look for that
507 prefix. And newer libmagic returns "application/CDFV2-corrupt" in
508 these cases, so handle that too.
510 Wed May 28 05:22:12 GMT 2014 Olly Betts <olly@survex.com>
512 * omindex-list.cc: Remove debug output.
514 Wed May 28 05:15:50 GMT 2014 Olly Betts <olly@survex.com>
516 * Makefile.am,omindex-list.cc: New tool to list URLs of all the
517 documents in a database (or list of databases) indexed by omindex.
519 Tue May 27 04:12:07 GMT 2014 Olly Betts <olly@survex.com>
521 * docs/omegascript.rst: Document $snippet.
523 Tue May 27 04:07:43 GMT 2014 Mihai Bivol <mm.bivol@gmail.com>
525 * query.cc,templates/query: Add Omega Snipper integration.
527 Fri May 23 12:05:52 GMT 2014 Olly Betts <olly@survex.com>
529 * date.cc,scriptindex.cc: Pass std::string by const reference.
531 Fri May 23 09:05:25 GMT 2014 Olly Betts <olly@survex.com>
533 * query.cc: Removed unused inline function.
535 Tue May 20 23:27:37 GMT 2014 Olly Betts <olly@survex.com>
537 * omindex.cc: Update comment about unrtf --nopict to note the unrtf
538 version where this was fixed to work again.
540 Tue May 20 23:26:19 GMT 2014 Olly Betts <olly@survex.com>
542 * omindex.cc: Report the size limit in the message when we skip a file
545 Mon Apr 14 10:42:06 GMT 2014 Olly Betts <olly@survex.com>
547 * omindex.cc: Filtering via text/html now handles HTML documents which
548 specify a charset. "application/vnd.ms-outlook" is now handled by
549 filtering via text/html rather than as a hard-coded special case.
551 Thu Apr 10 03:47:18 GMT 2014 Olly Betts <olly@survex.com>
553 * docs/overview.rst,omindex.cc: Add support for indexing Microsoft
554 Publisher files using pub2xhtml.
556 Thu Apr 10 03:29:50 GMT 2014 Olly Betts <olly@survex.com>
558 * diritor.cc: Work around libmagic returning a MIME content-type of
559 "Composite Document File V2 Document, No summary info".
561 Tue Mar 25 08:56:13 GMT 2014 Olly Betts <olly@survex.com>
563 * expand.cc: Fix mis-indentation of two lines.
565 Tue Mar 25 08:54:28 GMT 2014 Olly Betts <olly@survex.com>
567 * expand.cc: Fix warning when built with GCC 4.7.2 using -Os.
569 Thu Mar 13 01:41:49 GMT 2014 Olly Betts <olly@survex.com>
571 * omindex.cc: Restrict the length of what we consider to be an
572 extension, currently to 7 characters or whatever the longest
573 extension in the mime_map is if it is longer.
575 Mon Mar 10 06:23:22 GMT 2014 Olly Betts <olly@survex.com>
577 * expand.cc,expand.h,omega.cc,query.cc: Fix $set{expand,trad <K>} to
578 work when built against an older xapian-core.
580 Mon Mar 10 04:22:05 GMT 2014 Olly Betts <olly@survex.com>
582 * expand.cc: Only use new query expansion API if built against
583 xapian-core >= 1.3.2.
585 Thu Mar 06 10:25:03 GMT 2014 Olly Betts <olly@survex.com>
587 * expand.cc: Throw an error if $opt{expansion} is invalid.
589 Thu Mar 06 09:58:24 GMT 2014 Aarsh Shah <aarshkshah1992@gmail.com>
591 * Makefile.am,docs/omegascript.rst,expand.cc,expand.h,omega.cc,
592 query.cc: Add support for setting the query expansion scheme to use.
594 Tue Feb 18 05:04:48 GMT 2014 Olly Betts <olly@survex.com>
596 * omindex.cc,tmpdir.h: Add get_tmpfile() helper function.
598 Tue Feb 18 02:09:13 GMT 2014 Olly Betts <olly@survex.com>
600 * omindex.cc: Avoid '//' in temporary filename (cosmetic only).
602 Sat Feb 15 00:58:53 GMT 2014 Olly Betts <olly@survex.com>
606 Wed Feb 12 05:24:23 GMT 2014 Olly Betts <olly@survex.com>
608 * docs/overview.rst,omindex.cc: Don't assume .doc is
609 application/msword but let libmagic decide, as .doc may actually be
610 RTF, and also it's sometimes used for plain-text files.
612 Tue Dec 31 20:20:05 GMT 2013 Olly Betts <olly@survex.com>
614 * timegm.cc: Fix typo.
616 Sun Dec 29 21:26:03 GMT 2013 Olly Betts <olly@survex.com>
618 * Makefile.am: Ship common/safewinsock2.h, needed under mingw.
620 Sun Dec 29 06:22:50 GMT 2013 Olly Betts <olly@survex.com>
622 * Makefile.am,configure.ac,metaxmlparse.cc,myhtmlparse.cc,timegm.cc,
623 timegm.h: Factor out our portable timegm() replacement. Fixes
624 incorrect timezone handling for created timestamps in OpenDocument
625 documents on platforms without timegm(). Should also fix a build
628 Thu Dec 26 01:15:27 GMT 2013 Olly Betts <olly@survex.com>
630 * Makefile.am,portability/mkdtemp.cc,portability/mkdtemp.h,tmpdir.cc:
631 Add header with prototype of mkdtemp() to avoid "no previous
632 declaration" warning on platforms which don't have mkdtemp() as
635 Mon Dec 23 02:25:13 GMT 2013 Olly Betts <olly@survex.com>
639 Fri Dec 20 07:07:36 GMT 2013 Olly Betts <olly@survex.com>
641 * docs/overview.rst: Fix minor typo.
643 Fri Dec 20 04:58:44 GMT 2013 Olly Betts <olly@survex.com>
645 * docs/overview.rst: Add Abiword as an example use of --filter, based
646 on patch from Frank J Bruzzaniti (fixes#383). Update unoconv
647 example to talk about LibreOffice instead of OpenOffice.
649 Thu Dec 19 10:21:28 GMT 2013 Olly Betts <olly@survex.com>
651 * NEWS: Update from 1.2.16 and ChangeLog.
653 Mon Dec 02 22:56:18 GMT 2013 Olly Betts <olly@survex.com>
655 * configure.ac: Define __MSVCRT_VERSION__ to 0x0601 on mingw so we get
656 __ftime64() defined in the headers.
658 Fri Nov 29 03:51:24 GMT 2013 Olly Betts <olly@survex.com>
660 * configure.ac: Sync GCC checks with xapian-core.
662 Sun Oct 13 23:20:58 GMT 2013 Olly Betts <olly@survex.com>
664 * omindex.cc: Group-readable files which are owner-readable but not
665 world-readable should still get a "readable by owner" term added.
666 Reported by Emmanuel Garette.
668 Fri Sep 27 00:27:47 GMT 2013 Olly Betts <olly@survex.com>
670 * diritor.cc: Handle ENOENT, ENOTDIR and EACCES from readdir().
672 Thu Sep 26 05:41:10 GMT 2013 Olly Betts <olly@survex.com>
674 * diritor.cc: If we've already opened the file (as we often will have
675 if using a modern libmagic with magic_descriptor() available), then
676 use fstat() on that fd rather than stat()/lstat() on the pathname.
678 Thu Sep 26 05:32:46 GMT 2013 Olly Betts <olly@survex.com>
680 * diritor.cc: If we get EACCES trying to read a directory or stat
681 a file, don't handle it by committing changes and exiting - instead
682 skip the file (like we used to before r17461).
684 Tue Sep 24 10:01:17 GMT 2013 Olly Betts <olly@survex.com>
686 * .gitignore,configure.ac,xapian-omega.spec.in: Compress source
687 tarballs with xz instead of gzip.
689 Mon Sep 23 06:26:03 GMT 2013 Olly Betts <olly@survex.com>
691 * diritor.cc,diritor.h,omindex.cc: If we get ENOTDIR trying to index a
692 file, skip it quietly (unless in verbose mode) as we already do if
693 we get ENOENT, since ENOTDIR is what we get if the file and the
694 directory it was in got removed between us getting the filename and
697 Mon Sep 16 11:21:50 GMT 2013 Olly Betts <olly@survex.com>
699 * runfilter.h: Remove trailing space added in recent commit.
701 Mon Sep 16 02:04:06 GMT 2013 Olly Betts <olly@survex.com>
703 * diritor.h,runfilter.cc,runfilter.h: Pass error message string and
704 errno value in ReadError exceptions.
706 Thu Sep 12 22:02:59 GMT 2013 Olly Betts <olly@survex.com>
708 * weight.cc: Use "" not <> to include local header weight.h.
710 Thu Sep 12 01:18:33 GMT 2013 Olly Betts <olly@survex.com>
712 * diritor.cc,diritor.h,omindex.cc: Commit changes and exit, rather
713 than skipping the current file on most unexpected errors reading
714 directories or initialising libmagic - otherwise we can end up
715 deleting a lot of database entries on errors like EHOSTDOWN.
717 Tue Sep 03 01:03:33 GMT 2013 Olly Betts <olly@survex.com>
719 * myhtmlparse.cc,myhtmlparse.h,omindex.cc: Add support for indexing
720 'created' time from HTML documents.
722 Mon Sep 03 00:23:34 GMT 2013 Olly Betts <olly@survex.com>
724 * omindex.cc: Factor out mimetype_from_ext() function.
726 Mon Sep 02 23:32:01 GMT 2013 Olly Betts <olly@survex.com>
728 * omindex.cc: Fix to sleep for sleep_before_opendir from now, not
731 Mon Sep 02 22:45:44 GMT 2013 Olly Betts <olly@survex.com>
733 * omindex.cc: Make OPT_OPENDIR_SLEEP 256 not -1, as getopt() returns
734 -1 when there are no more options.
736 Mon Sep 02 05:00:16 GMT 2013 Olly Betts <olly@survex.com>
738 * omindex.cc,docs/overview.rst: Ignore 'adm', 'cur', and 'ico' by
741 Mon Sep 02 04:50:59 GMT 2013 Olly Betts <olly@survex.com>
743 * datematchdecider.h: Fix filename in comment at top of file.
745 Thu Aug 29 23:39:14 GMT 2013 Olly Betts <olly@survex.com>
747 * diritor.h: Mark DirectoryIterator ctor as 'explicit'.
749 Thu Aug 29 23:37:29 GMT 2013 Olly Betts <olly@survex.com>
751 * Makefile.am,omindex.cc: Add omindex --opendir-sleep=SECS option to
752 allow working around problems with indexing files on Microsoft DFS
755 Thu Aug 22 22:04:11 GMT 2013 Olly Betts <olly@survex.com>
757 * xlsxparse.cc: Handle pre-defined numfmtid codes for dates.
759 Wed Jul 17 06:15:27 GMT 2013 Olly Betts <olly@survex.com>
761 * omindex.cc,xlsxparse.cc,xlsxparse.h: Fix detection of cells with a
762 date format to work with xlsx files other than my first example.
764 Wed Jul 17 03:48:41 GMT 2013 Olly Betts <olly@survex.com>
766 * omindex.cc,xlsxparse.cc,xlsxparse.h: Decode dates for xlsx files.
768 Mon Jul 15 12:03:32 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
770 * docs/omegascript.rst,weight.cc: Add support for $set{weighting,dph}.
772 Sun Jul 14 07:06:58 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
774 * docs/omegascript.rst,weight.cc: Add support for $set{weighting,pl2}.
776 Thu Jul 11 06:12:50 GMT 2013 Olly Betts <olly@survex.com>
778 * Makefile.am,jsonescape.cc,jsonesctest.cc: Make the JSON escape code
779 force the text to be valid UTF-8.
781 Wed Jul 10 13:01:36 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
783 * docs/omegascript.rst,weight.cc: Add support for $set{weighting,dlh}.
785 Tue Jul 09 06:29:18 GMT 2013 Olly Betts <olly@survex.com>
787 * omindex.cc,xmlparse.h: Quick fix for infinite recursion from the
788 HtmlParser refactoring work.
790 Sun Jul 07 11:57:11 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
792 * weight.cc: Add support for $set{weighting,bb2}.
793 * docs/omegascript.rst: Update list of weighting schemes understood by
796 Fri Jul 05 03:15:12 GMT 2013 Olly Betts <olly@survex.com>
798 * weight.cc: Add conditional test so we can build against older
799 xapian-core without the new weighting schemes.
801 Wed Jul 03 14:02:17 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
803 * weight.cc: Add support for $set{weighting,ineb2}.
805 Wed Jul 03 13:37:48 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
807 * weight.cc: Add support for $set{weighting,ifb2}.
809 Wed Jul 03 11:58:22 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
811 * weight.cc: Add support for $set{weighting,inl2}.
813 Wed Jun 26 13:03:24 GMT 2013 Olly Betts <olly@survex.com>
815 * configure.ac: Enable -Woverloaded-virtual warning.
817 Wed Jun 26 13:00:33 GMT 2013 Olly Betts <olly@survex.com>
819 * atomparsetest.cc,htmlparse.cc,htmlparse.h,myhtmlparse.cc,omindex.cc,
820 xmlparse.h: Rename virtual parse_html() method to just parse(), as
821 it's not HTML that's being parsed in most cases.
823 Wed Jun 26 07:15:30 GMT 2013 Olly Betts <olly@survex.com>
825 * Makefile.am,msxmlparse.cc,msxmlparse.h,omindex.cc,xmlparse.cc: Split
826 out the code specific to handling MS XML out of XmlParser into an
827 MSXmlParser subclass.
829 Wed Jun 26 04:53:39 GMT 2013 Olly Betts <olly@survex.com>
831 * configure.ac: Sync compiler warning flag machinery against
832 xapian-core. The changes are special handling for clang, passing
833 -fshow-column where supported, and handling for new warning flags
836 Tue Jun 18 03:09:06 GMT 2013 Olly Betts <olly@survex.com>
838 * omindex.cc: Fix off-by-one when finding documents to delete which
839 would sometimes cause omindex to fail to delete documents from the
840 database when they weren't refound during an index update.
842 Mon Jun 17 02:13:21 GMT 2013 Olly Betts <olly@survex.com>
844 * omindex.cc: Report strerror(errno) if we can't read a file.
846 Mon Jun 17 02:12:12 GMT 2013 Olly Betts <olly@survex.com>
848 * omindex.cc: Factor out code to mark a document as seen into a new
849 mark_as_seen() function.
851 Mon Jun 17 00:43:28 GMT 2013 Olly Betts <olly@survex.com>
853 * configure.ac,metaxmlparse.cc,metaxmlparse.h,omindex.cc: Add support
854 for indexing 'topic' and 'created date' meta-data for OpenDocument
857 Sun Jun 16 11:48:28 GMT 2013 Olly Betts <olly@survex.com>
859 * weight.cc: Rewrite the xapian-core version test to use a macro so
862 Sat Jun 15 00:43:16 GMT 2013 Olly Betts <olly@survex.com>
864 * docs/termprefixes.rst,myhtmlparse.cc,myhtmlparse.h,omindex.cc,
865 templates/query: Index "topic" for HTML and PDF documents.
867 Fri May 24 09:22:43 GMT 2013 Olly Betts <olly@survex.com>
869 * weight.cc: Check parameters to $set{weighting,bm25 ...} and
870 $set{weighting,trad ...} converted OK. Based on patch from
873 Wed May 08 11:10:20 GMT 2013 Olly Betts <olly@survex.com>
875 * Makefile.am,README,docs/Makefile.am: SVN -> git.
877 Thu May 02 12:21:36 GMT 2013 Olly Betts <olly@survex.com>
879 * NEWS,configure.ac: Update for 1.3.1.
881 Wed Apr 17 03:10:40 GMT 2013 Olly Betts <olly@survex.com>
883 * NEWS: Update from 1.2 branch and ChangeLog.
885 Tue Apr 16 05:24:54 GMT 2013 Olly Betts <olly@survex.com>
887 * weight.cc: If built against older xapian-core, don't try to use
890 Mon Apr 15 06:21:21 GMT 2013 Aarsh Shah <aarshkshah1992@gmail.com>
892 * docs/omegascript.rst,weight.cc: Add support for
893 $set{weighting,tfidf}.
895 Thu Apr 04 06:42:01 GMT 2013 Olly Betts <olly@survex.com>
897 * Makefile.am,configure.ac: Remove support for 'configure
898 --enable-quiet', 'make QUIET=' and 'make QUIET=y' - automake now
899 supports 'configure --enable-silent-rules', 'make V=1' and 'make
900 V=0' which are broadly equivalent and more standard.
902 Wed Mar 27 09:20:28 GMT 2013 Olly Betts <olly@survex.com>
904 * Makefile.am: Don't link utf8convert.cc code into omega CGI.
906 Sun Mar 24 23:33:56 GMT 2013 Olly Betts <olly@survex.com>
908 * htmlparsetest.cc: Fix table parsing test to have a <table> tag in!
910 Sun Mar 17 22:00:58 GMT 2013 Olly Betts <olly@survex.com>
912 * docs/overview.rst,omindex.cc: Use rst2html to indexing .rst and
915 Sun Mar 17 21:34:06 GMT 2013 Olly Betts <olly@survex.com>
917 * htmlparsetest.cc: Update testcase to match previous change.
919 Sun Mar 17 21:13:49 GMT 2013 Olly Betts <olly@survex.com>
921 * myhtmlparse.cc: Make paragraph break in sample \r not \n, as \n
922 is used by Omega to indicate the end of a field in the document
925 Fri Mar 15 03:48:50 GMT 2013 Olly Betts <olly@survex.com>
927 * md5wrap.cc,md5wrap.h: Add md5_block() to checksum a block of memory.
929 Thu Mar 07 00:40:20 GMT 2013 Olly Betts <olly@survex.com>
931 * jsonescape.cc: Fix C+11 compatibility issue highlighted by GCC
934 Wed Mar 06 03:35:43 GMT 2013 Olly Betts <olly@survex.com>
936 * gen-myhtmltags,myhtmlparse.cc: Distinguish page breaks from other
937 whitespace in samples.
939 Tue Mar 05 19:37:55 GMT 2013 Olly Betts <olly@survex.com>
941 * INSTALL,configure.ac: Provide hints as to what package to install
944 Mon Mar 04 03:56:22 GMT 2013 Olly Betts <olly@survex.com>
946 * configure.ac,omindex.cc,runfilter.cc,runfilter.h: If omindex
947 receives a SIGHUP, SIGINT, SIGQUIT or SIGTERM, then kill any active
948 external filter child process before handling the signal as we
951 Sun Mar 03 23:52:37 GMT 2013 Olly Betts <olly@survex.com>
953 * configure.ac,runfilter.cc: If setpgid() is available, put each
954 external filter in its own process group so we can easily kill it
955 along with any processes which it starts.
957 Mon Feb 25 19:03:06 GMT 2013 Olly Betts <olly@survex.com>
959 * docs/overview.rst: Update to add com to the list of ignored
962 Thu Feb 21 22:09:44 GMT 2013 Olly Betts <olly@survex.com>
964 * omindex.cc: Ignore .com files by default.
966 Tue Feb 19 04:19:50 GMT 2013 Olly Betts <olly@survex.com>
968 * NEWS: Update from ChangeLog.
970 Tue Feb 19 03:13:08 GMT 2013 Olly Betts <olly@survex.com>
972 * htmlparsetest.cc,myhtmlparse.cc,myhtmlparse.h: Sample from HTML now
973 contains \n where a line or paragraph break would appear, and \t
976 Tue Feb 19 03:02:58 GMT 2013 Olly Betts <olly@survex.com>
978 * gen-myhtmltags,htmlparsetest.cc,myhtmlparse.cc,myhtmlparse.tokens:
979 Generate a lookup table for where we should insert a space in place
980 of an HTML tag rather than using a switch statement for that.
982 Tue Feb 12 20:43:14 GMT 2013 Olly Betts <olly@survex.com>
984 * Makefile.am: Remove my-html-tok.h in "make clean" not "make
985 distclean", since it's built by "make" and that's what the automake
986 manual recommends for files built by "make".
988 Tue Feb 12 20:41:53 GMT 2013 Olly Betts <olly@survex.com>
990 * Makefile.am: Clean up my-html-tok.h in "make distclean".
992 Tue Feb 12 20:03:51 GMT 2013 Olly Betts <olly@survex.com>
994 * Makefile.am: Ship common/keyword.h.
996 Tue Feb 12 19:27:26 GMT 2013 Olly Betts <olly@survex.com>
998 * Makefile.am: Ship common/Tokeniseise.pm.
1000 Tue Feb 12 00:44:23 GMT 2013 Olly Betts <olly@survex.com>
1002 * Makefile.am: Fix to work in VPATH build.
1004 Mon Feb 11 22:38:16 GMT 2013 Olly Betts <olly@survex.com>
1006 * xlsxparse.cc: Correct "max" -> "min" when reserving space for shared
1007 strings. This only means we now reserve a more appropriate amount
1008 of space to start with.
1010 Fri Feb 01 04:11:50 GMT 2013 Olly Betts <olly@survex.com>
1012 * Makefile.am: Ship new file myhtmlparse.tokens.
1014 Thu Jan 31 23:42:23 GMT 2013 Olly Betts <olly@survex.com>
1016 * myhtmlparse.cc,myhtmlparse.tokens: Add <APPLET>, <OBJECT>, and
1017 <TR> to the tags handled.
1019 Thu Jan 31 06:24:13 GMT 2013 Olly Betts <olly@survex.com>
1021 * Makefile.am,gen-myhtmltags,myhtmlparse.cc,myhtmlparse.tokens: Use
1022 a generated compact and efficient table to convert HTML tag names
1023 to enum codes, which we can then use a C switch statement to
1024 dispatch. The table first checks the token length, and then does a
1025 binary chop on tokens of the same length. This is both faster and
1026 smaller than the approach we were using, with the benefit that the
1027 table is auto-generated.
1029 Thu Jan 31 06:21:22 GMT 2013 Olly Betts <olly@survex.com>
1031 * NEWS: Update from ChangeLog.
1033 Mon Jan 28 23:41:21 GMT 2013 Olly Betts <olly@survex.com>
1035 * utf8convert.cc,utf8converttest.cc: Always use our built-in
1036 conversion code for the character sets it can handle, and only use
1037 iconv as a fall-back. This gives us more consistent results, and
1038 in particular means we now handle BOMs better (at least with GNU
1041 Mon Jan 28 23:16:08 GMT 2013 Olly Betts <olly@survex.com>
1043 * utf8convert.cc,utf8converttest.cc: A lot of data labelled as
1044 "iso-8859-1" is actually "windows-1252". The two only differ
1045 in characters which are control characters in iso-8859-1, so
1046 assume the latter when we see the former.
1048 Mon Jan 28 23:13:24 GMT 2013 Olly Betts <olly@survex.com>
1050 * omindex.cc: Use charset name "iso-8859-1" in lower case
1053 Thu Jan 24 08:44:24 GMT 2013 Olly Betts <olly@survex.com>
1055 * Makefile.am: Ship jsonescape.h.
1057 Thu Jan 24 07:21:17 GMT 2013 Olly Betts <olly@survex.com>
1059 * jsonescape.cc: Add missing header includes.
1061 Thu Jan 24 03:08:39 GMT 2013 Olly Betts <olly@survex.com>
1063 * docs/omegascript.rst: Document $json and $jsonarray.
1065 Thu Jan 24 02:55:53 GMT 2013 Olly Betts <olly@survex.com>
1067 * Makefile.am,jsonescape.cc,jsonescape.h,jsonesctest.cc,query.cc:
1068 Add new $json and $jsonarray OmegaScript commands.
1070 Wed Jan 09 11:53:19 GMT 2013 Olly Betts <olly@survex.com>
1072 * NEWS: Update from ChangeLog and 1.2 branch.
1074 Fri Jan 04 02:21:05 GMT 2013 Olly Betts <olly@survex.com>
1076 * Makefile.am,docs/omegascript.rst,omindex.cc,query.cc,sample.cc,
1077 sample.h: Add $truncate command to break a string after a word.
1079 Thu Jan 03 04:44:22 GMT 2013 Olly Betts <olly@survex.com>
1081 * commonhelp.cc: Tweak wording about default to match other options
1084 Thu Jan 03 04:07:57 GMT 2013 Olly Betts <olly@survex.com>
1086 * omindex.cc: Note default size limit on files to index is unlimited.
1087 Update --help to reflect that --sample-size now accepts the same
1088 formats as --max-size).
1090 Thu Jan 03 03:52:34 GMT 2013 Olly Betts <olly@survex.com>
1092 * omindex.cc: When generating a sample for a CSV file, limit the
1093 reserved size to the CSV file size as sample_size could be set
1094 really high by the user.
1096 Thu Jan 03 03:46:46 GMT 2013 Olly Betts <olly@survex.com>
1098 * NEWS: Update from ChangeLog.
1100 Tue Dec 18 04:49:41 GMT 2012 Olly Betts <olly@survex.com>
1102 * omindex.cc: Fix typo in previous change (2>dev/null should be
1105 Tue Dec 18 03:48:36 GMT 2012 Olly Betts <olly@survex.com>
1107 * docs/overview.rst,omindex.cc: Extend --filter to allow commands which
1108 produce HTML on stdout to be specified with it.
1110 Sun Dec 16 21:25:28 GMT 2012 Olly Betts <olly@survex.com>
1112 * diritor.cc: If libmagic hits ENOENT trying to classify a file, throw
1113 FileNotFound so we quietly skip the file in non-verbose mode.
1115 Sun Dec 16 21:23:25 GMT 2012 Olly Betts <olly@survex.com>
1117 * diritor.cc: MAGIC_MIME_TYPE was added in 4.22, so note that in the
1118 comment about its conditional use.
1120 Fri Dec 14 21:22:12 GMT 2012 Olly Betts <olly@survex.com>
1122 * Makefile.am: In automake, INCLUDES is now deprecated in favour of
1123 AM_CPPFLAGS so update to use the latter.
1125 Fri Dec 14 04:45:12 GMT 2012 Olly Betts <olly@survex.com>
1127 * omindex.cc: If md5_file() fails with ENOENT, assume the file was
1128 removed during indexing and only report this with --verbose as we
1129 do in other such cases.
1131 Fri Dec 14 04:42:34 GMT 2012 Olly Betts <olly@survex.com>
1133 * md5wrap.cc: If we get a read error while calculating the md5 checksum
1134 of a file, fail rather than returning the checksum of the file up to
1137 Fri Dec 14 04:35:52 GMT 2012 Olly Betts <olly@survex.com>
1139 * omindex.cc: Calculate the md5 from the loaded file contents when
1140 indexing SVG and Atom files. Use a const ref to avoid a string
1141 copy of the file contents for HTML and uncompressed ABI word.
1143 Fri Dec 14 04:24:17 GMT 2012 Olly Betts <olly@survex.com>
1145 * configure.ac,diritor.cc,diritor.h,loadfile.cc,loadfile.h,omindex.cc:
1146 If we open a file to index it, keep the fd around and use it with
1147 libmagic, provided magic_descriptor() is available.
1149 Tue Dec 11 03:35:29 GMT 2012 Olly Betts <olly@survex.com>
1151 * diritor.cc,diritor.h,omindex.cc: If we get ENOENT for a file or
1152 directory we're trying to index, assume it has been removed between
1153 us reading the directory entry for it and trying to open it, and
1154 only report the failure to index under --verbose.
1156 Wed Nov 21 04:06:54 GMT 2012 Olly Betts <olly@survex.com>
1158 * omindex.cc: Fix omindex not to segfault when -F option without a ':'
1161 Tue Sep 25 23:57:12 GMT 2012 Olly Betts <olly@survex.com>
1163 * Makefile.am,omindex.cc: Replace shell_protect() with
1164 append_filename_argument() from common/append_filename_arg.h.
1165 Extracting text using external filters now works for filenames
1166 containing a newline character.
1168 Thu Aug 09 20:07:59 GMT 2012 Dan Colish <dcolish@gmail.com>
1170 * Makefile.am,configure.ac: Allow users to configure with
1171 MAGIC_PREFIX for non-standard installs of libmagic
1173 Wed Jul 18 10:51:39 GMT 2012 Olly Betts <olly@survex.com>
1175 * urldecode.h: Fix to decode escaped character at the end of the
1177 * urlenctest.cc: Add regression testcase.
1179 Sun Jul 01 10:53:35 GMT 2012 Olly Betts <olly@survex.com>
1181 * NEWS: Update from ChangeLog and 1.2 branch.
1183 Tue Jun 26 11:59:17 GMT 2012 Olly Betts <olly@survex.com>
1185 * configure.ac: Set link_all_deplibs_CXX=no on solaris, like we
1186 already do for xapian-core.
1188 Thu Jun 21 13:44:06 GMT 2012 Olly Betts <olly@survex.com>
1190 * xlsxparse.cc: Check for "uniquecount" parameter, not "unqiueCount" as
1191 we normalise parameter names to lower case.
1193 Thu Jun 21 13:42:35 GMT 2012 Olly Betts <olly@survex.com>
1195 * omindex.cc: unzip extracts files in the order they are in the
1196 archive, not the order they are on the command line, so call unzip
1197 twice when the order of extraction matters.
1199 Tue Jun 19 00:58:04 GMT 2012 Olly Betts <olly@survex.com>
1201 * Makefile.am,omindex.cc,opendocparse.cc,opendocparse.h,xmlparse.cc:
1202 Improve handling of headers and footers on OpenDocument documents.
1204 Mon Jun 18 07:02:20 GMT 2012 Olly Betts <olly@survex.com>
1206 * omindex.cc: Properly fix the "trim trailing formfeeds" code.
1208 Mon Jun 18 06:16:39 GMT 2012 Olly Betts <olly@survex.com>
1210 * omindex.cc: Tweak previous change.
1212 Mon Jun 18 05:49:07 GMT 2012 Olly Betts <olly@survex.com>
1214 * omindex.cc: Fix the "trim trailing formfeeds" code not to remove one
1217 Mon Jun 18 05:47:33 GMT 2012 Olly Betts <olly@survex.com>
1219 * omindex.cc,xlsxparse.cc,xlsxparse.h: Rework .xlsx parsing to
1220 substitute the shared strings into the positions they are used
1221 in, so that the sample actually matches what appears in the
1224 Mon Jun 18 04:49:12 GMT 2012 Olly Betts <olly@survex.com>
1226 * xlsxparse.cc,xlsxparse.h: Subclass XlsxParser directly from
1229 Mon Jun 18 03:16:43 GMT 2012 Olly Betts <olly@survex.com>
1231 * Makefile.am,omindex.cc,xlsxparse.cc,xlsxparse.h: Index calculated
1232 numbers from .xlsx files.
1234 Wed Jun 13 04:53:02 GMT 2012 Olly Betts <olly@survex.com>
1236 * omindex.cc: pdftotext outputs a formfeed between each page, which
1237 messes up our "empty body" check, so trim any trailing formfeeds
1240 Sat Jun 09 06:04:44 GMT 2012 Olly Betts <olly@survex.com>
1242 * Cherry pick changes from Mihai Bivol's GSoC snippets branch:
1243 * omindex.cc: Add option for the document sample size.
1244 * omindex.cc: Add short option for sample-size
1245 * omindex.cc: Make sample-size consistent with max-size
1247 Sat Jun 02 12:23:21 GMT 2012 Olly Betts <olly@survex.com>
1249 * INSTALL,Makefile.am,cgiparam.cc,configfile.cc,configure.ac,
1250 htmlparse.cc,omindex.cc,query.cc: Change `...' quoting in prose to
1253 Thu May 17 12:53:07 GMT 2012 Olly Betts <olly@survex.com>
1255 * htmlparsetest.cc,myhtmlparse.cc,myhtmlparse.h: Change parsing of
1256 multiple <body> tags and text outside of <body> to match the
1257 behaviour if modern web browsers. (ticket#599)
1259 Tue May 15 12:46:15 GMT 2012 Olly Betts <olly@survex.com>
1261 * configure.ac: Set link_all_deplibs_CXX=no on freebsd and openbsd,
1262 like we already do for xapian-core.
1264 Tue May 15 11:29:53 GMT 2012 Olly Betts <olly@survex.com>
1266 * NEWS: Update from ChangeLog and 1.2.10.
1268 Tue May 08 11:39:28 GMT 2012 Olly Betts <olly@survex.com>
1270 * runfilter.cc: Add cast to rlim_t, required for C++11 compatibility
1271 according to new error from GCC 4.7 (reported by Gaurav Arora).
1273 Tue May 08 11:32:48 GMT 2012 Olly Betts <olly@survex.com>
1275 * tmpdir.cc: Add safeunistd.h for rmdir, required by GCC 4.7 (reported
1278 Sat Apr 14 00:14:58 GMT 2012 Olly Betts <olly@survex.com>
1280 * atomparse.cc: For type="html", use the charset of the XML rather
1283 Fri Apr 13 23:36:48 GMT 2012 Olly Betts <olly@survex.com>
1285 * Makefile.am,atomparse.cc,atomparse.h,overview.rst,omindex.cc: Add
1286 support for atom feed files, patch from Mihai Bivol in ticket#595.
1287 * Makefile.am,atomparsetest.cc: Add tests for AtomParser.
1289 Thu Apr 05 14:09:28 GMT 2012 Olly Betts <olly@survex.com>
1291 * htmlparse.cc,htmlparsetest.cc: Add support for CDATA to HTML parser.
1293 Fri Mar 30 22:35:08 GMT 2012 Olly Betts <olly@survex.com>
1295 * NEWS: Fix "an warning" to "a warning" in old entry.
1297 Mon Mar 26 08:44:51 GMT 2012 Olly Betts <olly@survex.com>
1299 * omindex.cc: Add --max-size option, based on patch from ndaley in
1302 Wed Mar 14 02:27:59 GMT 2012 Olly Betts <olly@survex.com>
1304 * NEWS: Update for 1.3.0.
1306 Tue Mar 13 10:44:11 GMT 2012 Olly Betts <olly@survex.com>
1308 * NEWS: Update from 1.2.9 and ChangeLog.
1310 Mon Mar 12 10:55:57 GMT 2012 Olly Betts <olly@survex.com>
1312 * omindex.cc: If the document with the highest existing docid was
1313 updated, we'd previously report it as "added", but now we correctly
1314 report it as "updated".
1316 Mon Mar 12 10:50:55 GMT 2012 Olly Betts <olly@survex.com>
1318 * omindex.cc: Catch and report std::exception.
1320 Mon Feb 20 02:45:12 GMT 2012 Olly Betts <olly@survex.com>
1322 * docs/overview.rst,omindex.cc: More extensions to ignore by default:
1325 Sun Feb 19 22:20:49 GMT 2012 Olly Betts <olly@survex.com>
1327 * docs/overview.rst: Wrap over-long line.
1329 Thu Feb 16 06:52:24 GMT 2012 Olly Betts <olly@survex.com>
1331 * docs/overview.rst,omindex.cc: Add more extensions to the default
1332 ignore list: bin dat db jar lnk pyc pyo sqlite sqlite3 sqlite-journal
1335 Fri Jan 27 03:36:10 GMT 2012 Olly Betts <olly@survex.com>
1337 * docs/overview.rst,htmlparse.cc,htmlparsetest.cc: Add support for
1338 ignoring sections bracketed by <!--UdmComment--> and
1339 <!--/UdmComment--> like we already do for <!--htdig_noindex-->.
1340 Patch from Raphael Geissert.
1342 Fri Dec 23 05:44:08 GMT 2011 Olly Betts <olly@survex.com>
1344 * docs/overview.rst: Document that libmagic is used to determine
1345 the MIME type if the extension isn't known. Partly addresses
1348 Fri Dec 23 01:29:17 GMT 2011 Olly Betts <olly@survex.com>
1350 * docs/overview.rst: We now limit time as well as CPU and memory for
1353 Thu Dec 22 10:55:44 GMT 2011 Olly Betts <olly@survex.com>
1355 * query.cc: Drop special handling for R-prefixed terms in $prettyterm
1356 - we stopped generating these in Xapian 1.0.
1358 Thu Dec 22 03:50:30 GMT 2011 Olly Betts <olly@survex.com>
1360 * INSTALL,configure.ac,diritor.cc,diritor.h: Make libmagic a required
1363 Wed Dec 21 10:02:03 GMT 2011 Olly Betts <olly@survex.com>
1365 * query.cc: Change Xapian::weight to double.
1367 Wed Dec 21 05:25:40 GMT 2011 Olly Betts <olly@survex.com>
1369 * docs/cgiparams.rst,omega.cc,query.cc: Make DEFAULTOP default to AND
1370 rather than OR, since that matches what pretty much every search
1371 engine does these days. Closes ticket#512.
1373 Tue Dec 13 11:21:54 GMT 2011 Olly Betts <olly@survex.com>
1375 * NEWS: Update from 1.2.8 and ChangeLog.
1377 Fri Dec 09 14:08:04 GMT 2011 Olly Betts <olly@survex.com>
1379 * docs/omegascript.rst,query.cc,templates/emptydocs,templates/godmode,
1380 templates/query,urldecode.h,urlenctest.cc: Add new $prettyurl{}
1381 command which undoes RFC3986 URL escaping which doesn't affect
1382 semantics in practice. Partly addresses ticket#550.
1384 Thu Dec 08 08:19:26 GMT 2011 Olly Betts <olly@survex.com>
1386 * omindex.cc: Improve --help output (and man page which is generated
1387 from it). Closes bug#572.
1389 Thu Dec 08 04:51:12 GMT 2011 Olly Betts <olly@survex.com>
1391 * Makefile.am: Ship new header urldecode.h.
1393 Thu Dec 08 03:34:02 GMT 2011 Olly Betts <olly@survex.com>
1395 * Makefile.am,cgiparam.cc,urldecode.h,urlenctest.cc: Add new
1396 implementation of URL decoding - the old one didn't handle
1397 various corner cases well, and had two cut and pasted variants
1398 for handling a input from a C string (GET) or from stdin (POST).
1399 Also add a new unit test program to test URL encoding and decoding.
1402 Tue Dec 06 13:30:45 GMT 2011 Olly Betts <olly@survex.com>
1404 * NEWS: Update from ChangeLog and to reflect backporting activity.
1406 Mon Dec 05 03:19:21 GMT 2011 Olly Betts <olly@survex.com>
1408 * scriptindex.cc: If no rules are found in the index script, report an
1409 error and give up - this is inevitably the result of a mistake, and
1410 adding empty documents to the database isn't helpful.
1412 Sat Oct 29 14:49:40 GMT 2011 Olly Betts <olly@survex.com>
1414 * docs/omegascript.rst: Add note to discourage use of percentage
1416 * templates/query: Don't show the percentage score in the default
1419 Fri Oct 14 12:36:43 GMT 2011 Olly Betts <olly@survex.com>
1421 * configure.ac,runfilter.cc: If we don't get any data from a filter
1422 for 5 minutes, give up - it has probably ended up blocked
1425 Mon Sep 26 01:22:08 GMT 2011 Olly Betts <olly@survex.com>
1427 * templates/query: HTML escape topterms.
1429 Mon Sep 26 00:52:42 GMT 2011 Olly Betts <olly@survex.com>
1431 * templates/godmode: HTML escape the contents of document values.
1433 Fri Sep 23 04:09:12 GMT 2011 Olly Betts <olly@survex.com>
1435 * Makefile.am,omindex.cc,tmpdir.cc,tmpdir.h: Factor out tmpdir handling
1436 into a separate source file.
1438 Fri Sep 23 01:49:38 GMT 2011 Olly Betts <olly@survex.com>
1440 * omindex.cc: Factor out index_mimetype() function as a step towards
1441 allowing indexing files within other files (like zip files and email
1444 Fri Sep 23 00:54:40 GMT 2011 Olly Betts <olly@survex.com>
1446 * omindex.cc: Use string::const_iterator where we don't modify the
1449 Thu Sep 01 12:28:36 GMT 2011 Olly Betts <olly@survex.com>
1451 * xapian-omega.spec.in: Package outlookmsg2html helper.
1453 Fri Aug 12 23:25:45 GMT 2011 Olly Betts <olly@survex.com>
1455 * NEWS: Update from 1.2.7 and ChangeLog.
1457 Fri Aug 12 23:17:09 GMT 2011 Olly Betts <olly@survex.com>
1459 * scriptindex.cc: MyHtmlParser::parse_html() no longer throws bool to
1460 stop parsing early, so we no longer need to catch it.
1462 Wed Aug 03 23:25:18 GMT 2011 Olly Betts <olly@survex.com>
1464 * configure.ac: Sync changes from xapian-core: Don't pass -Wshadow for
1465 GCC < 4.1; don't pass -Wstrict-null-sentinel for GCC 4.0.x; only
1466 enable symbol visibility on platforms where it is supported; remove
1467 now superfluous check for GCC >= 3. Also, add FIXME for enabling
1468 -Woverloaded-virtual.
1470 Wed Aug 03 06:27:06 GMT 2011 Olly Betts <olly@survex.com>
1472 * omindex.cc: Index title with an 'S' prefix rather than no prefix.
1473 * templates/query: Set up prefixes for 'author', 'title', and map
1474 no prefix so that terms from the title are still matched by default.
1476 Wed Aug 03 06:11:30 GMT 2011 Olly Betts <olly@survex.com>
1478 * docs/omegascript.rst,query.cc: Allow mapping a query string prefix to
1479 more than one term prefix (which xapian-core has supported since
1482 Fri Jul 29 01:47:44 GMT 2011 Olly Betts <olly@survex.com>
1484 * docs/omegascript.rst,query.cc: Add support for per-prefix stemmers.
1486 Thu Jul 28 13:23:26 GMT 2011 Olly Betts <olly@survex.com>
1488 * docs/omegascript.rst,omega.cc,omega.h,query.cc,query.h: Add support
1489 for search inputs for multiple probabilistic prefixes.
1491 Wed Jul 27 02:35:39 GMT 2011 Olly Betts <olly@survex.com>
1493 * scriptindex.cc: Add link to
1494 http://xapian.org/docs/omega/scriptindex.html to --help output (and
1495 so also to the man page which is generated from this).
1497 Tue Jul 26 05:54:52 GMT 2011 Olly Betts <olly@survex.com>
1499 * query.cc: Rearrange logic for discarding the RSet and forcing the
1502 Tue Jul 26 05:27:08 GMT 2011 Olly Betts <olly@survex.com>
1504 * query.cc: Remove support for OLDP CGI parameter which was superseded
1505 by xP approximately a decade ago, and isn't even documented.
1507 Mon Jul 04 06:20:03 GMT 2011 Olly Betts <olly@survex.com>
1509 * omega.cc,utils.cc,utils.h: Factor out trim() function.
1511 Mon Jul 04 06:14:05 GMT 2011 Olly Betts <olly@survex.com>
1513 * omega.cc: Avoid creating a temporary string object just to trim
1514 leading and/or trailing whitespace.
1516 Mon Jul 04 06:08:47 GMT 2011 Olly Betts <olly@survex.com>
1518 * omega.cc: If P had trailing spaces, we would remove all but one -
1519 fixed to remove all of them!
1521 Wed Jun 22 15:32:12 GMT 2011 Olly Betts <olly@survex.com>
1523 * INSTALL: Pull in a few updates from the latest version of the
1524 automake document which this file was originally based on.
1525 Add in the missing copyright and licensing information.
1527 Thu Jun 16 15:42:31 GMT 2011 Olly Betts <olly@survex.com>
1529 * query.cc: Drop legacy support for handling '.' separated terms in
1530 OLDP - that changed in Omega 0.9.7, which is approaching 5 years
1533 Thu Jun 16 15:38:40 GMT 2011 Olly Betts <olly@survex.com>
1535 * query.cc: Improve $version output from "Xapian - xapian-omega 1.2.6"
1536 to "xapian-omega 1.2.6".
1537 * docs/omegascript.rst: Update example to match (and use less ancient
1540 Thu Jun 16 15:36:12 GMT 2011 Olly Betts <olly@survex.com>
1542 * dbi2omega: Remove uninteresting reference to 0.9.4.
1544 Mon Jun 13 14:25:45 GMT 2011 Olly Betts <olly@survex.com>
1546 * hashterm.cc: Avoid unnecessary temporary string object.
1548 Mon Jun 13 14:01:20 GMT 2011 Olly Betts <olly@survex.com>
1550 * hashterm.cc: Fix comment typo.
1552 Mon Jun 13 13:49:14 GMT 2011 Olly Betts <olly@survex.com>
1554 * xapian-omega.spec.in: We're ABI compatible within a release series
1555 so make dependency on xapian-core-libs >= rather than =.
1557 Mon Jun 13 12:30:29 GMT 2011 Olly Betts <olly@survex.com>
1559 * scriptindex.cc: Avoid unnecessary temporary string object.
1561 Mon Jun 13 12:24:32 GMT 2011 Olly Betts <olly@survex.com>
1563 * scriptindex.cc: Remove error warning that index=nopos was replaced
1564 with indexnopos - this was removed in 1.1.0 so there's been enough
1567 Mon Jun 13 09:56:29 GMT 2011 Olly Betts <olly@survex.com>
1569 * configure.ac: Update version to 1.3.0.
1571 Mon Jun 13 09:42:50 GMT 2011 Olly Betts <olly@survex.com>
1573 * docs/termprefixes.rst: Update reference to flint.`
1575 Mon Jun 13 08:00:16 GMT 2011 Olly Betts <olly@survex.com>
1577 * docs/termprefixes.rst: Expand to document mapping a user prefix to
1578 multiple term prefixes.
1580 Mon Jun 13 03:23:47 GMT 2011 Olly Betts <olly@survex.com>
1582 * docs/overview.rst: Improve documentation of htdig_noindex.
1584 Sun Jun 12 11:52:29 GMT 2011 Olly Betts <olly@survex.com>
1586 * NEWS: Final update for 1.2.6.
1588 Fri Jun 10 12:02:32 GMT 2011 Olly Betts <olly@survex.com>
1590 * NEWS,configure.ac: Update in preparation for 1.2.6.
1592 Fri Jun 10 03:28:33 GMT 2011 Olly Betts <olly@survex.com>
1594 * templates/inc/anyallexactradio: Remove unused duplicate of
1597 Fri Jun 10 03:21:25 GMT 2011 Olly Betts <olly@survex.com>
1599 * configure.ac,omindex-config.cc,omindex-config.html: Strip out partly
1600 written and long untouched omindex-config utility.
1602 Thu Jun 09 14:20:46 GMT 2011 Olly Betts <olly@survex.com>
1604 * weight.cc: Fix a compiler warning (I failed to note the compiler
1607 Sun May 29 13:00:26 GMT 2011 Olly Betts <olly@survex.com>
1609 * templates/query: Make search query input type=search.
1611 Sun May 29 12:24:43 GMT 2011 Olly Betts <olly@survex.com>
1613 * templates/query: Autofocus the search query input (using HTML
1614 autofocus attribute with Javascript fallback for older browsers).
1617 Wed May 25 14:33:18 GMT 2011 Olly Betts <olly@survex.com>
1619 * docs/omegascript.rst: Correct the documentation of the colours used by
1622 Fri May 13 05:50:35 GMT 2011 Olly Betts <olly@survex.com>
1624 * docs/overview.rst: Add using unoconv as more complex example of
1625 using --filter (ticket#324).
1627 Wed Apr 20 07:00:56 GMT 2011 Olly Betts <olly@survex.com>
1629 * NEWS: Fix typo; clarify wording.
1631 Mon Apr 04 13:58:06 GMT 2011 Olly Betts <olly@survex.com>
1633 * NEWS: Update release date.
1635 Mon Apr 04 13:53:34 GMT 2011 Olly Betts <olly@survex.com>
1637 * templates/xml: Fix syntax error from recent edit.
1639 Sun Apr 03 10:54:04 GMT 2011 Olly Betts <olly@survex.com>
1641 * NEWS,configure.ac: Update for 1.2.5.
1643 Sat Apr 02 14:15:32 GMT 2011 Olly Betts <olly@survex.com>
1645 * templates/query: Use $add{$field{modtime}} to ensure it is numeric.
1647 Sat Apr 02 14:14:06 GMT 2011 Olly Betts <olly@survex.com>
1649 * templates/godmode: More missing escaping.
1651 Sat Apr 02 14:07:45 GMT 2011 Olly Betts <olly@survex.com>
1653 * templates/xml: Remove double escaping.
1655 Sat Apr 02 13:58:44 GMT 2011 Olly Betts <olly@survex.com>
1657 * templates/query: More escaping fixes.
1659 Sat Apr 02 13:55:03 GMT 2011 Olly Betts <olly@survex.com>
1661 * templates/emptydocs,templates/opensearch,templates/xml: More missing
1664 Sat Apr 02 12:34:42 GMT 2011 Olly Betts <olly@survex.com>
1666 * templates/query: Add missing escaping.
1668 Sat Apr 02 11:48:43 GMT 2011 Olly Betts <olly@survex.com>
1670 * templates/godmode: Add missing escaping.
1672 Sat Apr 02 10:34:58 GMT 2011 Olly Betts <olly@survex.com>
1674 * templates/xml: Remove support for undocumented HILITECLASS CGI
1675 variable. There's no evidence I can find using Google code search
1676 or web search that this has been used anywhere, and it's problematic
1679 Sat Mar 26 14:51:36 GMT 2011 Olly Betts <olly@survex.com>
1681 * INSTALL: Copy new Multi-Arch section from xapian-core/INSTALL.
1682 Replace VPATH section with better equivalent from
1683 xapian-core/INSTALL.
1685 Wed Mar 23 15:21:41 GMT 2011 Olly Betts <olly@survex.com>
1687 * htmlparse.cc,htmlparse.h,htmlparsetest.cc,metaxmlparse.cc,
1688 metaxmlparse.h,myhtmlparse.cc,myhtmlparse.h,omindex.cc,svgparse.cc,
1689 svgparse.h,xmlparse.cc,xmlparse.h,xpsxmlparse.cc,xpsxmlparse.h:
1690 Instead of throwing a bool to abandon parsing, change methods to
1691 return bool to signify if they want to continue parsing or not.
1692 This is a bit faster (~0.23% for indexing a lot of HTML files).
1694 Mon Mar 21 05:48:08 GMT 2011 Olly Betts <olly@survex.com>
1696 * myhtmlparse.cc,myhtmlparse.h,omindex.cc: Add --ignore-exclusions
1697 option, which will index HTML files despite meta robots tags, etc -
1698 omindex is often used in environments where such exclusions aren't
1701 Fri Mar 18 10:24:58 GMT 2011 Olly Betts <olly@survex.com>
1703 * omindex.cc: Just report the mimetype as unknown instead of saying
1704 "unknown Office 2007 MIME subtype".
1706 Fri Mar 18 05:53:21 GMT 2011 Olly Betts <olly@survex.com>
1708 * diritor.h: Avoid using S_IRUSR, etc under __WIN32__.
1710 Fri Mar 18 03:00:16 GMT 2011 Olly Betts <olly@survex.com>
1712 * docs/overview.rst,omindex.cc: Ignore *.css and *.js by default too.
1714 Thu Mar 17 23:34:07 GMT 2011 Olly Betts <olly@survex.com>
1716 * omindex.cc: For skip messages which are only to be shown in verbose
1717 mode, call skip with new SKIP_VERBOSE_ONLY flag. Pass new
1718 SKIP_SHOW_FILENAME flag for skip messages shown before we say what
1719 file we are indexing so we know to show the filename even in verbose
1722 Thu Mar 17 03:47:54 GMT 2011 Olly Betts <olly@survex.com>
1724 * omindex.cc: Restore handling of exceptions from
1725 DirectoryIterator::get_type(), and handle exceptions from
1726 DirectoryIterator::next() which ended up at the top level
1727 before (though they probably never happen, at least on Linux).
1729 Wed Mar 16 06:19:01 GMT 2011 Olly Betts <olly@survex.com>
1731 * omindex.cc: Push all the code associated with indexing a file into
1734 Wed Mar 16 02:55:53 GMT 2011 Olly Betts <olly@survex.com>
1736 * omindex.cc: Push try block around index_file() call into the
1739 Wed Mar 16 02:51:52 GMT 2011 Olly Betts <olly@survex.com>
1741 * omindex.cc: Factor out handling for skipping files, and improve
1742 these messages by consistently reporting the filename.
1744 Tue Mar 15 12:47:12 GMT 2011 Olly Betts <olly@survex.com>
1746 * docs/Makefile.am,docs/index.rst: Add index page which links to all
1747 the other documentation pages.
1749 Tue Mar 15 12:20:30 GMT 2011 Olly Betts <olly@survex.com>
1751 * omindex.cc: Add --empty-docs option to allow documents we extract
1752 no body text from to be indexed (existing behaviour), skipped, or
1753 reported and then indexed.
1755 Fri Mar 04 14:13:47 GMT 2011 Olly Betts <olly@survex.com>
1757 * docs/omegascript.rst: Minor improvements.
1759 Wed Mar 02 11:17:42 GMT 2011 Olly Betts <olly@survex.com>
1763 Wed Mar 02 06:14:41 GMT 2011 Olly Betts <olly@survex.com>
1765 * docs/termprefixes.rst: New standard prefix E for filename extension.
1766 * omindex.cc: Index file extension as E-prefixed term.
1768 Mon Feb 28 13:45:32 GMT 2011 Olly Betts <olly@survex.com>
1770 * omindex.cc: Tell xls2csv not to quote fields and to put spaces
1771 not commas between them. Fixes indexing of numeric fields, and
1772 means we don't need to use our CSV parser to get a sample.
1774 Mon Feb 28 12:10:53 GMT 2011 Olly Betts <olly@survex.com>
1776 * xmlparse.cc: Add whitespace between chunks of text extracted from
1777 Microsoft Office 2007 formats.
1779 Wed Feb 23 12:34:28 GMT 2011 Olly Betts <olly@survex.com>
1781 * templates/xml: Try $field{caption} (which is what omindex sets)
1782 before $field{title} when getting a value for the hit tag's title
1783 attribute - this is consistent with how the query template gets the
1784 title. Add new type attribute which gives $field{type}.
1786 Thu Feb 17 05:19:28 GMT 2011 Olly Betts <olly@survex.com>
1788 * templates/xml: Add DBSize attribute to <result> element.
1790 Wed Feb 16 03:19:57 GMT 2011 Olly Betts <olly@survex.com>
1792 * Makefile.am,omindex.cc,query.cc,urlencode.cc,urlencode.h: Update
1793 URL encoding to follow RFC3986.
1795 Tue Feb 15 03:20:40 GMT 2011 Olly Betts <olly@survex.com>
1797 * omindex.cc: Encode reserved characters in URLs - now links to
1798 files with names containing '#' and '?' will work.
1800 Sun Jan 23 13:27:48 GMT 2011 Olly Betts <olly@survex.com>
1802 * docs/overview.rst,omindex.cc: Later Microsoft Works version produce
1803 .xlr spreadsheet files, which are apparently XL files with a
1804 different extension, so handle them as XL files.
1806 Thu Jan 20 11:07:46 GMT 2011 Olly Betts <olly@survex.com>
1808 * docs/omegascript.rst,omega.cc,query.cc,templates/query: Allow
1809 QueryParser flags to be set from OmegaScript (ticket#418).
1811 Sat Jan 15 11:14:32 GMT 2011 Olly Betts <olly@survex.com>
1813 * NEWS: Update from ChangeLog, 1.0.22 and 1.0.23.
1815 Wed Jan 12 02:21:59 GMT 2011 Olly Betts <olly@survex.com>
1817 * query.cc: Fix double Content-Type header in some error reporting
1818 situations (regression introduced in 1.2.4).
1820 Mon Jan 10 10:00:00 GMT 2011 Olly Betts <olly@survex.com>
1822 * omindex.cc,pkglibbindir.cc,pkglibbindir.h: Fix typo in function name
1823 (get_pkglibdindir() -> get_pkglibbindir()).
1825 Mon Jan 10 09:50:38 GMT 2011 Olly Betts <olly@survex.com>
1827 * diritor.cc,diritor.h: Don't define or try to set euid member of
1828 DirectoryIterator on platforms where we aren't going to use it.
1830 Mon Jan 10 09:15:24 GMT 2011 Olly Betts <olly@survex.com>
1832 * diritor.h: Stub out get_owner() and get_group() for __WIN32__.
1834 Fri Dec 24 10:35:29 GMT 2010 Olly Betts <olly@survex.com>
1836 * NEWS: Update from ChangeLog.
1838 Thu Dec 23 01:53:06 GMT 2010 Olly Betts <olly@survex.com>
1840 * diritor.cc: Fix to work with older libmagic which doesn't have
1841 MAGIC_MIME_TYPE (e.g. on Ubuntu hardy).
1843 Sun Dec 19 12:39:23 GMT 2010 Olly Betts <olly@survex.com>
1845 * NEWS,configure.ac: 1.2.4.
1847 Sun Dec 19 12:37:58 GMT 2010 Olly Betts <olly@survex.com>
1849 * query.cc: Disable permission filtering based on $REMOTE_USER as that
1850 will break some existing installations if users upgrade, which we
1851 don't want. Probably this should be specifiable from OmegaScript
1852 but it's not worth delaying 1.2.4 while we sort this out.
1854 Sun Dec 19 02:46:17 GMT 2010 Olly Betts <olly@survex.com>
1856 * docs/overview.rst,omindex.cc: Change the new name for
1857 "--preserve-unupdated" from "--preserve-removed" to "--no-delete".
1859 Sun Dec 19 02:32:29 GMT 2010 Olly Betts <olly@survex.com>
1861 * query.cc: Fix comment typo.
1863 Fri Dec 17 12:45:47 GMT 2010 Olly Betts <olly@survex.com>
1865 * commonhelp.cc,commonhelp.h,omindex.cc,scriptindex.cc: Swap the
1866 meanings of -v and -V in omindex for consistency with scriptindex
1867 and typical short options for --verbose and --version in other
1868 packages. For backward compatibility, "omindex -v" is handled
1869 specially and still reports the version.
1871 Fri Dec 17 08:31:29 GMT 2010 Olly Betts <olly@survex.com>
1873 * utf8convert.cc: Fix built in converter to handle space in charset
1874 names, which fixes failing utf8converttest when iconv isn't
1877 Fri Dec 17 05:36:36 GMT 2010 Olly Betts <olly@survex.com>
1879 * utf8convert.cc: Rework the fixing up of charset names which iconv()
1880 doesn't understand a little.
1882 Thu Dec 16 06:35:46 GMT 2010 Olly Betts <olly@survex.com>
1884 * loadfile.cc: If fstat() fails, preserve the errno value rather than
1885 letting close() clobber it.
1887 Thu Dec 16 06:31:30 GMT 2010 Olly Betts <olly@survex.com>
1889 * loadfile.cc: Fix file descriptor leak if load_file() is called on
1890 something which isn't a file (found by cppcheck run on the Debian
1891 archive). This case probably couldn't occur in omindex, but could if
1892 you used the LOADFILE action in scriptindex.
1894 Thu Dec 09 10:58:48 GMT 2010 Olly Betts <olly@survex.com>
1896 * docs/omegascript.rst: Replace $simplecommand with $query - a concrete
1897 example is more useful. Improve mark-up.
1898 * docs/termprefixes.rst: Remove mention of pre-0.9.7 use of W prefix.
1900 Thu Nov 18 12:25:50 GMT 2010 Olly Betts <olly@survex.com>
1902 * omega.cc: Fix reversed condition in recent exception reporting fix.
1904 Wed Nov 17 03:46:24 GMT 2010 Olly Betts <olly@survex.com>
1906 * diritor.cc: Add missing magic_cookie argument to calls to
1909 Sat Nov 13 12:17:51 GMT 2010 Olly Betts <olly@survex.com>
1911 * omindex.cc: Build up document data with += for efficiency.
1913 Sat Nov 13 12:08:09 GMT 2010 Olly Betts <olly@survex.com>
1915 * omindex.cc: Index author with A prefix.
1917 Sat Nov 13 12:00:50 GMT 2010 Olly Betts <olly@survex.com>
1919 * omindex.cc: A file extension can't contain a '/'.
1921 Sat Nov 13 11:50:31 GMT 2010 Olly Betts <olly@survex.com>
1923 * omindex.cc: Index the leafname of the file (without any extension) as
1924 if it contained additional keywords.
1926 Sat Nov 13 11:32:09 GMT 2010 Olly Betts <olly@survex.com>
1928 * omindex.cc: If a filter command isn't installed, flag this in the
1929 commands map so we don't try running this command again for any
1930 file with the same mimetype (previously we'd rerun it for a different
1931 extension which gave the same mimetype).
1933 Fri Nov 12 09:11:35 GMT 2010 Olly Betts <olly@survex.com>
1935 * Makefile.am,configure.ac: Add -no-undefined to AM_LDFLAGS on
1936 platforms which need it to dynamically link such as cygwin (need
1937 to do this taken from ticket#282).
1939 Fri Nov 12 03:35:56 GMT 2010 Olly Betts <olly@survex.com>
1941 * omindex.cc: Report MIME type if it's unknown to us. Remove debug
1942 output line. Update comments.
1944 Fri Nov 12 03:32:27 GMT 2010 Olly Betts <olly@survex.com>
1946 * diritor.cc: Report errors from libmagic.
1948 Fri Nov 12 02:58:20 GMT 2010 Olly Betts <olly@survex.com>
1950 * diritor.cc,diritor.h: Fix to compile when libmagic is detected.
1952 Fri Nov 12 01:40:24 GMT 2010 Olly Betts <olly@survex.com>
1954 * diritor.cc: Add missing class qualifier to method definition.
1956 Fri Nov 12 01:25:11 GMT 2010 Olly Betts <olly@survex.com>
1958 * INSTALL: Mention libmagic in install instructions.
1960 Fri Nov 12 01:16:21 GMT 2010 Olly Betts <olly@survex.com>
1962 * Makefile.am,configure.ac,diritor.cc,diritor.h,omindex.cc: Optionally
1963 use libmagic to detect MIME types for files for which we have no
1964 extension mapping, which allows us to handle files with a misleading
1965 extension, and files with no extension. (ticket#114)
1967 Thu Nov 11 23:23:07 GMT 2010 Olly Betts <olly@survex.com>
1969 * omindex.cc: Refactor slightly to handle the unknown extension case
1970 up front, so we lose an indentation level for the known extension
1973 Thu Nov 11 12:25:03 GMT 2010 Olly Betts <olly@survex.com>
1975 * omindex.cc: Add new --filter option to allow the user to specify
1976 new filters without patching omindex.cc.
1977 * docs/overview.rst: Document --filter.
1979 Thu Nov 11 02:51:55 GMT 2010 Olly Betts <olly@survex.com>
1981 * omindex.cc: Factor out handling for external filter programs which
1982 simply return UTF-8 text on stdout.
1984 Mon Nov 08 10:58:46 GMT 2010 Olly Betts <olly@survex.com>
1986 * omindex.cc,svgparse.cc,svgparse.h: Extract author for SVG files.
1988 Mon Nov 08 10:40:09 GMT 2010 Olly Betts <olly@survex.com>
1990 * omindex.cc: Extract metadata from Microsoft Office 2007 file formats.
1992 Mon Nov 08 10:21:13 GMT 2010 Olly Betts <olly@survex.com>
1994 * myhtmlparse.cc,myhtmlparse.h,omindex.cc: Extract author from HTML
1997 Mon Nov 08 09:46:03 GMT 2010 Olly Betts <olly@survex.com>
1999 * omindex.cc: Escape wildcard patterns being passed to unzip - in the
2000 unlikely event that one of these matched files in or under the
2001 current directory, we might fail to extract all the files we wanted
2004 Mon Nov 08 05:03:41 GMT 2010 Olly Betts <olly@survex.com>
2006 * metaxmlparse.cc,metaxmlparse.h,omindex.cc: Extract author from
2007 OpenDocument documents.
2009 Mon Nov 08 03:18:26 GMT 2010 Olly Betts <olly@survex.com>
2011 * omindex.cc: Extract author from PDF metadata.
2013 Mon Nov 08 03:15:17 GMT 2010 Olly Betts <olly@survex.com>
2015 * metaxmlparse.h: Initialise field member variable.
2017 Mon Nov 08 00:28:07 GMT 2010 Olly Betts <olly@survex.com>
2019 * omindex.cc: Index text in headers and footers for .odt and .docx
2022 Thu Nov 04 11:55:58 GMT 2010 Olly Betts <olly@survex.com>
2024 * omega.cc,omega.h,query.cc: If we catch an error early on, make sure
2025 that if it's appropriate, we write out a "Content-Type:" HTTP header
2026 and end the headers.
2028 Thu Nov 04 11:39:10 GMT 2010 Olly Betts <olly@survex.com>
2030 * utf8converttest.cc: Add back in testcases for charset names with
2033 Thu Nov 04 09:01:43 GMT 2010 Olly Betts <olly@survex.com>
2035 * utils.cc: Fix misuse of BUFSIZE which should be sizeof(buf) (issue
2036 reported by compilation with CPPFLAGS=-D_GLIBCXX_DEBUG).
2038 Thu Nov 04 09:01:08 GMT 2010 Richard Boulton <richard@tartarus.org>
2040 * utf8convert.cc,utf8converttest.cc: If iconv can't handle a
2041 charset, check if it's of the form (UTF|UCS)[_ ]?.* and if so,
2042 convert to the official hypenated form. Should fix failure of
2043 utf8converttest on OSX, where it fails due to iconv not
2046 Tue Nov 02 09:48:19 GMT 2010 Olly Betts <olly@survex.com>
2048 * diritor.cc,diritor.h,loadfile.cc,loadfile.h,md5wrap.cc,md5wrap.h,
2049 omindex.cc,scriptindex.cc: Use O_NOATIME if available and either the
2050 file is owned by the current euid, or the current euid is 0 (i.e.
2051 we're running as root). Fixes ticket#222.
2053 Fri Oct 29 14:26:25 GMT 2010 Olly Betts <olly@survex.com>
2055 * omindex.cc: Use the CSV parser to generate a nicer sample for files
2056 of type application/vnd.ms-excel.
2058 Fri Oct 29 09:26:52 GMT 2010 Olly Betts <olly@survex.com>
2060 * Makefile.am: Put $(PCRE_LIBS) in libtransform_la_LIBADD rather than
2061 omega_LDADD (more correct, but probably doesn't actually make any
2064 Thu Oct 28 14:46:11 GMT 2010 Olly Betts <olly@survex.com>
2066 * omindex.cc: Disable more output unless --verbose is specified. Don't
2067 flush the "Indexing" partial message until we get to the potentially
2068 time consuming actions.
2070 Thu Oct 28 13:54:44 GMT 2010 Olly Betts <olly@survex.com>
2072 * docs/overview.rst: Improve mark-up, and tweak wording in a few
2075 Thu Oct 28 13:46:36 GMT 2010 Olly Betts <olly@survex.com>
2077 * docs/overview.rst: Update docs for --duplicates and
2080 Thu Oct 28 13:27:01 GMT 2010 Olly Betts <olly@survex.com>
2082 * omindex.cc: Deprecated "--preserve-nonduplicates" in favour of new
2083 long option "--preserve-removed" which does the same thing, but has
2084 a (hopefully) clearer name. Rename the variable it controls from
2085 preserve_unupdated to delete_removed_documents (with the opposite
2088 Thu Oct 28 12:08:59 GMT 2010 Olly Betts <olly@survex.com>
2090 * configfile.cc: Only append '/' to directory values if they don't
2091 already have a trailing '/'.
2093 Thu Oct 28 11:49:54 GMT 2010 Olly Betts <olly@survex.com>
2095 * runfilter.cc: Make the memory limit for filter processes the size
2096 of physical memory, not 7/8 of this value, which is a little less
2097 arbitrary (ticket#424).
2099 Thu Oct 28 11:47:38 GMT 2010 Olly Betts <olly@survex.com>
2101 * omindex.cc: Under --duplicate=ignore, fix so that old documents which
2102 aren't seen get deleted, which wasn't implemented before (to suppress
2103 this deletion, pass -p as well).
2105 Thu Oct 28 10:38:21 GMT 2010 Olly Betts <olly@survex.com>
2107 * omindex.cc: Track how many documents in the index we haven't seen
2108 in this index run - if this is 0, we don't need to check for docs
2109 to delete at all; otherwise we can at least use it to know when we
2110 have found them all. Use a PostingIterator over all documents to
2111 avoid having to catch exceptions from delete_document() for gaps
2114 Thu Oct 28 04:52:36 GMT 2010 Olly Betts <olly@survex.com>
2116 * omindex.cc: Add quotes around directory name in "Entering directory"
2117 message. Add directory name to "skipping directory" error message.
2119 Thu Oct 28 04:50:37 GMT 2010 Olly Betts <olly@survex.com>
2121 * omindex.cc: Document --verbose in --help. Actually recognise -V.
2123 Thu Oct 28 04:01:31 GMT 2010 Olly Betts <olly@survex.com>
2125 * omindex.cc: Move the directory iteration loop out of the try/catch
2126 block for starting the iteration, which means it's indented by a
2129 Thu Oct 28 03:47:30 GMT 2010 Olly Betts <olly@survex.com>
2131 * omindex.cc: Add --verbose option, and disable the less interesting
2132 output unless it is specified.
2134 Thu Oct 28 03:34:44 GMT 2010 Olly Betts <olly@survex.com>
2136 * omindex.cc: Eliminate the message "Caught unknown exception in
2137 index_directory, rethrowing" as it isn't actually informative.
2139 Thu Oct 28 01:43:44 GMT 2010 Olly Betts <olly@survex.com>
2141 * omindex.cc: Variable dbpath doesn't need to be global.
2143 Thu Oct 28 01:28:10 GMT 2010 Olly Betts <olly@survex.com>
2145 * omindex.cc: The Host and Path terms are the same for every document
2146 in a single invocation of omindex, so calculate them just once up
2149 Thu Oct 28 01:13:36 GMT 2010 Olly Betts <olly@survex.com>
2151 * omindex.cc: Eliminate the leading slash on filenames in output, so
2152 they are now relative filenames on the system. This also simplifies
2153 path building internally.
2155 Wed Oct 27 09:51:51 GMT 2010 Olly Betts <olly@survex.com>
2157 * omindex.cc: Use rpm's --qf option to produce output which is simpler
2160 Wed Oct 27 09:32:22 GMT 2010 Olly Betts <olly@survex.com>
2162 * docs/overview.rst,omindex.cc: Add support for indexing RPM packages
2165 Wed Oct 27 06:07:59 GMT 2010 Olly Betts <olly@survex.com>
2167 * docs/overview.rst,omindex.cc: Add support for indexing Debian package
2168 files (ticket #493).
2170 Wed Oct 27 05:37:02 GMT 2010 Olly Betts <olly@survex.com>
2172 * docs/overview.rst,omindex.cc: Quietly ignore files with mimetype set
2173 to "ignore". The initial list of extensions set to ignore is:
2174 .a .dll .dylib .exe .lib .o .obj .so
2176 Wed Oct 27 02:25:01 GMT 2010 Olly Betts <olly@survex.com>
2178 * omindex.cc: Report get_description() for Xapian exceptions, which
2179 is provides additional information above get_msg().
2181 Wed Oct 27 01:56:08 GMT 2010 Olly Betts <olly@survex.com>
2183 * omindex.cc,query.cc,values.h: Add file size as a value, and set up a
2184 NumberValueRangeProcessor so size: works in the query (has to be in
2187 Wed Oct 27 01:31:25 GMT 2010 Olly Betts <olly@survex.com>
2189 * scriptindex.cc: Report get_description() for Xapian exceptions, which
2190 is provides additional information above get_msg().
2192 Tue Oct 26 12:00:58 GMT 2010 Olly Betts <olly@survex.com>
2194 * docs/overview.rst: Document the new emptydocs template.
2196 Tue Oct 26 11:51:31 GMT 2010 Olly Betts <olly@survex.com>
2198 * docs/omegascript.rst,query.cc: Add new $emptydocs command which
2199 returns a list of documents with doclength zero.
2200 * query.cc: Extend $field to take an optional DOCID argument, rather
2201 than always using the context from $hitlist.
2202 * templates/emptydocs: New template which lists documents with
2205 Thu Oct 21 12:05:23 GMT 2010 Olly Betts <olly@survex.com>
2207 * configure.ac,unixperm.cc: Fix to build on platforms where
2208 getgrouplist() exists but takes int* not gid_t* (e.g. Mac OS X).
2210 Wed Oct 20 10:30:13 GMT 2010 Olly Betts <olly@survex.com>
2212 * omindex.cc,scriptindex.cc: Add boolean terms with add_boolean_term()
2213 so they get wdf of 0 and don't contribute to document length.
2215 Sat Oct 16 06:13:23 GMT 2010 Olly Betts <olly@survex.com>
2217 * configure.ac: Probe for any options needed to enable large file
2218 support. Handling files >= 2GB isn't especially useful, but more
2219 importantly this is needed to allow omindex to index files on filing
2220 systems with 64 bit inodes on some platforms (e.g. 32-bit Linux).
2222 Mon Oct 11 11:11:07 GMT 2010 Olly Betts <olly@survex.com>
2224 * Makefile.am: Drop special case to remove man pages on "make clean"
2227 Wed Sep 29 04:14:21 GMT 2010 Olly Betts <olly@survex.com>
2229 * Makefile.am,configure.ac,query.cc,unixperm.cc,unixperm.h: Pull out
2230 permission checks into a separate file and check Unix user and group
2231 permissions based on environmental variable REMOTE_USER, if set.
2233 Tue Sep 28 08:06:00 GMT 2010 Olly Betts <olly@survex.com>
2235 * Makefile.am: Ship common/realtime.h.
2237 Tue Sep 28 06:32:10 GMT 2010 Olly Betts <olly@survex.com>
2239 * query.cc: Apply permission filters if USER and/or GROUP are set.
2241 Tue Sep 28 06:14:50 GMT 2010 Olly Betts <olly@survex.com>
2243 * ./: Update svn:externals to latest common from xapian-core.
2244 * query.cc: Use RealTime::now() to time running the query. Include
2245 more enquire set-up in the time.
2247 Tue Sep 28 05:26:07 GMT 2010 Olly Betts <olly@survex.com>
2249 * omindex.cc: Index file owner and read permissions, to allow finding
2250 documents with a particular owner, and so searches can be restricted
2251 to documents a user is able to read.
2252 * docs/termprefixes.rst: Document term prefixes used by the above.
2254 Tue Sep 28 05:20:01 GMT 2010 Olly Betts <olly@survex.com>
2256 * diritor.h: Rename get_other_read() to is_other_readable() for
2259 Tue Sep 28 04:16:55 GMT 2010 Olly Betts <olly@survex.com>
2261 * diritor.cc,diritor.h: Rearrange so that the setting of statbuf_valid
2262 gets inlined so the compiler should be able to optimise out
2263 subsequent calls to call_stat().
2265 Tue Sep 28 04:10:28 GMT 2010 Olly Betts <olly@survex.com>
2267 * diritor.h: Add methods to read the owner and group, and to check
2268 who can read the file.
2270 Tue Sep 28 01:39:15 GMT 2010 Olly Betts <olly@survex.com>
2274 Tue Sep 28 01:33:44 GMT 2010 Olly Betts <olly@survex.com>
2276 * NEWS: Fix whitespace oddities.
2278 Tue Sep 28 01:31:46 GMT 2010 Olly Betts <olly@survex.com>
2280 * NEWS: Update from ChangeLog.
2282 Tue Sep 28 01:27:41 GMT 2010 Olly Betts <olly@survex.com>
2284 * omindex.cc: Improve --help for --mime-type option.
2286 Mon Sep 20 06:50:45 GMT 2010 Olly Betts <olly@survex.com>
2288 * omindex.cc,svgparse.cc,svgparse.h: Extract any document title and
2289 keywords from SVG files.
2291 Mon Sep 20 06:49:44 GMT 2010 Olly Betts <olly@survex.com>
2293 * htmlparse.cc: Call closing_tag() for XML empty tag syntax (like
2296 Mon Sep 20 05:30:54 GMT 2010 Olly Betts <olly@survex.com>
2298 * Makefile.am,docs/overview.rst,omindex.cc,svgparse.cc,svgparse.h: Add
2299 support for indexing SVG files.
2301 Tue Sep 07 04:39:59 GMT 2010 Olly Betts <olly@survex.com>
2303 * outlookmsg2html.in: If the required perl modules aren't available,
2304 exit with status 127 which omindex interprets as "filter not
2305 installed" and won't try further .msg files.
2307 Tue Sep 07 02:24:36 GMT 2010 Olly Betts <olly@survex.com>
2309 * Makefile.am,configure.ac,docs/overview.rst,omindex.cc,
2310 outlookmsg2html.in,pkglibbindir.cc,pkglibbindir.h: Add support for
2311 indexing .msg files from Microsoft Outlook. (ticket#334)
2313 Tue Aug 31 06:32:15 GMT 2010 Olly Betts <olly@survex.com>
2315 * omindex.cc: Fix handling of quoting in CSV files to match what's
2318 Tue Aug 31 05:41:13 GMT 2010 Olly Betts <olly@survex.com>
2320 * docs/overview.rst,omindex.cc: The V in CSV is Values not Variable.
2322 Mon Aug 30 14:56:36 GMT 2010 Olly Betts <olly@survex.com>
2324 * docs/overview.rst,omindex.cc: Add support for indexing .csv files.
2326 Sat Aug 28 11:46:22 GMT 2010 Olly Betts <olly@survex.com>
2328 * cdb_find.cc,cdb_init.cc,cgiparam.cc,date.cc,md5.cc,query.cc,utils.cc,
2329 values.h: Fix to compile with Sun C++.
2331 Sat Aug 28 11:36:25 GMT 2010 Olly Betts <olly@survex.com>
2333 * omega.cc: An ESet can't contain empty terms, so there's no need to
2336 Tue Aug 24 05:58:28 GMT 2010 Olly Betts <olly@survex.com>
2338 * NEWS,configure.ac: Update for 1.2.3.
2340 Mon Aug 23 15:08:11 GMT 2010 Olly Betts <olly@survex.com>
2342 * xapian-omega.spec.in: Don't run autoreconf - it's no longer required.
2344 Tue Aug 03 14:11:35 GMT 2010 Olly Betts <olly@survex.com>
2346 * docs/termprefixes.rst: Update "flint and quartz" to "flint and chert"
2347 as quartz is no longer supported. Give exact term length limit for
2350 Sun Jun 27 05:00:39 GMT 2010 Olly Betts <olly@survex.com>
2352 * NEWS,configure.ac: Update for 1.2.2.
2354 Sat Jun 26 15:59:59 GMT 2010 Olly Betts <olly@survex.com>
2356 * NEWS.SKELETON: Add blank line to the end.
2358 Sat Jun 26 15:59:05 GMT 2010 Olly Betts <olly@survex.com>
2360 * NEWS.SKELETON: Add template NEWS entry.
2362 Tue Jun 22 13:55:11 GMT 2010 Olly Betts <olly@survex.com>
2364 * NEWS: Sync with 1.0.21.
2365 * NEWS,configure.ac: Update for 1.2.1.
2367 Sun Jun 13 11:55:40 GMT 2010 Olly Betts <olly@survex.com>
2369 * freemem.cc: Merge in __WIN32__ implementation from perftest in
2372 Fri May 14 01:39:43 GMT 2010 Olly Betts <olly@survex.com>
2374 * freemem.cc: Use "safeunistd.h" instead of <unistd.h>.
2376 Wed Apr 28 13:38:33 GMT 2010 Olly Betts <olly@survex.com>
2378 * NEWS: Sync with 1.0.20.
2380 Wed Apr 28 06:44:56 GMT 2010 Olly Betts <olly@survex.com>
2382 * configure.ac: Tell libtool not to link in deplibs on platforms where
2383 we know they aren't needed.
2384 * configure.ac: On Linux, extract the library search path from ldconfig
2385 which gives us the default entries reliably.
2386 * NEWS,configure.ac: 1.2.0.
2388 Thu Apr 15 04:32:06 GMT 2010 Olly Betts <olly@survex.com>
2390 * NEWS,configure.ac: Update for 1.1.5.
2392 Mon Feb 15 14:00:26 GMT 2010 Olly Betts <olly@survex.com>
2394 * configure.ac: Update for 1.1.4.
2396 Mon Feb 15 13:51:44 GMT 2010 Olly Betts <olly@survex.com>
2398 * NEWS: Add missing notes for 1.1.2 and 1.1.1 including changes from
2399 1.0.14 and 1.0.13 respectively.
2401 Mon Feb 15 13:28:12 GMT 2010 Olly Betts <olly@survex.com>
2403 * NEWS: Update from ChangeLog and 1.0.18.
2405 Mon Feb 08 00:48:44 GMT 2010 Olly Betts <olly@survex.com>
2407 * Makefile.am: Need to ship common/omassert.h.
2409 Sun Feb 07 23:03:45 GMT 2010 Olly Betts <olly@survex.com>
2411 * Makefile.am: Need to ship common/str.h.
2413 Sun Feb 07 21:40:03 GMT 2010 Olly Betts <olly@survex.com>
2415 * Makefile.am,omega.cc,omindex.cc,query.cc,utils.cc,utils.h: Use the
2416 optimised str() routine instead of int_to_string() and
2419 Fri Feb 05 23:29:12 GMT 2010 Olly Betts <olly@survex.com>
2421 * omindex.cc: Increase the wdf boost for the document title from 2 to
2422 5, since 2 isn't really enough.
2424 Thu Feb 04 03:20:02 GMT 2010 Olly Betts <olly@survex.com>
2426 * Makefile.am,configure.ac,runfilter.cc: Use safesyswait.h.
2427 * runfilter.cc: Reformat header to @file doxygen comment. Put
2428 '#include "runfilter.h"' right after <config.h>.
2430 Wed Dec 10 00:15:10 GMT 2009 Olly Betts <olly@survex.com>
2432 * NEWS: Update from ChangeLog.
2434 Wed Dec 09 00:26:19 GMT 2009 Olly Betts <olly@survex.com>
2436 * myhtmlparse.cc: Add missing "using namespace std;".
2438 Wed Dec 09 00:20:38 GMT 2009 Olly Betts <olly@survex.com>
2440 * htmlparse.cc: Make the default charset "utf-8" not "UTF-8" as we
2441 lower case explicitly specified character sets to compare to see
2442 if we need to reparse, so this avoids a reparse when UTF-8 is
2443 explicitly specified as well as the default.
2445 Tue Dec 08 23:56:46 GMT 2009 Olly Betts <olly@survex.com>
2447 * scriptindex.cc: Don't bomb out if indexing is disallowed or we hit
2448 </body> for a document which had an overridden character set.
2451 Wed Nov 18 10:48:47 GMT 2009 Olly Betts <olly@survex.com>
2453 * NEWS,configure.ac: Update for 1.1.3.
2455 Wed Nov 18 02:37:34 GMT 2009 Olly Betts <olly@survex.com>
2457 * NEWS: Update from 1.0.17 and ChangeLog.
2459 Mon Nov 16 09:08:12 GMT 2009 Olly Betts <olly@survex.com>
2461 * utf8converttest.cc: Charset "8859_1" isn't understood by Solaris
2462 libiconv, and isn't likely to be specified on a page, so just
2463 test it for our built-in convertor and GNU libc.
2465 Wed Nov 11 04:52:25 GMT 2009 Olly Betts <olly@survex.com>
2467 * configure.ac: Also check for socketpair with -lxnet if it isn't found
2468 without, which enables resource limits on Solaris, and possibly some
2469 other platforms. Fixes ticket#412.
2471 Wed Nov 04 01:51:41 GMT 2009 Olly Betts <olly@survex.com>
2473 * freemem.cc: On Linux, _SC_AVPHYS_PAGES excludes pages used by the OS
2474 VM cache, so will often return a really low value, so instead use
2475 _SC_PHYS_PAGES. Reported by Rune Kock in Debian bug#548987. Also
2476 explains ticket#358.
2478 Wed Nov 04 00:54:38 GMT 2009 Olly Betts <olly@survex.com>
2480 * common/: Sync with latest version from xapian-core to pick up getopt
2481 fix for Mac OS X 10.6.
2483 Mon Nov 02 09:32:22 GMT 2009 Olly Betts <olly@survex.com>
2485 * omindex.cc: Use delete[] (not delete) for array allocated by new[].
2487 Mon Nov 02 07:08:13 GMT 2009 Olly Betts <olly@survex.com>
2489 * runfilter.cc: Fix likely crash if read() is interrupted by a signal.
2490 Identified by Coverity's Scan.
2492 Mon Nov 02 06:47:01 GMT 2009 Olly Betts <olly@survex.com>
2494 * scriptindex.cc: Extend exception handling to the whole of main.
2495 Xapian::Stem("english") can't actually throw, but that's not obvious
2496 to static analysis tools, and it is more robust to wrap the whole of
2497 main, and reduces indentation.
2499 Mon Nov 02 06:32:41 GMT 2009 Olly Betts <olly@survex.com>
2501 * omindex.cc,scriptindex.cc: Tighten up the type of the error we catch
2502 to detect an unknown stemming language.
2504 Thu Sep 17 12:13:10 GMT 2009 Olly Betts <olly@survex.com>
2506 * NEWS: Update from ChangeLog.
2508 Thu Sep 10 13:33:06 GMT 2009 Olly Betts <olly@survex.com>
2510 * configure.ac: Default to looking for xapian-config-1.1.
2512 Thu Sep 10 06:46:55 GMT 2009 Olly Betts <olly@survex.com>
2514 * NEWS: Sync changes from 1.0.15 and 1.0.16.
2516 Wed Sep 09 13:32:25 GMT 2009 Olly Betts <olly@survex.com>
2518 * omega.cc,query.cc,query.h: Fix cross-site scripting vulnerability in
2519 reporting of exceptions (CVE-2009-2947).
2521 Fri Aug 28 15:30:07 GMT 2009 Richard Boulton <richard@lemurconsulting.com>
2523 * configure.ac: Check for PERL if in maintainer mode, not just when
2524 building documentation, because making the omegascript vim syntax
2527 Wed Aug 26 14:17:06 GMT 2009 Olly Betts <olly@survex.com>
2529 * templates/query: www.xapian.org -> xapian.org.
2531 Tue Aug 25 11:15:38 GMT 2009 Olly Betts <olly@survex.com>
2533 * gen-omegascript-vim: Fix swapped arguments to perl mkdir function.
2535 Tue Aug 25 10:39:29 GMT 2009 Olly Betts <olly@survex.com>
2537 * gen-omegascript-vim: Add GPL licence boilerplate.
2539 Tue Aug 25 10:29:07 GMT 2009 Olly Betts <olly@survex.com>
2541 * gen-omegascript-vim: Need to create "extra" for a VPATH build.
2543 Tue Aug 25 08:39:00 GMT 2009 Olly Betts <olly@survex.com>
2545 * Makefile.am: Fix for VPATH build.
2547 Tue Aug 25 06:38:08 GMT 2009 Olly Betts <olly@survex.com>
2549 * Makefile.am,extra/omegascript.vim,extra/omegascript.vim.in,
2550 gen-omegascript-vim: The list of OmegaScript commands in the vim
2551 mode was rather out of date, and a few commands were misclassified.
2552 Fix both problems and avoid future recurrences by automatically
2553 generating those lists from the command list in query.cc.
2555 Sat Aug 15 11:31:56 GMT 2009 Olly Betts <olly@survex.com>
2557 * NEWS: Update from ChangeLog.
2559 Wed Aug 05 03:50:54 GMT 2009 Olly Betts <olly@survex.com>
2561 * omindex.cc: Implement correct handling of paths when calling
2562 external filter programs on Microsoft Windows.
2564 Thu Jul 23 12:07:24 GMT 2009 Olly Betts <olly@survex.com>
2566 * omindex.cc: Remove pointless fallback code.
2568 Thu Jul 23 12:06:37 GMT 2009 Olly Betts <olly@survex.com>
2570 * templates/inc/toptermsjs: Use double-quotes rather than single quotes
2571 for parameter values on the <script> tag.
2573 Thu Jul 23 11:29:43 GMT 2009 Olly Betts <olly@survex.com>
2575 * docs/omegascript.rst: Document that $date uses UTC. (ticket#314)
2577 Thu Jul 23 11:26:15 GMT 2009 Olly Betts <olly@survex.com>
2579 * templates/query: If JavaScript is available, convert the
2580 $field{modtime} to a string on the client-side so that the timezone
2581 is correct. If JavaScript isn't available, fall back to the existing
2582 behaviour of using UTC. (ticket#314)
2584 Thu Jul 23 04:12:02 GMT 2009 Olly Betts <olly@survex.com>
2586 * NEWS,configure.ac: Update for 1.1.2.
2588 Wed Jul 22 04:33:29 GMT 2009 Olly Betts <olly@survex.com>
2590 * NEWS: Update from ChangeLog and sync with 1.0.13 and 1.0.14.
2592 Tue Jul 07 15:05:09 GMT 2009 Olly Betts <olly@survex.com>
2594 * omindex.cc: Consistently use endl not "\n" at the end of messages so
2595 that output is flushed.
2597 Tue Jul 07 07:29:21 GMT 2009 Olly Betts <olly@survex.com>
2599 * cdb_init.cc,cdb_int.h,cgiparam.cc,configfile.cc,date.cc,
2600 datematchdecider.cc,datematchdecider.h,freemem.cc,htmlparse.cc,
2601 htmlparsetest.cc,md5.cc,md5test.cc,myhtmlparse.cc,omega.cc,
2602 omindex.cc,query.cc,runfilter.cc,scriptindex.cc,strcasecmp.h,
2603 utf8converttest.cc,utils.cc: Update to use C++ forms for ISO C
2604 standard headers (ticket#330).
2606 Mon Jul 06 01:54:35 GMT 2009 Olly Betts <olly@survex.com>
2608 * loadfile.cc: Avoid infinite loop if the file has been truncated
2609 since we read the length, or on Cygwin with the automatic end of
2610 line translation turned on.
2612 Sun Jul 05 13:00:57 GMT 2009 Olly Betts <olly@survex.com>
2614 * htmlparse.cc,htmlparse.h: Make HtmlParser::get_parameter() const
2617 Sun Jul 05 12:59:45 GMT 2009 Olly Betts <olly@survex.com>
2619 * cdb_init.cc: Prefer static_cast<> to C-style cast.
2621 Sat Jun 20 03:31:22 GMT 2009 Olly Betts <olly@survex.com>
2623 * docs/overview.rst: www.xapian.org -> xapian.org
2625 Thu Jun 11 09:45:45 GMT 2009 Olly Betts <olly@survex.com>
2627 * omindex.cc: Extract pptx notesSlides and comments, if present. If
2628 they aren't, unzip returns exit code 11, which we must ignore
2631 Thu Jun 11 07:38:57 GMT 2009 Olly Betts <olly@survex.com>
2633 * docs/overview.rst,omindex.cc: Handle the "macroenabled" versions of
2634 MS Office 2007 files too (ticket#290).
2636 Wed Jun 10 01:13:14 GMT 2009 Olly Betts <olly@survex.com>
2638 * configure.ac: Update for 1.1.1.
2640 Tue Jun 09 14:35:40 GMT 2009 Olly Betts <olly@survex.com>
2642 * NEWS: Update for 1.1.1.
2644 Mon May 25 13:38:46 GMT 2009 Olly Betts <olly@survex.com>
2646 * query.cc: If SERVER_PROTOCOL in the environment is set to INCLUDED,
2647 then our output is being included in another page (e.g. using SSI)
2648 so suppress the output of any HTTP headers.
2650 Mon May 25 13:02:22 GMT 2009 Olly Betts <olly@survex.com>
2652 * templates/query: Remove extra "}" introduced when adding spelling
2655 Mon May 25 12:57:45 GMT 2009 Olly Betts <olly@survex.com>
2657 * cgiparam.cc,commonhelp.cc: Include the corresponding header.
2659 Mon May 25 12:56:55 GMT 2009 Olly Betts <olly@survex.com>
2661 * cgiparam.h: Add explicit inclusions of <map> and <string> and qualify
2662 multimap and string with std::.
2664 Sat May 23 12:21:33 GMT 2009 Olly Betts <olly@survex.com>
2666 * configure.ac: Sync warning flags used with GCC with xapian-core
2667 apart from -Woverloaded-virtual which fires for
2668 MyHtmlParser::parse_html(). That probably should be tidied up at
2669 some point, but not right now.
2671 Wed May 20 11:24:46 GMT 2009 Olly Betts <olly@survex.com>
2673 * omindex.cc: The MD5 checksum of a text file with a BOM was being
2674 incorrectly calculated from the contents converted to UTF-8
2675 since 1.0.7. Noticed by Srijon Biswas.
2677 Tue May 05 12:13:17 GMT 2009 Olly Betts <olly@survex.com>
2679 * omindex.cc: We can now use numeric_limits<> since we no longer
2680 support GCC 2.95, so use it and fix a warning on platforms with
2683 Thu Apr 30 14:09:50 GMT 2009 Olly Betts <olly@survex.com>
2685 * Makefile.am,docs/omegascript.rst,query.cc,weight.cc,weight.h: Add
2686 $opt{weighting} to allow the weighting scheme and parameters to be
2687 specified (ticket#298).
2689 Tue Apr 28 07:38:54 GMT 2009 Olly Betts <olly@survex.com>
2691 * omindex.cc: Check the last modification time of files before
2692 reindexing (ticket#342).
2694 Tue Apr 28 05:17:04 GMT 2009 Olly Betts <olly@survex.com>
2696 * omindex.cc: Drop the copyright info from the output of --version as
2697 it's perennially out of date and we don't report it for any other
2700 Tue Apr 28 05:03:29 GMT 2009 Olly Betts <olly@survex.com>
2702 * omindex.cc: If the filter for a filetype isn't installed, don't erase
2703 the entry from the mime_map, but instead set it to the empty string
2704 and then use this to report why we subsequently skip files with the
2705 same extension, rather than slightly misleadingly reporting "Unknown
2708 Mon Apr 27 16:34:29 GMT 2009 Olly Betts <olly@survex.com>
2710 * templates/query: Offer any spelling correction QueryParser gives.
2712 Mon Apr 27 13:36:19 GMT 2009 Olly Betts <olly@survex.com>
2714 * omindex.cc: Add "--spelling" option to index spelling correction
2717 Sun Apr 26 16:28:36 GMT 2009 Olly Betts <olly@survex.com>
2719 * omindex.cc: Make -s work as a short-form for --stemmer (as
2720 documented by "omindex --help" and "man omindex").
2722 Sun Apr 26 15:33:32 GMT 2009 Olly Betts <olly@survex.com>
2724 * docs/omegascript.rst,query.cc: Add $suggestion and $opt{spelling} to
2725 provide access to spelling correction (ticket#296).
2727 Sun Apr 26 15:08:40 GMT 2009 Olly Betts <olly@survex.com>
2729 * docs/scriptindex.rst,scriptindex.cc: Add new "spell" action for
2730 scriptindex (ticket#296).
2732 Thu Apr 23 07:40:41 GMT 2009 Olly Betts <olly@survex.com>
2734 * docs/scriptindex.rst,scriptindex.cc: Add new "valuenumeric" action
2735 to index a value using Xapian::sortable_serialise() to allow numeric
2736 sorting (ticket#260).
2738 Thu Apr 23 07:09:18 GMT 2009 Olly Betts <olly@survex.com>
2740 * Makefile.am,configure.ac,docs/Makefile.am: Fix things up so that in
2741 a bootstrapped SVN tree, automatic regeneration of
2742 autotools-generated files uses the in-tree versions of the autotools.
2744 Wed Apr 22 13:52:28 GMT 2009 Olly Betts <olly@survex.com>
2746 * NEWS: Update for 1.1.0.
2748 Mon Apr 20 14:20:51 GMT 2009 Olly Betts <olly@survex.com>
2750 * NEWS: Sync changes from 1.0.12.
2752 Mon Apr 20 14:15:41 GMT 2009 Olly Betts <olly@survex.com>
2754 * NEWS: Update from ChangeLog and clean up for release.
2756 Thu Apr 16 10:02:44 GMT 2009 Olly Betts <olly@survex.com>
2758 * transform.cc: Fix off-by-one error - the return value of pcre_exec()
2759 is one more than the number of groupings.
2761 Thu Apr 16 09:23:29 GMT 2009 Olly Betts <olly@survex.com>
2763 * Makefile.am: Need to ship new file transform.h.
2765 Thu Apr 16 08:20:01 GMT 2009 Olly Betts <olly@survex.com>
2767 * Makefile.am,docs/omegascript.rst,query.cc,transform.cc,transform.h:
2768 Factor out the implementation of $transform into a separate source
2769 file and compile only that file with $(PCRE_CFLAGS) to avoid
2770 problems reported by James Aylett with Mac OS X on #xapian-devel.
2771 Fix expansion of \1 to \9 to work correctly and document these
2772 and \\. Fix handling of unescaped \ at the end of the pattern, and
2773 leave unrecognised \<x> sequences unchanged.
2775 Thu Apr 16 04:38:20 GMT 2009 Olly Betts <olly@survex.com>
2777 * configure.ac: Remove duplicate "AC_SUBST(AM_CXXFLAGS)".
2779 Thu Apr 16 04:29:28 GMT 2009 Olly Betts <olly@survex.com>
2781 * configure.ac: Avoid implicitly casting a string literal to char* in
2782 the test for iconv by adding the same explicit cast we use in the
2783 code in utf8convert.cc. Currently the implicit cast is "only" a
2784 warning under GCC, but the user could pass -Werror explicitly in
2785 CXXFLAGS, and this could be promoted to an error in future GCC
2786 versions, and may already be so for some other compilers.
2788 Thu Apr 16 03:56:16 GMT 2009 Olly Betts <olly@survex.com>
2790 * configure.ac: Back out previous fix - -Werror has nothing to do with
2791 the issue James reported.
2793 Tue Apr 14 15:34:36 GMT 2009 Richard Boulton <richard@lemurconsulting.com>
2795 * configure.ac: Test for compiler flags before checking for
2796 libraries, and use the compiler flags found when checking for
2797 things. In particular, this should fix the test for the type
2798 used by iconv() on MacOS (where it was previously returning "char
2799 *", and the test was giving a warning about converting this to
2800 "const char *", but not failing). Requires a change to the iconv
2801 test to avoid it failing on linux with GCC due to an unrelated
2802 warning in the test code.
2804 Sat Apr 04 15:15:18 GMT 2009 Olly Betts <olly@survex.com>
2806 * NEWS: Update from ChangeLog.
2808 Wed Mar 25 12:35:42 GMT 2009 Olly Betts <olly@survex.com>
2810 * Makefile.am,configure.ac: Actually use all those warning flags we
2811 carefully determine!
2813 Wed Mar 25 12:03:37 GMT 2009 Olly Betts <olly@survex.com>
2815 * Makefile.am,configure.ac: Only put XAPIAN_CXXFLAGS in CXXFLAGS for
2816 the duration of configure (we need it as it may include options to
2817 put the compiler into ISO C++ mode). Set AM_CXXFLAGS to
2818 XAPIAN_CXXFLAGS in Makefile.am. This means that the user can safely
2819 override CXXFLAGS at make-time: "make CXXFLAGS=-Os"
2821 Wed Mar 25 10:56:29 GMT 2009 Olly Betts <olly@survex.com>
2823 * query.cc: Cope with write() not writing all the data or being
2824 interrupted by a signal when writing log entries.
2826 Wed Mar 25 10:48:14 GMT 2009 Olly Betts <olly@survex.com>
2828 * configure.ac: Move AC_PROG_CXX and AC_LANG_CPLUSPLUS earlier so that
2829 CXXFLAGS is set before we add XAPIAN_CXXFLAGS to it. With libtool
2830 1.5.x this wasn't an issue, as AC_PROG_CXX was implicitly run early
2831 on. With libtool 2.2.x it is as AC_PROG_CXX doesn't touch CXXFLAGS
2832 if it is already set, so we don't get "-O2 -g" set for GCC.
2834 Wed Mar 18 06:13:16 GMT 2009 Olly Betts <olly@survex.com>
2836 * scriptindex.cc: Mark "index=nopos" error for removal in 1.3.0
2837 not 1.2.0. Tweak code that produces it to use more literal strings.
2839 Wed Mar 18 06:12:06 GMT 2009 Olly Betts <olly@survex.com>
2841 * docs/scriptindex.rst: The deprecated "index=nopos" is now removed
2842 and gives an error explaining what to use instead, so remove the
2843 documentation saying it is deprecated and what to do.
2845 Mon Mar 16 14:07:58 GMT 2009 Olly Betts <olly@survex.com>
2847 * NEWS: Sync with 1.0.11.
2849 Sat Feb 28 08:31:15 GMT 2009 Olly Betts <olly@survex.com>
2851 * omindex.cc,scriptindex.cc: Use commit() rather than flush().
2853 Sat Feb 28 08:28:26 GMT 2009 Olly Betts <olly@survex.com>
2855 * scriptindex.cc: Don't call reopen() on a WritableDatabase - it
2856 doesn't do anything!
2858 Thu Feb 26 06:38:05 GMT 2009 Olly Betts <olly@survex.com>
2860 * NEWS: Update from ChangeLog.
2862 Thu Feb 26 06:18:05 GMT 2009 Olly Betts <olly@survex.com>
2864 * omindex.cc: Mark "-l" as requiring an argument so that it actually
2865 works - previously it would always result in a segmentation fault.
2867 Thu Feb 26 00:17:56 GMT 2009 Olly Betts <olly@survex.com>
2869 * docs/cgiparams.rst: Note the technique of using a stub database file
2870 to allow a default of searching over multiple databases.
2872 Wed Feb 25 12:39:08 GMT 2009 Olly Betts <olly@survex.com>
2874 * configure.ac: Update g++ version check to match recent change to
2875 xapian-core. Also turn on _FORTIFY_SOURCE and make the rare()
2876 and usual() branch prediction hint macros available.
2878 Mon Feb 23 06:05:25 GMT 2009 Olly Betts <olly@survex.com>
2880 * Makefile.am,docs/overview.rst,omindex.cc,xpsxmlparse.cc,
2881 xpsxmlparse.h: Add support for XPS files (bug#290).
2883 Fri Feb 20 03:25:14 GMT 2009 Olly Betts <olly@survex.com>
2885 * query.cc: Wrap a long comment.
2887 Thu Feb 19 10:34:36 GMT 2009 Olly Betts <olly@survex.com>
2889 * omega.cc,query.cc: Prefer str.resize(0) to str = "".
2891 Thu Feb 19 06:23:34 GMT 2009 Olly Betts <olly@survex.com>
2893 * docs/overview.rst,omindex.cc: Add support for MS Office 2007
2896 Thu Feb 19 04:46:26 GMT 2009 Olly Betts <olly@survex.com>
2898 * metaxmlparse.cc,metaxmlparse.h,xmlparse.cc,xmlparse.h: XmlParser and
2899 MetaXmlParser were overriding opening_tag with the wrong signature so
2900 their implementations weren't ever being used.
2902 Fri Jan 09 04:19:32 GMT 2009 Olly Betts <olly@survex.com>
2904 * runfilter.cc: Fix to compile when RLIMIT_AS isn't available (as on
2905 NetBSD and OpenBSD). In this situation, instead use RLIMIT_VMEM or
2906 RLIMIT_DATA if either is available.
2908 Wed Dec 10 01:06:03 GMT 2008 Olly Betts <olly@survex.com>
2910 * query.cc: Fix poor grammar in comment.
2912 Sat Nov 01 01:49:07 GMT 2008 Olly Betts <olly@survex.com>
2914 * NEWS: Sync with 1.0.9.
2916 Fri Oct 31 18:34:49 GMT 2008 Olly Betts <olly@survex.com>
2918 * configure.ac: Sync warning flag handling changes from xapian-core.
2920 Thu Oct 23 17:08:22 GMT 2008 Olly Betts <olly@survex.com>
2922 * docs/overview.rst: Document HTML parsing a bit, including robots
2923 meta and htdig_noindex.
2925 Sat Oct 18 08:00:24 GMT 2008 Olly Betts <olly@survex.com>
2927 * omega.cc: Catch std::exception and report what its what() method
2930 Thu Oct 09 10:16:05 GMT 2008 Olly Betts <olly@survex.com>
2932 * configure.ac: Update autoconf requirement to 2.63, libtool to 2.2.6.
2934 Wed Oct 01 04:48:37 GMT 2008 Olly Betts <olly@survex.com>
2936 * scriptindex.cc: Separate Action constructor cases to avoid
2937 pointlessly calling atoi() on an empty string.
2939 Wed Oct 01 03:15:29 GMT 2008 Olly Betts <olly@survex.com>
2941 * omega.cc,omega.h: Remove undocumented and non-functional support for
2942 numeric sorting via: SORT=#<slot>
2944 Thu Sep 04 04:26:22 GMT 2008 Olly Betts <olly@survex.com>
2946 * configure.ac: Set version to 1.1.0.
2948 Thu Sep 04 04:21:12 GMT 2008 Olly Betts <olly@survex.com>
2950 * NEWS: Sync with 1.0.8 and update from ChangeLog.
2952 Wed Sep 03 12:26:58 GMT 2008 Olly Betts <olly@survex.com>
2954 * htmlparse.cc,htmlparse.h,htmlparsetest.cc,myhtmlparse.cc,
2955 myhtmlparse.h,omindex.cc,scriptindex.cc,xmlparse.h: If the character
2956 encoding is specified using <meta http-equiv=...> in an HTML
2957 document then reparse the document if it isn't the encoding we're
2958 already using so that any preceding <title> is converted correctly
2961 Convert text from meta tag parameters to UTF-8 (bug#293).
2963 Handle <meta charset="..."> (new in HTML 5).
2965 Fix bug in parameter parsing which was probably just a small
2966 performance penalty in real world cases, but could perhaps result in
2967 parsing bogus extra parameters in carefully contrived situations.
2969 Tue Aug 05 09:24:33 GMT 2008 Olly Betts <olly@survex.com>
2971 * docs/: Fix a few typos and improve wording in a few places.
2973 Tue Aug 05 09:19:56 GMT 2008 Olly Betts <olly@survex.com>
2975 * omindex.cc: Tweak to use string::assign() instead of assigning the
2976 result of string::substr().
2978 Tue Jul 29 23:48:31 GMT 2008 Olly Betts <olly@survex.com>
2980 * runfilter.cc: Add missing <signal.h>, noted on FreeBSD by Henrik
2983 Mon Jul 21 12:27:48 GMT 2008 Olly Betts <olly@survex.com>
2985 * commonhelp.cc: Use PACKAGE_BUGREPORT instead of hardcoding the bug
2986 report URL. Remove reference to "bugzilla" as we now use trac
2989 Mon Jul 21 11:58:25 GMT 2008 Olly Betts <olly@survex.com>
2991 * configure.ac: Put the bug report URL as the third parameter to
2992 AC_INIT. Add proper m4 quoting in a few places (nowhere that
2993 should actually change behaviour). Add hard autotools version
2994 requirements to match xapian-core, and remove the version
2995 justification since HACKING now covers that. Drop docdir workaround
2996 for autoconf < 2.60.
2998 Wed Jul 09 10:44:37 GMT 2008 Olly Betts <olly@survex.com>
3000 * configure.ac: The workaround to avoid probe code for F77, GCJ, and
3001 RC being added to configure is no longer required now that we're
3002 using libtool 2.2 so remove it.
3004 Wed Jul 09 10:13:18 GMT 2008 Olly Betts <olly@survex.com>
3006 * Makefile.am,configure.ac: Use AC_CONFIG_MACRO_DIR and
3007 ACLOCAL_AMFLAGS as libtoolize 2.2.4 recommends.
3009 Fri Jul 04 08:29:47 GMT 2008 Olly Betts <olly@survex.com>
3011 * NEWS: Synchronise with 1.0 branch.
3013 Fri Jul 04 08:15:03 GMT 2008 Olly Betts <olly@survex.com>
3015 * utf8convert.cc,utf8converttest.cc: UTF-16 with no BOM is meant to be
3016 assumed to be big-endian. GNU libiconv doesn't handle some examples
3017 as expected, so disable them when using iconv() for now.
3019 Fri Jul 04 06:39:20 GMT 2008 Olly Betts <olly@survex.com>
3021 * omindex.cc: Handle UCS-2 and UTF-16 text files with a byte-order
3022 mark (BOM). Ignore any UTF-8 "byte-order" mark.
3023 * utf8convert.cc: Handle UCS-2/UTF-16 and explicit BE and LE forms in
3025 * Makefile.am,utf8converttest.cc: Add unit tests of convert_to_utf8().
3027 Fri Jun 27 04:43:18 GMT 2008 Olly Betts <olly@survex.com>
3029 * query.cc: Overhaul the $highlight colour combinations since some
3030 were rather unreadable. Reported by Joey Hess in Debian bug
3033 Sun Jun 01 15:12:02 GMT 2008 Olly Betts <olly@survex.com>
3035 * configure.ac: Update version to 1.0.7 to match 1.0 branch.
3037 Sun May 25 14:56:41 GMT 2008 Olly Betts <olly@survex.com>
3039 * NEWS: Synchronise with 1.0 branch, and update from ChangeLog.
3041 Sat May 17 11:42:26 GMT 2008 Olly Betts <olly@survex.com>
3043 * docs/omegascript.rst,docs/scriptindex.rst: Tweak mark-up so
3044 generated HTML gets a non-empty title.
3046 Sat May 10 11:14:20 GMT 2008 Olly Betts <olly@survex.com>
3048 * Makefile.am: omega_CPPFLAGS overrides AM_CPPFLAGS, so we need to
3049 explicitly include AM_CPPFLAGS in omega_CPPFLAGS to get
3050 CONFIGFILE_SYSTEM defined when building omega.
3052 Fri May 09 19:27:21 GMT 2008 Olly Betts <olly@survex.com>
3054 * Makefile.am: Fix handling of any -I options needed for PCRE.
3056 Sun May 04 19:12:08 GMT 2008 Olly Betts <olly@survex.com>
3058 * omindex.cc: Fix comment error regarding catdvi options.
3060 Sat May 03 14:02:02 GMT 2008 Olly Betts <olly@survex.com>
3062 * xapian-omega.spec.in: Remove "www." from xapian.org and
3063 oligarchy.co.uk URLs.
3065 Sat May 03 13:55:35 GMT 2008 Olly Betts <olly@survex.com>
3067 * cgiparam.cc,htdig2omega,mbox2omega,omindex-config.cc: Update FSF
3070 Sat May 03 13:54:25 GMT 2008 Olly Betts <olly@survex.com>
3072 * gnu_getopt.h: Remove old copy of file which is no longer used - we
3073 now share a copy with xapian-core via common/.
3075 Sat May 03 10:42:27 GMT 2008 Olly Betts <olly@survex.com>
3077 * configure.ac: Fix header checks to pre-include <sys/types.h> which
3078 Mac OS X needs for some other headers to work.
3080 Sat May 03 10:41:18 GMT 2008 Olly Betts <olly@survex.com>
3082 * configure.ac: Improve code which prevents probing for f77, etc.
3084 Fri May 02 17:52:44 GMT 2008 Olly Betts <olly@survex.com>
3086 * configure.ac: Fix to fail if --with-iconv is specified and libiconv
3087 isn't, and we aren't using fink on Mac OS X.
3089 Fri May 02 15:55:24 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
3091 * configure.ac: If iconv isn't found, set with_iconv to "no", to
3092 prevent USE_ICONV being set. Was previously only doing this if
3093 fink on OS X was found.
3095 Fri May 02 14:14:07 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
3097 * query.cc: Cast size to unsigned before division to avoid a
3098 warning about signed overflow.
3100 Fri May 02 14:08:39 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
3102 * configure.ac: Synchronise code for working out warning flags used
3103 for builds with that used for xapian-core. Copes with different
3104 formats of version number output by "gcc --version" which should
3105 help to improve output.
3107 Tue Apr 15 23:44:10 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
3109 * query.cc: Catch only the specific error which indicates a need to
3110 repeat a get_termfreq() call on the database instead of the mset.
3112 Sun Apr 13 11:19:49 GMT 2008 Richard Boulton <richard@lemurconsulting.com>
3114 * freemem.h: Specify units of get_free_physical_memory().
3116 Sun Apr 06 09:05:58 GMT 2008 Olly Betts <olly@survex.com>
3118 * freemem.cc: Fix latent compilation error on FreeBSD, pointed out by
3121 Mon Mar 31 02:00:48 GMT 2008 Olly Betts <olly@survex.com>
3123 * configure.ac: Update version to 1.0.6 to match latest release.
3125 Wed Mar 12 07:04:56 GMT 2008 Olly Betts <olly@survex.com>
3127 * scriptindex.cc: Make deprecated "index=nopos" an error.
3129 Mon Mar 10 03:37:30 GMT 2008 Olly Betts <olly@survex.com>
3131 * Makefile.am,diritor.cc,diritor.h,omindex.cc: Check for readdir()
3134 Thu Mar 06 23:43:11 GMT 2008 Olly Betts <olly@survex.com>
3136 * common/: Update to latest revisions.
3137 * Makefile.am,diritor.h: Use safedirent.h not dirent.h and build
3138 msvc_dirent.cc as part of omindex.
3140 Wed Mar 05 23:16:23 GMT 2008 Olly Betts <olly@survex.com>
3142 * NEWS: Update to HEAD with un-backported changes kept separate.
3144 Wed Mar 05 19:05:12 GMT 2008 Olly Betts <olly@survex.com>
3146 * NEWS: Update to 1.0 branch point.
3148 Sat Feb 02 22:46:40 GMT 2008 Olly Betts <olly@survex.com>
3150 * query.cc: Add (C) notice for Thomas Viehmann.
3152 Sat Feb 02 22:46:14 GMT 2008 Olly Betts <olly@survex.com>
3154 * omindex.cc: Back out random change committed by accident.
3156 Sat Feb 02 21:23:07 GMT 2008 Olly Betts <olly@survex.com>
3158 * omindex.cc,query.cc: New OmegaScript commands $addfilter, $lower,
3160 * docs/omegascript.rst: Document. Improve formatting.
3162 Fri Feb 01 01:45:26 GMT 2008 Olly Betts <olly@survex.com>
3164 * INSTALL: PCRE required.
3165 * docs/omegascript.rst: $transform{} now enabled. Fixes bug#231.
3167 Fri Feb 01 01:35:58 GMT 2008 Olly Betts <olly@survex.com>
3169 * Makefile.am,configure.ac,query.cc: Add PCRE as a requirement and
3170 add $transform{} command (which has been in the code for ages but
3173 Sat Jan 19 02:01:02 GMT 2008 Olly Betts <olly@survex.com>
3175 * omindex.cc: Add support for DjVu files.
3176 * docs/overview.rst: Document.
3178 Sat Jan 12 03:37:28 GMT 2008 Olly Betts <olly@survex.com>
3180 * freemem.cc: Check "defined HAVE_SYSMP" rather than just "HAVE_SYSMP".
3181 This doesn't change behaviour, but fixes a compile warning on
3182 platforms other than Linux and IRIX.
3184 Fri Dec 21 02:13:49 GMT 2007 Olly Betts <olly@survex.com>
3186 * NEWS: Bump release date.
3188 Thu Dec 20 21:40:34 GMT 2007 Olly Betts <olly@survex.com>
3190 * NEWS: Another update for 1.0.5.
3192 Thu Dec 20 20:08:58 GMT 2007 Olly Betts <olly@survex.com>
3194 * Makefile.am,scriptindex.cc: Fix scriptindex to insert a ':' between
3195 prefix and term using the same criteria which the QueryParser does.
3196 * scriptindex.cc,docs/scriptindex.rst: Action BOOLEAN now ignores an
3197 empty input rather than adding the prefix as a term. Action UNIQUE
3198 now issues an warning for empty input but otherwise ignores it.
3200 Thu Dec 20 17:44:57 GMT 2007 Olly Betts <olly@survex.com>
3202 * common/: Update to r9894 to pick up stringutils.cc.
3204 Wed Dec 19 03:44:50 GMT 2007 Olly Betts <olly@survex.com>
3206 * NEWS,configure.ac: Update for 1.0.5.
3208 Tue Dec 18 00:58:07 GMT 2007 Olly Betts <olly@survex.com>
3212 Thu Dec 13 01:38:43 GMT 2007 Olly Betts <olly@survex.com>
3214 * omindex.cc: Avoid rereading uncompressed AbiWord documents in order
3215 to calculate their MD5 checksums.
3217 Thu Dec 13 01:34:53 GMT 2007 Olly Betts <olly@survex.com>
3219 * omindex.cc: Improve comment wording.
3221 Thu Dec 13 00:59:35 GMT 2007 Olly Betts <olly@survex.com>
3223 * docs/overview.rst: Document that omindex limits resources that
3224 filter programs can use. Also add a note welcoming suggestions
3225 for additional reliable filter programs.
3227 Wed Dec 12 23:49:27 GMT 2007 Olly Betts <olly@survex.com>
3229 * Makefile.am,freemem.cc,freemem.h,runfilter.cc: Limit filter programs
3230 to 7/8 of free physical memory on platforms where we know how to
3231 determine this (currently at least Linux, FreeBSD, IRIX, HP-UX;
3232 probably Solaris and a few others too). Fixes bug#111.
3234 Wed Dec 12 18:20:34 GMT 2007 Olly Betts <olly@survex.com>
3236 * docs/termprefixes.rst: Note the version where we stopped generating
3237 terms with a 'W' prefix (0.9.7).
3239 Wed Dec 12 18:17:28 GMT 2007 Olly Betts <olly@survex.com>
3241 * docs/overview.rst: omindex hasn't generated "W"-prefix terms since
3242 0.9.7, so remove the documentation saying it does!
3244 Wed Dec 12 18:16:52 GMT 2007 Olly Betts <olly@survex.com>
3246 * docs/overview.rst: Update to mention how upper case in extensions is
3249 Wed Dec 12 17:49:12 GMT 2007 Olly Betts <olly@survex.com>
3251 * omindex.cc: If an extension isn't found in the mime_map and contains
3252 uppercase ASCII characters, see if the lower cased extension is in
3255 Wed Dec 12 02:09:02 GMT 2007 Olly Betts <olly@survex.com>
3257 * NEWS: Updated from ChangeLog in preparation for 1.0.5.
3259 Mon Dec 10 23:27:40 GMT 2007 Olly Betts <olly@survex.com>
3261 * omindex.cc: '-f' is documented by --help as a short option for
3262 '--follow', but wasn't previously actually recognised.
3264 Tue Nov 20 13:08:19 GMT 2007 Olly Betts <olly@survex.com>
3266 * htmlparse.cc: Add "using namespace std;" to ensure that
3267 std::strchr(), etc are imported into the global namespace.
3269 Tue Nov 20 01:01:13 GMT 2007 Richard Boulton <richard@lemurconsulting.com>
3271 * commonhelp.cc,diritor.cc,htmlparse.cc,omega.cc,scriptindex.cc:
3272 Add #include of cstring, to fix errors from gcc-4.3 snapshot.
3273 Tidy include ordering in htmlparse.cc
3275 Tue Nov 06 12:17:10 GMT 2007 Olly Betts <olly@survex.com>
3277 * docs/Makefile.am: No need to set SUFFIXES manually for suffixes used
3280 Mon Nov 05 19:32:41 GMT 2007 Olly Betts <olly@survex.com>
3282 * configure.ac: Probe for rst2html.
3284 Mon Nov 05 07:24:31 GMT 2007 Olly Betts <olly@survex.com>
3286 * Makefile.am,README,configure.ac,docs/,query.cc: Replace .txt docs
3287 with Jenny's RST-ified versions.
3289 Tue Oct 30 04:54:58 GMT 2007 Olly Betts <olly@survex.com>
3291 * NEWS,configure.ac: Update for 1.0.4.
3293 Sat Oct 27 05:32:06 BST 2007 Olly Betts <olly@survex.com>
3297 Sat Oct 27 05:30:28 BST 2007 Olly Betts <olly@survex.com>
3299 * query.cc: On balance, it's more helpful to users to moan about a
3300 template which tries to set the same user prefix as both boolean
3301 and probabilistic, even if previous releases didn't.
3303 Thu Oct 25 20:38:15 BST 2007 Olly Betts <olly@survex.com>
3305 * common/: Update to latest version.
3306 * query.cc: Remove STRINGIZE macro definition as this is now
3307 defined by stringutils.h.
3309 Fri Oct 19 16:17:47 BST 2007 Olly Betts <olly@survex.com>
3311 * query.cc: Fix for reverted add_prefix() API.
3313 Sun Sep 30 22:12:46 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3315 * query.cc: Use the new form of add_prefix() to avoid deprecation
3316 warnings at compile time. Carefully avoid calling
3317 add_prefix(f,p,PREFIX_FILTER) for a prefix which has already been
3318 set with add_prefix(f,p,PREFIX_INLINE), because this would cause
3319 an error (and we wish to avoid changing semantics of omegascript
3320 to avoid breaking existing scripts).
3323 Fri Sep 28 15:48:50 BST 2007 Olly Betts <olly@survex.com>
3325 * NEWS: Final (?) update for 1.0.3.
3327 Fri Sep 28 15:46:11 BST 2007 Olly Betts <olly@survex.com>
3329 * mbox2omega: Expand --help output.
3330 * docs/scriptindex.txt: Refer to mbox2omega as an example of how to
3333 Fri Sep 28 03:18:25 BST 2007 Olly Betts <olly@survex.com>
3337 Fri Sep 28 03:15:11 BST 2007 Olly Betts <olly@survex.com>
3339 * configure.ac: Update for 1.0.3. Use ustar format for tarball since
3340 we have to for xapian-core anyway.
3342 Fri Sep 28 02:42:28 BST 2007 Olly Betts <olly@survex.com>
3344 * ./: Update common SVN rev in svn:externals so the files are in
3345 sync with xapian-core.
3347 Wed Sep 19 16:09:36 BST 2007 Olly Betts <olly@survex.com>
3349 * NEWS: Update from ChangeLog entries since 1.0.2.
3351 Sat Sep 08 19:24:48 BST 2007 Olly Betts <olly@survex.com>
3353 * configure.ac,runfilter.cc: Impose a 5 minute CPU time limit on
3354 filter programs to prevent problems if a filter program goes into
3355 an infinite loop on a malformed input. Partly addresses bug#111.
3357 Fri Sep 07 21:22:43 BST 2007 Olly Betts <olly@survex.com>
3359 * omindex.cc: Fix comment typos.
3361 Fri Sep 07 20:56:50 BST 2007 Olly Betts <olly@survex.com>
3363 * docs/overview.txt,omindex.cc: Add supporting for indexing TeX DVI
3366 Thu Sep 06 20:59:57 BST 2007 Olly Betts <olly@survex.com>
3368 * query.cc: Fix bug in decimal fraction in $size for files >= 1M in
3371 Thu Sep 06 20:13:44 BST 2007 Olly Betts <olly@survex.com>
3373 * templates/query: Set HTML charset to utf-8 since that's what
3374 databases now are by default. Tidy up some HTML gremlins.
3375 Restyle to use CSS to draw a "score bar" instead of using
3376 images. Rework the layout of each hit. Add popup hints on
3377 mouse-over for various items.
3379 Thu Sep 06 18:12:07 BST 2007 Olly Betts <olly@survex.com>
3381 * scriptindex.cc: Fix line number tracking in dump files.
3383 Thu Sep 06 18:06:28 BST 2007 Olly Betts <olly@survex.com>
3385 * docs/omegascript.txt,query.cc: Add $muldiv{A,B,C} which calculates
3388 Thu Sep 06 03:36:36 BST 2007 Olly Betts <olly@survex.com>
3390 * runfilter.cc: Fix file description.
3392 Thu Sep 06 00:54:58 BST 2007 Olly Betts <olly@survex.com>
3394 * Makefile.am,omindex.cc,runfilter.cc,runfilter.h: Factor out the
3395 stdout_to_string() function into its own source file.
3397 Thu Sep 06 00:45:14 BST 2007 Olly Betts <olly@survex.com>
3399 * cgiparam.h,commonhelp.h,date.h,hashterm.h,htmlparse.h,loadfile.h,
3400 md5wrap.h,metaxmlparse.h,myhtmlparse.h,namedentities.h,omega.h,
3401 sample.h,utf8convert.h,utf8truncate.h,xmlparse.h: Add missing header
3402 guards and standardise existing header guards to use the form
3403 OMEGA_INCLUDED_FOO_H.
3405 Thu Sep 06 00:24:54 BST 2007 Olly Betts <olly@survex.com>
3407 * myhtmlparse.cc: Add '#include <config.h>'.
3408 * omega.h: Don't '#include <config.h>'.
3410 Mon Sep 03 19:16:37 BST 2007 Olly Betts <olly@survex.com>
3412 * docs/overview.txt,omindex.cc: Add support for indexing AbiWord
3415 Thu Jul 05 00:37:35 BST 2007 Olly Betts <olly@survex.com>
3417 * NEWS: Final (?) update for 1.0.2.
3419 Thu Jul 05 00:33:14 BST 2007 Olly Betts <olly@survex.com>
3421 * omindex.cc: Report files we aren't indexing because their extensions
3424 Wed Jul 04 21:22:02 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3426 * NEWS: Update with release date for release 1.0.2
3428 Wed Jul 04 20:43:22 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3430 * configure.ac: Bump version to 1.0.2.
3432 Wed Jul 04 17:34:15 BST 2007 Olly Betts <olly@survex.com>
3436 Wed Jul 04 17:31:38 BST 2007 Olly Betts <olly@survex.com>
3438 * Makefile.am,omindex.cc,query.cc: Use stringutils.h from common.
3439 * ./: Update common SVN rev in svn:externals to get the latest
3441 * cgiparam.cc: Use string::resize() rather than assigning from a
3442 substring of the string.
3444 Mon Jul 02 16:42:01 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3446 * htmlparsetest.cc,md5test.cc: Add #include <stdlib.h>, to get a
3447 definition for exit(). Fixes compilation with gcc-snapshot.
3449 Thu Jun 28 18:05:18 BST 2007 Olly Betts <olly@survex.com>
3451 * omindex.cc: If --url isn't passed, default to "/", but print a
3452 warning noting that this default has been used (at least for now).
3454 Thu Jun 28 18:04:53 BST 2007 Olly Betts <olly@survex.com>
3456 * docs/scriptindex.txt: Fix typo.
3458 Wed Jun 27 15:44:30 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3460 * NEWS: Remove the items which aren't really interesting to users.
3462 Wed Jun 27 14:26:26 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3464 * common/: Update svn:externals property to use latest version.
3468 Sat Jun 23 13:11:15 BST 2007 Olly Betts <olly@survex.com>
3470 * diritor.h: Delete random extra blank line.
3472 Sat Jun 23 13:08:35 BST 2007 Olly Betts <olly@survex.com>
3474 * omega.cc,query.cc: Use Xapian::BAD_VALUENO.
3476 Sat Jun 16 11:06:08 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3478 * Makefile.am: Pass value of XAPIAN_CONFIG to distcheck, to ensure
3479 that it works with uninstalled copies of Xapian.
3481 Mon Jun 11 03:34:53 BST 2007 Olly Betts <olly@survex.com>
3483 * NEWS: Minor wording improvement.
3485 Mon Jun 11 03:33:37 BST 2007 Olly Betts <olly@survex.com>
3487 * NEWS: Probably the final update for 1.0.1.
3489 Sun Jun 10 22:00:23 BST 2007 Olly Betts <olly@survex.com>
3491 * configure.ac: Drop automake requirement to 1.8.3 to allow RPM spec
3492 file to work on SLES 9.
3494 Sun Jun 10 21:49:45 BST 2007 Olly Betts <olly@survex.com>
3496 * configure.ac: Bump version to 1.0.1.
3498 Sun Jun 10 02:16:54 BST 2007 Olly Betts <olly@survex.com>
3502 Sat Jun 09 15:20:25 BST 2007 Olly Betts <olly@survex.com>
3504 * Makefile.am,diritor.cc,diritor.h,omindex.cc: Under Linux (at least)
3505 struct dirent can tell us the type of a directory entry for some
3506 filing systems, so make use of this to avoid calling stat() (or
3507 lstat()) unnecessarily - when indexing /usr/share/doc on my Linux
3508 box, this saves about 14000 explicit calls to stat (leaving about
3511 Thu Jun 07 01:40:43 BST 2007 Olly Betts <olly@survex.com>
3515 Wed Jun 06 15:45:33 BST 2007 Olly Betts <olly@survex.com>
3517 * docs/scriptindex.txt: Document that you can delete a document by
3518 giving a new document which only contains the unique term.
3520 Mon Jun 04 16:40:18 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3522 * Makefile.am: Only add manpages to dist_man_MANS if we're not in
3523 maintainer mode with documentation generation turned off.
3525 Thu May 31 20:02:16 BST 2007 Olly Betts <olly@survex.com>
3529 Thu May 31 19:16:37 BST 2007 Olly Betts <olly@survex.com>
3531 * configure.ac: Relax automake requirement to 1.9.2 to allow RPM
3534 Wed May 30 14:42:40 BST 2007 Olly Betts <olly@survex.com>
3536 * NEWS: Update for changes since 1.0.0. Removed unused subheading
3539 Wed May 30 10:24:57 BST 2007 Olly Betts <olly@survex.com>
3541 * query.cc: Fix handling of query parsing errors (broken by changes in
3544 Tue May 29 01:19:21 BST 2007 Olly Betts <olly@survex.com>
3546 * docs/overview.txt: We no longer use pstotext for PostScript, but
3547 instead use ps2pdf followed by pdftotext, so update the docs to
3550 Fri May 18 03:36:28 BST 2007 Olly Betts <olly@survex.com>
3552 * htmlparsetest.cc,myhtmlparse.cc: Fix bug in HTML parser - if the
3553 text between tags consisted entirely of whitespace it would just be
3554 ignored which could run words together. Add regression test, plus
3555 another test for other whitespace handling.
3557 Thu May 17 22:27:47 BST 2007 Olly Betts <olly@survex.com>
3559 * NEWS: Final update before release.
3561 Thu May 17 20:48:25 BST 2007 Olly Betts <olly@survex.com>
3565 Thu May 17 20:46:43 BST 2007 Olly Betts <olly@survex.com>
3567 * docs/termprefixes.txt: Update to include 'Z' prefix and mention
3568 that 'R' and 'W' aren't used by Xapian now.
3570 Thu May 17 19:11:04 BST 2007 Olly Betts <olly@survex.com>
3572 * configure.ac: Bump version to 1.0.0.
3574 Thu May 17 18:11:19 BST 2007 Olly Betts <olly@survex.com>
3576 * common/: Update to latest xapian-core revision to pull in 2 argument
3577 mkdir() wrapper for Mingw.
3579 Thu May 17 03:29:44 BST 2007 Olly Betts <olly@survex.com>
3581 * Makefile.am,configure.ac: Add support for --disable-documentation
3582 like xapian-core now has.
3583 * configure.ac: Only enable -Werror on --enable-maintainer-mode for
3584 GCC 4 or newer, in line with change in xapian-core.
3586 Thu May 17 03:22:10 BST 2007 Olly Betts <olly@survex.com>
3588 * NEWS: Update for 1.0.0.
3590 Wed May 16 03:09:44 BST 2007 Olly Betts <olly@survex.com>
3594 Tue May 15 18:50:47 BST 2007 Olly Betts <olly@survex.com>
3596 * configure.ac: Add AC_TYPE_PID_T.
3598 Tue May 15 04:22:40 BST 2007 Olly Betts <olly@survex.com>
3600 * omindex.cc: Remove FIXME comment which has already been addressed.
3602 Mon May 14 04:38:49 BST 2007 Olly Betts <olly@survex.com>
3604 * docs/omegascript.txt: Update docs for $prettyterm{TERM}.
3606 Mon May 14 04:31:01 BST 2007 Olly Betts <olly@survex.com>
3608 * omega.cc,omega.h,query.cc,query.h: Rejig how $topterms and other
3609 cases handle terms to fit with the new term generation scheme.
3610 Add 'you' and 'your' as stopwords.
3612 Thu May 10 04:48:43 BST 2007 Olly Betts <olly@survex.com>
3614 * ./: Update svn:externals to pull in r8538 of xapian-core's common
3616 * Makefile.am: Add common/safe.cc to scriptindex_SOURCES.
3618 Thu May 10 01:09:14 BST 2007 Olly Betts <olly@survex.com>
3620 * templates/,Makefile.am: The 'query' template no longer uses
3621 $topterms by default - to get them, use the new 'topterms' template.
3622 Also the template fragments which aren't intended for direct use
3623 have been move to templates/inc/.
3624 * docs/overview.txt: Document what each of the OmegaScript templates
3626 * docs/quickstart.txt: Assorted minor improvements.
3627 * xapian-omega.spec.in: Update to install templates/inc too.
3629 Wed May 09 23:43:57 BST 2007 Olly Betts <olly@survex.com>
3631 * docs/omegascript.txt,query.cc: Instead of appending a dot to
3632 indicate a stemmed term, wrap the term in double quotes.
3634 Sun May 06 21:41:21 BST 2007 Olly Betts <olly@survex.com>
3636 * omindex.cc,scriptindex.cc: Removed commented out code for generating
3637 "W" prefix terms for date searching. We've never made use of them
3638 in Omega, and we'll be moving to using DateMatchDecider by default
3641 Sun May 06 16:00:47 BST 2007 Olly Betts <olly@survex.com>
3643 * configure.ac: Set version to mythical 0.9.99.
3645 Sun May 06 15:52:08 BST 2007 Olly Betts <olly@survex.com>
3647 * Makefile.am,configure.ac,omega.spec.in,xapian-omega.spec.in:
3648 Update RPM spec file to reflect tarball name change from omega
3649 to xapian.omega (patch from Fabrice Colin). Also rename omega.spec
3650 to xapian-omega.spec (rpmbuild looks for any .spec file, but it's
3651 more consistent to keep the names in step).
3653 Fri May 04 19:52:44 BST 2007 Olly Betts <olly@survex.com>
3655 * omindex.cc,scriptindex.cc: Use new TermGenerator convenience methods
3656 which take std::string instead of Utf8Iterator.
3658 Fri May 04 13:32:11 BST 2007 Olly Betts <olly@survex.com>
3660 * Makefile.am,configure.ac,makemanpage.in: Use makemanpage to generate
3663 Fri May 04 13:30:36 BST 2007 Olly Betts <olly@survex.com>
3665 * commonhelp.cc: Add missing full stop in description of --stemmer.
3667 Fri May 04 04:10:23 BST 2007 Olly Betts <olly@survex.com>
3669 * query.cc: Explicitly include stdlib.h since we use atoi().
3671 Thu May 03 15:16:31 BST 2007 Olly Betts <olly@survex.com>
3673 * Makefile.am,indextext.cc,indextext.h,omindex.cc,scriptindex.cc:
3674 Update to use new TermGenerator class.
3676 Thu May 03 04:03:35 BST 2007 Olly Betts <olly@survex.com>
3678 * ./: Update svn:externals to pull rev8430 of xapian-core's common
3680 * scriptindex.cc: Remove sleep() wrapper.
3682 Wed May 02 03:26:38 BST 2007 Olly Betts <olly@survex.com>
3684 * docs/omegascript.txt,query.cc: Removed $freqs as it has been
3685 deprecated for ages.
3687 Wed May 02 03:19:18 BST 2007 Olly Betts <olly@survex.com>
3689 * docs/scriptindex.txt: Explicitly note that index=nopos is deprecated
3690 (scriptindex already emits a warning).
3692 Wed May 02 03:17:03 BST 2007 Olly Betts <olly@survex.com>
3694 * docs/cgiparams.txt: FMT isn't limited to just `a-z' - the
3695 actual restriction is that it may not contain `..'.
3697 Wed May 02 03:02:53 BST 2007 Olly Betts <olly@survex.com>
3699 * scriptindex.cc: Remove -q and -u options - they no longer do
3700 anything and are only accepted for compatibility with really old
3701 versions (0.6.1 and earlier and 0.7.5 and earlier respectively).
3703 Wed Apr 25 21:47:48 BST 2007 Olly Betts <olly@survex.com>
3705 * Makefile.am: omega doesn't need indextext.cc.
3707 Wed Apr 25 21:46:25 BST 2007 Olly Betts <olly@survex.com>
3709 * query.cc: Remove unused `#include "indextext.h"'.
3711 Wed Apr 25 02:37:15 BST 2007 Olly Betts <olly@survex.com>
3713 * Makefile.am,configure.ac: Add support like xapian-core has for
3714 `configure --enable-quiet', `make QUIET=' and `make QUIET=y'.
3716 Mon Apr 23 15:42:24 BST 2007 Olly Betts <olly@survex.com>
3718 * date.cc,datematchdecider.cc,utils.cc: Fix compilation with GCC 4.3
3721 Mon Apr 23 15:38:00 BST 2007 Olly Betts <olly@survex.com>
3723 * portability/mkdtemp.cc: config.h should always be included first and
3724 with angle brackets. Use safeerrno.h not errno.h. No special
3725 headers are required here for __CYGWIN__, and safesysstat.h provides
3726 a two argument wrapper for mkdir, so we don't need any
3727 __WIN32__-specific magic either.
3729 Mon Apr 23 12:14:01 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3731 * portability/mkdtemp.cc: Patch from Charlie Hull to fix windows
3733 * scriptindex.cc: #include <time.h> in scriptindex.cc for
3736 Sat Apr 21 23:31:02 BST 2007 Olly Betts <olly@survex.com>
3738 * strcasecmp.h: New header containing magic to provide strcasecmp()
3740 * query.cc,utf8convert.cc: Use strcasecmp.h.
3741 * Makefile.am,cdb_init.cc,cdb_int.h,configfile.cc,getopt.cc,
3742 loadfile.cc,md5wrap.cc,omega.cc,omindex-config.cc,omindex.cc,
3743 query.cc,scriptindex.cc,utf8convert.cc: Add xapian-core's common/
3744 subdirectory as an svn:external so we can (a) share copies of
3745 gnu_getopt.h and getopt.cc and (b) make use of the "safeunistd.h"
3748 Sat Apr 21 23:06:49 BST 2007 Olly Betts <olly@survex.com>
3750 * metaxmlparse.cc,metaxmlparse.h: Fix summary comments at the top of
3753 Sat Apr 21 20:42:03 BST 2007 Olly Betts <olly@survex.com>
3755 * omindex.cc: xapian.h no longer pulls in time.h, which exposes that
3756 we weren't explicitly including it here!
3758 Sat Apr 21 20:27:43 BST 2007 Olly Betts <olly@survex.com>
3760 * configure.ac: We require automake 1.9.5 for xapian-core, so require
3761 it here too for consistency. Turn on automake -Wportability option.
3763 Sat Apr 21 20:24:17 BST 2007 Olly Betts <olly@survex.com>
3765 * configure.ac: Probe for ssize_t and mode_t and define replacements
3766 if we don't find them.
3768 Fri Apr 20 14:38:57 BST 2007 Olly Betts <olly@survex.com>
3770 * datematchdecider.h,omega.h,datematchdecider.cc: Update return
3771 types of MatchDecider and ExpandDecider subclasses.
3773 Wed Apr 18 23:44:36 BST 2007 Olly Betts <olly@survex.com>
3775 * utf8convert.cc: Fix to compile when USE_ICONV isn't defined (to_utf8
3776 is now in the Xapian::Unicode namespace).
3778 Wed Apr 18 23:15:26 BST 2007 Olly Betts <olly@survex.com>
3780 * docs/cgiparams.txt,query.cc: Remove "bias_weight" and
3781 "bias_halflife" CGI parameters since they rely on
3782 Enquire::set_bias() which has been removed.
3784 Tue Apr 17 21:45:40 BST 2007 Richard Boulton <richard@lemurconsulting.com>
3786 * Makefile.am: Link htmlparsetest with Xapian library to get access
3789 Tue Apr 17 02:22:42 BST 2007 Olly Betts <olly@survex.com>
3791 * htmlparse.cc: nonascii_to_utf8 is now in the public API.
3793 Tue Apr 17 00:55:17 BST 2007 Olly Betts <olly@survex.com>
3795 * Makefile.am,htmlparse.cc,indextext.cc,indextext.h,query.cc,sample.cc,
3796 scriptindex.cc,tclUniData.cc,tclUniData.h,utf8convert.cc,utf8itor.cc,
3797 utf8itor.h,utf8test.cc: Use the new Unicode API routines in the core
3798 Xapian library instead of local copies.
3800 Thu Apr 12 17:04:07 BST 2007 Olly Betts <olly@survex.com>
3802 * Makefile.am: omega and scriptindex both need tclUniData.cc.
3804 Sat Mar 31 19:58:29 BST 2007 Olly Betts <olly@survex.com>
3806 * query.cc: $filesize{0} is now "0 bytes", $filesize{1} is now "1
3807 byte", $filesize{SIZE} where SIZE is negative is now "". Fix
3808 "comparison of signed and unsigned" warning. Use "%c" to generate
3809 the fractional part.
3810 * docs/omegascript.txt: Document that $filesize{SIZE} is "" when SIZE
3813 Sat Mar 31 18:25:55 BST 2007 Olly Betts <olly@survex.com>
3815 * query.cc: Ensure that the result of snprintf is zero terminated
3816 since MSVC's snprintf is broken (by design it seems).
3817 * query.cc,docs/omegascript.txt: $filesize enhanced to return a
3818 decimal point for K, M, and G (e.g. "2.1K" and "4.0M" rather than
3821 Fri Mar 30 19:57:00 BST 2007 Olly Betts <olly@survex.com>
3823 * portability/mkdtemp.cc: Fixes for mingw.
3825 Fri Mar 30 02:22:59 BST 2007 Olly Betts <olly@survex.com>
3827 * Makefile.am,scriptindex.cc,utf8truncate.cc,utf8truncate.h: The
3828 "truncate" action now knows not to chop off a multibyte utf-8
3831 Fri Mar 30 02:19:05 BST 2007 Olly Betts <olly@survex.com>
3833 * Makefile.am,omindex.cc,sample.cc,sample.h: New sample generating
3834 function which normalises all runs of whitespace to a single space,
3835 and fixes invalid utf-8 in the sample. This means we can now index
3836 an iso-8859-1 text file and mostly get the same results as if it
3839 Thu Mar 29 23:12:20 BST 2007 Olly Betts <olly@survex.com>
3841 * scriptindex.cc: Fix optimisation of "load truncate=N" to actually
3844 Thu Mar 29 18:54:11 BST 2007 Olly Betts <olly@survex.com>
3846 * configure.ac: Probe for mkdtemp.
3847 * Makefile.am: Add portability/mkdtemp.cc to omindex_SOURCES if
3848 configure didn't detect it.
3849 * omindex.cc: Prototype mkdtemp if configure didn't detect it.
3851 Thu Mar 29 18:47:50 BST 2007 Olly Betts <olly@survex.com>
3853 * portability/mkdtemp.cc: Fix to compile as C++. Replace isdigit()
3854 with a simple range test to avoid locale related quirks.
3856 Thu Mar 29 18:28:25 BST 2007 Olly Betts <olly@survex.com>
3858 * portability/mkdtemp.cc: Add portable implementation of mkdtemp for
3859 use on platforms which don't supply it.
3861 Thu Mar 29 17:22:18 BST 2007 Olly Betts <olly@survex.com>
3863 * omindex.cc: Index PostScript by converting to PDF with ps2pdf and
3864 then indexing that. This allows us to index PostScript files
3865 containing Unicode characters outside of iso-8859-1, and also
3866 means we now get metadata from PostScript files.
3868 Thu Mar 29 03:14:55 BST 2007 Olly Betts <olly@survex.com>
3870 * omega.spec.in: Update to handle documentation being installed in
3871 $prefix/share/doc/xapian-omega.
3873 Tue Mar 27 21:42:19 BST 2007 Olly Betts <olly@survex.com>
3875 * configure.ac: datarootdir is new in 2.60 too, so use datadir when
3876 setting docdir for 2.59.
3878 Mon Mar 26 15:47:53 BST 2007 Olly Betts <olly@survex.com>
3880 * configure.ac: Add code to ensure that docdir is set for autoconf
3881 2.59 (starting from 2.60, it is defined as standard).
3882 * Makefile.am: Use docdir for installing docs. This means that the
3883 documentation now goes in $prefix/share/doc/xapian-omega rather
3884 than $prefix/share/doc/omega, which is better really.
3886 Sat Mar 24 17:21:32 GMT 2007 Olly Betts <olly@survex.com>
3888 * query.cc: Prefer static char[] to static char * (gives better
3891 Sat Mar 24 17:19:18 GMT 2007 Olly Betts <olly@survex.com>
3893 * omega.cc: Prefer static char[] to static char * (gives better
3896 Sat Mar 24 17:16:49 GMT 2007 Olly Betts <olly@survex.com>
3898 * configfile.cc: Prefer static char[] to static char * (gives better
3901 Thu Mar 22 01:11:52 GMT 2007 Olly Betts <olly@survex.com>
3903 * configure.ac: Eliminate libtool probe code for f77, gcj, and rc
3904 which speeds up configure and knocks 29% off its size.
3906 Tue Mar 06 01:56:00 GMT 2007 Olly Betts <olly@survex.com>
3908 * configure.ac: Bump version number to 0.9.10 so that snapshots don't
3909 look older than releases.
3911 Sun Mar 04 14:42:18 GMT 2007 Olly Betts <olly@survex.com>
3913 * TODO: Remove entries which have already been done!
3915 Sat Mar 03 02:24:42 GMT 2007 Olly Betts <olly@survex.com>
3917 * utf8test.cc: Add single utf-8 sequence decoding tests.
3919 Fri Mar 02 00:18:09 GMT 2007 Olly Betts <olly@survex.com>
3921 * configure.ac: Perform a link test for posix_fadvise to fix
3922 misdetection on HP-UX.
3924 Thu Mar 01 21:48:57 GMT 2007 Olly Betts <olly@survex.com>
3926 * utf8itor.h: Add cast to suppress warning from aCC.
3928 Thu Mar 01 21:00:56 GMT 2007 Olly Betts <olly@survex.com>
3930 * configure.ac: Check we can link with libiconv, not just compile.
3931 Some of the HP-UX hosts in the HP testdrive seem to have headers
3932 but no matching library.
3934 Thu Mar 01 18:02:37 GMT 2007 Olly Betts <olly@survex.com>
3936 * myhtmlparse.cc: Remove unused function. Move "#include <string.h>"
3939 Thu Feb 22 15:45:25 GMT 2007 Olly Betts <olly@survex.com>
3941 * configure.ac: xapian-config --cxxflags now includes -ptused for
3942 SGI's C++ compiler, so we don't need to probe for it here.
3944 Wed Feb 21 15:17:07 GMT 2007 Olly Betts <olly@survex.com>
3946 * docs/termprefixes.txt: Expand section on boolean prefixes, showing
3947 how to generate them using scriptindex, and how to allow them to be
3948 selected in an HTML form.
3950 Mon Feb 19 12:51:24 GMT 2007 Olly Betts <olly@survex.com>
3952 * configure.ac: Previous fix doesn't work. Just drop -O2 instead -
3953 users of SGI's CC can specify "./configure CXXFLAGS=-O2" is they
3956 Sun Feb 18 21:44:09 GMT 2007 Olly Betts <olly@survex.com>
3958 * configure.ac: For SGI's CC, -g overrides -g3 if it comes afterwards,
3959 so we need to modify CXXFLAGS rather than just setting AM_CXXFLAGS.
3961 Sat Feb 17 19:25:04 GMT 2007 Olly Betts <olly@survex.com>
3963 * docs/overview.txt,omindex.cc: Add support for indexing MS Works
3964 documents using wps2text (part of libwps).
3966 Sat Feb 17 19:06:03 GMT 2007 Olly Betts <olly@survex.com>
3968 * omindex.cc: Don't index empty files.
3970 Fri Feb 16 21:14:35 GMT 2007 Olly Betts <olly@survex.com>
3972 * NEWS: Add note that Omega < 0.8.0 NEWS entries are in the
3973 xapian-core NEWS file.
3975 Fri Feb 16 20:34:10 GMT 2007 Olly Betts <olly@survex.com>
3977 * indextext.cc: Now I've fixed the bug in UTF-8 decoding, the check
3978 for zero length terms is no longer required.
3980 Fri Feb 16 19:34:48 GMT 2007 Olly Betts <olly@survex.com>
3982 * tclUniData.h,utf8itor.h: The tcl unicode routines only have tables
3983 for characters in the BMP. For other characters, assume they're
3984 word characters, but can't be forced to lowercase.
3986 Fri Feb 16 19:19:11 GMT 2007 Olly Betts <olly@survex.com>
3988 * utf8itor.cc: Fix bug in decoding of 4 byte utf-8 sequences
3989 - the returned value was 0x400000 too large! Fixes bug#106.
3991 Thu Feb 15 19:42:36 GMT 2007 Olly Betts <olly@survex.com>
3993 * indextext.cc,query.cc: Keep embedded apostrophe's in terms rather
3994 than relying on generating a phrase search for them.
3996 Thu Feb 15 05:38:12 GMT 2007 Olly Betts <olly@survex.com>
3998 * Makefile.am,datematchdecider.cc,datematchdecider.h,
3999 docs/cgiparams.txt,query.cc: Add an alternative implementation
4000 of date range filtering which uses a MatchDecider. This allows
4001 everything that the existing implementation does, plus you can
4002 support sorting on a choice of dates (e.g. first published or
4003 last updated), and filtering works to a resolution of a minute
4004 rather than a day. Since omindex now adds the last modified
4005 date as value 0, this will work with omindex.
4007 Thu Feb 15 04:38:32 GMT 2007 Olly Betts <olly@survex.com>
4009 * configure.ac: SGI's CC needs -g3 instead of -g if we want to use
4012 Sat Feb 10 20:53:14 GMT 2007 Olly Betts <olly@survex.com>
4014 * md5.cc: Fix reversed preprocessor conditional so that we generate
4015 correct MD5 checksums on big endian platforms.
4017 Sat Feb 10 20:19:23 GMT 2007 Olly Betts <olly@survex.com>
4019 * md5.cc: No need to byte swap when we've just zero filled!
4021 Sat Feb 10 18:54:33 GMT 2007 Olly Betts <olly@survex.com>
4023 * indextext.cc,query.cc: Prefer Xapian::Stem::operator() to
4024 Xapian::Stem::stem_word().
4026 Fri Feb 09 05:53:29 GMT 2007 Olly Betts <olly@survex.com>
4028 * docs/omegascript.txt: Rewrite introductory paragraph. Note that
4029 whitespace is significant, and add explicit warning to $setmap.
4031 Mon Jan 1 01:56:56 GMT 2007 Richard Boulton <richard@lemurconsulting.com>
4033 * indextext.cc: Fix parsing of text containing certain unicode
4034 characters. Such text could have resulted in zero length terms
4035 being added to documents. (The minimal example I found causing
4036 this problem was a document containing only the unicode character
4037 0x28a0f, which is a CJK Unified Ideograph).
4039 Addresses bug #106, though may not be a complete fix - see the
4042 Sun Dec 31 17:22:56 GMT 2006 Richard Boulton <richard@lemurconsulting.com>
4044 * scriptindex.cc: Update short option list for scriptindex to match
4045 documented usage (-h, -V and -s were not working).
4047 Thu Dec 21 14:57:28 GMT 2006 Olly Betts <olly@survex.com>
4049 * query.cc: Remove support for xB, xDATE1, xDATE2, xDAYSMINUS,
4050 and xDEFAULTOP which were deprecated in favour of xFILTER in
4051 0.7.5 (over 3 years ago).
4053 Thu Dec 21 14:52:38 GMT 2006 Olly Betts <olly@survex.com>
4055 * docs/cgiparams.txt: Remove documentation of the removed deprecated
4058 Thu Dec 21 14:39:04 GMT 2006 Olly Betts <olly@survex.com>
4060 * omega.cc,query.cc: Remove deprecated aliases for CGI parameters
4061 (deprecated in 0.6.3 or 0.6.5, more than 3.5 years ago):
4062 RAW_SEARCH (now RAWSEARCH), DATE1 (now START), DATE2 (now END),
4063 DAYSMINUS (now SPAN but with slightly different semantics),
4064 and MIN_HITS (now MINHITS).
4066 Thu Dec 21 01:04:00 GMT 2006 Olly Betts <olly@survex.com>
4068 * utf8convert.cc: Fix headers included for iconv and not-iconv.
4070 Wed Dec 20 23:53:41 GMT 2006 Olly Betts <olly@survex.com>
4072 * configure.ac,utf8convert.cc: If iconv isn't found by configure, fall
4073 back on simple conversion routines which handle iso-8859-1.
4074 Configuring --without-iconv forces these routines to be used.
4075 Configuring --with-iconv forces configure to fail if it can't find
4078 Tue Dec 19 20:35:04 GMT 2006 Olly Betts <olly@survex.com>
4080 * utf8itor.h: Need <string.h> for strlen.
4082 Tue Dec 19 19:53:52 GMT 2006 Olly Betts <olly@survex.com>
4084 * Makefile.am,configure.ac: Add "-liconv" if it's needed. If we're on
4085 OS X, also check for libiconv installed with fink.
4087 Fri Dec 15 05:43:40 GMT 2006 Olly Betts <olly@survex.com>
4089 * values.h: Add include guard.
4091 Sun Dec 10 04:33:26 GMT 2006 Olly Betts <olly@survex.com>
4093 * query.cc: Fix $substr{} with negative start to actually work. Fix
4094 $substr{} to never cause a C++ exception.
4095 * docs/omegascript.txt,query.cc: Enhance $substr{} to accept a
4096 negative length (meaning to count back from the end of the string).
4098 Sun Dec 10 03:05:09 GMT 2006 Olly Betts <olly@survex.com>
4100 * commonhelp.cc: "--help" now says that the default stemming language
4103 Thu Nov 16 23:06:25 GMT 2006 Olly Betts <olly@survex.com>
4105 * docs/omegascript.txt,query.cc,utils.cc,utils.h: Add $weight command
4106 to OmegaScript which returns the raw document weight - mostly useful
4107 for debugging purposes.
4109 Thu Nov 16 04:02:10 GMT 2006 Olly Betts <olly@survex.com>
4111 * omega.spec.in: Remove "." from the end of the Summary.
4113 Thu Nov 16 03:03:25 GMT 2006 Olly Betts <olly@survex.com>
4115 * configure.ac: As of xapian-core 0.8.0, XO_LIB_XAPIAN doesn't need to
4116 be called with arguments if you want a hard requirement on xapian,
4117 so remove the arguments.
4119 Thu Nov 16 02:07:31 GMT 2006 Olly Betts <olly@survex.com>
4121 * configure.ac: Change the project name to "xapian-omega" since that's
4122 what the RPMs and Debian packages call it (there's a Rogue-like game
4125 Thu Nov 16 02:01:55 GMT 2006 Olly Betts <olly@survex.com>
4127 * omega.cc: Fix backwards setting of sort_after. Fix generation of
4128 sort setup flags for filters.
4130 Thu Nov 16 01:21:32 GMT 2006 Olly Betts <olly@survex.com>
4132 * docs/cgiparams.txt,omega.cc,omega.h,query.cc: Implement new CGI
4133 parameters for finer control of sorting and ranking - SORTAFTER
4135 * omega.cc: Set up the filters variable so we know to revert to
4136 page 1 if the sorting options are changed.
4138 Tue Nov 14 15:27:09 GMT 2006 Olly Betts <olly@survex.com>
4140 * md5test.cc: Need <stdio.h> for sprintf.
4142 Tue Nov 14 03:19:13 GMT 2006 Olly Betts <olly@survex.com>
4144 * configure.ac: Note a couple of platforms which take the different
4147 Tue Nov 14 03:16:37 GMT 2006 Olly Betts <olly@survex.com>
4149 * configure.ac,utf8convert.cc: The input pointer to iconv can be
4150 either "char **" or "const char **" so probe at configure time.
4152 Mon Nov 13 20:22:50 GMT 2006 Olly Betts <olly@survex.com>
4154 * utf8convert.cc: Need <algorithm> for swap().
4156 Mon Nov 13 02:27:51 GMT 2006 Olly Betts <olly@survex.com>
4158 * Makefile.am,md5test.cc: Add tests for md5 code.
4160 Mon Nov 13 02:06:51 GMT 2006 Olly Betts <olly@survex.com>
4162 * Merge in utf8 branch:
4164 Fri Sep 15 06:03:50 BST 2006 Olly Betts <olly@survex.com>
4166 * utf8convert.cc: Compilation fix for Sun C++.
4168 Thu Sep 14 23:55:20 BST 2006 Olly Betts <olly@survex.com>
4170 * Makefile.am,htmlparse.cc,htmlparse.h,indextext.cc,
4171 indextext.h,makesymboltabh.pl,myhtmlparse.cc,myhtmlparse.h,
4172 namedentities.h,omindex.cc,query.cc,scriptindex.cc,
4173 symboltab.h,tclUniData.cc,tclUniData.h,utf8convert.cc,
4174 utf8convert.h,utf8itor.cc,utf8itor.h, utf8test.cc: Convert
4177 Thu Nov 09 00:20:19 GMT 2006 Olly Betts <olly@survex.com>
4179 * NEWS,configure.ac: Update for 0.9.9.
4181 Wed Nov 08 22:45:10 GMT 2006 Olly Betts <olly@survex.com>
4183 * omega.spec.in: Run "autoreconf --force" to avoid rpath on x86_64
4186 Sun Nov 05 17:08:48 GMT 2006 Olly Betts <olly@survex.com>
4188 * scriptindex.cc: The "date" action was modifying the value it
4189 operated on, which it isn't meant to do - fixed.
4191 Sun Nov 05 02:25:48 GMT 2006 Olly Betts <olly@survex.com>
4193 * query.cc: Report an error if $setmap is called with an even number
4196 Thu Nov 02 16:08:27 GMT 2006 Olly Betts <olly@survex.com>
4198 * NEWS,configure.ac: Update for 0.9.8.
4200 Thu Nov 02 15:43:31 GMT 2006 Olly Betts <olly@survex.com>
4202 * configure.ac: Update comment about "-ptused".
4204 Wed Nov 01 16:23:13 GMT 2006 Olly Betts <olly@survex.com>
4206 * cdb_init.cc: Fix warning in mingw build.
4208 Wed Nov 01 13:43:54 GMT 2006 Olly Betts <olly@survex.com>
4210 * cdb_init.cc,query.cc: Fix warnings.
4212 Wed Nov 01 04:00:20 GMT 2006 Olly Betts <olly@survex.com>
4214 * md5.cc,md5.h: Fix warnings about changing alignment requirements
4215 when casting pointers.
4217 Tue Oct 31 02:47:23 GMT 2006 Olly Betts <olly@survex.com>
4219 * cdb_init.cc,configure.ac,getopt.cc,omega.cc,query.cc,scriptindex.cc:
4220 Enable more warnings for GCC (and fix them in the code). Enable
4221 appropriate warnings for Intel's C++ compiler.
4223 Tue Oct 31 00:02:19 GMT 2006 Olly Betts <olly@survex.com>
4225 * htmlparsetest.cc,omindex.cc: Fix GCC warnings.
4227 Mon Oct 30 23:57:09 GMT 2006 Olly Betts <olly@survex.com>
4229 * query.cc: $substr where the start is negative and longer than the
4230 string (e.g. $substr{abcd,-5,1}) should now work as intended.
4232 Mon Oct 30 21:02:18 GMT 2006 Olly Betts <olly@survex.com>
4234 * scriptindex.cc: Fix GCC warnings uncovered by actually substituting
4237 Mon Oct 30 21:01:26 GMT 2006 Olly Betts <olly@survex.com>
4239 * configure.ac: Actually substitute AM_CXXFLAGS in the Makefile.
4240 * configure.ac: Fix AM_CXXFLAGS for IRIX.
4242 Sat Oct 28 12:31:31 BST 2006 Olly Betts <olly@survex.com>
4244 * myhtmlparse.cc: Add missing "#include <ctype.h>".
4246 Sat Oct 28 02:23:09 BST 2006 Olly Betts <olly@survex.com>
4248 * htmlparse.cc,indextext.cc,indextext.h,myhtmlparse.cc,omega.cc,
4249 omega.h,omindex.cc,query.cc,scriptindex.cc: Ensure that we always
4250 pass an unsigned char value to isupper(), toupper(), etc as they
4251 are undefined on other values (glibc makes them work for signed
4252 char values too, but this is an extension).
4254 Fri Oct 27 00:36:34 BST 2006 Olly Betts <olly@survex.com>
4256 * configure.ac,md5.h,values.h: HAVE_STDINT_H is already defined
4257 by autoconf based on trying the C compiler with AC_CHECK_HEADERS
4258 so define HAVE_WORKING_STDINT_H instead.
4260 Wed Oct 25 01:36:43 BST 2006 Olly Betts <olly@survex.com>
4262 * configure.ac: Need a more sophisticated test for the stdint.h
4265 Tue Oct 24 02:12:13 BST 2006 Olly Betts <olly@survex.com>
4267 * metaxmlparse.cc,omega.h: Fix warnings from SGI's C++ compiler.
4269 Tue Oct 24 02:11:11 BST 2006 Olly Betts <olly@survex.com>
4271 * htmlparse.cc,query.cc,scriptindex.cc: Remove unused static
4274 Tue Oct 24 01:51:05 BST 2006 Olly Betts <olly@survex.com>
4276 * configure.ac: Pass magic options to SGI's C++ compiler to allow
4277 linking of templates to work.
4279 Tue Oct 24 00:46:06 BST 2006 Olly Betts <olly@survex.com>
4281 * configure.ac: IRIX doesn't allow stdint.h to be included from C++
4282 code, so we need a smarter configure test than AC_CHECK_HEADERS.
4284 Sun Oct 22 03:30:11 BST 2006 Olly Betts <olly@survex.com>
4286 * configure.ac: Tell AC_CHECK_HEADERS to suppress its backward
4287 compatibility mode, so it only checks headers with the compiler.
4288 This speeds up configure a little, and is what we do elsewhere.
4290 Tue Oct 10 17:21:13 BST 2006 Olly Betts <olly@survex.com>
4292 * NEWS: Update for actual 0.9.7 release.
4294 Mon Oct 09 18:26:14 BST 2006 Olly Betts <olly@survex.com>
4296 * docs/termprefixes.txt: "$setmap{title,S}" should be
4297 "$setmap{prefix,title,S}".
4299 Sun Oct 08 21:43:16 BST 2006 Olly Betts <olly@survex.com>
4301 * NEWS,configure.ac: Update for 0.9.7.
4303 Fri Sep 15 16:56:49 BST 2006 Olly Betts <olly@survex.com>
4305 * cgiparam.cc: Compilation fix for Sun C++.
4307 Fri Sep 15 06:00:50 BST 2006 Olly Betts <olly@survex.com>
4309 * configure.ac,query.cc: Compilation fix for Sun C++.
4311 Thu Sep 14 15:41:33 BST 2006 Olly Betts <olly@survex.com>
4313 * htmlparse.cc: Include <stdlib.h> so atoi() is prototyped.
4315 Wed Sep 13 16:37:32 BST 2006 Olly Betts <olly@survex.com>
4317 * configure.ac,md5.h,values.h: Use stdint.h if we have it.
4319 Tue Sep 12 11:57:16 BST 2006 Olly Betts <olly@survex.com>
4321 * myhtmlparse.cc: Need "#include <string.h>" for strchr.
4323 Mon Sep 11 20:24:27 BST 2006 Olly Betts <olly@survex.com>
4325 * values.h: Only want our own ntohl for MS Windows.
4327 Mon Sep 11 16:36:54 BST 2006 Olly Betts <olly@survex.com>
4329 * omega.cc,query.cc: Now xapian-config will switch Sun's C++ compiler
4330 into ANSI C++ compliant mode, so clean out all our special cased
4333 Mon Sep 11 14:23:44 BST 2006 Olly Betts <olly@survex.com>
4335 * md5.h,values.h: Apply previous fix for DJGPP too.
4337 Sun Sep 10 19:04:17 BST 2006 Olly Betts <olly@survex.com>
4339 * md5.h,values.h: Using htonl from winsock.h requires use to link
4340 with the winsock DLL, which is overkill so just add a simple
4341 implementation for htonl - we know MS Windows is little-endian.
4343 Sat Sep 09 21:48:22 BST 2006 Olly Betts <olly@survex.com>
4345 * md5.h,values.h: Sigh, winsock.h uses u_long instead of uint32_t
4346 in the htonl prototype.
4348 Sat Sep 09 19:19:15 BST 2006 Olly Betts <olly@survex.com>
4350 * omindex.cc: Fix typo in previous commit.
4352 Sat Sep 09 17:11:40 BST 2006 Olly Betts <olly@survex.com>
4354 * configure.ac,omindex.cc: Mingw doesn't have sys/wait.h or
4357 Sat Sep 09 16:44:29 BST 2006 Olly Betts <olly@survex.com>
4359 * md5.h,values.h: On MS Windows, we need to #include <winsock.h>.
4361 Fri Sep 08 08:01:15 BST 2006 Olly Betts <olly@survex.com>
4363 * query.cc: Sun C++'s std::count() isn't very "std" -- it has the
4366 Fri Sep 08 03:39:14 BST 2006 Olly Betts <olly@survex.com>
4368 * md5.h,values.h: openbsd needs arpa/inet.h to be included before
4371 Wed Sep 06 21:31:33 BST 2006 Olly Betts <olly@survex.com>
4373 * md5wrap.cc: #include <unistd.h>
4375 Wed Sep 06 18:03:23 BST 2006 Olly Betts <olly@survex.com>
4377 * Makefile.am: Ship values.h.
4379 Wed Sep 06 03:52:27 BST 2006 Olly Betts <olly@survex.com>
4381 * configfile.cc: Changed my mind - don't allow comments on the end of
4383 * docs/overview.txt: Document that omega.conf can have comments and
4386 Wed Sep 06 03:46:16 BST 2006 Olly Betts <olly@survex.com>
4388 * configfile.cc,omega.conf: Fix code which reads omega.conf to be line
4389 based as documented rather than the wacky whitespace based scheme
4390 that was actually implemented. Allow "#" comments and blank lines
4393 Wed Sep 06 01:26:17 BST 2006 Olly Betts <olly@survex.com>
4395 * omindex.cc: If popen() fails, treat it as a read error.
4397 Wed Sep 06 00:49:47 BST 2006 Olly Betts <olly@survex.com>
4399 * omindex.cc: Fix escaping of filenames to cast characters to
4400 "unsigned char" so that isalnum() works correctly everywhere.
4401 Not a security hole as dangerous characters were still being
4404 Tue Sep 05 06:49:30 BST 2006 Olly Betts <olly@survex.com>
4406 * Makefile.am: Run htmlparsetest on "make check".
4408 Tue Sep 05 06:46:18 BST 2006 Olly Betts <olly@survex.com>
4410 * Makefile.am,htmlparse.cc,htmlparse.h,metaxmlparse.cc,metaxmlparse.h,
4411 myhtmlparse.h,omindex.cc,xmlparse.cc,xmlparse.h: Parse the XML from
4412 OpenDocument and OpenOffice using new subclasses of HtmlParser.
4413 Only extract meta.xml once.
4415 Tue Sep 05 06:45:02 BST 2006 Olly Betts <olly@survex.com>
4417 * Makefile.am,htmlparsetest.cc: Add htmlparsetest which tests the
4420 Tue Sep 05 04:36:46 BST 2006 Olly Betts <olly@survex.com>
4422 * omindex.cc: Note UTF-8 runes for pdfinfo and pdftotext.
4424 Tue Sep 05 04:29:21 BST 2006 Olly Betts <olly@survex.com>
4426 * omindex.cc: Only run pdfinfo once and pull out the
4427 fields we want using string operations, instead of
4428 running it twice filtered through sed.
4430 Tue Sep 05 03:53:00 BST 2006 Olly Betts <olly@survex.com>
4432 * htmlparse.cc,htmlparse.h: Don't get confused by "a<b" in
4433 Javascript in a <script> tag. Fixes bug#91.
4435 Sat Sep 02 04:29:12 BST 2006 Olly Betts <olly@survex.com>
4437 * omindex.cc: Call pclose() not fclose() on a FILE* obtained from
4438 popen(). If a filter program isn't installed, then don't try it
4439 again for the same extension (not perfect but an improvement -
4440 previously we indexed an empty document!)
4442 Sat Sep 02 02:07:30 BST 2006 Olly Betts <olly@survex.com>
4444 * Makefile.am,configure.ac,docs/omegascript.txt,md5.cc,md5.h,
4445 md5wrap.cc,md5wrap.h,omindex.cc,query.cc,values.h: Generate
4446 an MD5 checksum of each file indexed and store it in value #1
4447 to allow duplicates to be collapsed. Add $pack and $unpack
4448 OmegaScript commands to allow big endian binary values to
4449 be encoded and decoded. Add the file last modified time
4452 Fri Sep 01 04:37:09 BST 2006 Olly Betts <olly@survex.com>
4454 * omindex.cc: Tweak comment and whitespace.
4456 Fri Sep 01 04:19:39 BST 2006 Olly Betts <olly@survex.com>
4458 * README: Update reference to "CVS" to say "SVN".
4460 Thu Aug 31 20:22:33 BST 2006 Olly Betts <olly@survex.com>
4462 * loadfile.cc: #include <algorithm> for std::min().
4464 Thu Aug 31 02:35:36 BST 2006 Olly Betts <olly@survex.com>
4466 * loadfile.cc: More missing #include-s.
4468 Thu Aug 31 01:53:31 BST 2006 Olly Betts <olly@survex.com>
4470 * loadfile.cc: Add #include <unistd.h>.
4472 Wed Aug 30 23:21:49 BST 2006 Olly Betts <olly@survex.com>
4474 * Makefile.am: Include loadfile.h in the tarball.
4476 Mon Aug 28 18:09:28 BST 2006 Olly Betts <olly@survex.com>
4478 * omindex.cc: Don't generate 'W' terms since omega doesn't use them.
4480 Mon Aug 28 03:06:46 BST 2006 Olly Betts <olly@survex.com>
4482 * query.cc,templates/query: Use '\t' to separate terms in xP since
4483 filter terms might contain '.'. Fixes bug#87.
4485 Sun Aug 27 01:36:40 BST 2006 Olly Betts <olly@survex.com>
4487 * indextext.cc: Don't generate terms with more than 3 trailing
4488 symbols ('-', '+', or '#').
4490 Sun Aug 27 01:11:45 BST 2006 Olly Betts <olly@survex.com>
4492 * omindex.cc: Added "size" field to document data; don't add "modtime"
4493 field if the timestamp is (time_t)-1.
4495 Sun Aug 27 00:36:12 BST 2006 Olly Betts <olly@survex.com>
4497 * omindex.cc,templates/query,utils.cc,utils.h: Store the file's last
4498 modified time in the document data as "modtime" so it shows up in
4499 search results (and tweak the query template so the display of this
4500 information looks nicer).
4502 Fri Aug 25 22:55:23 BST 2006 Olly Betts <olly@survex.com>
4504 * docs/overview.txt,omindex.cc: Run xls2csv on MS Excel files; run
4505 catppt on MS Powerpoint files; also index MS Word templates (.dot).
4507 Thu Aug 24 21:40:10 BST 2006 Olly Betts <olly@survex.com>
4509 * htmlparse.cc: Support htdig's "ignore this bit" comments.
4511 Thu Aug 24 12:55:26 BST 2006 Olly Betts <olly@survex.com>
4513 * query.cc: Fix $highlight{} to work with capitalised words (it used
4514 to work but regressed in 0.8.2).
4516 Thu Aug 24 12:38:50 BST 2006 Olly Betts <olly@survex.com>
4518 * Makefile.am,omindex.cc,query.cc: Use the new routines in loadfile.cc
4519 to replace code to do the same thing in omindex and omega.
4521 Thu Aug 24 12:37:16 BST 2006 Olly Betts <olly@survex.com>
4523 * scriptindex.cc: Fix handling of check whether a record has content
4524 in the case where the same field is processed more than once.
4526 Thu Aug 24 12:35:32 BST 2006 Olly Betts <olly@survex.com>
4528 * Makefile.am,docs/scriptindex.txt,loadfile.cc,loadfile.h,
4529 scriptindex.cc: Add new "load" action to allow the contents of an
4530 external file to be loaded.
4532 Thu Aug 24 12:05:23 BST 2006 Olly Betts <olly@survex.com>
4534 * configure.ac: Check for strftime.
4536 Sun Jul 09 01:40:09 BST 2006 Olly Betts <olly@survex.com>
4538 * docs/omegascript.txt: Note that (by design) an omegascript template
4539 can't contain an infinite loop.
4541 Sun May 21 11:42:54 BST 2006 Olly Betts <olly@survex.com>
4543 * Makefile.am: Make use of the dist_ prefix to avoid having to list
4544 files in EXTRA_DIST as well as in *_SCRIPTS, *_DATA, and man_MANS.
4545 * Makefile.am: Prefer $(sysconfdir) to @sysconfdir@ since the former
4546 can be overridden on the "make" command line.
4548 Sat May 20 06:16:27 BST 2006 Olly Betts <olly@survex.com>
4550 * Makefile.am,configure.ac: Specify required automake version in
4551 the call to AM_INIT_AUTOMAKE in configure.ac.
4553 Thu May 18 14:12:13 BST 2006 Olly Betts <olly@survex.com>
4555 * docs/overview.txt,docs/quickstart.txt: Use the default path to the
4556 database directories in examples. Tweak the formatting in a few
4557 places. Give a path to the omega CGI binary in the example showing
4558 how to run it from the command line.
4560 Wed May 17 15:28:01 BST 2006 Olly Betts <olly@survex.com>
4562 * omega.spec.in: Fix so that the documentation gets packaged.
4564 Tue May 16 06:56:26 BST 2006 Olly Betts <olly@survex.com>
4566 * configure.ac: Remove unused variable from snprintf testing code.
4568 Mon May 15 02:18:01 BST 2006 Olly Betts <olly@survex.com>
4570 * NEWS,configure.ac: Updated for 0.9.6.
4572 Sat May 13 20:43:08 BST 2006 Olly Betts <olly@survex.com>
4574 * configure.ac: Update snprintf detection to match xapian-core.
4576 Fri May 12 20:12:40 BST 2006 Olly Betts <olly@survex.com>
4578 * docs/omegascript.txt: Clarified description of $now.
4580 Thu Apr 27 23:45:26 BST 2006 Olly Betts <olly@survex.com>
4582 * docs/omegascript.txt,query.cc: Added new OmegaScript commands
4583 $filterterms and $substr.
4585 Thu Apr 27 18:37:50 BST 2006 Olly Betts <olly@survex.com>
4587 * scriptindex.cc: Use const reference instead of just a reference.
4589 Sun Apr 23 18:32:20 BST 2006 Olly Betts <olly@survex.com>
4591 * scriptindex.cc: Fix "index" and "indexnopos" without a prefix to
4592 set the weight correctly (bug introduced in 0.9.5).
4594 Wed Apr 19 13:37:15 BST 2006 Fabrice Colin
4596 * omega.spec.in: Create and package /var/lib/omega/cdb and
4599 Tue Apr 11 19:29:34 BST 2006 Olly Betts <olly@survex.com>
4601 * configure.ac,htmlparse.cc,query.cc,scriptindex.cc: Disable MSVC
4602 warning 4800 (on int to bool conversions) in config.h and then we
4603 can remove the "fixes" elsewhere.
4605 Mon Apr 10 16:26:08 BST 2006 Olly Betts <olly@survex.com>
4607 * date.cc,hashterm.cc,htmlparse.cc,omega.cc,omindex.cc,query.cc,
4608 scriptindex.cc: Fix MSVC7 warnings.
4610 Sat Apr 08 20:04:33 BST 2006 Olly Betts <olly@survex.com>
4612 * NEWS,configure.ac: Updated for 0.9.5.
4614 Fri Apr 07 16:45:36 BST 2006 Olly Betts <olly@survex.com>
4616 * omindex.cc,query.cc: Tweak for MSVC compilation.
4618 Fri Apr 07 03:23:22 BST 2006 Olly Betts <olly@survex.com>
4620 * omega.spec.in: Man pages may be gzipped.
4622 Thu Apr 06 14:28:08 BST 2006 Olly Betts <olly@survex.com>
4624 * README: Add pointer to documentation.
4626 Thu Apr 06 03:32:21 BST 2006 Olly Betts <olly@survex.com>
4628 * omega.spec.in: Include man pages in RPM.
4630 Thu Apr 06 03:06:56 BST 2006 Olly Betts <olly@survex.com>
4632 * Makefile.am,commonhelp.cc,commonhelp.h,configure.ac,omindex.cc,
4633 scriptindex.cc: Add man pages for omindex and scriptindex.
4635 Thu Apr 06 02:56:09 BST 2006 Olly Betts <olly@survex.com>
4637 * mbox2omega.script: Use new "hash" command.
4639 Wed Apr 05 19:29:14 BST 2006 Olly Betts <olly@survex.com>
4641 * Makefile.am,docs/scriptindex.txt,hashterm.cc,hashterm.h,
4642 omindex.cc,scriptindex.cc: Add new "hash" command to allow hashed
4643 terms to be generated from long URLs like omindex does.
4644 * htdig2omega.script: Use new "hash" command.
4645 * scriptindex.cc: Fix "useless weight" warning to not incorrectly
4646 fire when "index" or "indexnopos" has no parameter.
4648 Wed Apr 05 15:03:28 BST 2006 Olly Betts <olly@survex.com>
4650 * scriptindex.cc: Check if we successfully opened the index script
4651 and give an error if not.
4653 Fri Mar 10 05:21:13 GMT 2006 Olly Betts <olly@survex.com>
4655 * dbi2omega: Check DBIDRIVER environmental variable to allow a driver
4656 other than mysql to be specified without modifying the script.
4658 Wed Mar 01 02:28:57 GMT 2006 Olly Betts <olly@survex.com>
4660 * scriptindex.cc: Don't repeat the "note" part of warnings; Warn if
4661 "unique=<prefix>" is used without a corresponding "boolean=<prefix>";
4662 Warn that "index=nopos" is deprecated and should be replaced by
4665 Tue Feb 28 23:46:57 GMT 2006 Olly Betts <olly@survex.com>
4667 * scriptindex.cc: Report a useless weight action, even if it's
4668 followed by another non-useless action (e.g. field); convert weight
4669 actions into a numeric parameter on index and indexnopos Action
4670 objects; add explanatory text "(note that actions are executed from
4671 left to right)" when reporting useless actions.
4673 Sun Feb 26 00:25:10 GMT 2006 Olly Betts <olly@survex.com>
4675 * query.cc: Fix $opt[fieldnames] handling. Previously it would try
4676 to kick in if you didn't set fieldnames but set any alphabetically
4679 Tue Feb 21 00:18:25 GMT 2006 Olly Betts <olly@survex.com>
4681 * configure.ac,NEWS: Updated for 0.9.4.
4683 Sun Feb 19 23:20:49 GMT 2006 Olly Betts <olly@survex.com>
4685 * COPYING: Updated FSF address.
4687 Thu Feb 16 00:10:22 GMT 2006 Olly Betts <olly@survex.com>
4689 * NEWS,configure.ac: Updated for 0.9.3.
4691 Wed Feb 08 13:01:15 GMT 2006 Olly Betts <olly@survex.com>
4693 * templates/query: Make the page title shorter so there's more chance
4694 it will fit on icon bars, etc.
4696 Wed Feb 08 10:08:24 GMT 2006 Olly Betts <olly@survex.com>
4698 * docs/overview.txt: Add pointer to documentation of the supported
4701 Mon Feb 06 15:19:17 GMT 2006 Olly Betts <olly@survex.com>
4703 * docs/termprefixes.txt: Fix typo.
4705 Sat Jan 14 22:40:43 GMT 2006 Olly Betts <olly@survex.com>
4707 * configure.ac: Copy over fixed snprintf checks from xapian-core.
4709 Fri Jan 13 03:21:15 GMT 2006 Olly Betts <olly@survex.com>
4711 * configure.ac: The configure test for snprintf uses memcmp, so
4712 we need to "#include <string.h>" for it to work reliably.
4714 Mon Jan 09 04:23:54 GMT 2006 Olly Betts <olly@survex.com>
4716 * date.cc,query.cc: Add "#include <stdarg.h>" where we use
4719 Mon Jan 09 04:17:54 GMT 2006 Olly Betts <olly@survex.com>
4721 * cdb_init.cc: Fix more compilation issues with cdb no-mmap code.
4723 Mon Jan 09 03:42:18 GMT 2006 Olly Betts <olly@survex.com>
4725 * omega.cc,utils.cc,utils.h: Replace remaining use of split with
4726 a direct walk of the string.
4728 Mon Jan 09 03:19:49 GMT 2006 Olly Betts <olly@survex.com>
4730 * query.cc: Don't split strings of docids in R parameters into a
4731 vector<string> - just walk the string directly. The code is
4732 as simple, and much more efficient if a lot of documents are
4735 Mon Jan 09 02:46:34 GMT 2006 Olly Betts <olly@survex.com>
4737 * Makefile.am,date.cc,omindex.cc,query.cc,scriptindex.cc,utils.cc,
4738 utils.h: Use snprintf where available.
4740 Sun Jan 08 22:41:47 GMT 2006 Olly Betts <olly@survex.com>
4742 * cdb_init.cc: Fixed malloc-based version to compile.
4744 Sun Jan 08 21:05:46 GMT 2006 Olly Betts <olly@survex.com>
4746 * cdb_find.cc,cdb_hash.cc,cdb_unpack.cc: #include <config.h>.
4747 * configure.ac: Test for mmap.
4748 * cdb_init.cc: If mmap isn't found, and this isn't WIN32 fall back on
4749 the very crude approach of loading the whole file into a malloc-ed
4750 block. For a small cdb file, that'll give acceptable performance
4753 Fri Jan 06 21:29:37 GMT 2006 Olly Betts <olly@survex.com>
4755 * symboltab.h: Fix A after \xbf being interpereted as an overlong
4758 Fri Jan 06 21:26:57 GMT 2006 Olly Betts <olly@survex.com>
4760 * query.cc: Fix printf type mismatch on 64 bit platforms.
4762 Fri Jan 06 21:00:34 GMT 2006 Olly Betts <olly@survex.com>
4764 * docs/omegascript.txt,query.cc: Added $find{LIST,STRING}.
4766 Fri Jan 06 20:52:31 GMT 2006 Olly Betts <olly@survex.com>
4768 * symboltab.h: Write top-bit set characters using \xXX notation to
4769 avoid warnings from Intel's C++ compiler.
4771 Fri Jan 06 18:15:42 GMT 2006 Olly Betts <olly@survex.com>
4773 * query.cc: Removed unused variable.
4775 Fri Jan 06 18:14:33 GMT 2006 Olly Betts <olly@survex.com>
4777 * query.cc: Cast time_t to unsigned long to avoid problems on 64bit
4780 Fri Jan 06 18:12:38 GMT 2006 Olly Betts <olly@survex.com>
4782 * docs/omegascript.txt: Note in the $cgi description that it returns
4783 an arbitrary value if there's more than one, and pointing to
4786 Thu Jan 05 05:54:58 GMT 2006 Olly Betts <olly@survex.com>
4788 * cdb_init.cc: Fix mingw compilation.
4790 Thu Jan 05 03:24:07 GMT 2006 Olly Betts <olly@survex.com>
4792 * cdb_init.cc: Fix to hopefully compile on Solaris which has a broken
4793 sys/mman.h when used from C++.
4795 Wed Jan 04 20:44:44 GMT 2006 Olly Betts <olly@survex.com>
4797 * query.cc: Fixed to compile with GCC 3.0.
4799 Wed Jan 04 04:33:15 GMT 2006 Olly Betts <olly@survex.com>
4801 * Makefile.am,cdb.h,cdb_find.cc,cdb_hash.cc,cdb_init.cc,cdb_int.h,
4802 cdb_unpack.cc,configfile.cc,configfile.h,docs/omegascript.txt,
4803 omega.conf,query.cc: Add $lookup{CDBFILE,KEY} command to perform
4804 a lookup in a CDB file.
4806 Wed Jan 04 03:06:31 GMT 2006 Olly Betts <olly@survex.com>
4808 * docs/omegascript.txt,docs/overview.txt,query.cc: Added new feature
4809 which allows you to avoid storing fieldnames in every document
4810 (which can save a lot of disk space for a large database). Instead
4811 you just store the field values, one per line, and add something
4812 like "$set{fieldnames,$split{caption sample url}}" to the
4813 OmegaScript template to specify the fieldnames to use.
4814 * docs/omegascript.txt,query.cc: Add new "$split{}" command which
4815 splits a string to give an OmegaScript list.
4816 * query.cc: Fix $url{} to escape "+" to "%2b".
4817 * query.cc: Speed up $highlight{} - only compare terms which are the
4820 Tue Jan 03 22:38:01 GMT 2006 Olly Betts <olly@survex.com>
4822 * configfile.cc: Rename file_readable() to file_exists() to better
4823 reflect what the function actually does!
4825 Tue Jan 03 17:43:40 GMT 2006 Olly Betts <olly@survex.com>
4827 * templates/opensearch: Add missing escaping.
4829 Mon Dec 19 10:27:30 GMT 2005 Olly Betts <olly@survex.com>
4831 * Makefile.am,commonhelp.cc,commonhelp.h,docs/overview.txt,omindex.cc,
4832 scriptindex.cc: Add "--stemmer" option to omindex and scriptindex
4833 to allow the stemming language to be set.
4834 * omindex.cc,scriptindex.cc: More consistent --help and --version
4835 output. Update FSF address.
4837 Mon Dec 19 06:03:31 GMT 2005 Olly Betts <olly@survex.com>
4839 * query.cc: Explicitly use "unsigned char" when %-encoding in $url
4840 so that top bit set characters are correctly handled on platforms
4841 where char is signed by default.
4843 Sun Dec 11 09:30:44 GMT 2005 Olly Betts <olly@survex.com>
4845 * templates/godmode: If a non-existent docid is specified, report the
4846 error and prompt the user to enter another docid. Fixes bug#60.
4848 Sun Dec 11 09:27:18 GMT 2005 Olly Betts <olly@survex.com>
4850 * docs/cgiparams.txt,omega.cc,omega.h,query.cc: Add "SORTREVERSE"
4851 CGI parameter which allows the sort order to be reversed when
4852 sorting on a value. Remove "SORTBANDS" CGI parameter since it
4853 no longer does anything.
4855 Sun Dec 11 09:26:14 GMT 2005 Olly Betts <olly@survex.com>
4857 * omindex.cc: Improve wording of comment.
4859 Sun Dec 11 09:22:58 GMT 2005 Olly Betts <olly@survex.com>
4861 * docs/overview.txt,omindex.cc: Add support for OpenDocument format
4862 mimetypes and extensions out of the box.
4864 Sun Dec 11 09:16:57 GMT 2005 Olly Betts <olly@survex.com>
4866 * docs/omegascript.txt,query.cc: If executing an OmegaScript command
4867 causes a Xapian exception to be thrown, catch it and copy the error
4868 message into error_msg (which is read by the $error command).
4870 Sun Dec 11 09:12:12 GMT 2005 Olly Betts <olly@survex.com>
4872 * htmlparse.cc: Tweak a few comments; "while (1)" -> "while (true)".
4874 Sun Dec 11 09:09:40 GMT 2005 Olly Betts <olly@survex.com>
4876 * docs/overview.txt: The U prefix (URL term) was grouped with the date
4877 searching prefixes, but it makes more sense to group it with the
4878 prefixes relating to parts of the URL (H for hostname, P for path,
4881 Sun Oct 02 16:28:59 BST 2005 Olly Betts <olly@survex.com>
4883 * scriptindex.cc: Use "int database_mode" (set to the value to pass to
4884 WritableDatabase's ctor) instead of "bool overwrite" to implement
4886 * scriptindex.cc: Remove code to handle "-q" as it no longer actually
4887 controls anything. Just ignore it for backwards compatibility.
4888 * scriptindex.cc: Tweak --help output to not wrap on a default
4891 Sat Sep 10 14:57:19 BST 2005 Olly Betts <olly@survex.com>
4893 * docs/omegascript.txt: Improve descriptions of $collapsed, $value,
4896 Fri Jul 29 10:05:21 BST 2005 James Aylett <james@tartarus.org>
4898 * omindex.cc: add --preserve-nonduplicates / -p option to not
4899 delete any documents that aren't updated, in replace duplicates
4900 mode (so that multiple runs of omindex on different subsites
4901 don't stomp on each other).
4903 * docs/overview.txt: update to match the above.
4905 Fri Jul 15 11:12:28 BST 2005 Olly Betts <olly@survex.com>
4907 * configure.ac: Updated for 0.9.2.
4909 Fri Jul 15 02:18:40 BST 2005 Olly Betts <olly@survex.com>
4911 * NEWS: Updated for 0.9.2.
4913 Sat Jul 02 14:56:35 BST 2005 Olly Betts <olly@survex.com>
4915 * query.cc: Workaround further Sun C++ crapness.
4917 Wed Jun 29 03:19:22 BST 2005 Olly Betts <olly@survex.com>
4919 * docs/omegascript.txt,query.cc: Changed $highlight so
4920 if OPEN and CLOSE aren't specified, they default to
4921 highlighting each word from the query with a different
4922 background colour like gmane does (previous default was to use
4923 '<strong>' and '</strong>').
4924 * query.cc: Removed surplus whitespace.
4926 Fri Jun 24 02:51:38 BST 2005 Olly Betts <olly@survex.com>
4928 * query.cc: Call QueryParser::set_database() as this is now used to
4929 decide what to do for terms like "C#".
4930 * docs/omegascript.txt,docs/termprefixes.txt,query.cc: Add the
4931 ability to set boolean prefixes for the QueryParser by setting
4932 a "boolprefix" map in the omegascript template.
4934 Fri Jun 24 02:40:10 BST 2005 Olly Betts <olly@survex.com>
4936 * scriptindex.cc: Fix infinite loop if there's no newline at the end
4939 Thu Jun 23 16:42:41 BST 2005 Olly Betts <olly@survex.com>
4941 * docs/termprefixes.txt: Explain who to use termprefixes with
4942 scriptindex and omega, since that's what most people will want to
4945 Thu Jun 23 16:41:15 BST 2005 Olly Betts <olly@survex.com>
4947 * query.cc,docs/omegascript.txt: Added $length{} and $stoplist{}
4948 commands to OmegaScript.
4949 * docs/omegascript.txt: Use standard "S" prefix for title in example
4950 for $setmap, rather than "XT".
4952 Mon Jun 06 17:59:10 BST 2005 Olly Betts <olly@survex.com>
4954 * NEWS: Another 0.9.1 update.
4956 Mon Jun 06 17:52:44 BST 2005 Olly Betts <olly@survex.com>
4958 * NEWS: Updated for 0.9.1.
4960 Mon Jun 06 17:51:58 BST 2005 Olly Betts <olly@survex.com>
4962 * configure.ac: Updated for 0.9.1.
4964 Mon May 23 23:36:48 BST 2005 Fabrice Colin <fabrice.colin@gmail.com>
4966 * omega.spec.in: Updated for 0.9.0.
4968 Fri May 13 23:21:02 BST 2005 Olly Betts <olly@survex.com>
4970 * NEWS: Updated for 0.9.0.
4972 Fri May 13 00:39:44 BST 2005 Olly Betts <olly@survex.com>
4974 * configure.ac: Updated for 0.9.0.
4976 Fri May 13 00:35:21 BST 2005 Olly Betts <olly@survex.com>
4978 * scriptindex.cc: Improved handling of extra blank lines in dump file;
4979 Strip multiple \r characters from end of line; Complain if a dump
4980 file doesn't appear to have been = escaped correctly; Flush
4981 database after each input file to ensure all changes from a file
4983 * docs/omegascript.txt: Whitespace tweak.
4985 Wed May 11 02:28:41 BST 2005 Olly Betts <olly@survex.com>
4987 * NEWS: Started to update for 0.9.0.
4989 Sun May 08 02:16:07 BST 2005 Olly Betts <olly@survex.com>
4991 * query.cc: Use Query::get_terms_begin() not
4992 QueryParser::termlist_begin().
4994 Sun May 08 02:11:49 BST 2005 Olly Betts <olly@survex.com>
4996 * Makefile.am: Use AM_CPPFLAGS not CPPFLAGS (CPPFLAGS is for the
4999 Wed May 4 11:32:18 BST 2005 Richard Boulton <richard@tartarus.org>
5001 * configfile.cc: Configuration file is now looked for in various
5002 locations: the first location in which a file is found is used.
5003 Firstly, if the OMEGA_CONFIG_FILE environment variable is set,
5004 the location given in it is checked. Secondly, the file
5005 "omega.conf" in the same directory as the executable is checked.
5006 Finally, the file "${sysconfdir}/omega.conf" (eg, /etc/omega.conf
5007 on Linux) is checked. If none of these locations contain a file,
5008 default values are used.
5009 * docs/overview.txt: Update to describe new configuration file
5011 * Makefile.am: Install omega.conf to ${sysconfdir} by default.
5012 Define CONFIGFILE_SYSTEM with an appropriate value to find the
5013 system configuration file.
5015 Wed May 4 11:20:26 BST 2005 Richard Boulton <richard@tartarus.org>
5017 * query.cc: Use new set_stemming_strategy() API method, rather than
5018 old set_stemming_options() method. The old method didn't compile
5019 because it's being passed a stemming_strategy value, which there
5020 isn't a prototype for.
5022 Fri Apr 29 10:27:05 BST 2005 Olly Betts <olly@survex.com>
5024 * scriptindex.cc: Improved comments.
5026 Fri Apr 15 03:12:02 BST 2005 Olly Betts <olly@survex.com>
5028 * docs/termprefixes.txt: Updated QueryParser prefix documentation to
5029 remove references to CVS HEAD.
5030 * docs/termprefixes.txt: Capitalise "Month" to indicate why it has
5031 prefix "M" (in line with all the other entries in the list).
5033 Fri Apr 15 02:55:06 BST 2005 Olly Betts <olly@survex.com>
5035 * indextext.cc: Generate terms like "c#".
5036 * query.cc: Highlight words like "C#".
5038 Fri Apr 15 02:53:22 BST 2005 Olly Betts <olly@survex.com>
5040 * query.cc: Clearer code for adding boolean filters are added to the
5043 Wed Apr 06 02:47:14 BST 2005 Olly Betts <olly@survex.com>
5045 * omindex.cc: Tweak the hashing of URLs so that it works the same
5046 way on all platforms (previously it would depend on sizeof(long)).
5047 This means an incompatibility with any existing database built on
5048 a platform where sizeof(long) > 4 where URLs were hashed (i.e.
5049 URLs were > 228 bytes if sizeof(long) == 8), but we really want
5050 databases to be portable between platforms.
5052 Wed Apr 06 02:44:58 BST 2005 Olly Betts <olly@survex.com>
5054 * omindex.cc,docs/overview.txt: Removed useless "DUPE_duplicate"
5057 Wed Apr 06 00:48:08 BST 2005 Olly Betts <olly@survex.com>
5059 * omindex.cc,docs/overview.txt: Added support for using pod2text for
5060 indexing Perl documentation.
5062 Wed Apr 06 00:25:47 BST 2005 Olly Betts <olly@survex.com>
5064 * omindex.cc,docs/overview.txt: Replace -l/--no-recurse with
5065 -l/--depth-limit which takes an argument allowing recursion
5066 to be restriction to any depth, not just 0 or infinite!
5068 Tue Apr 05 23:45:39 BST 2005 Olly Betts <olly@survex.com>
5070 * mbox2omega,mbox2omega.script,Makefile.am: Added mbox2omega which
5071 allows a mail folder to be indexed. Mostly it's an example as
5072 there's no mechanism included to show the full original message.
5074 Tue Apr 05 23:41:44 BST 2005 Olly Betts <olly@survex.com>
5076 * scriptindex.cc: Tidy up STL header includes.
5078 Tue Apr 05 23:34:36 BST 2005 Olly Betts <olly@survex.com>
5080 * docs/omegascript.txt: Clarify $field description slightly.
5082 Tue Apr 05 23:33:33 BST 2005 Olly Betts <olly@survex.com>
5084 * indextext.h: Add typedefs to allow AccentNormalisingItor to be used
5087 Tue Apr 05 00:47:52 BST 2005 Olly Betts <olly@survex.com>
5089 * docs/cgiparams.txt,docs/omegascript.txt: Fixed 3 references to
5092 Tue Apr 05 00:41:45 BST 2005 Olly Betts <olly@survex.com>
5094 * debian/.cvsignore,.cvsignore: Remove .cvsignore files, as they're
5097 Mon Mar 21 16:43:07 GMT 2005 Richard Boulton <richard@tartarus.org>
5099 * templates/opensearch: Add new template to implement basic
5100 opensearch feeds of search results.
5101 * Makefile.am: Include opensearch template in distribution.
5103 Thu Mar 03 02:20:26 GMT 2005 Olly Betts <olly@survex.com>
5105 * templates/query2: Remove Sam's unfinished rewrite of the query
5106 template. It's not been worked on for nearly two years, and we
5109 Wed Mar 02 03:09:52 GMT 2005 Olly Betts <olly@survex.com>
5111 * COPYING: Put in CVS.
5113 Tue Mar 01 02:09:35 GMT 2005 Olly Betts <olly@survex.com>
5115 * omindex.cc,docs/overview.txt: Extend -M/--mime-type to allow an
5116 existing mapping to be removed by omitting the type.
5118 Thu Feb 24 17:42:35 GMT 2005 Olly Betts <olly@survex.com>
5120 * Makefile.am: Actually ship docs/termprefixes.txt (and make it harder
5121 to fail to ship new docs in future).
5123 Thu Feb 24 02:10:09 GMT 2005 Olly Betts <olly@survex.com>
5125 * Makefile.am,docs/termprefixes.txt: Added a single document covering
5126 all aspects of term prefixes.
5128 Wed Feb 23 14:59:46 GMT 2005 Olly Betts <olly@survex.com>
5130 * docs/omegascript.txt: Moved $collapsed into correct place
5133 Wed Feb 16 03:46:51 GMT 2005 Olly Betts <olly@survex.com>
5135 * docs/cgiparams.txt,docs/overview.txt: Improved description of how
5136 B filters are handled when building the query.
5138 Wed Feb 16 03:44:24 GMT 2005 Olly Betts <olly@survex.com>
5140 * omindex.cc: Fixed so that we get lstat() prototype on Linux systems
5141 where we have posix_fadvise().
5143 Mon Jan 17 03:35:35 GMT 2005 Olly Betts <olly@survex.com>
5145 * query.cc: Corrected a comment.
5147 Mon Jan 17 03:32:25 GMT 2005 Olly Betts <olly@survex.com>
5149 * query.cc: Updated to use the new QueryParser API.
5151 Wed Jan 05 03:15:43 GMT 2005 Olly Betts <olly@survex.com>
5153 * docs/scriptindex.txt: Note that actions are applied in the specified
5156 Thu Dec 23 19:12:57 GMT 2004 Olly Betts <olly@survex.com>
5158 * INSTALL: "xapian-examples" -> "omega".
5160 Thu Dec 23 19:10:04 GMT 2004 Olly Betts <olly@survex.com>
5162 * configure.ac,NEWS: Version 0.8.5.
5164 Thu Dec 23 19:09:01 GMT 2004 Olly Betts <olly@survex.com>
5166 * INSTALL,README: Added better installation instructions.
5168 Mon Dec 20 17:26:26 GMT 2004 Olly Betts <olly@survex.com>
5170 * configure.ac,omindex.cc: Fixed "ignore symlinks" code to compile on
5171 systems without lstat (e.g. mingw).
5173 Mon Dec 20 12:18:18 GMT 2004 Olly Betts <olly@survex.com>
5175 * omindex.cc: Fix the "ignore symlinks" code to actually compile on
5176 certain Linux boxes.
5178 Mon Dec 20 11:33:59 GMT 2004 Olly Betts <olly@survex.com>
5180 * query.cc: If an exception is thrown, make sure that the HTTP headers
5181 get written so that we don't cause "500 Internal Server Error".
5182 This problem was introduced by the change to allow a user specified
5183 Content-Type in 0.8.0. Partly addresses bug#60.
5185 Fri Dec 17 22:50:01 GMT 2004 Olly Betts <olly@survex.com>
5187 * omindex.cc: Only try to delete removed documents in DUPE_replace
5190 Thu Dec 16 11:43:28 GMT 2004 Olly Betts <olly@survex.com>
5192 * scriptindex.cc: Fixed "Unknown Exception" when trying to "unhtml"
5193 text which contains "</body>" (bug#61). This bug was introduced in
5196 Thu Dec 16 11:28:25 GMT 2004 Olly Betts <olly@survex.com>
5198 * myhtmlparse.cc: <h1> - <h6> and </h1> - </h6> should leave a
5199 space into the dumped HTML.
5201 Wed Dec 15 15:53:55 GMT 2004 Richard Boulton <richard@tartarus.org>
5203 * dbi2omega: Add a comment to the start of the file detailing what
5206 Wed Dec 15 15:08:41 GMT 2004 Richard Boulton <richard@tartarus.org>
5208 * omindex.cc: Change behaviour of crawler such that it doesn't
5209 follow symbolic links any more. Add "--follow" command
5210 line option to turn following of symlinks back on.
5212 Wed Dec 08 16:31:46 GMT 2004 Olly Betts <olly@survex.com>
5214 * NEWS: Final update for 0.8.4.
5216 Tue Dec 07 18:16:32 GMT 2004 Olly Betts <olly@survex.com>
5218 * indextext.h: Fixed to compile with GCC 3.x.
5220 Tue Dec 07 18:15:39 GMT 2004 Olly Betts <olly@survex.com>
5222 * omega.cc,omindex.cc,scriptindex.cc: Use the new
5223 Database/WritableDatabase constructors.
5225 Tue Nov 30 22:02:33 GMT 2004 Olly Betts <olly@survex.com>
5227 * NEWS,configure.ac: Updated for 0.8.4 release.
5229 Wed Nov 24 04:50:52 GMT 2004 Olly Betts <olly@survex.com>
5231 * templates/godmode: Finished off godmode template.
5233 Wed Nov 24 04:12:09 GMT 2004 Olly Betts <olly@survex.com>
5235 * query.cc: If there's only a boolean query so we promote it to be
5236 the query, switch to boolean weights.
5238 Wed Nov 24 03:29:36 GMT 2004 Olly Betts <olly@survex.com>
5240 * Makefile.am,myhtmlparse.cc,myhtmlparse.h,omindex.cc,scriptindex.cc:
5241 Factored out MyHtmlParser into a separate file so it can be used
5242 in scriptindex too to give scriptindex the same improved HTML
5243 parsing which omindex just got.
5245 Wed Nov 24 02:22:49 GMT 2004 Olly Betts <olly@survex.com>
5247 * omindex.cc: Removed bogus extra line from code which was meant to
5248 truncate at a word boundary, but has never actually worked!
5250 Wed Nov 24 02:20:36 GMT 2004 Olly Betts <olly@survex.com>
5252 * omindex.cc: Improved HTML to text conversion - the parser now knows
5253 that some tags should be regarded as word breaks and some shouldn't
5254 (previously all tags were treated as word breaks).
5256 Wed Nov 24 00:22:39 GMT 2004 Olly Betts <olly@survex.com>
5258 * omindex.cc: Removed debug output; don't include \xa0 in the list of
5259 whitespace characters for now, as that's a bit character set
5262 Wed Nov 24 00:04:42 GMT 2004 Olly Betts <olly@survex.com>
5264 * omindex.cc: HTML extraction now turns strips leading and trailing
5265 whitespace and converts all other consecutive groups of whitespace
5268 Tue Nov 23 20:29:14 GMT 2004 Olly Betts <olly@survex.com>
5270 * Makefile.am: XAPIAN_FLAGS already links with xapianqueryparser
5271 so remove -lxapianqueryparser from omega_LDADD as it was causing
5274 Wed Nov 17 18:51:28 GMT 2004 Olly Betts <olly@survex.com>
5276 * omindex.cc: Index RTF documents with unrtf, if available.
5277 * docs/overview.txt: Document this.
5279 Wed Nov 17 16:31:01 GMT 2004 Olly Betts <olly@survex.com>
5281 * omindex.cc: If a filename to be passed to a filter program has a
5282 leading "-", protect it from possible interpretation as an option
5285 Wed Nov 17 16:29:55 GMT 2004 Olly Betts <olly@survex.com>
5287 * omindex.cc: Index Wordperfect documents with wpd2text, if available.
5288 * docs/overview.txt: Document this.
5290 Wed Nov 17 15:12:08 GMT 2004 Olly Betts <olly@survex.com>
5292 * omindex.cc: Index MS Word documents with antiword, if available.
5293 * docs/overview.txt: Document this.
5295 Wed Nov 17 04:29:15 GMT 2004 Olly Betts <olly@survex.com>
5297 * omindex.cc: Add simple code to index OpenOffice documents.
5298 * docs/overview.txt: Update documentation to mention this.
5300 Tue Nov 09 03:04:44 GMT 2004 Olly Betts <olly@survex.com>
5302 * configure.ac,Makefile.am: We now get -AA or -std strict_ansi from
5303 xapian-config, so we don't need to probe for them ourselves.
5305 Sun Nov 07 16:36:42 GMT 2004 Olly Betts <olly@survex.com>
5307 * utils.cc: Fixed to work with updated snprintf configure test,
5309 Sun Nov 07 04:55:26 GMT 2004 Olly Betts <olly@survex.com>
5311 * configure.ac: rearrange so that libtool is active when we test if
5312 the c++ compiler can link a program so it can pull in libstdc++
5313 through a .la file; updated snprintf test to the new one from
5316 Fri Nov 05 17:20:13 GMT 2004 Olly Betts <olly@survex.com>
5318 * configure.ac: AM_CONFIG_HEADER -> AC_CONFIG_HEADERS; Run tests using
5319 the C++ compiler; select ANSI mode for aCC and cxx; Check GXX not
5320 GCC when choosing warning flags.
5322 Wed Nov 03 20:15:34 GMT 2004 Olly Betts <olly@survex.com>
5324 * query.cc: Updated to use Query::empty() instead of
5327 Wed Nov 03 20:12:37 GMT 2004 Olly Betts <olly@survex.com>
5329 * Makefile.am,getopt.cc,getopt.h,getopt1.cc,gnu_getopt.h,omindex.cc,
5330 scriptindex.cc: Updated to reworked getopt from xapian-core.
5332 Wed Nov 03 04:11:03 GMT 2004 Olly Betts <olly@survex.com>
5334 * getopt.cc: Defining _NO_PROTO is a really bad idea for C++ code!
5336 Tue Nov 02 18:54:12 GMT 2004 Olly Betts <olly@survex.com>
5338 * getopt.cc: Protect getopt definition for possible getopt macro
5339 declared in getopt.h.
5341 Tue Nov 02 17:56:08 GMT 2004 Olly Betts <olly@survex.com>
5343 * indextext.h: Fixed 2 warnings.
5345 Tue Nov 02 06:54:17 GMT 2004 Olly Betts <olly@survex.com>
5347 * getopt.cc,getopt1.cc: Fixed function declarations to not use K&R C
5350 Tue Nov 02 05:40:06 GMT 2004 Olly Betts <olly@survex.com>
5352 * Makefile.am,configure.ac,getopt.c,getopt1.c,getopt.cc,getopt1.cc:
5353 Compile everything as C++.
5355 Mon Sep 20 14:52:24 BST 2004 Olly Betts <olly@survex.com>
5357 * NEWS,configure.ac: Version 0.8.3.
5359 Mon Sep 20 14:49:26 BST 2004 Olly Betts <olly@survex.com>
5361 * Makefile.am,configure.ac: Require same versions of autoconf and
5362 automake that xapian-core does.
5364 Mon Sep 20 14:45:53 BST 2004 Olly Betts <olly@survex.com>
5366 * omega.spec.in: Update from Fabrice Colin. The most notable change
5367 is that the RPM is now called xapian-omega because there's already
5368 an omega RPM (in Fedora Core at least) which is some game.
5370 Thu Sep 16 00:57:13 BST 2004 Olly Betts <olly@survex.com>
5372 * cgiparam.cc,configfile.cc,configfile.h,htmlparse.cc,indextext.cc,
5373 omega.cc,omindex-config.cc: All C++ sources should #include
5374 <config.h> as the first header; no header files should #include
5377 Thu Sep 16 00:54:31 BST 2004 Olly Betts <olly@survex.com>
5379 * scriptindex.cc: --version now actually reports the version. --help
5380 now exits with status 0 rather than status 1.
5382 Tue Sep 14 03:00:32 BST 2004 Olly Betts <olly@survex.com>
5384 * omega.spec.in: Updated URL for sources; include htdig2omega and
5385 htdig2omega.script in the RPM.
5387 Tue Sep 14 02:56:52 BST 2004 Olly Betts <olly@survex.com>
5389 * Makefile.am: Install htdig2omega.script in ${prefix}/share/omega/
5390 rather than ${prefix}/share/.
5392 Mon Sep 13 03:22:55 BST 2004 Olly Betts <olly@survex.com>
5394 * NEWS,configure.ac: Version 0.8.2.
5396 Thu Sep 09 15:11:45 BST 2004 Olly Betts <olly@survex.com>
5400 Thu Sep 09 14:41:41 BST 2004 Olly Betts <olly@survex.com>
5402 * query.cc: Use new checkatleast parameter to Enquire::get_mset to
5405 Thu Sep 02 01:45:46 BST 2004 Olly Betts <olly@survex.com>
5407 * templates/query: Always report database not found - previously we
5408 only did so if there was a query. Also fixed missing </center>
5409 tag which happened in certain cases.
5411 Wed Aug 25 23:19:47 BST 2004 Olly Betts <olly@survex.com>
5413 * omindex.cc: When running with "replace duplicates" mode (the
5414 default), detect documents removed since the last indexing
5415 run and delete them from the database (bug #34).
5417 Tue Aug 24 19:23:55 BST 2004 Olly Betts <olly@survex.com>
5419 * omega.cc: Added FIXME comment noting that SORT and SORTBANDS should
5420 be tracked and the results reset to the first page if they change.
5422 Tue Aug 24 19:23:07 BST 2004 Olly Betts <olly@survex.com>
5424 * Makefile.am: Install htdig2omega and htdig2omega.script.
5426 Mon Aug 23 22:29:53 BST 2004 Olly Betts <olly@survex.com>
5428 * scriptindex.cc: Report index file name and line number when
5429 reporting errors in it. Added warning for redundant actions,
5430 such as "truncate" as the last action in a rule.
5432 Mon Aug 23 22:03:25 BST 2004 Olly Betts <olly@survex.com>
5434 * omindex.cc: Use the new replace_document(term, doc) method.
5436 Sun Aug 22 13:11:23 BST 2004 Olly Betts <olly@survex.com>
5438 * configure.in,configure.ac: Renamed configure.in to configure.ac.
5440 Sat Aug 21 12:41:43 BST 2004 Olly Betts <olly@survex.com>
5442 * docs/omegascript.txt: Added note about that $add{$hit,1} gives
5445 Fri Aug 20 20:28:16 BST 2004 Olly Betts <olly@survex.com>
5447 * Makefile.am: Link with -lxapianqueryparser, not -lomqueryparser.
5449 Thu Aug 19 19:13:34 BST 2004 Olly Betts <olly@survex.com>
5451 * Makefile.am: And actually ship htdig2omega and htdig2omega.script!
5453 Thu Aug 19 19:02:40 BST 2004 Olly Betts <olly@survex.com>
5455 * htdig2omega,htdig2omega.script: Added perl script and corresponding
5456 scriptindex index script which allow an ht://dig database to be
5457 imported into Xapian. This provides an easy way to provide a search
5458 of remote websites using omega (by spidering them with ht://dig).
5460 Sun Aug 15 01:48:58 BST 2004 Olly Betts <olly@survex.com>
5462 * indextext.cc,indextext.h,omindex.cc,query.cc,scriptindex.cc,
5463 symboltab.h: Fixed $highlight to understand accented characters
5466 Wed Jun 30 14:58:12 BST 2004 Olly Betts <olly@survex.com>
5468 * NEWS,configure.in: Version 0.8.1.
5470 Tue Jun 29 17:26:41 BST 2004 Richard Boulton <richard@tartarus.org>
5472 * Makefile.am: Remove Debian files from distribution tarballs,
5473 since there will often be multiple patch releases for each
5474 release. Debian files will be available from an apt repository
5477 Tue Jun 29 01:45:06 BST 2004 Olly Betts <olly@survex.com>
5479 * omindex.cc: Renamed hash() to hash_string() to avoid colliding
5480 with something on IRIX; Removed explicit initialisation of
5481 mime_types - perhaps that's spooking the SGI CC prelinker.
5483 Sun Jun 27 23:47:35 BST 2004 Olly Betts <olly@survex.com>
5485 * omega.cc: Change MORELIKE to pick up to 40 terms, rather than up to
5486 6 (feedback on the mailing list suggests this gives much better
5489 Fri Jun 11 02:22:38 BST 2004 Olly Betts <olly@survex.com>
5491 * scriptindex.cc: Added catch for std::bad_alloc.
5493 Mon Apr 19 14:43:17 BST 2004 Olly Betts <olly@survex.com>
5495 * NEWS: Final update for 0.8.0.
5497 Sun Apr 18 22:31:24 BST 2004 Olly Betts <olly@survex.com>
5499 * omindex.cc: Only need _POSIX_C_SOURCE on Linux, and it seems to
5500 cause problems with Sun's C++ compiler.
5502 Sun Apr 18 17:50:35 BST 2004 Olly Betts <olly@survex.com>
5504 * omindex.cc: _POSIX_C_SOURCE works better than _POSIX_SOURCE for
5505 making posix_fadvise prototype visible on Linux.
5507 Thu Apr 15 02:05:49 BST 2004 Olly Betts <olly@survex.com>
5509 * omindex.cc: And another _POSIX_SOURCE attempt!
5511 Thu Apr 15 01:43:51 BST 2004 Olly Betts <olly@survex.com>
5513 * omindex.cc: Another stab at _POSIX_SOURCE...
5515 Thu Apr 15 01:25:29 BST 2004 Olly Betts <olly@survex.com>
5517 * omindex.cc: Added a missing underscore (_POSIX_SOURCE not
5520 Thu Apr 15 00:48:12 BST 2004 Olly Betts <olly@survex.com>
5522 * omindex.cc: Defined POSIX_SOURCE to a suitable value to get
5523 posix_fadvise on some versions of redhat.
5525 Mon Apr 12 01:06:58 BST 2004 Olly Betts <olly@survex.com>
5527 * NEWS,configure.in: Version 0.8.0.
5529 Mon Apr 12 00:03:57 BST 2004 Olly Betts <olly@survex.com>
5531 * indextext.cc,query.cc: Don't create R terms for terms which start
5534 Sun Apr 11 23:47:33 BST 2004 Olly Betts <olly@survex.com>
5536 * omindex.cc: Fixed inconsistent indenting.
5538 Sun Apr 11 23:11:51 BST 2004 Olly Betts <olly@survex.com>
5540 * omindex.cc: Call posix_fadvise with POSIX_FADV_DONTNEED just before
5541 closing an input file. Again should help improve indexing
5544 Fri Apr 02 16:09:03 BST 2004 Olly Betts <olly@survex.com>
5546 * configure.in,omindex.cc: Use O_STREAMING and/or posix_fadvise()
5547 when reading files to be indexed (if available). This helps to
5548 keep the Xapian database in cache, and greatly improve indexing
5551 Tue Mar 30 00:06:15 BST 2004 Olly Betts <olly@survex.com>
5553 * NEWS: We're now putting omega news here rather than in xapian-core
5554 so composed draft version for the forthcoming 0.8.0 release.
5556 Tue Mar 29 23:56:27 BST 2004 Olly Betts <olly@survex.com>
5558 * templates/xml: Remove unused OmegaScript code:
5559 `$set{topterms,$or{$ne{$msize,0},$query}}'.
5561 Tue Mar 29 23:55:40 BST 2004 Olly Betts <olly@survex.com>
5563 * Makefile.am: scriptindex needs to link to getopt.c and getopt1.c.
5565 Tue Mar 23 19:20:19 GMT 2004 Olly Betts <olly@survex.com>
5567 * templates/xml: Correct spelling of `relavence' to `relevance'.
5568 NB: if you're parsing the XML output, you'll need to fix this
5569 spelling in your parser!
5571 Sun Mar 21 14:23:23 GMT 2004 Olly Betts <olly@survex.com>
5573 * scriptindex.cc: Use getopt for option parsing. Change default to
5574 *not* overwriting the database (use --overwrite if you really want
5575 to do this); -u is now accepted but ignored.
5577 Fri Mar 12 02:11:28 GMT 2004 Olly Betts <olly@survex.com>
5579 * templates/xml: "Content-Type: application/html" is more appropriate
5582 Fri Mar 12 02:09:33 GMT 2004 Olly Betts <olly@survex.com>
5584 * omindex.cc: Added --overwrite option which forces an existing
5585 database to be deleted before indexing begins.
5587 Wed Mar 10 14:39:13 GMT 2004 Olly Betts <olly@survex.com>
5589 * templates/xml: "Content-Type: text/xml".
5591 Wed Mar 10 00:08:40 GMT 2004 Olly Betts <olly@survex.com>
5593 * docs/scriptindex.txt: Make more explicit that boolean produces a
5594 *single* boolean term.
5596 Tue Mar 09 19:08:19 GMT 2004 Olly Betts <olly@survex.com>
5598 * indextext.cc,omindex.cc,scriptindex.cc: Updated to use add_term()
5599 instead of add_term_nopos().
5601 Wed Mar 03 14:55:50 GMT 2004 Olly Betts <olly@survex.com>
5603 * scriptindex.cc: Use true/false for assigning to booleans, not 1/0.
5605 Sat Feb 21 18:33:15 GMT 2004 Olly Betts <olly@survex.com>
5607 * omega.cc,query.cc,docs/omegascript.txt: Added $httpheader
5608 Omegascript to allow arbitrary HTTP headers and alternative
5609 Content-Type headers to be specified.
5611 Sat Feb 14 00:32:06 GMT 2004 Olly Betts <olly@survex.com>
5613 * query.cc: If the probabilistic query was bad, don't try to run the
5616 Sat Feb 14 00:11:52 GMT 2004 Olly Betts <olly@survex.com>
5618 * docs/cgiparams.txt: Note that START and END should be in the format
5621 Sat Feb 14 00:07:41 GMT 2004 Olly Betts <olly@survex.com>
5623 * query.cc: Don't crash if there's a date filter but no probabilistic
5626 Wed Nov 26 22:44:49 GMT 2003 Olly Betts <olly@survex.com>
5628 * indextext.cc: Raw terms with a multicharacter prefix are now indexed
5629 with a : inserted (e.g. as XFOO:Rterm). This matches what the query
5632 Wed Nov 26 16:25:16 GMT 2003 Olly Betts <olly@survex.com>
5634 * configure.in: Version 0.7.5.
5636 Sun Nov 23 03:28:21 GMT 2003 Olly Betts <olly@survex.com>
5638 * query.cc,docs/omegascript.txt: Added note that $setmap{prefix,...}
5639 needs be used before any commands which require the query to be
5642 Thu Nov 20 02:44:55 GMT 2003 Olly Betts <olly@survex.com>
5644 * docs/omegascript.txt: Expanded documentation of $set and $setmap to
5645 include values which Omega itself makes use of.
5647 Thu Nov 20 02:43:03 GMT 2003 Olly Betts <olly@survex.com>
5649 * omega.cc,query.cc: Set default value for $opt{stemmer} to "english"
5650 rather than taking "" to mean English.
5652 Tue Oct 21 21:29:18 BST 2003 Olly Betts <olly@survex.com>
5654 * query.cc: Fixed $setmap{} to not add bogus entries.
5656 Tue Oct 21 21:20:31 BST 2003 Olly Betts <olly@survex.com>
5658 * query.cc: Allow the QueryParser prefix map to be set up using
5659 $setmap{prefix,...} (e.g. $setmap{prefix,subject,XT,abstract,XA}).
5661 Tue Oct 21 21:13:59 BST 2003 Olly Betts <olly@survex.com>
5663 * query.cc: Only parse probabilistic query once!
5665 Tue Oct 21 20:03:27 BST 2003 Olly Betts <olly@survex.com>
5667 * omega.cc,omega.h,query.cc,query.h: Reworked so that the
5668 probabilistic query isn't parsed until we need some
5669 information from it. This means that we can now use options
5670 set by the omegascript template to control the behaviour of the
5673 Thu Oct 16 21:17:01 BST 2003 Olly Betts <olly@survex.com>
5675 * omega.cc: Renamed `big_buf' to `query_string' and eliminated `more'
5676 flag and use of goto; tidied up order of reading CGI variables; use
5677 const refs to value strings in cgi_params map rather than copying
5680 Sat Oct 11 20:43:04 BST 2003 Olly Betts <olly@survex.com>
5682 * omega.cc,omega.h,query.cc: Make rset an object rather than a pointer
5685 Fri Oct 10 18:06:10 BST 2003 Olly Betts <olly@survex.com>
5687 * query.cc: Removed the unfinished code for caching omegascript
5688 command expansions. Added code to cache $dbsize. The only other
5689 value correctly marked for caching is already being cached!
5691 Thu Oct 02 15:18:19 BST 2003 Olly Betts <olly@survex.com>
5693 * configure.in: Version 0.7.4.
5695 Thu Oct 02 15:16:41 BST 2003 Olly Betts <olly@survex.com>
5697 * query.cc: $date doesn't require the match to be run to work, but
5700 Tue Sep 30 18:32:25 BST 2003 Olly Betts <olly@survex.com>
5702 * query.cc: Cleaner version of T macro.
5704 Tue Sep 30 18:09:30 BST 2003 Olly Betts <olly@survex.com>
5706 * query.cc: Hopefully the final piece in the Sun C++ puzzle.
5708 Tue Sep 30 00:59:50 BST 2003 Olly Betts <olly@survex.com>
5710 * query.cc: Cleaned up a recent fix by using clean generic code which
5711 works on Sun's C++ too.
5713 Mon Sep 29 17:12:10 BST 2003 Olly Betts <olly@survex.com>
5715 * cgiparam.cc: Portability fixes for Sun's C++ compiler.
5717 Mon Sep 29 13:26:22 BST 2003 Olly Betts <olly@survex.com>
5719 * query.cc: Another Sun C++ fix.
5721 Mon Sep 29 11:49:30 BST 2003 Olly Betts <olly@survex.com>
5723 * query.cc,omega.cc: More fixes for Sun's really rather rubbish
5726 Mon Sep 29 01:39:56 BST 2003 Olly Betts <olly@survex.com>
5728 * query.cc: Fixes for compiling with Sun's C++ compiler.
5730 Mon Sep 29 01:17:39 BST 2003 Olly Betts <olly@survex.com>
5732 * omega.cc: Added workaround for compilation problem with Sun's C++.
5734 Fri Aug 08 01:39:51 BST 2003 Olly Betts <olly@survex.com>
5736 * configure.in: Version 0.7.3.
5738 Sat Aug 02 01:52:38 BST 2003 Olly Betts <olly@survex.com>
5740 * configure.in,omindex.cc,query.cc: Fixed to compile on mingw
5741 where ftime() returns void.
5743 Fri Aug 01 20:59:57 BST 2003 Olly Betts <olly@survex.com>
5745 * scriptindex.cc: Added #define for sleep() on __WIN32__.
5747 Wed Jul 30 19:05:17 BST 2003 Olly Betts <olly@survex.com>
5749 * getopt.h: Copied over latest getopt.h from xapian-core.
5751 Sun Jul 27 16:34:19 BST 2003 Olly Betts <olly@survex.com>
5753 * Makefile.am,getopt.c,getopt.h,getopt1.c: Copied our version of GNU
5754 getopt here from xapian-core so we can build omindex on non-glibc
5755 platforms (modifications are for better C++ compatibility).
5757 Mon Jul 21 01:16:59 BST 2003 Olly Betts <olly@survex.com>
5759 * configure.in: Use libtool; OM_PATH_XAPIAN -> XO_LIB_XAPIAN.
5761 Sat Jul 19 19:26:03 BST 2003 Olly Betts <olly@survex.com>
5763 * omindex.cc: Added missing `#include <errno.h>'.
5765 Sat Jul 19 19:24:50 BST 2003 Olly Betts <olly@survex.com>
5767 * indextext.cc: Fixed signed character issue.
5769 Thu Jul 17 00:51:42 BST 2003 Olly Betts <olly@survex.com>
5771 * bootstrap: Removed bootstrap in favour of top-level bootstrap.
5773 Tue Jul 15 16:27:52 BST 2003 Olly Betts <olly@survex.com>
5775 * omindex.cc: file_to_string() and stdout_to_string() now throw an
5776 exception on a read error, avoiding the " "-for-empty-file bodge.
5778 Tue Jul 15 15:18:32 BST 2003 James Aylett <james@tartarus.org>
5780 * omindex.cc: fix file_to_string() to return the file on
5781 success, and not leak memory on empty files. Fix callers
5782 to give up on unreadable files, not vice versa. Fix
5783 logging messages to distinguish re-indexed/added.
5785 Fri Jul 11 15:09:55 BST 2003 Olly Betts <olly@survex.com>
5787 * configure.in: Version 0.7.2.
5789 Fri Jul 11 12:08:57 BST 2003 Olly Betts <olly@survex.com>
5791 * omega.cc: If the same database is listed more than once, only search
5792 the first occurrence.
5794 Fri Jul 11 11:57:24 BST 2003 Olly Betts <olly@survex.com>
5796 * configure.in,utils.cc: Use snprintf.
5798 Tue Jul 08 17:56:39 BST 2003 Olly Betts <olly@survex.com>
5800 * configure.in: Version 0.7.1.
5802 Tue Jul 08 17:34:01 BST 2003 Olly Betts <olly@survex.com>
5804 * omindex.cc: Fixed compilation problem.
5806 Fri Jul 04 22:12:32 BST 2003 Olly Betts <olly@survex.com>
5808 * bootstrap: add missing ';;' as case pattern delimiter
5810 Thu Jul 03 23:34:50 BST 2003 Olly Betts <olly@survex.com>
5812 * configure.in: Version 0.7.0.
5814 Thu Jul 03 23:33:05 BST 2003 Olly Betts <olly@survex.com>
5816 * omindex.cc: Abort parsing of document if it's excluded from
5817 indexing; ignore anything outside of the first <body>...</body>,
5820 Tue Jun 24 00:45:28 BST 2003 Olly Betts <olly@survex.com>
5822 * docs/overview.txt: Added note about hashing of long URL terms and
5823 reworked structure a little.
5825 Mon Jun 23 21:11:41 BST 2003 Olly Betts <olly@survex.com>
5827 * bootstrap: Check for Bison 1.875 which doesn't work with Xapian.
5829 Mon Jun 23 16:52:47 BST 2003 Olly Betts <olly@survex.com>
5831 * omega.cc,omindex.cc,scriptindex.cc: Xapian::PostListIterator ->
5832 Xapian::PostingIterator.
5834 Thu Jun 19 20:02:00 BST 2003 Olly Betts <olly@survex.com>
5836 * symboltab.h: Convert hardspace to space.
5838 Wed Jun 18 16:32:34 BST 2003 Olly Betts <olly@survex.com>
5840 * scriptindex.cc: Removed already disabled unique id hashing to docid
5841 code. Xapian doesn't support setting arbitrary docids - if it ever
5842 does we can retrieve this code from CVS.
5844 Wed Jun 18 16:28:33 BST 2003 Olly Betts <olly@survex.com>
5846 * Makefile.am,indextext.cc,indextext.h,omindex.cc,scriptindex.cc:
5847 Normalise accents in probabilistic terms.
5849 Tue Jun 17 17:54:32 BST 2003 Olly Betts <olly@survex.com>
5851 * omindex.cc: Read output from pstotext and pdftotext via pipes rather
5852 than temporary files to side-step the whole problem of secure
5853 temporary file creation; Use pdfinfo to get the title and keywords
5854 from when indexing a PDF; Safe filename escaping tweaked to not
5855 escape common safe punctuation.
5857 Tue Jun 17 17:50:00 BST 2003 Olly Betts <olly@survex.com>
5859 * htmlparse.cc,htmlparse.h: Moved initialisation of named_ents out of
5860 header - it's not a sensible candidate for inlining.
5862 Wed Jun 11 02:32:25 BST 2003 Olly Betts <olly@survex.com>
5864 * date.cc,date.h,omega.cc,omega.h,omindex.cc,query.cc,query.h,
5865 scriptindex.cc: Om -> Xapian::, etc.
5867 Fri Jun 6 01:04:12 BST 2003 Richard Boulton <richard@tartarus.org>
5869 * omindex.cc: Implement an upper limit on the length of URL
5870 terms. Currently, this is set at 240 characters - it can
5871 probably be increased slightly, but I'm not sure exactly
5872 how long a term can safely be. If the URL term would be
5873 longer than this, its last few bytes are replaced by a
5874 hash of the tail of the URL. This means that (apart from
5875 hopefully very rare collisions) urlterms should still be
5876 unique ids for documents.
5878 Fri Jun 06 00:14:13 BST 2003 Richard Boulton <richard@tartarus.org>
5880 * omindex.cc: Clean up processing of HTML documents:
5881 - Ignore the contents of <script> and <style> tags in HTML.
5882 - Strip initial whitespace in each tag in an HTML document.
5883 - Try not to split words in half when truncating title and
5886 Tue Jun 03 11:15:28 BST 2003 Olly Betts <olly@survex.com>
5888 * templates/query: Create log entry in query.log.
5890 Thu May 29 18:03:54 BST 2003 Olly Betts <olly@survex.com>
5892 * query.cc: Fixed bug in DEFAULT_LOG_ENTRY's Omegascript.
5894 Thu May 29 00:22:28 BST 2003 Olly Betts <olly@survex.com>
5896 * query.cc: Set STEM_LANGUAGE near the start of the file so it's easy
5897 for users to change until we get better configurability.
5899 Thu May 29 00:00:28 BST 2003 Olly Betts <olly@survex.com>
5901 * Makefile.am,date.cc,date.h,query.cc: Split code to build a
5902 date range filter into a separate file.
5904 Wed May 28 23:38:02 BST 2003 Olly Betts <olly@survex.com>
5906 * configfile.cc,configfile.h,omega.cc,omega.conf,query.cc,query.h,
5907 docs/omegascript.txt,docs/overview.txt,docs/quickstart.txt:
5908 Replaced half-hearted logging support with flexible
5909 OmegaScript-based approach with new $log command. Also added
5910 $now to allow the current date/time to be logged.
5912 Tue May 27 17:55:24 BST 2003 Olly Betts <olly@survex.com>
5914 * query.cc: Added missing "#include <assert.h>".
5916 Mon May 26 22:41:26 BST 2003 Olly Betts <olly@survex.com>
5918 * configure.in: Don't use libtool; Use AC_CONFIG_FILES - it's the new
5921 Mon May 26 12:12:22 BST 2003 Olly Betts <olly@survex.com>
5923 * omega.spec.in: Removed %changelog - it hasn't been reliably updated
5924 and only really makes sense when the packaging is done by a third
5927 Mon May 26 12:01:55 BST 2003 Olly Betts <olly@survex.com>
5929 * query.cc: If the query is empty, don't bother running it through
5932 Wed Apr 30 01:18:47 BST 2003 Olly Betts <olly@survex.com>
5934 * docs/cgiparams.txt,docs/omegascript.txt: Minor improvements.
5936 Wed Apr 30 01:14:46 BST 2003 Olly Betts <olly@survex.com>
5938 * query.cc: Use correct types for docid and value_no in $value.
5940 Wed Apr 23 16:15:07 BST 2003 Sam Liddicott <sam@liddicott.com>
5942 * templates/xml: add collapse info to xml template.
5944 Wed Apr 23 14:00:37 BST 2003 Olly Betts <olly@survex.com>
5946 * omega.spec.in: Merged changes from Fabrice Colin.
5948 Thu Apr 10 03:14:51 BST 2003 Olly Betts <olly@survex.com>
5950 * configure.in: Updated for 0.6.5 release.
5952 Wed Apr 09 13:56:14 BST 2003 Olly Betts <olly@survex.com>
5954 * omega.cc,query.cc,omega.h,docs/cgiparams.txt: Renamed DATE1, DATE2,
5955 and DAYSMINUS to the more meaningful START, END, and SPAN (NB SPAN
5956 is days before END, or after START, or before today - whereas
5957 SPAN was before *DATE1* or before today). The old parameters names
5958 are supported (with the original semantics) for now.
5960 Wed Apr 09 13:44:28 BST 2003 Olly Betts <olly@survex.com>
5962 * Makefile.am: Install docs in /usr/share/doc/omega to be FHS
5964 * omega.spec.in: Consistently use %{contentdir} instead of /var/lib;
5965 removed redundant second setting of %docdir.
5967 Wed Apr 09 01:21:57 BST 2003 Olly Betts <olly@survex.com>
5969 * Makefile.am: Removed bogus extra "\".
5971 Mon Mar 31 19:42:24 BST 2003 Olly Betts <olly@survex.com>
5973 * Makefile.am: Install documentation!
5974 * omega.spec.in: Merged in changes to RPM packaging from Fabrice Colin
5975 and reworked further.
5977 Fri Mar 28 17:47:45 GMT 2003 Olly Betts <olly@survex.com>
5979 * templates/query,templates/query2: Removed bogus setting of defunct
5980 xB parameter; correctly propagate multiple B parameters.
5982 Fri Mar 28 17:45:41 GMT 2003 Olly Betts <olly@survex.com>
5984 * omindex.cc: Report correct version number (was hard-wired to 1.0!)
5986 Tue Mar 25 14:46:10 GMT 2003 Olly Betts <olly@survex.com>
5988 * query.cc: If xP and P are both empty, classify as SAME_QUERY not
5989 NEW_QUERY as there may be a boolean query too.
5990 * query.cc: Fixed off-by-one error in rounding down topdoc - it was
5991 possible to get to an empty page of hits if there were exactly a
5992 multiple of HITSPERPAGE matches and the matcher over-estimated the
5993 number of matches and Omega displayed page links.
5995 Mon Mar 24 09:40:04 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
5997 * templates/query: Added propagation of B boolean filter
5998 * templates/query2: factored about a bit more, query2 is
5999 a more modular version of query which will ultimately
6000 lend itself to customisation a bit more to the uninitiated.
6002 Tue Mar 04 01:02:12 GMT 2003 Olly Betts <olly@survex.com>
6004 * omega.cc: Fixed handling of multiple DB parameters to be as
6007 Fri Feb 28 09:52:03 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
6009 * Added $collapsed to omegascript to give the number of hits
6010 collapsed into the current hit, eg:
6012 $if{$ne{$collapsed,0},$collapsed hidden results
6013 ($value{$cgi{COLLAPSE}})}
6015 * templates/godmode: removed euro ferret icon reference
6016 * templates/godmode: added value dumping, for values from 0-255
6018 Thu Feb 27 11:58:13 GMT 2003 Olly Betts <olly@survex.com>
6020 * Makefile.am,query.cc,docs/omegascript.txt,templates/query:
6021 Added $transform{} which does regexp manipulation (currently
6022 disabled); Added $uniq{} to eliminate duplicates from a sorted
6023 list; Fixed a query with repeated terms to be identified as
6024 SAME_QUERY not EXTENDED_QUERY; remove duplicates from terms
6025 listed in term frequencies.
6027 Wed Feb 26 17:50:26 GMT 2003 Olly Betts <olly@survex.com>
6029 * scriptindex.cc: Allow '_' in fieldnames. Diagnose bad characters
6030 in fieldnames better.
6032 Wed Feb 26 15:13:02 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
6034 * dbi2omega: Add DBUSER and DBPASSWD env var support so that password
6035 protected DB's can easily be used
6036 * add cgi parameter COLLAPSE to collapse on key values
6037 * Add $value{key[,docid]} support to omegascript
6039 Wed Feb 26 09:58:01 GMT 2003 Sam Liddicott <sam.liddicott@orange.co.uk>
6041 * bootstrap: Fix success message when building in non-src dir
6042 as configure is written to the src dir.
6044 Mon Jan 6 12:47:55 GMT 2003 James Aylett <james@tartarus.org>
6046 * scriptindex.cc: build fix
6048 Tue Dec 24 20:12:23 GMT 2002 Olly Betts <olly@survex.com>
6050 * configure.in: Version 0.6.4.
6052 Tue Dec 24 20:06:47 GMT 2002 Olly Betts <olly@survex.com>
6054 * scriptindex.cc: Minor tweak.
6056 Tue Dec 24 19:58:57 GMT 2002 Olly Betts <olly@survex.com>
6058 * omega.cc,docs/cgiparams.txt: Prefer MINHITS to MIN_HITS and
6059 RAWSEARCH to RAW_SEARCH since none of the other CGI parameter
6060 names have _ separating words. Also support old names for now.
6062 Mon Dec 23 03:23:33 GMT 2002 Olly Betts <olly@survex.com>
6064 * query.cc,docs/omegascript.txt,templates/query: Added $unstem to map
6065 a stemmed term to the form(s) used in the query; $queryterms now
6066 only includes the first occurrence of each stemmed form; $prettyterm
6067 uses the unstem map.
6069 Sat Dec 21 17:47:33 GMT 2002 Olly Betts <olly@survex.com>
6071 * scriptindex.cc,docs/scriptindex.txt: Replaced index=nopos with
6072 indexnopos action; index and indexnopos now take an optional
6073 prefix argument; index=nopos is handled specially for backwards
6076 Sat Dec 21 17:18:02 GMT 2002 Olly Betts <olly@survex.com>
6078 * scriptindex.cc,docs/scriptindex.txt: Added new scriptindex action
6079 date=FORMAT to generate terms for date range searching.
6081 Sat Dec 21 01:51:32 GMT 2002 Olly Betts <olly@survex.com>
6083 * templates/query: Stop topterms sticking out of green box with
6084 gecko based browsers.
6086 Sat Dec 21 01:44:53 GMT 2002 Olly Betts <olly@survex.com>
6088 * Makefile.am: Distribute docs/scriptindex.txt.
6089 * docs/omegascript.txt: It's $setrelevant not $set_relevant.
6091 Sat Dec 14 13:54:10 GMT 2002 Olly Betts <olly@survex.com>
6093 * configure.in: Version 0.6.3; removed -Wno-long-long as we don't use
6095 * query.cc: Compilation fixes.
6096 * templates/query: Don't call $topterms twice!
6098 Sat Dec 14 01:10:48 GMT 2002 Olly Betts <olly@survex.com>
6100 * query.cc: Updated in line with removal of OmSettings.
6102 Wed Dec 11 00:58:49 GMT 2002 Olly Betts <olly@survex.com>
6104 * configure.in,query.cc,docs/omegascript.txt,templates/query:
6105 Added $time which reports how long the match took - when searching
6106 on a remote website, it's hard to gauge how much time is taken by
6107 the search, and how much by the web server and browser; renamed
6108 and_vec to or_vec which better describes its purpose.
6110 Mon Dec 09 17:11:26 GMT 2002 Olly Betts <olly@survex.com>
6112 * query.cc,docs/omegascript.txt,templates/query: Added $dbsize
6113 to return the number of documents in the database being searched.
6114 Use this in the default query template on the "front page" shown
6115 when there's no search.
6117 Mon Dec 09 02:55:46 GMT 2002 Olly Betts <olly@survex.com>
6119 * query.cc,docs/omegascript.txt,templates/query: Added $msizeexact
6120 which returns "true" if $msize if exact (or "" if it is estimated).
6121 This means that you'll see "... of about N matches" less often -
6122 notably it's gone when searching for a single term, which is a
6125 Sun Dec 08 08:42:47 GMT 2002 Olly Betts <olly@survex.com>
6127 * scriptindex.cc: Replaced icky unportable code which set the filename
6128 to "/dev/fd/0" in order to read from stdin.
6130 Sun Dec 08 06:39:30 GMT 2002 Olly Betts <olly@survex.com>
6132 * query.cc,docs/omegascript.txt: Fixed $hitlist to complain if more
6133 than one parameter is passed; $topterms now defaults to 16 terms
6134 rather than 20; $topterms now weeds out terms which stem to the
6135 same as those in the query, or those already in $topterms.
6137 Sun Dec 08 06:36:04 GMT 2002 Olly Betts <olly@survex.com>
6139 * templates/query: Make background white - the very light grey just
6140 looks dirty; fixed exclusion of TopTerms Javascript when there
6141 are not TopTerms; sample now <small>; language and size now
6142 appear when the corresponding fields are present; fixed
6143 unmatched </small>; fixed missing list of terms matching
6146 Sat Dec 07 21:20:31 GMT 2002 Olly Betts <olly@survex.com>
6148 * configure.in: Version 0.6.2.
6150 Sat Dec 07 21:04:31 GMT 2002 Olly Betts <olly@survex.com>
6152 * query.cc: Prefer "while (true)" to "while (1)".
6154 Fri Dec 06 04:41:05 GMT 2002 Olly Betts <olly@survex.com>
6156 * omindex.cc: Index .php files by default; non-zero return code if
6157 an exception is caught.
6159 Fri Dec 06 04:30:17 GMT 2002 Olly Betts <olly@survex.com>
6161 * htmlparse.cc: Ignore PHP tags and their contents; fixed tag
6162 scanning code to never read one character past the end of
6165 Wed Dec 04 18:42:51 GMT 2002 Olly Betts <olly@survex.com>
6167 * omega.cc,omega.h,omindex.cc,query.cc,scriptindex.cc:
6168 Updated in line with OmSettings related changes to the API.
6170 Wed Dec 04 17:13:43 GMT 2002 Olly Betts <olly@survex.com>
6172 * query.cc: Fixed $dbname to return "default" for the default
6173 database, rather than "" - this fixes paging in searches of the
6175 * templates/query: Removed xDEFAULTOP hidden field which is no longer
6178 Wed Dec 04 11:57:13 GMT 2002 Olly Betts <olly@survex.com>
6180 * templates/query: Removed bogus unmatched '}'.
6182 Thu Nov 28 20:24:08 GMT 2002 Olly Betts <olly@survex.com>
6184 * omega.cc,query.cc: Updated in line with OmEnquire::get_eset() no
6185 longer taking an OmSettings object.
6187 Wed Nov 27 19:02:12 GMT 2002 Olly Betts <olly@survex.com>
6189 * dbi2omega: Return fields in table order; more efficient;
6190 report any error reading a row; if we get a NULL field,
6191 don't output it, and suppress perl warning about use of
6192 an undefined program.
6194 Wed Nov 27 05:22:04 GMT 2002 Olly Betts <olly@survex.com>
6196 * configure.in: Set version to 0.6.0.
6198 Wed Nov 27 05:21:00 GMT 2002 Olly Betts <olly@survex.com>
6200 * configure.in,htmlparse.h,omindex.cc,scriptindex.cc:
6201 Use "-Wall -W" rather than "-Wall -Wunused", and fixed the
6202 warnings this reveals.
6204 Wed Nov 27 04:20:13 GMT 2002 Olly Betts <olly@survex.com>
6206 * Makefile.am,dbi2omega: Added perl script to dump any database
6207 which perl DBI can access into the dump format expected by
6210 Wed Oct 30 02:02:32 GMT 2002 Olly Betts <olly@survex.com>
6212 * omega.spec.in: Use bootstrap instead of buildall; don't use "-j4"
6213 with make - most people don't all have quad processor boxes!
6215 Wed Oct 30 01:56:31 GMT 2002 Olly Betts <olly@survex.com>
6217 * buildall: Removed in favour of bootstrap script.
6219 Tue Oct 29 02:01:58 GMT 2002 Olly Betts <olly@survex.com>
6221 * omindex.cc,scriptindex.cc: Added MAX_PROB_TERM_LENGTH (set to
6222 64) to limit size of probabilistic terms.
6224 Sat Oct 12 17:09:55 BST 2002 Olly Betts <olly@survex.com>
6226 * bootstrap: Copied bootstrap script from xapian-core.
6228 Sat Oct 12 17:05:37 BST 2002 Olly Betts <olly@survex.com>
6230 * configure.in: Version 0.5.3.
6232 Wed Oct 09 16:55:56 BST 2002 Olly Betts <olly@survex.com>
6234 * omega.cc,omega.h,query.cc,docs/{cgiparams.txt,omegascript.txt},
6235 templates/query: revamped the "reset first page when filter changes"
6236 scheme - all filtery things are now serialised and put into the
6237 xFILTER CGI parameter, which copes with multiple B values. Support
6238 for the old way (xB, xDATE1, xDATE2, xDAYSMINUS, xDEFAULTOP) is
6239 included for now (but only copes with a single B value). Added (and
6240 documented) $filters Omegascript command to implement this.
6241 * query.cc: fixed handling of case when topdoc is non-zero, but
6242 no matches were found. This was causing topdoc to be set to -6!
6243 * query.cc: fixed handling of prefixes starting with an X.
6245 Wed Oct 09 15:35:54 BST 2002 Olly Betts <olly@survex.com>
6247 * .cvsignore: Added scriptindex and omega-*.tar.gz; removed libtool.
6249 Sun Oct 06 18:56:40 BST 2002 Olly Betts <olly@survex.com>
6251 * configure.in: Version 0.5.2.
6253 Thu Oct 03 16:42:06 BST 2002 Olly Betts <olly@survex.com>
6255 * query.cc: Added CMD_hit to enumeration.
6257 Wed Oct 02 17:02:25 BST 2002 Olly Betts <olly@survex.com>
6259 * configure.in: Version 0.5.1.
6260 * Makefile.am,configure.in: require automake 1.6.3 and autoconf 2.54
6261 since xapian-core does anyway, and it neatens configure.in slightly.
6263 Wed Oct 02 16:58:39 BST 2002 Olly Betts <olly@survex.com>
6265 * query.cc,docs/omegascript.txt: Added $hit which gives the m-set
6266 number of the current hit.
6268 Sun Sep 22 15:47:33 BST 2002 Olly Betts <olly@survex.com>
6270 * configfile.cc: Corrected use of string.data() to string.c_str().
6272 Sun Sep 22 03:53:35 BST 2002 Olly Betts <olly@survex.com>
6274 * templates/query: Updated xapian url to http://www.xapian.org/
6276 Fri Sep 20 15:36:35 BST 2002 Olly Betts <olly@survex.com>
6278 * configure.in: Version 0.5.0.
6280 Sun Sep 15 03:07:31 BST 2002 Richard Boulton <richard.boulton@omsee.com>
6282 * buildall: Update to latest version, to fix bug with VPATH version
6283 checking for autoconf.
6285 Thu Sep 12 15:11:16 BST 2002 Olly Betts <olly@survex.com>
6287 * htmlparse.cc: Add comment about string::replace() invalidating
6290 Thu Sep 12 13:38:05 BST 2002 Olly Betts <olly@survex.com>
6292 * omegascript.vim,omegascript.txt,query.cc: cosmetic tweaks.
6294 Thu Sep 5 14:47:54 BST 2002 Richard Boulton <richard@tartarus.org>
6296 * configure.in: Don't use libtool. I don't know why I ever thought
6299 Thu Sep 5 14:11:51 BST 2002 Richard Boulton <richard@tartarus.org>
6301 * query.cc: Change $and to return true iff all its arguments are
6302 not false, rather than if one or more of the arguments is false.
6303 * docs/omegascript.txt: Update documentation of $and{}
6305 Fri Aug 23 13:27:02 BST 2002 James Aylett <tartarus@users.sourceforge.net>
6307 * docs/quickstart.txt: encourage people to call their first
6308 database 'default' since this will work straight off.
6310 Wed Aug 21 17:52:36 BST 2002 Richard Boulton <richard@tartarus.org>
6312 * query.cc: Add $slice{} command, to slice a list at a set of
6313 positions (given by a second list).
6314 Also, bugfix: require $hitlist{} to take at least one parameter:
6315 it currently segfaults if given none.
6316 * docs/omegascript.txt: Document $slice{}.
6317 * extra/omegascript.vim: Update syntax highlighting.
6319 Wed Aug 21 18:03:43 BST 2002 James Aylett <tartarus@users.sourceforge.net>
6321 * omindex.cc: tidy up output so it doesn't wrap so much
6323 Wed Aug 21 18:01:38 BST 2002 James Aylett <tartarus@users.sourceforge.net>
6325 * htmlparse.cc: fixed bug in entity reference handling
6327 Wed Aug 21 13:21:12 BST 2002 James Aylett <tartarus@users.sourceforge.net>
6329 * omindex.cc: Bugfix to metaterm generation when operating on an
6330 absolute URL that is also at the root of its web server.
6332 Wed Aug 21 10:48:06 BST 2002 Richard Boulton <richard@tartarus.org>
6334 * scriptindex.cc: If a field has multiple instances, keep all of
6335 them (previously only kept the final occurrence).
6336 * docs/scriptindex.txt: Mention that multiple instances of fields
6339 Tue Aug 20 18:02:45 BST 2002 James Aylett <tartarus@users.sourceforge.net>
6341 * docs/quickstart.txt: correct for new(ish) omindex behaviour
6343 Sat Aug 17 13:38:57 BST 2002 Richard Boulton <richard@tartarus.org>
6345 * extra/omegascript.vim: Quick attempt at a vim syntax highlighting
6346 file for omegascript. Recognises files only if they're in a
6347 directory called "templates": perhaps we should adopt a suffix to
6348 make recognition easier.
6349 Read the file for installation instructions.
6351 Thu Aug 15 11:21:20 BST 2002 Richard Boulton <richard@tartarus.org>
6353 * scriptindex.cc: Allow updating of databases by a command line
6354 switch, and also turn off verbose output (can be turned back
6356 * docs/scriptindex.txt: Document the "unique" tag.
6358 Thu Aug 15 11:18:21 BST 2002 Richard Boulton <richard@tartarus.org>
6360 * buildall: Copy buildall from xapian-core - the old one breaks
6361 for me (due to odd aclocal paths) but the new one is fine.
6362 We should make a common module to hold build stuff to be shared
6363 between modules, though.
6365 Mon Aug 12 01:34:42 BST 2002 Richard Boulton <richard@tartarus.org>
6367 * scriptindex.cc: Bug fix - index without positional information
6368 if "nopos" is specified, rather than the other way around.
6369 Bug fix - don't completely eradicate newlines in multiline values,
6370 until they have a chance to be converted to spaces.
6371 Delete documents if no fields other than unique fields are
6373 Add some simple debugging, and write messages to a log file in
6374 the database directory.
6376 * configure.in: Use libtool.
6378 Fri Aug 9 13:57:32 BST 2002 Richard Boulton <richard@tartarus.org>
6380 * scriptindex.cc: Fix compile errors, by changing string
6381 constructors to take begin and end iterators, instead of a begin
6384 Fri Jul 05 19:33:55 BST 2002 Olly Betts <olly@survex.com>
6386 * omega.spec.in: Fixed wrt /usr/lib/omega/bin/omega.
6388 Fri Jul 05 19:20:05 BST 2002 Olly Betts <olly@survex.com>
6390 * Makefile.am, docs/quickstart.txt: Install omega as
6391 ${prefix}/lib/omega/bin/omega.
6393 Thu Jul 04 02:11:46 BST 2002 Olly Betts <olly@survex.com>
6395 * scriptindex.cc, docs/scriptindex.txt: new indexer - indexing
6396 behaviour is controlled by a simple but powerful script.
6398 * Makefile.am: tidied up.
6400 * configfile.cc, docs/quickstart.txt: database and templates default to
6401 being in /var/lib/omega rather than /home/omega.
6403 * docs/quickstart.txt: describe the new test mode (command line) rather
6404 than the old one (stdin).
6406 * omega.cc, docs/cgiparams.txt: If xP isn't set, honour paging and
6407 R-set. So RAW_SEARCH now only disables snapping TOPDOC to a multiple
6410 * query.cc: "using namespace std;"
6412 Fri Jun 14 00:07:20 BST 2002 Olly Betts <olly@survex.com>
6414 * $prettyterm{} no longer adds a trailing '.' if the term also exists
6415 with an R prefix and stems to itself.
6417 Fri Jun 14 00:02:16 BST 2002 Olly Betts <olly@survex.com>
6419 * MORELIKE can now take a termname - this allows MORELIKE to be used
6420 with a unique id from an external database if it has been indexed
6423 Thu Jun 13 00:01:11 BST 2002 Olly Betts <olly@survex.com>
6425 * omega.conf: removed trailing slashes from directory names.
6427 * query.cc: removed extra slash added to template_dir; improved
6428 reporting of errors opening template file.
6430 Wed Jun 12 23:51:11 BST 2002 Olly Betts <olly@survex.com>
6432 * Added an alternative test mode - you can now pass parameters as
6433 command line arguments, which is more convenient for repeating
6434 the same test query, and for automated testing, e.g.:
6436 omega 'P=information retrieval' DB=papers
6438 If the first parameter starts with a "-" and doesn't contain an
6439 "=", omega now outputs the version string and stops (to gracefully
6440 handle "omega --version" and "omega --help".
6442 Wed Jun 12 23:39:20 BST 2002 Olly Betts <olly@survex.com>
6444 * omindex.cc: removed OLD_PREFIXES code - shout if you were using it.
6446 Fri May 17 14:09:25 BST 2002 Olly Betts <olly@survex.com>
6448 * Pass the database to the query parser (not used there at present,
6449 but will allow wildcarded searches, etc to be implemented).
6451 Thu May 16 17:57:34 BST 2002 Olly Betts <olly@survex.com>
6453 * <algo.h> -> <algorithm>.
6455 Thu May 16 15:41:14 BST 2002 Sam Liddicott <sam@ananova.com>
6457 * Removed extra package again!
6459 * Moved images to /var/www/icons/omega till we think of something
6460 better. Should be the most harmless solution that still works
6461 without requireing too much brains on the part of the installer
6463 Thu May 16 14:53:54 BST 2002 Sam Liddicott <sam@ananova.com>
6465 * Moved images to a separate optional package to stop touching
6466 user's web tree until we work out what to do. sysadmin can
6467 still install images if he wants and on a redhat box they will
6468 end up in the right place. This will no doubt get revisted later,
6471 Thu May 16 13:31:27 BST 2002 Sam Liddicott <sam@ananova.com>
6473 * Added loads more missing files like images and templates to the
6476 * Also fixed the templates to use the new images dir (if they used
6477 images, which they actually don't)
6479 Thu May 16 12:56:55 BST 2002 Sam Liddicott <sam@ananova.com>
6481 * Fixes to spec file to add various missing files
6483 Wed May 15 12:59:37 BST 2002 Olly Betts <olly@survex.com>
6485 * omindex now understand acronyms (N.A.T.O. E.T ...).
6487 * $highlight{} now understands "&" (AT&T M&S ...) and acronyms.
6489 Tue May 14 13:08:41 BST 2002 Olly Betts <olly@survex.com>
6491 * Index <word>&<word> as a single term (e.g. AT&T, M&S, A&P).
6493 Tue May 14 12:37:49 BST 2002 Olly Betts <olly@survex.com>
6495 * omindex.cc: cleaned up a little.
6497 Tue May 14 11:24:42 BST 2002 Olly Betts <olly@survex.com>
6499 * Fixed config.h inclusion; using std::*.
6501 Tue May 14 11:18:37 BST 2002 Olly Betts <olly@survex.com>
6505 Tue May 14 11:16:03 BST 2002 Olly Betts <olly@survex.com>
6507 * Added SORT and SORTBANDS.
6509 Mon May 13 12:52:29 BST 2002 Olly Betts <olly@survex.com>
6513 * Commented out omindex-config (since it's unfinished) and XML support
6514 (since only omindex-config uses it).
6516 Thu May 02 16:06:02 BST 2002 Olly Betts <olly@survex.com>
6518 * Updated to reflect removal of OmData.
6520 Wed May 01 11:26:59 BST 2002 Olly Betts <olly@survex.com>
6522 * Changed to use queryparser in libomqueryparser.
6524 Tue Apr 23 15:10:42 BST 2002 Olly Betts <olly@survex.com>
6526 * Make buildall smart enough to generate aclocal.m4 properly and
6527 remove acinclude.m4. It now also extracts the package name from
6528 configure.in so we can use the same buildall everywhere; fixed
6529 problem with double use of AM_CXXFLAGS in Makefile.am.
6531 Tue Apr 23 14:27:29 BST 2002 Olly Betts <olly@survex.com>
6533 * Updated for xapian-config and xapian.m4 changes.
6535 Thu Apr 18 14:37:05 BST 2002 Olly Betts <olly@survex.com>
6537 * Updated buildall; minor tweaks to configure.in.
6539 Wed Apr 17 12:31:18 BST 2002 Olly Betts <olly@survex.com>
6541 * Removed references to xapian-config uninst options.
6543 Fri Apr 12 15:48:33 BST 2002 Olly Betts <olly@survex.com>
6545 * Remove parsequery.cc on "make maintainer-clean".
6547 Fri Apr 12 16:19:19 BST 2002 Olly Betts <olly@survex.com>
6549 * Require automake 1.5.
6551 Fri Apr 12 12:47:04 BST 2002 Olly Betts <olly@survex.com>
6553 * Tweaked what gets interpreted as a phrase.
6555 Fri Apr 12 12:44:00 BST 2002 Olly Betts <olly@survex.com>
6557 * Fixed to use AM_CFLAGS and AM_CXXFLAGS.
6559 Mon Apr 01 23:34:09 BST 2002 Olly Betts <olly@survex.com>
6561 * Fixed support for decimal numeric entities (e.g. "ö")
6563 * Added support for all iso-8859-1 named entities (e.g. "ö")
6565 Mon Apr 01 15:07:31 BST 2002 Olly Betts <olly@survex.com>
6567 * Applied patch from "orion orion" to fix problem in HTML parsing.
6569 Mon Mar 25 13:11:14 GMT 2002 Olly Betts <olly@survex.com>
6571 * More tolerant treatment of random punctuation in query.
6573 Mon Feb 4 14:57:36 GMT 2002 Sam Liddicott <sam@ananova.com>
6575 * Added support for repeated fields in document data.
6576 $field{fieldname} may now return multiple tab separated values if
6577 more than one instance of a field exists in the document data
6579 Tue Jan 15 16:29:39 GMT 2002 Sam Liddicott <sam@ananova.com>
6581 * Fixed date_range_filter for the case where DATE1 and DATE2 don't
6582 share the same MONTH and YEAR and M## terms for intermediate months
6583 need calculating between the years.
6585 Thu Jan 10 15:39:43 GMT 2002 Sam Liddicott <sam@ananova.com>
6587 * Added $htmlstrip{} to strip out html tags
6589 Thu Jan 10 14:34:35 GMT 2002 James Aylett <tartarus@users.sourceforge.net>
6591 * toptermsjs snippet now included inside the HEAD, so it's
6592 actually legal HTML. Snippet now sets the required 'type'
6593 attribute as well. (It keeps the technically illegal
6594 'language' attribute because I have a sneaking suspicion it
6595 won't work otherwise.)
6597 Thu Jan 10 14:30:19 GMT 2002 James Aylett <tartarus@users.sourceforge.net>
6599 * $opt with two arguments now acts as a lookup for a $setmap
6600 map. This was previously documented in a misleading fashion.
6601 The new system is backwards compatible with the old.
6603 Wed Jan 9 Sam Liddicott <sam@ananova.com>
6605 * Added RAW_SEARCH as cgi param which when set stops change-search
6606 detection being performed and processes rset, topdoc and page-change
6607 parameters ( [ ] < > 1 2 etc etc ) anyway
6609 * Added MIN_HITS cgi param to request many more hits than can
6610 fit on the page so we can be confident that the next few
6611 consecutive pages will really be needed
6613 * Added xml template which when combined with RAW_SEARCH=1
6614 can be very useful when searching is done from another
6617 Fri Dec 21 17:56:02 GMT 2001 Olly Betts <olly@survex.com>
6619 * Namespace fixes to allow use of find and find_if on Redhat's
6622 Fri Dec 21 17:53:59 GMT 2001 Olly Betts <olly@survex.com>
6624 * Added quick'n'dirty interface to allow experimentation with
6627 Thu Dec 20 14:46:33 GMT 2001 Olly Betts <olly@survex.com>
6629 * Document xDB, xDAYSMINUS, xDATE1, xDATE2, xB.
6631 Thu Dec 20 12:55:29 GMT 2001 Olly Betts <olly@survex.com>
6633 * Use double quotes on parameters to <BODY>.
6635 Mon Dec 17 15:01:43 GMT 2001 Olly Betts <olly@survex.com>
6637 * Get rid of whitespace between hundreds and tens image in page
6640 Fri Dec 14 17:26:48 GMT 2001 Olly Betts <olly@survex.com>
6642 * Force first page of hits if DB, DEFAULTOP, B, DAYSMINUS, DATE1,
6643 or DATE2 changes; also clear relevance judgements if DB changes.
6645 Fri Dec 14 16:21:07 GMT 2001 Olly Betts <olly@survex.com>
6647 * Removed restriction on minimum page size (was 10) - for a shopping
6648 type application with images next to each hit, 5 or fewer per page
6649 might be reasonable; even one result per page makes sense for some
6652 Fri Dec 14 15:37:20 GMT 2001 Olly Betts <olly@survex.com>
6654 * Added $error to make nicer error reporting possible.
6656 Fri Dec 14 14:49:18 GMT 2001 Olly Betts <olly@survex.com>
6658 * Give more helpful messages for query syntax errors in cases where
6659 we can without elaborate YACC hackery.
6661 Thu Dec 13 15:10:24 GMT 2001 Olly Betts <olly@survex.com>
6663 * For image page buttons, display pages 10-999 by using 2 or 3 images.
6665 Thu Dec 13 15:02:16 GMT 2001 Olly Betts <olly@survex.com>
6667 * New operators: $div{}, $mod{}, $mul{}, $sub{}, $ge{}, $gt{}, $le{},
6670 Wed Dec 12 16:37:47 GMT 2001 Olly Betts <olly@survex.com>
6672 * Updated omegascript documentation.
6674 Wed Dec 12 15:43:19 GMT 2001 Olly Betts <olly@survex.com>
6676 * Fixed TOPDOC clipping.
6678 Wed Dec 12 15:36:20 GMT 2001 Olly Betts <olly@survex.com>
6680 * templates/query: Fixed typo which caused "..." to appear after
6681 page buttons when it wasn't appropriate.
6683 Wed Dec 12 15:11:23 GMT 2001 Olly Betts <olly@survex.com>
6685 * omega: Added stopword list (still hardcoded at present though).
6687 Wed Dec 12 12:46:57 GMT 2001 Olly Betts <olly@survex.com>
6689 * omindex: index unstemmed terms with prefix 'R' (mnemonic: Raw).
6691 * omega: $topterms will now return terms with prefix 'R'.
6693 * parsequery.yy: fixed handling of DEFAULT_OP; "+first second" and
6694 "-first second" now work; stopwording queries working (currently
6695 stopword list is hardwired to just "the") - stopwords are ignored
6696 when used as normal terms, but not in phrases, or with + and -.
6698 * templates/query: make use of $prettyterm{}.
6700 Wed Dec 12 11:11:30 GMT 2001 Olly Betts <olly@survex.com>
6702 * $highlight{} now uses find_if not find_first_of (faster).
6704 * Fixed detection of new/old/extended query when a term occurs
6705 in the query more than once.
6707 * Added $prettyterm{TERM} to convert a probabilistic term for
6708 display to the user.
6710 * $map would allow more than two arguments, but ignore them. Fixed
6711 to take exactly two.
6713 Fri Dec 07 15:59:21 GMT 2001 Olly Betts <olly@survex.com>
6715 * Added macros to OmegaScript.
6717 * template/query: updated to use macros.
6719 * Removed specialcase to allow no-argument commands to accept an empty
6720 argument list (e.g. "$thispage{}" rather than "$thispage"). The only
6721 reason this was useful was to allow "$thispage{}s" which can just as
6722 well be written using a comment to force the parser do what you want,
6723 e.g. "$thispage${}s".
6725 Thu Dec 06 18:59:34 GMT 2001 Olly Betts <olly@survex.com>
6727 * If a stemmer is set, and all_stem isn't, only stemmer terms starting
6728 with a lowercase letter.
6730 Thu Dec 06 18:49:40 GMT 2001 Olly Betts <olly@survex.com>
6732 * parsequery.yy: changed to use find_if() (faster than find_first_of()).
6734 Thu Dec 06 17:46:37 GMT 2001 Olly Betts <olly@survex.com>
6736 * Base page links on estimated number of matches, not minimum.
6738 Wed Dec 05 17:07:33 GMT 2001 Olly Betts <olly@survex.com>
6740 * omindex: minor speed tweaks.
6742 Wed Dec 05 16:52:21 GMT 2001 Olly Betts <olly@survex.com>
6744 * omindex: further HTML parser speed-ups.
6746 Wed Dec 05 16:31:33 GMT 2001 Olly Betts <olly@survex.com>
6748 * omindex: sped up HTML parsing.
6750 Wed Dec 05 14:52:53 GMT 2001 Olly Betts <olly@survex.com>
6752 * omindex: parsing terms from text is now twice as fast.
6754 Thu Nov 29 16:53:45 GMT 2001 Olly Betts <olly@survex.com>
6756 * NEAR phrases (e.g. "a NEAR b NEAR c") now work; removed "{a b c}"
6757 syntax for NEAR phrases.
6759 Thu Nov 29 15:25:54 GMT 2001 Olly Betts <olly@survex.com>
6761 * $highlight{} now allows you to specify the tags to use for the
6764 Thu Nov 29 15:24:53 GMT 2001 Olly Betts <olly@survex.com>
6766 * topdoc is unsigned so subtracting and then checking if it's < 0
6769 Wed Nov 28 15:45:39 GMT 2001 Olly Betts <olly@survex.com>
6771 * Fixed clipping of hit page in case when there are a multiple of
6772 HITSPERPAGE matches.
6774 Wed Nov 28 14:03:48 GMT 2001 Olly Betts <olly@survex.com>
6776 * Added $hostname{URL}; $version output now says "Xapian - omega
6779 Wed Nov 28 13:04:46 GMT 2001 Olly Betts <olly@survex.com>
6781 * docs/cgiparams.txt: Minor corrections and updates.
6783 Wed Nov 28 13:03:40 GMT 2001 Olly Betts <olly@survex.com>
6785 * If we're asked for a page of hits beyond the end of the matches, clip
6786 to the last page of matches rather than the first.
6788 Wed Nov 28 13:02:31 GMT 2001 Olly Betts <olly@survex.com>
6790 * For an EXTENDED_QUERY, force the first page of hits.
6792 Wed Nov 28 12:56:56 2001 James Aylett <tartarus@users.sourceforge.net>
6794 * Lower case terms when constructing the query (otherwise why
6795 do we store them in the database that way? :-)
6797 Wed Nov 28 12:36:49 GMT 2001 Olly Betts <olly@survex.com>
6799 * Fettled default query template.
6801 Wed Nov 28 12:33:52 GMT 2001 Olly Betts <olly@survex.com>
6803 * Request one more match than the last we want to display so we can
6804 tell if the next page of hits is empty or not - otherwise we risk
6805 offering a "next page" link when there are no more hits.
6807 Mon Nov 26 16:28:00 2001 James Aylett <tartarus@users.sourceforge.net>
6809 * --no-recurse / -l option added; useful if your sites are
6810 nested in their disc storage (particularly things like
6811 http://example.com/ being a distinct site, with
6812 http://example.com/product being within it)
6814 * --mime-type now really works (it was --mime-map in the code)
6816 * documentation updated further
6818 Mon Nov 26 14:39:00 2001 James Aylett <tartarus@users.sourceforge.net>
6820 * options parsing fixed so minimised/unrecognised long options
6823 Mon Nov 26 14:00:13 2001 James Aylett <tartarus@users.sourceforge.net>
6825 * omindex can now index part of a site (previously 'subsite')
6826 by having an index base within the site's disc storage
6828 Mon Nov 26 13:57:10 2001 James Aylett <tartarus@users.sourceforge.net>
6830 * Documentation updated for recent changes
6832 Thu Nov 22 13:24:45 GMT 2001 Olly Betts <olly@survex.com>
6834 * Use $nice{} in query template, but don't use $freqs. Use numbers as
6835 page image button tooltips on Netscape 4.
6837 Thu Nov 22 13:02:17 GMT 2001 Olly Betts <olly@survex.com>
6839 * Herded escaped CGI parameter mangling code back into cgiparam.cc;
6840 added special handling for numeric image button names.
6842 Thu Nov 22 12:55:00 GMT 2001 Olly Betts <olly@survex.com>
6844 * Fixed $nice to put the comma (or dot) in the right place.
6846 Tue Nov 20 17:30:19 GMT 2001 Olly Betts <olly@survex.com>
6848 * $lastpage now returns 0 when there are no matches (previously
6849 gave a very large answer).
6851 Tue Nov 20 12:30:47 GMT 2001 Olly Betts <olly@survex.com>
6853 * $terms now only returns terms which were in the parsed query
6854 (boolean filter terms are excluded).
6856 Tue Nov 20 12:07:54 GMT 2001 Olly Betts <olly@survex.com>
6858 * Fixed bug in date range filtering (got it wrong when start and end
6859 date were in the same month).
6861 * DAYSMINUS now counts back from DATE1 (if specified) rather than
6862 always counting back from the present.
6864 Mon Nov 19 17:13:24 GMT 2001 Olly Betts <olly@survex.com>
6866 * Added date-range filtering (not fully tested yet).
6868 Mon Nov 19 15:21:31 GMT 2001 Olly Betts <olly@survex.com>
6870 * Fixed (c) message displayed by -v (BrightStation "PLC" not "Inc.",
6873 Fri Nov 16 11:49:20 GMT 2001 Olly Betts <olly@survex.com>
6875 * New OmegaScript commands: $allterms{<docid>}, $freq{<term>},
6876 $nice{<number>}, $set_relevant{<docid>}.
6878 * $map{} now returns a list (shouldn't affect most users - if
6879 the extra tabs are a problem, change `$map{...}' to
6880 `$list{$map{...},}' ).
6882 * Template `query' now preserves value of THRESHOLD.
6884 * Template `godmode' fixed to actually work.
6886 Wed Nov 14 15:04:13 GMT 2001 Olly Betts <olly@survex.com>
6888 * Fixed to compile with GCC3.0
6890 Wed Nov 14 14:54:53 GMT 2001 Olly Betts <olly@survex.com>
6892 * Updated for changes to OmQuery
6894 Tue Nov 06 13:10:15 GMT 2001 Olly Betts <olly@survex.com>
6896 * Updated .cvsignore.
6898 Tue Nov 06 13:02:04 GMT 2001 Olly Betts <olly@survex.com>
6900 * Fixed lookup of CGI parameter THRESHOLD.
6902 Tue Nov 6 12:38:37 GMT 2001 Richard Boulton <richard@tartarus.org>
6904 * Moved configure.ac to configure.in: depending on autoconf 2.13 is
6907 Tue Nov 06 12:23:55 GMT 2001 Olly Betts <olly@survex.com>
6909 * Added support for percentage threshold cutoff (CGI var THRESHOLD);
6910 Code for calculating better percentages has been pushed into Xapian
6911 so removed it from here.
6913 Mon Nov 5 12:42:26 GMT 2001 Richard Boulton <richard@tartarus.org>
6915 * Omega moved to new home, from om-examples/omega.
6916 Standalone build system added.