Simplify how we clamp wdf to <= doclen
commit71b2d22773df333bc58b6a13f1024f96403ba953
authorOlly Betts <olly@survex.com>
Mon, 8 Jul 2024 22:54:45 +0000 (9 10:54 +1200)
committerOlly Betts <olly@survex.com>
Mon, 8 Jul 2024 22:54:45 +0000 (9 10:54 +1200)
treef33ee0aa4605b22a567369f699b5fa08af6df8f4
parent020258069aedd101b6fb6adfcc9449463bf61e9a
Simplify how we clamp wdf to <= doclen

We need to do with for OP_SYNONYM where the wdf is approximated.
We now only clamp when need_stat(WDF) and need_stat(DOC_LENGTH)
have both been called - when they both have we can rely on having
both stats, so the clamping is cheap, and otherwise the weighting
scheme can't reasonably rely on wdf <= doclen.

This completely avoids the need to compute whether each wildcard or
edit distance expansion is wdf_disjoint.
xapian-core/api/queryinternal.cc
xapian-core/include/xapian/weight.h
xapian-core/matcher/localsubmatch.cc
xapian-core/matcher/localsubmatch.h
xapian-core/matcher/queryoptimiser.h
xapian-core/matcher/synonympostlist.cc
xapian-core/matcher/synonympostlist.h