1 Neighborhood search for analysis tools {#page_analysisnbsearch}
2 ======================================
4 The header nbsearch.h declares a C++ interface to a relatively flexible and
5 efficient neighborhood search. It is currently implemented within the
6 selection module where it originated, but it does not have any dependencies on
7 the other selection code and can be easily split out in the future.
9 The emphasis is on flexibility and ease of use; one main driver is to have
10 one common implementation of grid-based searching to avoid replicating this in
11 multiple tools (and to make more tools take advantage of the significant
12 performance improvement this allows). The main features that it provides:
14 - Grid-based searching with any triclinic box shape that \Gromacs supports
15 (i.e., a triangular box matrix and not too skewed).
16 - Grid-based searching with all PBC options except for screw boundary
18 - With no PBC, grid-based searching where the grid is constructed based on the
19 bounding box of the gridded atoms.
20 - Efficient, rectangular grid cells whose size is determined by particle
21 density and not limited by the cutoff.
22 - Transparent fallback to a simple all-pairs search if the cutoff is too long
23 for the algorithm or grid searching is not otherwise supported.
24 - Support for either N-vs-M pair search with two sets of coordinates, or for
25 all pairs within a single set of coordinates.
26 - Support for computing all distances in the XY plane only (and still
28 - Convenience functions for finding the shortest distance or the nearest pair
29 between two sets of positions.
30 - Basic support for exclusions.
31 - Thread-safe handling of multiple concurrent searches with the same cutoff
32 with the same or different reference positions.
37 The neighborhood search works conceptually with two different sets of
40 - _reference positions_: When initiating the search, you provide one set of
41 reference positions that get placed on the search grid and determine the
43 - _test positions_: For each set of reference positions, you provide a set of
44 test positions (or a single position). The search is performed from each
45 test position, finding the reference positions within the cutoff from this
46 point. It is possible to perform multiple searches against the same set of
47 reference positions (and the same grid).
49 To start using the neighborhood search, you need to first create an instance of
50 gmx::AnalysisNeighborhood. This class allows you to set some global properties
51 for the search (most notably, the cutoff distance). Then you provide the
52 reference positions as a gmx::AnalysisNeighborhoodPositions and PBC information
53 to get a gmx::AnalysisNeighborhoodSearch instance. You can then either use
54 methods directly in this class to find, e.g., the nearest reference point from
55 a test position, or you can do a full pair search that returns you all the
56 reference-test pairs within a cutoff. The pair search is performed using an
57 instance of gmx::AnalysisNeighborhoodPairSearch that the search object returns.
58 Methods that return information about pairs return an instance of
59 gmx::AnalysisNeighborhoodPair, which can be used to access the indices of
60 the reference and test positions in the pair, as well as the computed distance.
61 See the class documentation for these classes for details.
63 For use together with selections, an instance of gmx::Selection or
64 gmx::SelectionPosition can be transparently passed as the positions for the
70 This section provides a high-level overview of the algorithm used. It is not
71 necessary to understand all the details to use the API, but it can be useful to
72 get the best performance out of it. The main audience is developers who may
73 need to extend the API to make it suitable for more cases.
75 The grid for the search is initialized based on the reference positions and the
78 - The grid cells are always rectangular, even for fully triclinic boxes.
79 - If there is no PBC, the grid edges are defined from the bounding box of the
80 reference positions; with PBC, the grid covers the unit cell.
81 - The grid cell size is determined such that on average, each cell contains
82 ten particles. Special considerations are in place for cases where the grid
83 will only be one- or two-dimensional because of a flat box.
84 - If the resulting grid has too few cells in some dimensions, the code
85 falls back automatically to an all-pairs search. For correct operation, the
86 grid algorithm needs three cells in each dimension, but the code can fall
87 back to a non-gridded search for each dimension separately.
88 - The initialization also pre-calculates the shifts required across the
89 periodic boundaries for triclinic cells, i.e., the fractional number of
90 cells that the grid origin is shifted when crossing the periodic boundary in
92 - Finally, all the reference positions are mapped to the grid cells.
94 The average number of particles within a cell is somewhat heuristic in the
95 above logic. This has not been particularly optimized for best performance.
97 When doing the search for test positions, each test position is considered
100 - The coordinates of the test position are mapped to the grid coordinate
101 system. The coordinates here are fractional and may lay outside the grid
102 for non-periodic dimensions.
103 - The bounding box of the cutoff sphere centered at the mapped coordinates is
104 determined, and each grid cell that intersects with this box is used for
105 searching the reference positions. So the searched grid cells may vary
106 depending on the coordinates of the test position, even if the test position
107 is within the same cell.
108 - Possible triclinic shifts in the grid are considered when looping over the
109 cells in the cutoff box if the coordinates wrap around a periodic dimension.
110 This is done by shifting the search range in the other dimensions when the Z
111 or Y dimension loop crosses the boundary.