1 The modular simulator {#page_modularsimulator}
2 ==============================================
4 A new modular approach to the GROMACS simulator is described. The
5 simulator in GROMACS is the object which carries out a simulation. The
6 simulator object is created and owned by the runner object, which is
7 outside of the scope of this new approach, and will hence not be further
8 described. The simulator object provides access to some generally used
9 data, most of which is owned by the runner object.
11 ## Legacy implementation
13 In the legacy implementation, the simulator consisted of a number of
14 independent functions carrying out different type of simulations, such
15 as `do_md` (MD simulations), `do_cg` and `do_steep` (minimization),
16 `do_rerun` (force and energy evaluation of simulation trajectories),
17 `do_mimic` (MiMiC QM/MM simulations), `do_nm` (normal mode analysis),
18 and `do_tpi` (test-particle insertion).
20 The legacy approach has some obvious drawbacks:
21 * *Data management:* Each of the `do_*` functions defines local data,
22 including complex objects encapsulating some data and functionality,
23 but also data structures effectively used as "global variables" for
24 communication between different parts of the simulation. Neither the
25 ownership nor the access rights (except for `const` qualifiers) are
27 * *Dependencies:* Many function calls in the `do_*` functions are
28 dependent on each others, i.e. rely on being called in a specific
29 order, but these dependencies are not clearly defined.
30 * *Branches:* The flow of the `do_*` functions are hard to understand
31 due to branching. At setup time, and then at every step of the
32 simulation run, a number of booleans are set (e.g. `bNS` (do neighbor
33 searching), `bCalcEner` (calculate energies), `do_ene` (write
34 energies), `bEner` (energy calculation needed), etc). These booleans
35 enable or disable branches of the code (for the current step or the
36 entire run), mostly encoded as `if(...)` statements in the main `do_*`
37 loop, but also in functions called from there.
38 * *Task scheduling:* Poorly defined dependencies and per-step branching
39 make task scheduling (e.g. parallel execution of independent tasks)
41 * *Error-prone for developers:* Poorly defined dependencies and unclear
42 code flow make changing the simulator functions very error-prone,
43 rendering the implementation of new methods tedious.
45 ## The modular simulator approach
47 The main design goals of the new, fully modular simulator approach
49 * *Extensibility:* We want to ease maintenance and the implementation
50 of new integrator schemes.
51 * *Monte Carlo:* We want to add MC capability, which can be mixed with
52 MD to create hybrid MC/MD schemes.
53 * *Data locality & interfaces:* We aim at localizing data in objects,
54 and offer interfaces if access from other objects is needed.
55 * *Multi-stepping:* We aim at a design which intrinsically supports
56 multi-step integrators, e.g. having force calls at different
57 frequencies, or avoid having branches including rare events
58 (trajectory writing, neighbor search, ...) in the computation loops.
59 * *Task parallelism:* Although not first priority, we want to have a
60 design which can be extended to allow for task parallelism.
62 The general design approach is that of a **task scheduler**. *Tasks*
63 are argument-less functions which perform a part of the computation.
64 Periodically during the simulation, the scheduler builds a
65 *queue of tasks*, i.e. a list of tasks which is then run through in
66 order. Over time, with data dependencies clearly defined, this
67 approach can be modified to have independent tasks run in parallel.
69 The approach is most easily displayed using some pseudo code:
71 class ModularSimulator : public ISimulator
77 std::vector<ISignaller*> signallers_;
78 std::vector<ISimulatorElement*> elements_;
79 std::queue<SimulatorRunFunction*> taskQueue_;
82 void ModularSimulator::run()
84 constructElementsAndSignallers();
88 // Fill the task queue with new tasks (can be precomputed for many steps)
90 // Now simply loop through the queue and run one task after the next
91 for (auto task : taskQueue)
93 (*task)(); // run task
98 This allows for an important division of tasks.
100 * `constructElementsAndSignallers()` is responsible to **store the
101 elements in the right order**. This includes the different order of
102 element in different algorithms (e.g. leap-frog vs. velocity
103 verlet), but also logical dependencies (energy output after compute
105 * `populateTaskQueue()` is responsible to **decide if elements need to
106 run at a specific time step**. The elements get called in order, and
107 decide whether they need to run at a specific step. This can be
108 pre-computed for multiple steps. In the current implementation, the
109 tasks are pre-computed for the entire life-time of the neighbor
111 * **Running the actual simulation tasks** is done after the task queue
112 was filled. This is achieved by simply looping over the task list,
113 no conditionals or branching needed.
115 ### Simulator elements
117 The task scheduler holds a list of *simulator elements*, defined by
118 the `ISimulatorElement` interface. These elements have a
119 `scheduleTask(Step, Time)` function, which gets called by the task
120 scheduler. This allows the simulator element to register one (or more)
121 function pointers to be run at that specific `(Step, Time)`. From the
122 point of view of the element, it is important to note that the
123 computation will not be carried out immediately, but that it will be
124 called later during the actual (partial) simulation run. From the
125 point of view of the builder of the task scheduler, it is important to
126 note that the order of the elements determines the order in which
127 computation is performed. The task scheduler periodically loops over
128 its list of elements, builds a queue of function pointers to run, and
129 returns this list of tasks. As an example, a possible application
130 would be to build a new queue after each domain-decomposition (DD) /
131 neighbor-searching (NS) step, which might occur every 100 steps. The
132 scheduler would loop repeatedly over all its elements, with elements
133 like the trajectory-writing element registering for only one or no
134 step at all, the energy-calculation element registering for every
135 tenth step, and the force, position / velocity propagation, and
136 constraining algorithms registering for every step. The result would
137 be a (long) queue of function pointers including all computations
138 needed until the next DD / NS step, which can be run without any
143 Some elements might require computations by other elements. If for
144 example, the trajectory writing is an element independent from the
145 energy-calculation element, it needs to signal to the energy element
146 that it is about to write a trajectory, and that the energy element
147 should be ready for that (i.e. perform an energy calculation in the
148 upcoming step). This requirement, which replaces the boolean branching
149 in the current implementation, is fulfilled by a Signaller - Client
150 model. Classes implementing the `ISignaller` interface get called
151 *before* every loop of the element list, and can inform registered
152 clients about things happening during that step. The trajectory
153 element, for example, can tell the energy element that it will write
154 to trajectory at the end of this step. The energy element can then
155 register an energy calculation during that step, being ready to write
156 to trajectory when requested.
158 ### Sequence diagrams
161 In the loop preparation, the signallers and elements are created and
162 stored in the right order. The signallers and elements can then
163 perform any setup operations needed.
169 Signallers [label="ModularSimulator::\nSignallers"],
170 Elements [label="ModularSimulator::\nElements"],
171 TaskQueue [label="ModularSimulator::\nTaskQueue"];
173 --- [ label = "constructElementsAndSignallers()" ];
174 ModularSimulator => Signallers [ label = "Create signallers\nand order them" ];
175 ModularSimulator => Elements [ label = "Create elements\nand order them" ];
176 --- [ label = "constructElementsAndSignallers()" ];
180 --- [ label = "setupAllElements()" ];
181 ModularSimulator => Signallers [ label = "Call setup()" ];
182 Signallers box Signallers [ label = "for signaler in Signallers\n signaller->setup()" ];
184 ModularSimulator => Elements [ label = "Call setup()" ];
185 Elements box Elements [ label = "for element in Elements\n element->setup()" ];
186 --- [ label = "setupAllElements()" ];
190 The main loop consists of two parts which are alternately run until the
191 simulation stop criterion is met. The first part is the population of
192 the task queue, which determines all tasks that will have to run to
193 simulate the system for a given time period. In the current implementation,
194 the scheduling period is set equal to the lifetime of the neighborlist.
195 Once the tasks have been predetermined, the simulator runs them in order.
196 This is the actual simulation computation, which can now run without any
203 Signallers [label="ModularSimulator::\nSignallers"],
204 Elements [label="ModularSimulator::\nElements"],
205 TaskQueue [label="ModularSimulator::\nTaskQueue"];
207 ModularSimulator box TaskQueue [ label = "loop: while(not lastStep)" ];
208 ModularSimulator note TaskQueue [ label = "The task queue is empty. The simulation state is at step N.", textbgcolor="yellow" ];
211 ModularSimulator box ModularSimulator [ label = "populateTaskQueue()" ];
212 ModularSimulator =>> TaskQueue [ label = "Fill task queue with tasks until next neighbor-searching step" ];
215 ModularSimulator note TaskQueue [ label = "The task queue now holds all tasks needed to move the simulation from step N to step N + nstlist. The simulation for these steps has not been performed yet, however. The simulation state is hence still at step N.", textbgcolor="yellow" ];
219 ModularSimulator => TaskQueue [ label = "Run all tasks in TaskQueue" ];
220 TaskQueue box TaskQueue [label = "for task in TaskQueue\n run task" ];
221 TaskQueue note TaskQueue [ label = "All simulation computations are happening in this loop!", textbgcolor="yellow" ];
224 ModularSimulator note TaskQueue [ label = "The task queue is now empty. The simulation state is at step N + nstlist.", textbgcolor="yellow" ];
225 ModularSimulator box TaskQueue [ label = "end loop: while(not lastStep)" ];
230 A part of the main loop, the task scheduling in `populateTaskQueue()`
231 allows the elements to push tasks to the task queue. For every scheduling
232 step, the signallers are run first to give the elements information about
233 the upcoming scheduling step. The scheduling routine elements are then
234 called in order, allowing the elements to register their respective tasks.
240 Signallers [label="ModularSimulator::\nSignallers"],
241 Elements [label="ModularSimulator::\nElements"],
242 TaskQueue [label="ModularSimulator::\nTaskQueue"];
244 --- [ label = "populateTaskQueue()" ];
245 ModularSimulator box ModularSimulator [ label = "doDomainDecomposition()\ndoPmeLoadBalancing()" ];
246 ModularSimulator =>> Elements [ label = "Update state and topology" ];
250 ModularSimulator note ModularSimulator [ label = "schedulingStep == N\nsimulationStep == N", textbgcolor="yellow" ];
251 ModularSimulator box TaskQueue [ label = "loop: while(not nextNeighborSearchingStep)" ];
252 ModularSimulator => Signallers [ label = "Run signallers for schedulingStep" ];
253 Signallers box Signallers [label = "for signaller in Signallers\n signaller->signal(scheduleStep)" ];
254 Signallers =>> Elements [ label = "notify" ];
255 Signallers note Elements [ label = "The elements now know if schedulingStep has anything special happening, e.g. neighbor searching, log writing, trajectory writing, ...", textbgcolor="yellow" ];
259 ModularSimulator => Elements [ label = "Schedule run functions for schedulingStep" ];
260 Elements box Elements [label = "for element in Elements\n element->scheduleTask(scheduleStep)" ];
261 Elements =>> TaskQueue [ label = "Push task" ];
262 Elements note TaskQueue [ label = "The elements have now registered everything they will need to do for schedulingStep.", textbgcolor="yellow" ];
263 ModularSimulator note ModularSimulator [ label = "schedulingStep++", textbgcolor="yellow" ];
265 ModularSimulator box TaskQueue [ label = "end loop: while(not nextNeighborSearchingStep)" ];
266 --- [ label = "populateTaskQueue()" ];
267 ModularSimulator note ModularSimulator [ label = "schedulingStep == N + nstlist\nsimulationStep == N", textbgcolor="yellow" ];
271 ## Acceptance tests and further plans
273 Acceptance tests before this can be made default code path (as
274 defined with Mark Jan 2019)
275 * End-to-end tests pass on both `do_md` and the new loop in
276 Jenkins pre- and post-submit matrices
277 * Physical validation cases pass on the new loop
278 * Performance on different sized benchmark cases, x86 CPU-only
279 and NVIDIA GPU are at most 1% slower -
280 https://github.com/ptmerz/gmxbenchmark has been developed to
283 After the NVE MD bare minimum, we will want to add support for
284 * Thermo- / barostats
289 Using the new modular simulator framework, we will then explore
290 adding new functionality to GROMACS, including
291 * Monte Carlo barostat
292 * hybrid MC/MD schemes
293 * multiple-time-stepping integration
295 We sill also explore optimization opportunities, including
296 * re-use of the same queue if conditions created by user input are
297 sufficiently favorable (e.g. by design or when observed)
298 * simultaneous execution of independent tasks
300 We will probably not prioritize support for (and might consider
301 deprecating from do_md for GROMACS 2020)
302 * Simulated annealing
304 * Simulated tempering
309 * Fancy mdp options for FEP output
314 * Constant acceleration groups
315 * Ensemble-averaged restraints
316 * Time-averaged restraints
317 * Freeze, deform, cos-acceleration
319 ## Signallers and elements
321 The current implementation of the modular simulator consists of
322 the following signallers and elements:
326 All signallers have a list of pointers to clients, objects that
327 implement a respective interface and get notified of events the
328 signaller is communicating.
330 * `NeighborSearchSignaller`: Informs its clients whether the
331 current step is a neighbor-searching step.
332 * `LastStepSignaller`: Informs its clients when the current step
333 is the last step of the simulation.
334 * `LoggingSignaller`: Informs its clients whether output to the
335 log file is written in the current step.
336 * `EnergySignaller`: Informs its clients about energy related
337 special steps, namely energy calculation steps, virial
338 calculation steps, and free energy calculation steps.
339 * `TrajectoryElement`: Informs its clients if writing to
340 trajectory (state [x/v/f] and/or energy) is planned for the
341 current step. Note that the `TrajectoryElement` is not a
342 pure signaller, but also implements the `ISimulatorElement`
343 interface (see section "Simulator Elements" below).
345 ### Simulator Elements
347 #### `TrajectoryElement`
348 The `TrajectoryElement` is a special element, as it
349 is both implementing the `ISimulatorElement` and the `ISignaller`
350 interfaces. During the signaller phase, it is signalling its
351 _signaller clients_ that the trajectory will be written at the
352 end of the current step. During the simulator run phase, it is
353 calling its _trajectory clients_ (which do not necessarily need
354 to be identical with the signaller clients), passing them a valid
355 output pointer and letting them write to trajectory. Unlike the
356 legacy implementation, the trajectory element itself knows nothing
357 about the data that is written to file - it is only responsible
358 to inform clients about trajectory steps, and providing a valid
359 file pointer to the objects that need to write to trajectory.
361 #### `StatePropagatorData`
362 The `StatePropagatorData` takes part in the simulator run, as it might
363 have to save a valid state at the right moment during the
364 integration. Placing the StatePropagatorData correctly is for now the
365 duty of the simulator builder - this might be automated later
366 if we have enough meta-data of the variables (i.e., if
367 `StatePropagatorData` knows at which time the variables currently are,
368 and can decide when a valid state (full-time step of all
369 variables) is reached. The `StatePropagatorData` is also a client of
370 both the trajectory signaller and writer - it will save a
371 state for later writeout during the simulator step if it
372 knows that trajectory writing will occur later in the step,
373 and it knows how to write to file given a file pointer by
374 the `TrajectoryElement`.
377 The `EnergyElement` takes part in the simulator run, as it
378 does either add data (at energy calculation steps), or
379 record a non-calculation step (all other steps). It is the
380 responsibility of the simulator builder to ensure that the
381 `EnergyElement` is called at a point of the simulator run
382 at which it has access to a valid energy state.
387 ### `StatePropagatorData`
388 The `StatePropagatorData` contains a little more than the pure
389 statistical-physical micro state, namely the positions,
390 velocities, forces, and box matrix, as well as a backup of
391 the positions and box of the last time step. While it takes
392 part in the simulator loop to be able to backup positions /
393 boxes and save the current state if needed, it's main purpose
394 is to offer access to its data via getter methods. All elements
395 reading or writing to this data need a pointer to the
396 `StatePropagatorData` and need to request their data explicitly. This
397 will later simplify the understanding of data dependencies
400 Note that the `StatePropagatorData` can be converted to and from the
401 legacy `t_state` object. This is useful when dealing with
402 functionality which has not yet been adapted to use the new
403 data approach - of the elements currently implemented, only
404 domain decomposition, PME load balancing, and the initial
405 constraining are using this.
408 The EnergyElement owns the EnergyObject, and is hence responsible
409 for saving energy data and writing it to trajectory. It also owns
410 the tensors for the different virials and the pressure as well as
411 the total dipole vector.
413 It subscribes to the trajectory signaller, the energy signaller,
414 and the logging signaller to know when an energy calculation is
415 needed and when a non-recording step is enough. The simulator
416 builder is responsible to place the element in a location at
417 which a valid energy state is available. The EnergyElement is
418 also a subscriber to the trajectory writer element, as it is
419 responsible to write energy data to trajectory.
421 The EnergyElement offers an interface to add virial contributions,
422 but also allows access to the raw pointers to tensor data, the
423 dipole vector, and the legacy energy data structures.
426 The topology object owns the local topology and holds a constant reference
427 to the global topology owned by the ISimulator.
429 The local topology is only infrequently changed if domain decomposition is
430 on, and never otherwise. The topology holder therefore offers elements to register
431 as ITopologyHolderClients. If they do so, they get a handle to the updated local
432 topology whenever it is changed, and can rely that their handle is valid
433 until the next update. The domain decomposition element is defined as friend
434 class to be able to update the local topology when needed.
436 The topology holder is not a `ISimulatorElement`, i.e. it does not take part in the