docs/doxygen/lib/modularsimulator.md

   1 The modular simulator {#page_modularsimulator}
   2 ==============================================
   3
   4 A new modular approach to the GROMACS simulator is described. The
   5 simulator in GROMACS is the object which carries out a simulation. The
   6 simulator object is created and owned by the runner object, which is
   7 outside of the scope of this new approach, and will hence not be further
   8 described. The simulator object provides access to some generally used
   9 data, most of which is owned by the runner object.
  10
  11 ## Legacy implementation
  12
  13 In the legacy implementation, the simulator consisted of a number of
  14 independent functions carrying out different type of simulations, such
  15 as `do_md` (MD simulations), `do_cg` and `do_steep` (minimization),
  16 `do_rerun` (force and energy evaluation of simulation trajectories),
  17 `do_mimic` (MiMiC QM/MM simulations), `do_nm` (normal mode analysis),
  18 and `do_tpi` (test-particle insertion).
  19
  20 The legacy approach has some obvious drawbacks:
  21 * *Data management:* Each of the `do_*` functions defines local data,
  22   including complex objects encapsulating some data and functionality,
  23   but also data structures effectively used as "global variables" for
  24   communication between different parts of the simulation. Neither the
  25   ownership nor the access rights (except for `const` qualifiers) are
  26   clearly defined.
  27 * *Dependencies:* Many function calls in the `do_*` functions are
  28   dependent on each others, i.e. rely on being called in a specific
  29   order, but these dependencies are not clearly defined.
  30 * *Branches:* The flow of the `do_*` functions are hard to understand
  31   due to branching. At setup time, and then at every step of the
  32   simulation run, a number of booleans are set (e.g. `bNS` (do neighbor
  33   searching), `bCalcEner` (calculate energies), `do_ene` (write
  34   energies), `bEner` (energy calculation needed), etc). These booleans
  35   enable or disable branches of the code (for the current step or the
  36   entire run), mostly encoded as `if(...)` statements in the main `do_*`
  37   loop, but also in functions called from there.
  38 * *Task scheduling:* Poorly defined dependencies and per-step branching
  39   make task scheduling (e.g. parallel execution of independent tasks)
  40   very difficult.
  41 * *Error-prone for developers:* Poorly defined dependencies and unclear
  42   code flow make changing the simulator functions very error-prone,
  43   rendering the implementation of new methods tedious.
  44
  45 ## The modular simulator approach
  46
  47 The main design goals of the new, fully modular simulator approach
  48 include
  49 * *Extensibility:* We want to ease maintenance and the implementation
  50   of new integrator schemes.
  51 * *Monte Carlo:* We want to add MC capability, which can be mixed with
  52   MD to create hybrid MC/MD schemes.
  53 * *Data locality & interfaces:* We aim at localizing data in objects,
  54   and offer interfaces if access from other objects is needed.
  55 * *Multi-stepping:* We aim at a design which intrinsically supports
  56   multi-step integrators, e.g. having force calls at different
  57   frequencies, or avoid having branches including rare events
  58   (trajectory writing, neighbor search, ...) in the computation loops.
  59 * *Task parallelism:* Although not first priority, we want to have a
  60   design which can be extended to allow for task parallelism.
  61
  62 The general design approach is that of a **task scheduler**. *Tasks*
  63 are argument-less functions which perform a part of the computation.
  64 Periodically during the simulation, the scheduler builds a
  65 *queue of tasks*, i.e. a list of tasks which is then run through in
  66 order. Over time, with data dependencies clearly defined, this
  67 approach can be modified to have independent tasks run in parallel.
  68
  69 The approach is most easily displayed using some pseudo code:
  70
  71     class ModularSimulator : public ISimulator
  72     {
  73         public:
  74             //! Run the simulator
  75             void run() override;
  76         private:
  77             std::vector<ISignaller*> signallers_;
  78             std::vector<ISimulatorElement*> elements_;
  79             std::queue<SimulatorRunFunction*> taskQueue_;
  80     }
  81
  82     void ModularSimulator::run()
  83     {
  84         constructElementsAndSignallers();
  85         setupAllElements();
  86         while (not lastStep)
  87         {
  88             // Fill the task queue with new tasks (can be precomputed for many steps)
  89             populateTaskQueue();
  90             // Now simply loop through the queue and run one task after the next
  91             for (auto task : taskQueue)
  92             {
  93                 (*task)();  // run task
  94             }
  95         }
  96     }
  97
  98 This allows for an important division of tasks.
  99
 100 * `constructElementsAndSignallers()` is responsible to **store the
 101   elements in the right order**. This includes the different order of
 102   element in different algorithms (e.g. leap-frog vs. velocity
 103   verlet), but also logical dependencies (energy output after compute
 104   globals).
 105 * `populateTaskQueue()` is responsible to **decide if elements need to
 106   run at a specific time step**. The elements get called in order, and
 107   decide whether they need to run at a specific step. This can be
 108   pre-computed for multiple steps. In the current implementation, the
 109   tasks are pre-computed for the entire life-time of the neighbor
 110   list.
 111 * **Running the actual simulation tasks** is done after the task queue
 112   was filled.  This is achieved by simply looping over the task list,
 113   no conditionals or branching needed.
 114
 115 ### Simulator elements
 116
 117 The task scheduler holds a list of *simulator elements*, defined by
 118 the `ISimulatorElement` interface. These elements have a
 119 `scheduleTask(Step, Time)` function, which gets called by the task
 120 scheduler. This allows the simulator element to register one (or more)
 121 function pointers to be run at that specific `(Step, Time)`. From the
 122 point of view of the element, it is important to note that the
 123 computation will not be carried out immediately, but that it will be
 124 called later during the actual (partial) simulation run. From the
 125 point of view of the builder of the task scheduler, it is important to
 126 note that the order of the elements determines the order in which
 127 computation is performed. The task scheduler periodically loops over
 128 its list of elements, builds a queue of function pointers to run, and
 129 returns this list of tasks. As an example, a possible application
 130 would be to build a new queue after each domain-decomposition (DD) /
 131 neighbor-searching (NS) step, which might occur every 100 steps. The
 132 scheduler would loop repeatedly over all its elements, with elements
 133 like the trajectory-writing element registering for only one or no
 134 step at all, the energy-calculation element registering for every
 135 tenth step, and the force, position / velocity propagation, and
 136 constraining algorithms registering for every step. The result would
 137 be a (long) queue of function pointers including all computations
 138 needed until the next DD / NS step, which can be run without any
 139 branching.
 140
 141 ### Signallers
 142
 143 Some elements might require computations by other elements. If for
 144 example, the trajectory writing is an element independent from the
 145 energy-calculation element, it needs to signal to the energy element
 146 that it is about to write a trajectory, and that the energy element
 147 should be ready for that (i.e. perform an energy calculation in the
 148 upcoming step). This requirement, which replaces the boolean branching
 149 in the current implementation, is fulfilled by a Signaller - Client
 150 model. Classes implementing the `ISignaller` interface get called
 151 *before* every loop of the element list, and can inform registered
 152 clients about things happening during that step. The trajectory
 153 element, for example, can tell the energy element that it will write
 154 to trajectory at the end of this step. The energy element can then
 155 register an energy calculation during that step, being ready to write
 156 to trajectory when requested.
 157
 158 ### Sequence diagrams
 159
 160 #### Pre-loop
 161 In the loop preparation, the signallers and elements are created and
 162 stored in the right order. The signallers and elements can then
 163 perform any setup operations needed.
 164
 165 \msc
 166 hscale="2";
 167
 168 ModularSimulator,
 169 Signallers [label="ModularSimulator::\nSignallers"],
 170 Elements [label="ModularSimulator::\nElements"],
 171 TaskQueue [label="ModularSimulator::\nTaskQueue"];
 172
 173 --- [ label = "constructElementsAndSignallers()" ];
 174     ModularSimulator => Signallers [ label = "Create signallers\nand order them" ];
 175     ModularSimulator => Elements [ label = "Create elements\nand order them" ];
 176 --- [ label = "constructElementsAndSignallers()" ];
 177 |||;
 178 |||;
 179
 180 --- [ label = "setupAllElements()" ];
 181     ModularSimulator => Signallers [ label = "Call setup()" ];
 182     Signallers box Signallers [ label = "for signaler in Signallers\n    signaller->setup()" ];
 183     |||;
 184     ModularSimulator => Elements [ label = "Call setup()" ];
 185     Elements box Elements [ label = "for element in Elements\n    element->setup()" ];
 186 --- [ label = "setupAllElements()" ];
 187 \endmsc
 188
 189 #### Main loop
 190 The main loop consists of two parts which are alternately run until the
 191 simulation stop criterion is met. The first part is the population of
 192 the task queue, which determines all tasks that will have to run to
 193 simulate the system for a given time period. In the current implementation,
 194 the scheduling period is set equal to the lifetime of the neighborlist.
 195 Once the tasks have been predetermined, the simulator runs them in order.
 196 This is the actual simulation computation, which can now run without any
 197 branching.
 198
 199 \msc
 200 hscale="2";
 201
 202 ModularSimulator,
 203 Signallers [label="ModularSimulator::\nSignallers"],
 204 Elements [label="ModularSimulator::\nElements"],
 205 TaskQueue [label="ModularSimulator::\nTaskQueue"];
 206
 207 ModularSimulator box TaskQueue [ label = "loop: while(not lastStep)" ];
 208 ModularSimulator note TaskQueue [ label = "The task queue is empty. The simulation state is at step N.", textbgcolor="yellow" ];
 209 |||;
 210 |||;
 211 ModularSimulator box ModularSimulator [ label = "populateTaskQueue()" ];
 212 ModularSimulator =>> TaskQueue [ label = "Fill task queue with tasks until next neighbor-searching step" ];
 213 |||;
 214 |||;
 215 ModularSimulator note TaskQueue [ label = "The task queue now holds all tasks needed to move the simulation from step N to step N + nstlist. The simulation for these steps has not been performed yet, however. The simulation state is hence still at step N.", textbgcolor="yellow" ];
 216 |||;
 217 |||;
 218
 219 ModularSimulator => TaskQueue [ label = "Run all tasks in TaskQueue" ];
 220 TaskQueue box TaskQueue [label = "for task in TaskQueue\n    run task" ];
 221 TaskQueue note TaskQueue [ label = "All simulation computations are happening in this loop!", textbgcolor="yellow" ];
 222 |||;
 223 |||;
 224 ModularSimulator note TaskQueue [ label = "The task queue is now empty. The simulation state is at step N + nstlist.", textbgcolor="yellow" ];
 225 ModularSimulator box TaskQueue [ label = "end loop: while(not lastStep)" ];
 226
 227 \endmsc
 228
 229 #### Task scheduling
 230 A part of the main loop, the task scheduling in `populateTaskQueue()`
 231 allows the elements to push tasks to the task queue. For every scheduling
 232 step, the signallers are run first to give the elements information about
 233 the upcoming scheduling step. The scheduling routine elements are then
 234 called in order, allowing the elements to register their respective tasks.
 235
 236 \msc
 237 hscale="2";
 238
 239 ModularSimulator,
 240 Signallers [label="ModularSimulator::\nSignallers"],
 241 Elements [label="ModularSimulator::\nElements"],
 242 TaskQueue [label="ModularSimulator::\nTaskQueue"];
 243
 244 --- [ label = "populateTaskQueue()" ];
 245     ModularSimulator box ModularSimulator [ label = "doDomainDecomposition()\ndoPmeLoadBalancing()" ];
 246     ModularSimulator =>> Elements [ label = "Update state and topology" ];
 247     |||;
 248     |||;
 249
 250     ModularSimulator note ModularSimulator [ label = "schedulingStep == N\nsimulationStep == N", textbgcolor="yellow" ];
 251     ModularSimulator box TaskQueue [ label = "loop: while(not nextNeighborSearchingStep)" ];
 252         ModularSimulator => Signallers [ label = "Run signallers for schedulingStep" ];
 253         Signallers box Signallers [label = "for signaller in Signallers\n    signaller->signal(scheduleStep)" ];
 254         Signallers =>> Elements [ label = "notify" ];
 255         Signallers note Elements [ label = "The elements now know if schedulingStep has anything special happening, e.g. neighbor searching, log writing, trajectory writing, ...", textbgcolor="yellow" ];
 256         |||;
 257         |||;
 258
 259         ModularSimulator => Elements [ label = "Schedule run functions for schedulingStep" ];
 260         Elements box Elements [label = "for element in Elements\n    element->scheduleTask(scheduleStep)" ];
 261         Elements =>> TaskQueue [ label = "Push task" ];
 262         Elements note TaskQueue [ label = "The elements have now registered everything they will need to do for schedulingStep.", textbgcolor="yellow" ];
 263         ModularSimulator note ModularSimulator [ label = "schedulingStep++", textbgcolor="yellow" ];
 264
 265     ModularSimulator box TaskQueue [ label = "end loop: while(not nextNeighborSearchingStep)" ];
 266 --- [ label = "populateTaskQueue()" ];
 267 ModularSimulator note ModularSimulator [ label = "schedulingStep == N + nstlist\nsimulationStep == N", textbgcolor="yellow" ];
 268
 269 \endmsc
 270
 271 ## Acceptance tests and further plans
 272
 273 Acceptance tests before this can be made default code path (as
 274 defined with Mark Jan 2019)
 275 * End-to-end tests pass on both `do_md` and the new loop in
 276   Jenkins pre- and post-submit matrices
 277 * Physical validation cases pass on the new loop
 278 * Performance on different sized benchmark cases, x86 CPU-only
 279   and NVIDIA GPU are at most 1% slower -
 280   https://github.com/ptmerz/gmxbenchmark has been developed to
 281   this purpose.
 282
 283 After the NVE MD bare minimum, we will want to add support for
 284 * Thermo- / barostats
 285 * FEP
 286 * Pulling
 287 * Checkpointing
 288
 289 Using the new modular simulator framework, we will then explore
 290 adding new functionality to GROMACS, including
 291 * Monte Carlo barostat
 292 * hybrid MC/MD schemes
 293 * multiple-time-stepping integration
 294
 295 We sill also explore optimization opportunities, including
 296 * re-use of the same queue if conditions created by user input are
 297   sufficiently favorable (e.g. by design or when observed)
 298 * simultaneous execution of independent tasks
 299
 300 We will probably not prioritize support for (and might consider
 301 deprecating from do_md for GROMACS 2020)
 302 * Simulated annealing
 303 * REMD
 304 * Simulated tempering
 305 * Multi-sim
 306 * Membrane embedding
 307 * QM/MM
 308 * FEP lambda vectors
 309 * Fancy mdp options for FEP output
 310 * MTTK
 311 * Shell particles
 312 * Essential dynamics
 313 * Enforced rotation
 314 * Constant acceleration groups
 315 * Ensemble-averaged restraints
 316 * Time-averaged restraints
 317 * Freeze, deform, cos-acceleration
 318
 319 ## Signallers and elements
 320
 321 The current implementation of the modular simulator consists of
 322 the following signallers and elements:
 323
 324 ### Signallers
 325
 326 All signallers have a list of pointers to clients, objects that
 327 implement a respective interface and get notified of events the
 328 signaller is communicating.
 329
 330 * `NeighborSearchSignaller`: Informs its clients whether the
 331   current step is a neighbor-searching step.
 332 * `LastStepSignaller`: Informs its clients when the current step
 333   is the last step of the simulation.
 334 * `LoggingSignaller`: Informs its clients whether output to the
 335   log file is written in the current step.
 336 * `EnergySignaller`: Informs its clients about energy related
 337   special steps, namely energy calculation steps, virial
 338   calculation steps, and free energy calculation steps.
 339 * `TrajectoryElement`: Informs its clients if writing to
 340   trajectory (state [x/v/f] and/or energy) is planned for the
 341   current step. Note that the `TrajectoryElement` is not a
 342   pure signaller, but also implements the `ISimulatorElement`
 343   interface (see section "Simulator Elements" below).
 344
 345 ### Simulator Elements
 346
 347 #### `TrajectoryElement`
 348 The `TrajectoryElement` is a special element, as it
 349 is both implementing the `ISimulatorElement` and the `ISignaller`
 350 interfaces. During the signaller phase, it is signalling its
 351 _signaller clients_ that the trajectory will be written at the
 352 end of the current step. During the simulator run phase, it is
 353 calling its _trajectory clients_ (which do not necessarily need
 354 to be identical with the signaller clients), passing them a valid
 355 output pointer and letting them write to trajectory. Unlike the
 356 legacy implementation, the trajectory element itself knows nothing
 357 about the data that is written to file - it is only responsible
 358 to inform clients about trajectory steps, and providing a valid
 359 file pointer to the objects that need to write to trajectory.
 360
 361 #### `StatePropagatorData`
 362 The `StatePropagatorData` takes part in the simulator run, as it might
 363 have to save a valid state at the right moment during the
 364 integration. Placing the StatePropagatorData correctly is for now the
 365 duty of the simulator builder - this might be automated later
 366 if we have enough meta-data of the variables (i.e., if
 367 `StatePropagatorData` knows at which time the variables currently are,
 368 and can decide when a valid state (full-time step of all
 369 variables) is reached. The `StatePropagatorData` is also a client of
 370 both the trajectory signaller and writer - it will save a
 371 state for later writeout during the simulator step if it
 372 knows that trajectory writing will occur later in the step,
 373 and it knows how to write to file given a file pointer by
 374 the `TrajectoryElement`.
 375
 376 #### `EnergyElement`
 377 The `EnergyElement` takes part in the simulator run, as it
 378 does either add data (at energy calculation steps), or
 379 record a non-calculation step (all other steps). It is the
 380 responsibility of the simulator builder to ensure that the
 381 `EnergyElement` is called at a point of the simulator run
 382 at which it has access to a valid energy state.
 383
 384
 385 ## Data structures
 386
 387 ### `StatePropagatorData`
 388 The `StatePropagatorData` contains a little more than the pure
 389 statistical-physical micro state, namely the positions,
 390 velocities, forces, and box matrix, as well as a backup of
 391 the positions and box of the last time step. While it takes
 392 part in the simulator loop to be able to backup positions /
 393 boxes and save the current state if needed, it's main purpose
 394 is to offer access to its data via getter methods. All elements
 395 reading or writing to this data need a pointer to the
 396 `StatePropagatorData` and need to request their data explicitly. This
 397 will later simplify the understanding of data dependencies
 398 between elements.
 399
 400 Note that the `StatePropagatorData` can be converted to and from the
 401 legacy `t_state` object. This is useful when dealing with
 402 functionality which has not yet been adapted to use the new
 403 data approach - of the elements currently implemented, only
 404 domain decomposition, PME load balancing, and the initial
 405 constraining are using this.
 406
 407 ### `EnergyElement`
 408 The EnergyElement owns the EnergyObject, and is hence responsible
 409 for saving energy data and writing it to trajectory. It also owns
 410 the tensors for the different virials and the pressure as well as
 411 the total dipole vector.
 412
 413 It subscribes to the trajectory signaller, the energy signaller,
 414 and the logging signaller to know when an energy calculation is
 415 needed and when a non-recording step is enough. The simulator
 416 builder is responsible to place the element in a location at
 417 which a valid energy state is available. The EnergyElement is
 418 also a subscriber to the trajectory writer element, as it is
 419 responsible to write energy data to trajectory.
 420
 421 The EnergyElement offers an interface to add virial contributions,
 422 but also allows access to the raw pointers to tensor data, the
 423 dipole vector, and the legacy energy data structures.
 424
 425 ### `TopologyHolder`
 426 The topology object owns the local topology and holds a constant reference
 427 to the global topology owned by the ISimulator.
 428
 429 The local topology is only infrequently changed if domain decomposition is
 430 on, and never otherwise. The topology holder therefore offers elements to register
 431 as ITopologyHolderClients. If they do so, they get a handle to the updated local
 432 topology whenever it is changed, and can rely that their handle is valid
 433 until the next update. The domain decomposition element is defined as friend
 434 class to be able to update the local topology when needed.
 435
 436 The topology holder is not a `ISimulatorElement`, i.e. it does not take part in the
 437 simulator loop.