2 % This file is part of the GROMACS molecular simulation package.
4 % Copyright (c) 2013,2014,2015,2016,2017, by the GROMACS development team, led by
5 % Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
6 % and including many others, as listed in the AUTHORS file in the
7 % top-level source directory and at http://www.gromacs.org.
9 % GROMACS is free software; you can redistribute it and/or
10 % modify it under the terms of the GNU Lesser General Public License
11 % as published by the Free Software Foundation; either version 2.1
12 % of the License, or (at your option) any later version.
14 % GROMACS is distributed in the hope that it will be useful,
15 % but WITHOUT ANY WARRANTY; without even the implied warranty of
16 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
17 % Lesser General Public License for more details.
19 % You should have received a copy of the GNU Lesser General Public
20 % License along with GROMACS; if not, see
21 % http://www.gnu.org/licenses, or write to the Free Software Foundation,
22 % Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
24 % If you want to redistribute modifications to GROMACS, please
25 % consider that scientific software is very special. Version
26 % control is crucial - bugs must be traceable. We will be happy to
27 % consider code for inclusion in the official distribution, but
28 % derived work must not be called official GROMACS. Details are found
29 % in the README & COPYING files - if they are missing, get the
30 % official version at http://www.gromacs.org.
32 % To help us fund GROMACS development, we humbly ask that you cite
33 % the research papers on the package. Check out http://www.gromacs.org.
35 \newcommand{\nproc}{\mbox{$M$
}}
36 \newcommand{\natom}{\mbox{$N$
}}
37 \newcommand{\nx}{\mbox{$n_x$
}}
38 \newcommand{\ny}{\mbox{$n_y$
}}
39 \newcommand{\nz}{\mbox{$n_z$
}}
40 \newcommand{\nsgrid}{NS grid
}
41 \newcommand{\fftgrid}{FFT grid
}
42 \newcommand{\dgrid}{\mbox{$
\delta_{grid
}$
}}
43 \newcommand{\bfv}[1]{{\mbox{\boldmath{$
#1$
}}}}
44 % non-italicized boldface for math (e.g. matrices)
45 \newcommand{\bfm}[1]{{\bf #1}}
46 \newcommand{\dt}{\Delta t
}
47 \newcommand{\rv}{\bfv{r
}}
48 \newcommand{\vv}{\bfv{v
}}
49 \newcommand{\F}{\bfv{F
}}
50 \newcommand{\pb}{\bfv{p
}}
51 \newcommand{\veps}{v_
{\epsilon}}
52 \newcommand{\peps}{p_
{\epsilon}}
53 \newcommand{\sinhx}[1]{\frac{\sinh{\left(
#1\right)
}}{#1}}
56 \section{Introduction
}
57 In this chapter we first give describe some general concepts used in
58 {\gromacs}:
{\em periodic boundary conditions
} (
\secref{pbc
})
59 and the
{\em group concept
} (
\secref{groupconcept
}). The MD algorithm is
60 described in
\secref{MD
}: first a global form of the algorithm is
61 given, which is refined in subsequent subsections. The (simple) EM
62 (Energy Minimization) algorithm is described in
\secref{EM
}. Some
63 other algorithms for special purpose dynamics are described after
66 %\ifthenelse{\equal{\gmxlite}{1}}{}{
67 %In the final \secref{par} of this chapter a few principles are
68 %given on which parallelization of {\gromacs} is based. The
69 %parallelization is hardly visible for the user and is therefore not
71 %} % Brace matches ifthenelse test for gmxlite
73 A few issues are of general interest. In all cases the
{\em system
}
74 must be defined, consisting of molecules. Molecules again consist of
75 particles with defined interaction functions. The detailed
76 description of the
{\em topology
} of the molecules and of the
{\em force
77 field
} and the calculation of forces is given in
78 \chref{ff
}. In the present chapter we describe
79 other aspects of the algorithm, such as pair list generation, update of
80 velocities and positions, coupling to external temperature and
81 pressure, conservation of constraints.
82 \ifthenelse{\equal{\gmxlite}{1}}{}{
83 The
{\em analysis
} of the data generated by an MD simulation is treated in
\chref{analysis
}.
84 } % Brace matches ifthenelse test for gmxlite
86 \section{Periodic boundary conditions
\index{periodic boundary conditions
}}
89 \centerline{\includegraphics[width=
9cm
]{plots/pbctric
}}
90 \caption {Periodic boundary conditions in two dimensions.
}
93 The classical way to minimize edge effects in a finite system is to
94 apply
{\em periodic boundary conditions
}. The atoms of the system to
95 be simulated are put into a space-filling box, which is surrounded by
96 translated copies of itself (
\figref{pbc
}). Thus there are no
97 boundaries of the system; the artifact caused by unwanted boundaries
98 in an isolated cluster is now replaced by the artifact of periodic
99 conditions. If the system is crystalline, such boundary conditions are
100 desired (although motions are naturally restricted to periodic motions
101 with wavelengths fitting into the box). If one wishes to simulate
102 non-periodic systems, such as liquids or solutions, the periodicity by
103 itself causes errors. The errors can be evaluated by comparing various
104 system sizes; they are expected to be less severe than the errors
105 resulting from an unnatural boundary with vacuum.
107 There are several possible shapes for space-filling unit cells. Some,
108 like the
{\em \normindex{rhombic dodecahedron
}} and the
109 {\em \normindex{truncated octahedron
}}~
\cite{Adams79
} are closer to being a sphere
110 than a cube is, and are therefore better suited to the
111 study of an approximately spherical macromolecule in solution, since
112 fewer solvent molecules are required to fill the box given a minimum
113 distance between macromolecular images. At the same time, rhombic
114 dodecahedra and truncated octahedra are special cases of
{\em triclinic
}
115 unit cells
\index{triclinic unit cell
}; the most general space-filling unit cells
116 that comprise all possible space-filling shapes~
\cite{Bekker95
}.
117 For this reason,
{\gromacs} is based on the triclinic unit cell.
119 {\gromacs} uses periodic boundary conditions, combined with the
{\em
120 \normindex{minimum image convention
}}: only one -- the nearest -- image of each
121 particle is considered for short-range non-bonded interaction terms.
122 For long-range electrostatic interactions this is not always accurate
123 enough, and
{\gromacs} therefore also incorporates lattice sum methods
124 such as Ewald Sum, PME and PPPM.
126 {\gromacs} supports triclinic boxes of any shape.
127 The simulation box (unit cell) is defined by the
3 box vectors
128 $
{\bf a
}$,$
{\bf b
}$ and $
{\bf c
}$.
129 The box vectors must satisfy the following conditions:
135 \label{eqn:box_shift1
}
136 a_x>
0,~~~~b_y>
0,~~~~c_z>
0
139 \label{eqn:box_shift2
}
140 |b_x|
\leq \frac{1}{2} \, a_x,~~~~
141 |c_x|
\leq \frac{1}{2} \, a_x,~~~~
142 |c_y|
\leq \frac{1}{2} \, b_y
144 Equations
\ref{eqn:box_rot
} can always be satisfied by rotating the box.
145 Inequalities (
\ref{eqn:box_shift1
}) and (
\ref{eqn:box_shift2
}) can always be
146 satisfied by adding and subtracting box vectors.
148 Even when simulating using a triclinic box,
{\gromacs} always keeps the
149 particles in a brick-shaped volume for efficiency,
150 as illustrated in
\figref{pbc
} for a
2-dimensional system.
151 Therefore, from the output trajectory it might seem that the simulation was
152 done in a rectangular box. The program
{\tt trjconv
} can be used to convert
153 the trajectory to a different unit-cell representation.
155 It is also possible to simulate without periodic boundary conditions,
156 but it is usually more efficient to simulate an isolated cluster of molecules
157 in a large periodic box, since fast grid searching can only be used
158 in a periodic system.
162 \includegraphics[width=
5cm
]{plots/rhododec
}
163 ~~~~
\includegraphics[width=
5cm
]{plots/truncoct
}
165 \caption {A rhombic dodecahedron and truncated octahedron
166 (arbitrary orientations).
}
167 \label{fig:boxshapes
}
170 \subsection{Some useful box types
}
173 \begin{tabular
}{|c|c|c|ccc|ccc|
}
175 box type & image & box &
\multicolumn{3}{c|
}{box vectors
} &
\multicolumn{3}{c|
}{box vector angles
} \\
176 & distance & volume & ~
{\bf a
}~ &
{\bf b
} &
{\bf c
} &
177 $
\angle{\bf bc
}$ & $
\angle{\bf ac
}$ & $
\angle{\bf ab
}$ \\
179 & & & $d$ &
0 &
0 & & & \\
180 cubic & $d$ & $d^
3$ &
0 & $d$ &
0 & $
90^
\circ$ & $
90^
\circ$ & $
90^
\circ$ \\
181 & & &
0 &
0 & $d$ & & & \\
183 rhombic & & & $d$ &
0 & $
\frac{1}{2}\,d$ & & & \\
184 dodecahedron & $d$ & $
\frac{1}{2}\sqrt{2}\,d^
3$ &
0 & $d$ & $
\frac{1}{2}\,d$ & $
60^
\circ$ & $
60^
\circ$ & $
90^
\circ$ \\
185 (xy-square) & & $
0.707\,d^
3$ &
0 &
0 & $
\frac{1}{2}\sqrt{2}\,d$ & & & \\
187 rhombic & & & $d$ & $
\frac{1}{2}\,d$ & $
\frac{1}{2}\,d$ & & & \\
188 dodecahedron & $d$ & $
\frac{1}{2}\sqrt{2}\,d^
3$ &
0 & $
\frac{1}{2}\sqrt{3}\,d$ & $
\frac{1}{6}\sqrt{3}\,d$ & $
60^
\circ$ & $
60^
\circ$ & $
60^
\circ$ \\
189 (xy-hexagon) & & $
0.707\,d^
3$ &
0 &
0 & $
\frac{1}{3}\sqrt{6}\,d$ & & & \\
191 truncated & & & $d$ & $
\frac{1}{3}\,d$ & $-
\frac{1}{3}\,d$ & & &\\
192 octahedron & $d$ & $
\frac{4}{9}\sqrt{3}\,d^
3$ &
0 & $
\frac{2}{3}\sqrt{2}\,d$ & $
\frac{1}{3}\sqrt{2}\,d$ & $
71.53^
\circ$ & $
109.47^
\circ$ & $
71.53^
\circ$ \\
193 & & $
0.770\,d^
3$ &
0 &
0 & $
\frac{1}{3}\sqrt{6}\,d$ & & & \\
197 \caption{The cubic box, the rhombic
\normindex{dodecahedron
} and the truncated
198 \normindex{octahedron
}.
}
201 The three most useful box types for simulations of solvated systems
202 are described in
\tabref{boxtypes
}. The rhombic dodecahedron
203 (
\figref{boxshapes
}) is the smallest and most regular space-filling
204 unit cell. Each of the
12 image cells is at the same distance. The
205 volume is
71\% of the volume of a cube having the same image
206 distance. This saves about
29\% of CPU-time when simulating a
207 spherical or flexible molecule in solvent. There are two different
208 orientations of a rhombic dodecahedron that satisfy equations
209 \ref{eqn:box_rot
},
\ref{eqn:box_shift1
} and
\ref{eqn:box_shift2
}.
210 The program
{\tt editconf
} produces the orientation
211 which has a square intersection with the xy-plane. This orientation
212 was chosen because the first two box vectors coincide with the x and
213 y-axis, which is easier to comprehend. The other orientation can be
214 useful for simulations of membrane proteins. In this case the
215 cross-section with the xy-plane is a hexagon, which has an area which
216 is
14\% smaller than the area of a square with the same image
217 distance. The height of the box ($c_z$) should be changed to obtain
218 an optimal spacing. This box shape not only saves CPU time, it
219 also results in a more uniform arrangement of the proteins.
221 \subsection{Cut-off restrictions
}
222 The
\normindex{minimum image convention
} implies that the cut-off radius used to
223 truncate non-bonded interactions may not exceed half the shortest box
226 \label{eqn:physicalrc
}
227 R_c <
\half \min(\|
{\bf a
}\|,\|
{\bf b
}\|,\|
{\bf c
}\|),
229 because otherwise more than one image would be within the cut-off distance
230 of the force. When a macromolecule, such as a protein, is studied in
231 solution, this restriction alone is not sufficient: in principle, a single
232 solvent molecule should not be able
233 to `see' both sides of the macromolecule. This means that the length of
234 each box vector must exceed the length of the macromolecule in the
235 direction of that edge
{\em plus
} two times the cut-off radius $R_c$.
236 It is, however, common to compromise in this respect, and make the solvent
237 layer somewhat smaller in order to reduce the computational cost.
238 For efficiency reasons the cut-off with triclinic boxes is more restricted.
239 For grid search the extra restriction is weak:
242 R_c <
\min(a_x,b_y,c_z)
244 For simple search the extra restriction is stronger:
247 R_c <
\half \min(a_x,b_y,c_z)
250 Each unit cell (cubic, rectangular or triclinic)
251 is surrounded by
26 translated images. A
252 particular image can therefore always be identified by an index pointing to one
253 of
27 {\em translation vectors
} and constructed by applying a
254 translation with the indexed vector (see
\ssecref{forces
}).
255 Restriction (
\ref{eqn:gridrc
}) ensures that only
26 images need to be
258 %\ifthenelse{\equal{\gmxlite}{1}}{}{
259 \section{The group concept
}
260 \label{sec:groupconcept
}\index{group
}
261 The
{\gromacs} MD and analysis programs use user-defined
{\em groups
} of
262 atoms to perform certain actions on. The maximum number of groups is
263 256, but each atom can only belong to six different groups, one
264 each of the following:
266 \item[\swapindex{temperature-coupling
}{group
}]
267 The
\normindex{temperature coupling
} parameters (reference
268 temperature, time constant, number of degrees of freedom, see
269 \ssecref{update
}) can be defined for each T-coupling group
270 separately. For example, in a solvated macromolecule the solvent (that
271 tends to generate more heating by force and integration errors) can be
272 coupled with a shorter time constant to a bath than is a macromolecule,
273 or a surface can be kept cooler than an adsorbing molecule. Many
274 different T-coupling groups may be defined. See also center of mass
277 \item[\swapindex{freeze
}{group
}\index{frozen atoms
}]
278 Atoms that belong to a freeze group are kept stationary in the
279 dynamics. This is useful during equilibration,
{\eg} to avoid badly
280 placed solvent molecules giving unreasonable kicks to protein atoms,
281 although the same effect can also be obtained by putting a restraining
282 potential on the atoms that must be protected. The freeze option can
283 be used, if desired, on just one or two coordinates of an atom,
284 thereby freezing the atoms in a plane or on a line. When an atom is
285 partially frozen, constraints will still be able to move it, even in a
286 frozen direction. A fully frozen atom can not be moved by constraints.
287 Many freeze groups can be defined. Frozen coordinates are unaffected
288 by pressure scaling; in some cases this can produce unwanted results,
289 particularly when constraints are also used (in this case you will
290 get very large pressures). Accordingly, it is recommended to avoid
291 combining freeze groups with constraints and pressure coupling. For the
292 sake of equilibration it could suffice to start with freezing in a
293 constant volume simulation, and afterward use position restraints in
294 conjunction with constant pressure.
296 \item[\swapindex{accelerate
}{group
}]
297 On each atom in an ``accelerate group'' an acceleration
298 $
\ve{a
}^g$ is imposed. This is equivalent to an external
299 force. This feature makes it possible to drive the system into a
300 non-equilibrium state and enables the performance of
301 \swapindex{non-equilibrium
}{MD
} and hence to obtain transport properties.
303 \item[\swapindex{energy-monitor
}{group
}]
304 Mutual interactions between all energy-monitor groups are compiled
305 during the simulation. This is done separately for Lennard-Jones and
306 Coulomb terms. In principle up to
256 groups could be defined, but
307 that would lead to
256$
\times$
256 items! Better use this concept
310 All non-bonded interactions between pairs of energy-monitor groups can
311 be excluded
\index{exclusions
}
312 \ifthenelse{\equal{\gmxlite}{1}}
314 {(see details in the User Guide).
}
315 Pairs of particles from excluded pairs of energy-monitor groups
316 are not put into the pair list.
317 This can result in a significant speedup
318 for simulations where interactions within or between parts of the system
321 \item[\swapindex{center of mass
}{group
}\index{removing COM motion
}]
322 In
\gromacs\ the center of mass (COM) motion can be removed, for
323 either the complete system or for groups of atoms. The latter is
324 useful,
{\eg} for systems where there is limited friction (
{\eg} gas
325 systems) to prevent center of mass motion to occur. It makes sense to
326 use the same groups for temperature coupling and center of mass motion
329 \item[\swapindex{Compressed position output
}{group
}]
331 In order to further reduce the size of the compressed trajectory file
332 (
{\tt .xtc
{\index{XTC
}}} or
{\tt .tng
{\index{TNG
}}}), it is possible
333 to store only a subset of all particles. All x-compression groups that
334 are specified are saved, the rest are not. If no such groups are
335 specified, than all atoms are saved to the compressed trajectory file.
338 The use of groups in
{\gromacs} tools is described in
339 \secref{usinggroups
}.
340 %} % Brace matches ifthenelse test for gmxlite
342 \section{Molecular Dynamics
}
346 \addtolength{\fboxsep}{0.5cm
}
347 \begin{shadowenv
}[12cm
]
348 {\large \bf THE GLOBAL MD ALGORITHM
}
349 \rule{\textwidth}{2pt
} \\
350 {\bf 1. Input initial conditions
}\\
[2ex
]
351 Potential interaction $V$ as a function of atom positions\\
352 Positions $
\ve{r
}$ of all atoms in the system\\
353 Velocities $
\ve{v
}$ of all atoms in the system \\
355 \rule{\textwidth}{1pt
}\\
356 {\bf repeat
2,
3,
4} for the required number of steps:\\
357 \rule{\textwidth}{1pt
}\\
358 {\bf 2. Compute forces
} \\
[1ex
]
359 The force on any atom \\
[1ex
]
360 $
\ve{F
}_i = -
\displaystyle\frac{\partial V
}{\partial \ve{r
}_i
}$ \\
[1ex
]
361 is computed by calculating the force between non-bonded atom pairs: \\
362 $
\ve{F
}_i =
\sum_j \ve{F
}_
{ij
}$ \\
363 plus the forces due to bonded interactions (which may depend on
1,
2,
364 3, or
4 atoms), plus restraining and/or external forces. \\
365 The potential and kinetic energies and the pressure tensor may be computed. \\
367 {\bf 3. Update configuration
} \\
[1ex
]
368 The movement of the atoms is simulated by numerically solving Newton's
369 equations of motion \\
[1ex
]
371 \frac {\de^
2\ve{r
}_i
}{\de t^
2} =
\frac{\ve{F
}_i
}{m_i
} $ \\
374 \frac{\de\ve{r
}_i
}{\de t
} =
\ve{v
}_i ; \;\;
375 \frac{\de\ve{v
}_i
}{\de t
} =
\frac{\ve{F
}_i
}{m_i
} $ \\
[1ex
]
377 {\bf 4.
} if required:
{\bf Output step
} \\
378 write positions, velocities, energies, temperature, pressure, etc. \\
380 \caption{The global MD algorithm
}
384 A global flow scheme for MD is given in
\figref{global
}. Each
385 MD or EM run requires as input a set of initial coordinates and --
386 optionally -- initial velocities of all particles involved. This
387 chapter does not describe how these are obtained; for the setup of an
388 actual MD run check the online manual at
{\wwwpage}.
390 \subsection{Initial conditions
}
391 \subsubsection{Topology and force field
}
392 The system topology, including a description of the force field, must
394 \ifthenelse{\equal{\gmxlite}{1}}
396 {Force fields and topologies are described in
\chref{ff
}
397 and
\ref{ch:top
}, respectively.
}
398 All this information is static; it is never modified during the run.
400 \subsubsection{Coordinates and velocities
}
402 \centerline{\includegraphics[width=
8cm
]{plots/maxwell
}}
403 \caption{A Maxwell-Boltzmann velocity distribution, generated from
408 Then, before a run starts, the box size and the coordinates and
409 velocities of all particles are required. The box size and shape is
410 determined by three vectors (nine numbers) $
\ve{b
}_1,
\ve{b
}_2,
\ve{b
}_3$,
411 which represent the three basis vectors of the periodic box.
413 If the run starts at $t=t_0$, the coordinates at $t=t_0$ must be
414 known. The
{\em leap-frog algorithm
}, the default algorithm used to
415 update the time step with $
\Dt$ (see
\ssecref{update
}), also requires
416 that the velocities at $t=t_0 -
\hDt$ are known. If velocities are not
417 available, the program can generate initial atomic velocities
418 $v_i, i=
1\ldots 3N$ with a
\index{Maxwell-Boltzmann distribution
}
419 (
\figref{maxwell
}) at a given absolute temperature $T$:
421 p(v_i) =
\sqrt{\frac{m_i
}{2 \pi kT
}}\exp\left(-
\frac{m_i v_i^
2}{2kT
}\right)
423 where $k$ is Boltzmann's constant (see
\chref{defunits
}).
424 To accomplish this, normally distributed random numbers are generated
425 by adding twelve random numbers $R_k$ in the range $
0 \le R_k <
1$ and
426 subtracting
6.0 from their sum. The result is then multiplied by the
427 standard deviation of the velocity distribution $
\sqrt{kT/m_i
}$. Since
428 the resulting total energy will not correspond exactly to the required
429 temperature $T$, a correction is made: first the center-of-mass motion
430 is removed and then all velocities are scaled so that the total
431 energy corresponds exactly to $T$ (see
\eqnref{E-T
}).
432 % Why so complicated? What's wrong with Box-Mueller transforms?
434 \subsubsection{Center-of-mass motion
\index{removing COM motion
}}
435 The
\swapindex{center-of-mass
}{velocity
} is normally set to zero at
436 every step; there is (usually) no net external force acting on the
437 system and the center-of-mass velocity should remain constant. In
438 practice, however, the update algorithm introduces a very slow change in
439 the center-of-mass velocity, and therefore in the total kinetic energy of
440 the system -- especially when temperature coupling is used. If such
441 changes are not quenched, an appreciable center-of-mass motion
442 can develop in long runs, and the temperature will be
443 significantly misinterpreted. Something similar may happen due to overall
444 rotational motion, but only when an isolated cluster is simulated. In
445 periodic systems with filled boxes, the overall rotational motion is
446 coupled to other degrees of freedom and does not cause such problems.
449 \subsection{Neighbor searching
\swapindexquiet{neighbor
}{searching
}}
451 As mentioned in
\chref{ff
}, internal forces are
452 either generated from fixed (static) lists, or from dynamic lists.
453 The latter consist of non-bonded interactions between any pair of particles.
454 When calculating the non-bonded forces, it is convenient to have all
455 particles in a rectangular box.
456 As shown in
\figref{pbc
}, it is possible to transform a
457 triclinic box into a rectangular box.
458 The output coordinates are always in a rectangular box, even when a
459 dodecahedron or triclinic box was used for the simulation.
460 Equation
\ref{eqn:box_rot
} ensures that we can reset particles
461 in a rectangular box by first shifting them with
462 box vector $
{\bf c
}$, then with $
{\bf b
}$ and finally with $
{\bf a
}$.
463 Equations
\ref{eqn:box_shift2
},
\ref{eqn:physicalrc
} and
\ref{eqn:gridrc
}
464 ensure that we can find the
14 nearest triclinic images within
465 a linear combination that does not involve multiples of box vectors.
467 \subsubsection{Pair lists generation
}
468 The non-bonded pair forces need to be calculated only for those pairs
469 $i,j$ for which the distance $r_
{ij
}$ between $i$ and the
470 \swapindex{nearest
}{image
}
471 of $j$ is less than a given cut-off radius $R_c$. Some of the particle
472 pairs that fulfill this criterion are excluded, when their interaction
473 is already fully accounted for by bonded interactions.
{\gromacs}
474 employs a
{\em pair list
} that contains those particle pairs for which
475 non-bonded forces must be calculated. The pair list contains particles
476 $i$, a displacement vector for particle $i$, and all particles $j$ that
477 are within
\verb'rlist' of this particular image of particle $i$. The
478 list is updated every
\verb'nstlist' steps.
480 To make the
\normindex{neighbor list
}, all particles that are close
481 (
{\ie} within the neighbor list cut-off) to a given particle must be found.
482 This searching, usually called neighbor search (NS) or pair search,
483 involves periodic boundary conditions and determining the
{\em image
}
484 (see
\secref{pbc
}). The search algorithm is $O(N)$, although a simpler
485 $O(N^
2)$ algorithm is still available under some conditions.
487 \subsubsection{\normindex{Cut-off schemes
}: group versus Verlet
}
488 From version
4.6,
{\gromacs} supports two different cut-off scheme
489 setups: the original one based on particle groups and one using a Verlet
490 buffer. There are some important differences that affect results,
491 performance and feature support. The group scheme can be made to work
492 (almost) like the Verlet scheme, but this will lead to a decrease in
493 performance. The group scheme is especially fast for water molecules,
494 which are abundant in many simulations, but on the most recent x86
495 processors, this advantage is negated by the better instruction-level
496 parallelism available in the Verlet-scheme implementation. The group
497 scheme is deprecated in version
5.0, and will be removed in a future
498 version. For practical details of choosing and setting up
499 cut-off schemes, please see the User Guide.
501 In the group scheme, a neighbor list is generated consisting of pairs
502 of groups of at least one particle. These groups were originally
503 \swapindex{charge
}{group
}s
\ifthenelse{\equal{\gmxlite}{1}}{}{(see
504 \secref{chargegroup
})
}, but with a proper treatment of long-range
505 electrostatics, performance in unbuffered simulations is their only advantage. A pair of groups
506 is put into the neighbor list when their center of geometry is within
507 the cut-off distance. Interactions between all particle pairs (one from
508 each charge group) are calculated for a certain number of MD steps,
509 until the neighbor list is updated. This setup is efficient, as the
510 neighbor search only checks distance between charge-group pair, not
511 particle pairs (saves a factor of $
3 \times 3 =
9$ with a three-particle water
512 model) and the non-bonded force kernels can be optimized for, say, a
513 water molecule ``group''. Without explicit buffering, this setup leads
514 to energy drift as some particle pairs which are within the cut-off don't
515 interact and some outside the cut-off do interact. This can be caused
518 \item particles moving across the cut-off between neighbor search steps, and/or
519 \item for charge groups consisting of more than one particle, particle pairs
520 moving in/out of the cut-off when their charge group center of
521 geometry distance is outside/inside of the cut-off.
523 Explicitly adding a buffer to the neighbor list will remove such
524 artifacts, but this comes at a high computational cost. How severe the
525 artifacts are depends on the system, the properties in which you are
526 interested, and the cut-off setup.
528 The Verlet cut-off scheme uses a buffered pair list by default. It
529 also uses clusters of particles, but these are not static as in the group
530 scheme. Rather, the clusters are defined spatially and consist of
4 or
531 8 particles, which is convenient for stream computing, using e.g. SSE, AVX
532 or CUDA on GPUs. At neighbor search steps, a pair list is created
533 with a Verlet buffer, ie. the pair-list cut-off is larger than the
534 interaction cut-off. In the non-bonded kernels, interactions are only
535 computed when a particle pair is within the cut-off distance at that
536 particular time step. This ensures that as particles move between pair
537 search steps, forces between nearly all particles within the cut-off
538 distance are calculated. We say
{\em nearly
} all particles, because
539 {\gromacs} uses a fixed pair list update frequency for
540 efficiency. A particle-pair, whose distance was outside the cut-off,
541 could possibly move enough during this fixed number of
542 steps that its distance is now within the cut-off. This
543 small chance results in a small energy drift, and the size of the
544 chance depends on the temperature. When temperature
545 coupling is used, the buffer size can be determined automatically,
546 given a certain tolerance on the energy drift.
548 The Verlet cut-off scheme is implemented in a very efficient fashion
549 based on clusters of particles. The simplest example is a cluster size
550 of
4 particles. The pair list is then constructed based on cluster
551 pairs. The cluster-pair search is much faster searching based on
552 particle pairs, because $
4 \times 4 =
16$ particle pairs are put in
553 the list at once. The non-bonded force calculation kernel can then
554 calculate many particle-pair interactions at once, which maps nicely
555 to SIMD or SIMT units on modern hardware, which can perform multiple
556 floating operations at once. These non-bonded kernels
557 are much faster than the kernels used in the group scheme for most
558 types of systems, particularly on newer hardware.
560 Additionally, when the list buffer is determined automatically as
561 described below, we also apply dynamic pair list pruning. The pair list
562 can be constructed infrequently, but that can lead to a lot of pairs
563 in the list that are outside the cut-off range for all or most of
564 the life time of this pair list. Such pairs can be pruned out by
565 applying a cluster-pair kernel that only determines which clusters
566 are in range. Because of the way the non-bonded data is regularized
567 in
{\gromacs}, this kernel is an order of magnitude faster than
568 the search and the interaction kernel. On the GPU this pruning is
569 overlapped with the integration on the CPU, so it is free in most
570 cases. Therefore we can prune every
4-
10 integration steps with
571 little overhead and significantly reduce the number of cluster pairs
572 in the interaction kernel. This procedure is applied automatically,
573 unless the user set the pair-list buffer size manually.
575 \ifthenelse{\equal{\gmxlite}{1}}{}{
576 \subsubsection{Energy drift and pair-list buffering
}
577 For a canonical (NVT) ensemble, the average energy error caused by
578 diffusion of $j$ particles from outside the pair-list cut-off
579 $r_
\ell$ to inside the interaction cut-off $r_c$ over the lifetime
580 of the list can be determined from the atomic
581 displacements and the shape of the potential at the cut-off.
582 %Since we are interested in the small drift regime, we will assume
583 %#that atoms will only move within the cut-off distance in the last step,
584 %$n_\mathrm{ps}-1$, of the pair list update interval $n_\mathrm{ps}$.
585 %Over this number of steps the displacment of an atom with mass $m$
586 The displacement distribution along one dimension for a freely moving
587 particle with mass $m$ over time $t$ at temperature $T$ is
589 of zero mean and variance $
\sigma^
2 = t^
2 k_B T/m$. For the distance
590 between two particles, the variance changes to $
\sigma^
2 =
\sigma_{12}^
2 =
591 t^
2 k_B T(
1/m_1+
1/m_2)$. Note that in practice particles usually
592 interact with (bump into) other particles over time $t$ and therefore the real
593 displacement distribution is much narrower. Given a non-bonded
594 interaction cut-off distance of $r_c$ and a pair-list cut-off
595 $r_
\ell=r_c+r_b$ for $r_b$ the Verlet buffer size, we can then
596 write the average energy error after time $t$ for all missing pair
597 interactions between a single $i$ particle of type
1 surrounded
598 by all $j$ particles that are of type
2 with number density $
\rho_2$,
599 when the inter-particle distance changes from $r_0$ to $r_t$, as:
601 \langle \Delta V
\rangle =
602 \int_{0}^
{r_c
} \int_{r_
\ell}^
\infty 4 \pi r_0^
2 \rho_2 V(r_t) G\!
\left(
\frac{r_t-r_0
}{\sigma}\right) d r_0\, d r_t
604 To evaluate this analytically, we need to make some approximations. First we replace $V(r_t)$ by a Taylor expansion around $r_c$, then we can move the lower bound of the integral over $r_0$ to $-
\infty$ which will simplify the result:
606 \langle \Delta V
\rangle &
\approx&
607 \int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty 4 \pi r_0^
2 \rho_2 \Big[ V'(r_c) (r_t - r_c) +
610 \phantom{\int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty 4 \pi r_0^
2 \rho_2 \Big[}
611 V''(r_c)
\frac{1}{2}(r_t - r_c)^
2 +
614 \phantom{\int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty 4 \pi r_0^
2 \rho_2 \Big[}
615 V'''(r_c)
\frac{1}{6}(r_t - r_c)^
3 +
618 \phantom{\int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty 4 \pi r_0^
2 \rho_2 \Big[}
619 O \!
\left((r_t - r_c)^
4 \right)
\Big] G\!
\left(
\frac{r_t-r_0
}{\sigma}\right) d r_0 \, d r_t
621 Replacing the factor $r_0^
2$ by $(r_
\ell +
\sigma)^
2$, which results in a slight overestimate, allows us to calculate the integrals analytically:
623 \langle \Delta V
\rangle \!
625 4 \pi (r_
\ell+
\sigma)^
2 \rho_2
626 \int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty \Big[ V'(r_c) (r_t - r_c) +
629 \phantom{4 \pi (r_
\ell+
\sigma)^
2 \rho_2 \int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty \Big[}
630 V''(r_c)
\frac{1}{2}(r_t - r_c)^
2 +
633 \phantom{4 \pi (r_
\ell+
\sigma)^
2 \rho_2 \int_{-
\infty}^
{r_c
} \int_{r_
\ell}^
\infty \Big[}
634 V'''(r_c)
\frac{1}{6}(r_t - r_c)^
3 \Big] G\!
\left(
\frac{r_t-r_0
}{\sigma}\right)
637 4 \pi (r_
\ell+
\sigma)^
2 \rho_2 \bigg\
{
638 \frac{1}{2}V'(r_c)
\left[r_b
\sigma G\!
\left(
\frac{r_b
}{\sigma}\right) - (r_b^
2+
\sigma^
2)E\!
\left(
\frac{r_b
}{\sigma}\right)
\right] +
641 \phantom{4 \pi (r_
\ell+
\sigma)^
2 \rho_2 \bigg\
{ }
642 \frac{1}{6}V''(r_c)
\left[ \sigma(r_b^
2+
2\sigma^
2) G\!
\left(
\frac{r_b
}{\sigma}\right) - r_b(r_b^
2+
3\sigma^
2 ) E\!
\left(
\frac{r_b
}{\sigma}\right)
\right] +
645 \phantom{4 \pi (r_
\ell+
\sigma)^
2 \rho_2 \bigg\
{ }
646 \frac{1}{24}V'''(r_c)
\bigg[ r_b
\sigma(r_b^
2+
5\sigma^
2) G\!
\left(
\frac{r_b
}{\sigma}\right)
649 \phantom{4 \pi (r_
\ell+
\sigma)^
2 \rho_2 \bigg\
{ \frac{1}{24}V'''(r_c)
\bigg[ }
650 - (r_b^
4+
6r_b^
2\sigma^
2+
3\sigma^
4 ) E\!
\left(
\frac{r_b
}{\sigma}\right)
\bigg]
654 where $G(x)$ is a Gaussian distribution with
0 mean and unit variance and
655 $E(x)=
\frac{1}{2}\mathrm{erfc
}(x/
\sqrt{2})$. We always want to achieve
656 small energy error, so $
\sigma$ will be small compared to both $r_c$
657 and $r_
\ell$, thus the approximations in the equations above are good,
658 since the Gaussian distribution decays rapidly. The energy error needs
659 to be averaged over all particle pair types and weighted with the
660 particle counts. In
{\gromacs} we don't allow cancellation of error
661 between pair types, so we average the absolute values. To obtain the
662 average energy error per unit time, it needs to be divided by the
663 neighbor-list life time $t = (
{\tt nstlist
} -
1)
\times{\tt dt
}$. The
664 function can not be inverted analytically, so we use bisection to
665 obtain the buffer size $r_b$ for a target drift. Again we note that
666 in practice the error we usually be much smaller than this estimate,
667 as in the condensed phase particle displacements will be much smaller
668 than for freely moving particles, which is the assumption used here.
670 When (bond) constraints are present, some particles will have fewer
671 degrees of freedom. This will reduce the energy errors. For simplicity,
672 we only consider one constraint per particle, the heaviest particle
673 in case a particle is involved in multiple constraints.
674 This simplification overestimates the displacement. The motion of
675 a constrained particle is a superposition of the
3D motion of the
676 center of mass of both particles and a
2D rotation around the center of mass.
677 The displacement in an arbitrary direction of a particle with
2 degrees
678 of freedom is not Gaussian, but rather follows the complementary error
681 \frac{\sqrt{\pi}}{2\sqrt{2}\sigma}\,
\mathrm{erfc
}\left(
\frac{|r|
}{\sqrt{2}\,
\sigma}\right)
684 where $
\sigma^
2$ is again $t^
2 k_B T/m$. This distribution can no
685 longer be integrated analytically to obtain the energy error. But we
686 can generate a tight upper bound using a scaled and shifted Gaussian
687 distribution (not shown). This Gaussian distribution can then be used
688 to calculate the energy error as described above. The rotation displacement
689 around the center of mass can not be more than the length of the arm.
690 To take this into account, we scale $
\sigma$ in
\eqnref{2D_disp
} (details
691 not presented here) to obtain an overestimate of the real displacement.
692 This latter effect significantly reduces the buffer size for longer
693 neighborlist lifetimes in e.g. water, as constrained hydrogens are by far
694 the fastest particles, but they can not move further than
0.1 nm
695 from the heavy atom they are connected to.
698 There is one important implementation detail that reduces the energy
699 errors caused by the finite Verlet buffer list size. The derivation
700 above assumes a particle pair-list. However, the
{\gromacs}
701 implementation uses a cluster pair-list for efficiency. The pair list
702 consists of pairs of clusters of
4 particles in most cases, also
703 called a $
4 \times 4$ list, but the list can also be $
4 \times 8$ (GPU
704 CUDA kernels and AVX
256-bit single precision kernels) or $
4 \times 2$
705 (SSE double-precision kernels). This means that the pair-list is
706 effectively much larger than the corresponding $
1 \times 1$ list. Thus
707 slightly beyond the pair-list cut-off there will still be a large
708 fraction of particle pairs present in the list. This fraction can be
709 determined in a simulation and accurately estimated under some
710 reasonable assumptions. The fraction decreases with increasing
711 pair-list range, meaning that a smaller buffer can be used. For
712 typical all-atom simulations with a cut-off of
0.9 nm this fraction is
713 around
0.9, which gives a reduction in the energy errors of a factor of
714 10. This reduction is taken into account during the automatic Verlet
715 buffer calculation and results in a smaller buffer size.
718 \centerline{\includegraphics[width=
9cm
]{plots/verlet-drift
}}
719 \caption {Energy drift per atom for an SPC/E water system at
300K with
720 a time step of
2 fs and a pair-list update period of
10 steps
721 (pair-list life time:
18 fs). PME was used with
{\tt ewald-rtol
} set
722 to
10$^
{-
5}$; this parameter affects the shape of the potential at
723 the cut-off. Error estimates due to finite Verlet buffer size are
724 shown for a $
1 \times 1$ atom pair list and $
4 \times 4$ atom pair
725 list without and with (dashed line) cancellation of positive and
726 negative errors. Real energy drift is shown for simulations using
727 double- and mixed-precision settings. Rounding errors in the SETTLE
728 constraint algorithm from the use of single precision causes
729 the drift to become negative
730 at large buffer size. Note that at zero buffer size, the real drift
731 is small because positive (H-H) and negative (O-H) energy errors
733 \label{fig:verletdrift
}
736 In
\figref{verletdrift
} one can see that for small buffer sizes the drift
737 of the total energy is much smaller than the pair energy error tolerance,
738 due to cancellation of errors. For larger buffer size, the error estimate
739 is a factor of
6 higher than drift of the total energy, or alternatively
740 the buffer estimate is
0.024 nm too large. This is because the protons
741 don't move freely over
18 fs, but rather vibrate.
742 %At a buffer size of zero there is cancellation of
743 %drift due to repulsive (H-H) and attractive (O-H) interactions.
745 \subsubsection{Cut-off artifacts and switched interactions
}
746 With the Verlet scheme, the pair potentials are shifted to be zero at
747 the cut-off, which makes the potential the integral of the force.
748 This is only possible in the group scheme if the shape of the potential
749 is such that its value is zero at the cut-off distance.
750 However, there can still be energy drift when the
751 forces are non-zero at the cut-off. This effect is extremely small and
752 often not noticeable, as other integration errors (e.g. from constraints)
754 completely avoid cut-off artifacts, the non-bonded forces can be
755 switched exactly to zero at some distance smaller than the neighbor
756 list cut-off (there are several ways to do this in
{\gromacs}, see
757 \secref{mod_nb_int
}). One then has a buffer with the size equal to the
758 neighbor list cut-off less the longest interaction cut-off.
760 } % Brace matches ifthenelse test for gmxlite
762 \subsubsection{Simple search
\swapindexquiet{simple
}{search
}}
763 Due to
\eqnsref{box_rot
}{simplerc
}, the vector $
\rvij$
764 connecting images within the cut-off $R_c$ can be found by constructing:
766 \ve{r
}''' & = &
\ve{r
}_j-
\ve{r
}_i \\
767 \ve{r
}'' & = &
\ve{r
}''' -
{\bf c
}*
\verb'round'(r'''_z/c_z) \\
768 \ve{r
}' & = &
\ve{r
}'' -
{\bf b
}*
\verb'round'(r''_y/b_y) \\
769 \ve{r
}_
{ij
} & = &
\ve{r
}' -
{\bf a
}*
\verb'round'(r'_x/a_x)
771 When distances between two particles in a triclinic box are needed
772 that do not obey
\eqnref{box_rot
},
773 many shifts of combinations of box vectors need to be considered to find
776 \ifthenelse{\equal{\gmxlite}{1}}{}{
779 \centerline{\includegraphics[width=
8cm
]{plots/nstric
}}
780 \caption {Grid search in two dimensions. The arrows are the box vectors.
}
784 \subsubsection{Grid search
\swapindexquiet{grid
}{search
}}
786 The grid search is schematically depicted in
\figref{grid
}. All
787 particles are put on the
{\nsgrid}, with the smallest spacing $
\ge$
788 $R_c/
2$ in each of the directions. In the direction of each box
789 vector, a particle $i$ has three images. For each direction the image
790 may be -
1,
0 or
1, corresponding to a translation over -
1,
0 or +
1 box
791 vector. We do not search the surrounding
{\nsgrid} cells for neighbors
792 of $i$ and then calculate the image, but rather construct the images
793 first and then search neighbors corresponding to that image of $i$.
794 As
\figref{grid
} shows, some grid cells may be searched more than once
795 for different images of $i$. This is not a problem, since, due to the
796 minimum image convention, at most one image will ``see'' the
797 $j$-particle. For every particle, fewer than
125 (
5$^
3$) neighboring
798 cells are searched. Therefore, the algorithm scales linearly with the
799 number of particles. Although the prefactor is large, the scaling
800 behavior makes the algorithm far superior over the standard $O(N^
2)$
801 algorithm when there are more than a few hundred particles. The
802 grid search is equally fast for rectangular and triclinic boxes. Thus
803 for most protein and peptide simulations the rhombic dodecahedron will
804 be the preferred box shape.
805 } % Brace matches ifthenelse test for gmxlite
807 \ifthenelse{\equal{\gmxlite}{1}}{}{
808 \subsubsection{Charge groups
}
809 \label{sec:chargegroup
}\swapindexquiet{charge
}{group
}%
810 Charge groups were originally introduced to reduce cut-off artifacts
811 of Coulomb interactions. When a plain cut-off is used, significant
812 jumps in the potential and forces arise when atoms with (partial) charges
813 move in and out of the cut-off radius. When all chemical moieties have
814 a net charge of zero, these jumps can be reduced by moving groups
815 of atoms with net charge zero, called charge groups, in and
816 out of the neighbor list. This reduces the cut-off effects from
817 the charge-charge level to the dipole-dipole level, which decay
818 much faster. With the advent of full range electrostatics methods,
819 such as particle-mesh Ewald (
\secref{pme
}), the use of charge groups is
820 no longer required for accuracy. It might even have a slight negative effect
821 on the accuracy or efficiency, depending on how the neighbor list is made
822 and the interactions are calculated.
824 But there is still an important reason for using ``charge groups'': efficiency with the group cut-off scheme.
825 Where applicable, neighbor searching is carried out on the basis of
826 charge groups which are defined in the molecular topology.
827 If the nearest image distance between the
{\em
828 geometrical centers
} of the atoms of two charge groups is less than
829 the cut-off radius, all atom pairs between the charge groups are
830 included in the pair list.
831 The neighbor searching for a water system, for instance,
832 is $
3^
2=
9$ times faster when each molecule is treated as a charge group.
833 Also the highly optimized water force loops (see
\secref{waterloops
})
834 only work when all atoms in a water molecule form a single charge group.
835 Currently the name
{\em neighbor-search group
} would be more appropriate,
836 but the name charge group is retained for historical reasons.
837 When developing a new force field, the advice is to use charge groups
838 of
3 to
4 atoms for optimal performance. For all-atom force fields
839 this is relatively easy, as one can simply put hydrogen atoms, and in some
840 case oxygen atoms, in the same charge group as the heavy atom they
841 are connected to; for example: CH$_3$, CH$_2$, CH, NH$_2$, NH, OH, CO$_2$, CO.
843 With the Verlet cut-off scheme, charge groups are ignored.
845 } % Brace matches ifthenelse test for gmxlite
847 \subsection{Compute forces
}
848 \label{subsec:forces
}
850 \subsubsection{Potential energy
}
851 When forces are computed, the
\swapindex{potential
}{energy
} of each
852 interaction term is computed as well. The total potential energy is
853 summed for various contributions, such as Lennard-Jones, Coulomb, and
854 bonded terms. It is also possible to compute these contributions for
855 {\em energy-monitor groups
} of atoms that are separately defined (see
856 \secref{groupconcept
}).
858 \subsubsection{Kinetic energy and temperature
}
859 The
\normindex{temperature
} is given by the total
860 \swapindex{kinetic
}{energy
} of the $N$-particle system:
862 E_
{kin
} =
\half \sum_{i=
1}^N m_i v_i^
2
864 From this the absolute temperature $T$ can be computed using:
866 \half N_
{\mathrm{df
}} kT = E_
{\mathrm{kin
}}
869 where $k$ is Boltzmann's constant and $N_
{df
}$ is the number of
870 degrees of freedom which can be computed from:
872 N_
{\mathrm{df
}} ~=~
3 N - N_c - N_
{\mathrm{com
}}
874 Here $N_c$ is the number of
{\em \normindex{constraints
}} imposed on the system.
875 When performing molecular dynamics $N_
{\mathrm{com
}}=
3$ additional degrees of
876 freedom must be removed, because the three
877 center-of-mass velocities are constants of the motion, which are usually
878 set to zero. When simulating in vacuo, the rotation around the center of mass
879 can also be removed, in this case $N_
{\mathrm{com
}}=
6$.
880 When more than one temperature-coupling group
\index{temperature-coupling group
} is used, the number of degrees
881 of freedom for group $i$ is:
883 N^i_
{\mathrm{df
}} ~=~ (
3 N^i - N^i_c)
\frac{3 N - N_c - N_
{\mathrm{com
}}}{3 N - N_c
}
886 The kinetic energy can also be written as a tensor, which is necessary
887 for pressure calculation in a triclinic system, or systems where shear
890 {\bf E
}_
{\mathrm{kin
}} =
\half \sum_i^N m_i
\vvi \otimes \vvi
893 \subsubsection{Pressure and virial
}
894 The
\normindex{pressure
}
895 tensor
{\bf P
} is calculated from the difference between
896 kinetic energy $E_
{\mathrm{kin
}}$ and the
\normindex{virial
} $
{\bf \Xi}$:
898 {\bf P
} =
\frac{2}{V
} (
{\bf E
}_
{\mathrm{kin
}}-
{\bf \Xi})
901 where $V$ is the volume of the computational box.
902 The scalar pressure $P$, which can be used for pressure coupling in the case
903 of isotropic systems, is computed as:
905 P =
{\rm trace
}(
{\bf P
})/
3
908 The virial $
{\bf \Xi}$ tensor is defined as:
910 {\bf \Xi} = -
\half \sum_{i<j
} \rvij \otimes \Fvij
914 \ifthenelse{\equal{\gmxlite}{1}}{}{
915 The
{\gromacs} implementation of the virial computation is described
917 } % Brace matches ifthenelse test for gmxlite
920 \subsection{The
\swapindex{leap-frog
}{integrator
}}
921 \label{subsec:update
}
923 \centerline{\includegraphics[width=
8cm
]{plots/leapfrog
}}
924 \caption[The Leap-Frog integration method.
]{The Leap-Frog integration method. The algorithm is called Leap-Frog because $
\ve{r
}$ and $
\ve{v
}$ are leaping
925 like frogs over each other's backs.
}
929 The default MD integrator in
{\gromacs} is the so-called
{\em leap-frog
}
930 algorithm~
\cite{Hockney74
} for the integration of the equations of
931 motion. When extremely accurate integration with temperature
932 and/or pressure coupling is required, the velocity Verlet integrators
933 are also present and may be preferable (see
\ssecref{vverlet
}). The leap-frog
934 algorithm uses positions $
\ve{r
}$ at time $t$ and
935 velocities $
\ve{v
}$ at time $t-
\hDt$; it updates positions and
936 velocities using the forces
937 $
\ve{F
}(t)$ determined by the positions at time $t$ using these relations:
939 \label{eqn:leapfrogv
}
940 \ve{v
}(t+
\hDt) &~=~&
\ve{v
}(t-
\hDt)+
\frac{\Dt}{m
}\ve{F
}(t) \\
941 \ve{r
}(t+
\Dt) &~=~&
\ve{r
}(t)+
\Dt\ve{v
}(t+
\hDt)
943 The algorithm is visualized in
\figref{leapfrog
}.
944 It produces trajectories that are identical to the Verlet~
\cite{Verlet67
} algorithm,
945 whose position-update relation is
947 \ve{r
}(t+
\Dt)~=~
2\ve{r
}(t) -
\ve{r
}(t-
\Dt) +
\frac{1}{m
}\ve{F
}(t)
\Dt^
2+O(
\Dt^
4)
949 The algorithm is of third order in $
\ve{r
}$ and is time-reversible.
950 See ref.~
\cite{Berendsen86b
} for the merits of this algorithm and comparison
951 with other time integration algorithms.
953 The
\swapindex{equations of
}{motion
} are modified for temperature
954 coupling and pressure coupling, and extended to include the
955 conservation of constraints, all of which are described below.
957 \subsection{The
\swapindex{velocity Verlet
}{integrator
}}
958 \label{subsec:vverlet
}
959 The velocity Verlet algorithm~
\cite{Swope82
} is also implemented in
960 {\gromacs}, though it is not yet fully integrated with all sets of
961 options. In velocity Verlet, positions $
\ve{r
}$ and velocities
962 $
\ve{v
}$ at time $t$ are used to integrate the equations of motion;
963 velocities at the previous half step are not required.
\bea
964 \label{eqn:velocityverlet1
}
965 \ve{v
}(t+
\hDt) &~=~&
\ve{v
}(t)+
\frac{\Dt}{2m
}\ve{F
}(t) \\
966 \ve{r
}(t+
\Dt) &~=~&
\ve{r
}(t)+
\Dt\,
\ve{v
}(t+
\hDt) \\
967 \ve{v
}(t+
\Dt) &~=~&
\ve{v
}(t+
\hDt)+
\frac{\Dt}{2m
}\ve{F
}(t+
\Dt)
971 \label{eqn:velocityverlet2
}
972 \ve{r
}(t+
\Dt) &~=~&
\ve{r
}(t)+
\Dt\,
\ve{v
} +
\frac{\Dt^
2}{2m
}\ve{F
}(t) \\
973 \ve{v
}(t+
\Dt) &~=~&
\ve{v
}(t)+
\frac{\Dt}{2m
}\left[\ve{F
}(t) +
\ve{F
}(t+
\Dt)
\right]
975 With no temperature or pressure coupling, and with
{\em corresponding
}
976 starting points, leap-frog and velocity Verlet will generate identical
977 trajectories, as can easily be verified by hand from the equations
978 above. Given a single starting file with the
{\em same
} starting
979 point $
\ve{x
}(
0)$ and $
\ve{v
}(
0)$, leap-frog and velocity Verlet will
980 {\em not
} give identical trajectories, as leap-frog will interpret the
981 velocities as corresponding to $t=-
\hDt$, while velocity Verlet will
982 interpret them as corresponding to the timepoint $t=
0$.
984 \subsection{Understanding reversible integrators: The Trotter decomposition
}
985 To further understand the relationship between velocity Verlet and
986 leap-frog integration, we introduce the reversible Trotter formulation
987 of dynamics, which is also useful to understanding implementations of
988 thermostats and barostats in
{\gromacs}.
990 A system of coupled, first-order differential equations can be evolved
991 from time $t =
0$ to time $t$ by applying the evolution operator
993 \Gamma(t) &=&
\exp(iLt)
\Gamma(
0)
\nonumber \\
994 iL &=&
\dot{\Gamma}\cdot \nabla_{\Gamma},
996 where $L$ is the Liouville operator, and $
\Gamma$ is the
997 multidimensional vector of independent variables (positions and
999 A short-time approximation to the true operator, accurate at time $
\Dt
1000 = t/P$, is applied $P$ times in succession to evolve the system as
1002 \Gamma(t) =
\prod_{i=
1}^P
\exp(iL
\Dt)
\Gamma(
0)
1004 For NVE dynamics, the Liouville operator is
1006 iL =
\sum_{i=
1}^
{N
} \vv_i \cdot \nabla_{\rv_i} +
\sum_{i=
1}^N
\frac{1}{m_i
}\F(r_i)
\cdot \nabla_{\vv_i}.
1008 This can be split into two additive operators
1010 iL_1 &=&
\sum_{i=
1}^N
\frac{1}{m_i
}\F(r_i)
\cdot \nabla_{\vv_i} \nonumber \\
1011 iL_2 &=&
\sum_{i=
1}^
{N
} \vv_i \cdot \nabla_{\rv_i}
1013 Then a short-time, symmetric, and thus reversible approximation of the true dynamics will be
1015 \exp(iL
\Dt) =
\exp(iL_2
\hDt)
\exp(iL_1
\Dt)
\exp(iL_2
\hDt) +
\mathcal{O
}(
\Dt^
3).
1016 \label{eq:NVE_Trotter
}
1018 This corresponds to velocity Verlet integration. The first
1019 exponential term over $
\hDt$ corresponds to a velocity half-step, the
1020 second exponential term over $
\Dt$ corresponds to a full velocity
1021 step, and the last exponential term over $
\hDt$ is the final velocity
1022 half step. For future times $t = n
\Dt$, this becomes
1024 \exp(iLn
\Dt) &
\approx&
\left(
\exp(iL_2
\hDt)
\exp(iL_1
\Dt)
\exp(iL_2
\hDt)
\right)^n
\nonumber \\
1025 &
\approx&
\exp(iL_2
\hDt)
\bigg(
\exp(iL_1
\Dt)
\exp(iL_2
\Dt)
\bigg)^
{n-
1} \nonumber \\
1026 & & \;\;\;\;
\exp(iL_1
\Dt)
\exp(iL_2
\hDt)
1028 This formalism allows us to easily see the difference between the
1029 different flavors of Verlet integrators. The leap-frog integrator can
1030 be seen as starting with Eq.~
\ref{eq:NVE_Trotter
} with the
1031 $
\exp\left(iL_1
\dt\right)$ term, instead of the half-step velocity
1034 \exp(iLn
\dt) &=&
\exp\left(iL_1
\dt\right)
\exp\left(iL_2
\Dt \right) +
\mathcal{O
}(
\Dt^
3).
1036 Here, the full step in velocity is between $t-
\hDt$ and $t+
\hDt$,
1037 since it is a combination of the velocity half steps in velocity
1038 Verlet. For future times $t = n
\Dt$, this becomes
1040 \exp(iLn
\dt) &
\approx&
\bigg(
\exp\left(iL_1
\dt\right)
\exp\left(iL_2
\Dt \right)
\bigg)^
{n
}.
1042 Although at first this does not appear symmetric, as long as the full velocity
1043 step is between $t-
\hDt$ and $t+
\hDt$, then this is simply a way of
1044 starting velocity Verlet at a different place in the cycle.
1046 Even though the trajectory and thus potential energies are identical
1047 between leap-frog and velocity Verlet, the kinetic energy and
1048 temperature will not necessarily be the same. Standard velocity
1049 Verlet uses the velocities at the $t$ to calculate the kinetic energy
1050 and thus the temperature only at time $t$; the kinetic energy is then a sum over all particles
1052 KE_
{\mathrm{full
}}(t) &=&
\sum_i \left(
\frac{1}{2m_i
}\ve{v
}_i(t)
\right)^
2 \nonumber\\
1053 &=&
\sum_i \frac{1}{2m_i
}\left(
\frac{1}{2}\ve{v
}_i(t-
\hDt)+
\frac{1}{2}\ve{v
}_i(t+
\hDt)
\right)^
2,
1055 with the square on the
{\em outside
} of the average. Standard
1056 leap-frog calculates the kinetic energy at time $t$ based on the
1057 average kinetic energies at the timesteps $t+
\hDt$ and $t-
\hDt$, or
1058 the sum over all particles
1060 KE_
{\mathrm{average
}}(t) &=&
\sum_i \frac{1}{2m_i
}\left(
\frac{1}{2}\ve{v
}_i(t-
\hDt)^
2+
\frac{1}{2}\ve{v
}_i(t+
\hDt)^
2\right),
1062 where the square is
{\em inside
} the average.
1064 A non-standard variant of velocity Verlet which averages the kinetic
1065 energies $KE(t+
\hDt)$ and $KE(t-
\hDt)$, exactly like leap-frog, is also
1066 now implemented in
{\gromacs} (as
{\tt .mdp
} file option
{\tt md-vv-avek
}). Without
1067 temperature and pressure coupling, velocity Verlet with
1068 half-step-averaged kinetic energies and leap-frog will be identical up
1069 to numerical precision. For temperature- and pressure-control schemes,
1070 however, velocity Verlet with half-step-averaged kinetic energies and
1071 leap-frog will be different, as will be discussed in the section in
1072 thermostats and barostats.
1074 The half-step-averaged kinetic energy and temperature are slightly more
1075 accurate for a given step size; the difference in average kinetic
1076 energies using the half-step-averaged kinetic energies (
{\em md
} and
1077 {\em md-vv-avek
}) will be closer to the kinetic energy obtained in the
1078 limit of small step size than will the full-step kinetic energy (using
1079 {\em md-vv
}). For NVE simulations, this difference is usually not
1080 significant, since the positions and velocities of the particles are
1081 still identical; it makes a difference in the way the the temperature
1082 of the simulations are
{\em interpreted
}, but
{\em not
} in the
1083 trajectories that are produced. Although the kinetic energy is more
1084 accurate with the half-step-averaged method, meaning that it changes
1085 less as the timestep gets large, it is also more noisy. The RMS deviation
1086 of the total energy of the system (sum of kinetic plus
1087 potential) in the half-step-averaged kinetic energy case will be
1088 higher (about twice as high in most cases) than the full-step kinetic
1089 energy. The drift will still be the same, however, as again, the
1090 trajectories are identical.
1092 For NVT simulations, however, there
{\em will
} be a difference, as
1093 discussed in the section on temperature control, since the velocities
1094 of the particles are adjusted such that kinetic energies of the
1095 simulations, which can be calculated either way, reach the
1096 distribution corresponding to the set temperature. In this case, the
1097 three methods will not give identical results.
1099 Because the velocity and position are both defined at the same time
1100 $t$ the velocity Verlet integrator can be used for some methods,
1101 especially rigorously correct pressure control methods, that are not
1102 actually possible with leap-frog. The integration itself takes
1103 negligibly more time than leap-frog, but twice as many communication
1104 calls are currently required. In most cases, and especially for large
1105 systems where communication speed is important for parallelization and
1106 differences between thermodynamic ensembles vanish in the $
1/N$ limit,
1107 and when only NVT ensembles are required, leap-frog will likely be the
1108 preferred integrator. For pressure control simulations where the fine
1109 details of the thermodynamics are important, only velocity Verlet
1110 allows the true ensemble to be calculated. In either case, simulation
1111 with double precision may be required to get fine details of
1112 thermodynamics correct.
1114 \subsection{Multiple time stepping
}
1115 Several other simulation packages uses multiple time stepping for
1116 bonds and/or the PME mesh forces. In
{\gromacs} we have not implemented
1117 this (yet), since we use a different philosophy. Bonds can be constrained
1118 (which is also a more sound approximation of a physical quantum
1119 oscillator), which allows the smallest time step to be increased
1120 to the larger one. This not only halves the number of force calculations,
1121 but also the update calculations. For even larger time steps, angle vibrations
1122 involving hydrogen atoms can be removed using virtual interaction
1123 \ifthenelse{\equal{\gmxlite}{1}}
1125 {sites (see
\secref{rmfast
}),
}
1126 which brings the shortest time step up to
1127 PME mesh update frequency of a multiple time stepping scheme.
1129 \subsection{Temperature coupling
\index{temperature coupling
}}
1130 While direct use of molecular dynamics gives rise to the NVE (constant
1131 number, constant volume, constant energy ensemble), most quantities
1132 that we wish to calculate are actually from a constant temperature
1133 (NVT) ensemble, also called the canonical ensemble.
{\gromacs} can use
1134 the
{\em weak-coupling
} scheme of Berendsen~
\cite{Berendsen84
},
1135 stochastic randomization through the Andersen
1136 thermostat~
\cite{Andersen80
}, the extended ensemble Nos
{\'e
}-Hoover
1137 scheme~
\cite{Nose84,Hoover85
}, or a velocity-rescaling
1138 scheme~
\cite{Bussi2007a
} to simulate constant temperature, with
1139 advantages of each of the schemes laid out below.
1141 There are several other reasons why it might be necessary to control
1142 the temperature of the system (drift during equilibration, drift as a
1143 result of force truncation and integration errors, heating due to
1144 external or frictional forces), but this is not entirely correct to do
1145 from a thermodynamic standpoint, and in some cases only masks the
1146 symptoms (increase in temperature of the system) rather than the
1147 underlying problem (deviations from correct physics in the dynamics).
1148 For larger systems, errors in ensemble averages and structural
1149 properties incurred by using temperature control to remove slow drifts
1150 in temperature appear to be negligible, but no completely
1151 comprehensive comparisons have been carried out, and some caution must
1152 be taking in interpreting the results.
1154 When using temperature and/or pressure coupling the total energy is
1155 no longer conserved. Instead there is a
\normindex{conserved energy quantity
}
1156 the formula of which will depend on the combination or temperature and
1157 pressure coupling algorithm used. For all coupling algorithms, except
1158 for Andersen temperature coupling and Parrinello-Rahman pressure coupling
1159 combined with shear stress, the conserved energy quantity is computed
1160 and stored in the energy and log file. Note that this quantity will not
1161 be conserved when external forces are applied to the system, such as
1162 pulling on group with a changing distance or an electric field.
1163 Furthermore, how well the energy is conserved depends on the accuracy
1164 of all algorithms involved in the simulation. Usually the algorithms that
1165 cause most drift are constraints and the pair-list buffer, depending
1166 on the parameters used.
1168 \subsubsection{Berendsen temperature coupling
\pawsindexquiet{Berendsen
}{temperature coupling
}\index{weak coupling
}}
1169 The Berendsen algorithm mimics weak coupling with first-order
1170 kinetics to an external heat bath with given temperature $T_0$.
1171 See ref.~
\cite{Berendsen91
} for a comparison with the
1172 Nos
{\'e
}-Hoover scheme. The effect of this algorithm is
1173 that a deviation of the system temperature from $T_0$ is slowly
1174 corrected according to:
1176 \frac{\de T
}{\de t
} =
\frac{T_0-T
}{\tau}
1177 \label{eqn:Tcoupling
}
1179 which means that a temperature deviation decays exponentially with a
1180 time constant $
\tau$.
1181 This method of coupling has the advantage that the strength of the
1182 coupling can be varied and adapted to the user requirement: for
1183 equilibration purposes the coupling time can be taken quite short
1184 (
{\eg} 0.01 ps), but for reliable equilibrium runs it can be taken much
1185 longer (
{\eg} 0.5 ps) in which case it hardly influences the
1186 conservative dynamics.
1188 The Berendsen thermostat suppresses the fluctuations of the kinetic
1189 energy. This means that one does not generate a proper canonical
1190 ensemble, so rigorously, the sampling will be incorrect. This
1191 error scales with $
1/N$, so for very large systems most ensemble
1192 averages will not be affected significantly, except for the
1193 distribution of the kinetic energy itself. However, fluctuation
1194 properties, such as the heat capacity, will be affected. A similar
1195 thermostat which does produce a correct ensemble is the velocity
1196 rescaling thermostat~
\cite{Bussi2007a
} described below.
1198 The heat flow into or out of the system is affected by scaling the
1199 velocities of each particle every step, or every $n_
\mathrm{TC
}$ steps,
1200 with a time-dependent factor $
\lambda$, given by:
1202 \lambda =
\left[ 1 +
\frac{n_
\mathrm{TC
} \Delta t
}{\tau_T}
1203 \left\
{\frac{T_0
}{T(t -
\hDt)
} -
1 \right\
} \right]^
{1/
2}
1206 The parameter $
\tau_T$ is close, but not exactly equal, to the time constant
1207 $
\tau$ of the temperature coupling (
\eqnref{Tcoupling
}):
1209 \tau =
2 C_V
\tau_T / N_
{df
} k
1211 where $C_V$ is the total heat capacity of the system, $k$ is Boltzmann's
1212 constant, and $N_
{df
}$ is the total number of degrees of freedom. The
1213 reason that $
\tau \neq \tau_T$ is that the kinetic energy change
1214 caused by scaling the velocities is partly redistributed between
1215 kinetic and potential energy and hence the change in temperature is
1216 less than the scaling energy. In practice, the ratio $
\tau /
\tau_T$
1217 ranges from
1 (gas) to
2 (harmonic solid) to
3 (water). When we use
1218 the term ``temperature coupling time constant,'' we mean the parameter
1219 \normindex{$
\tau_T$
}.
1220 {\bf Note
} that in practice the scaling factor $
\lambda$ is limited to
1221 the range of
0.8 $<=
\lambda <=$
1.25, to avoid scaling by very large
1222 numbers which may crash the simulation. In normal use,
1223 $
\lambda$ will always be much closer to
1.0.
1225 The thermostat modifies the kinetic energy at each scaling step by:
1227 \Delta E_k = (
\lambda -
1)^
2 E_k
1229 The sum of these changes over the run needs to subtracted from the total energy
1230 to obtain the conserved energy quantity.
1232 \subsubsection{Velocity-rescaling temperature coupling
\pawsindexquiet{velocity-rescaling
}{temperature coupling
}}
1233 The velocity-rescaling thermostat~
\cite{Bussi2007a
} is essentially a Berendsen
1234 thermostat (see above) with an additional stochastic term that ensures
1235 a correct kinetic energy distribution by modifying it according to
1237 \de K = (K_0 - K)
\frac{\de t
}{\tau_T} +
2 \sqrt{\frac{K K_0
}{N_f
}} \frac{\de W
}{\sqrt{\tau_T}},
1238 \label{eqn:vrescale
}
1240 where $K$ is the kinetic energy, $N_f$ the number of degrees of freedom and $
\de W$ a Wiener process.
1241 There are no additional parameters, except for a random seed.
1242 This thermostat produces a correct canonical ensemble and still has
1243 the advantage of the Berendsen thermostat: first order decay of
1244 temperature deviations and no oscillations.
1246 \subsubsection{\normindex{Andersen thermostat
}}
1247 One simple way to maintain a thermostatted ensemble is to take an
1248 $NVE$ integrator and periodically re-select the velocities of the
1249 particles from a Maxwell-Boltzmann distribution.~
\cite{Andersen80
}
1250 This can either be done by randomizing all the velocities
1251 simultaneously (massive collision) every $
\tau_T/
\Dt$ steps (
{\tt andersen-massive
}), or by
1252 randomizing every particle with some small probability every timestep (
{\tt andersen
}),
1253 equal to $
\Dt/
\tau$, where in both cases $
\Dt$ is the timestep and
1254 $
\tau_T$ is a characteristic coupling time scale.
1255 Because of the way constraints operate, all particles in the same
1256 constraint group must be randomized simultaneously. Because of
1257 parallelization issues, the
{\tt andersen
} version cannot currently (
5.0) be
1258 used in systems with constraints.
{\tt andersen-massive
} can be used regardless of constraints.
1259 This thermostat is also currently only possible with velocity Verlet algorithms,
1260 because it operates directly on the velocities at each timestep.
1262 This algorithm completely avoids some of the ergodicity issues of other thermostatting
1263 algorithms, as energy cannot flow back and forth between energetically
1264 decoupled components of the system as in velocity scaling motions.
1265 However, it can slow down the kinetics of system by randomizing
1266 correlated motions of the system, including slowing sampling when
1267 $
\tau_T$ is at moderate levels (less than
10 ps). This algorithm
1268 should therefore generally not be used when examining kinetics or
1269 transport properties of the system.~
\cite{Basconi2013
}
1271 % \ifthenelse{\equal{\gmxlite}{1}}{}{
1272 \subsubsection{Nos
{\'e
}-Hoover temperature coupling
\index{Nose-Hoover temperature coupling@Nos
{\'e
}-Hoover temperature coupling|see
{temperature coupling, Nos
{\'e
}-Hoover
}}{\index{temperature coupling Nose-Hoover@temperature coupling Nos
{\'e
}-Hoover
}}\index{extended ensemble
}}
1274 The Berendsen weak-coupling algorithm is
1275 extremely efficient for relaxing a system to the target temperature,
1276 but once the system has reached equilibrium it might be more
1277 important to probe a correct canonical ensemble. This is unfortunately
1278 not the case for the weak-coupling scheme.
1280 To enable canonical ensemble simulations,
{\gromacs} also supports the
1281 extended-ensemble approach first proposed by Nos
{\'e
}~
\cite{Nose84
}
1282 and later modified by Hoover~
\cite{Hoover85
}. The system Hamiltonian is
1283 extended by introducing a thermal reservoir and a friction term in the
1284 equations of motion. The friction force is proportional to the
1285 product of each particle's velocity and a friction parameter, $
\xi$.
1286 This friction parameter (or ``heat bath'' variable) is a fully
1287 dynamic quantity with its own momentum ($p_
{\xi}$) and equation of
1288 motion; the time derivative is calculated from the difference between
1289 the current kinetic energy and the reference temperature.
1291 In this formulation, the particles' equations of motion in
1292 \figref{global
} are replaced by:
1294 \frac {\de^
2\ve{r
}_i
}{\de t^
2} =
\frac{\ve{F
}_i
}{m_i
} -
1295 \frac{p_
{\xi}}{Q
}\frac{\de \ve{r
}_i
}{\de t
} ,
1296 \label{eqn:NH-eqn-of-motion
}
1297 \eeq where the equation of motion for the heat bath parameter $
\xi$ is:
1298 \beq \frac {\de p_
{\xi}}{\de t
} =
\left( T - T_0
\right).
\eeq The
1299 reference temperature is denoted $T_0$, while $T$ is the current
1300 instantaneous temperature of the system. The strength of the coupling
1301 is determined by the constant $Q$ (usually called the ``mass parameter''
1302 of the reservoir) in combination with the reference
1303 temperature.~
\footnote{Note that some derivations, an alternative
1304 notation $
\xi_{\mathrm{alt
}} = v_
{\xi} = p_
{\xi}/Q$ is used.
}
1306 The conserved quantity for the Nos
{\'e
}-Hoover equations of motion is not
1307 the total energy, but rather
1309 H =
\sum_{i=
1}^
{N
} \frac{\pb_i}{2m_i
} + U
\left(
\rv_1,
\rv_2,
\ldots,
\rv_N\right) +
\frac{p_
{\xi}^
2}{2Q
} + N_fkT
\xi,
1311 where $N_f$ is the total number of degrees of freedom.
1313 In our opinion, the mass parameter is a somewhat awkward way of
1314 describing coupling strength, especially due to its dependence on
1315 reference temperature (and some implementations even include the
1316 number of degrees of freedom in your system when defining $Q$). To
1317 maintain the coupling strength, one would have to change $Q$ in
1318 proportion to the change in reference temperature. For this reason, we
1319 prefer to let the
{\gromacs} user work instead with the period
1320 $
\tau_T$ of the oscillations of kinetic energy between the system and
1321 the reservoir instead. It is directly related to $Q$ and $T_0$ via:
1323 Q =
\frac {\tau_T^
2 T_0
}{4 \pi^
2}.
1325 This provides a much more intuitive way of selecting the
1326 Nos
{\'e
}-Hoover coupling strength (similar to the weak-coupling
1327 relaxation), and in addition $
\tau_T$ is independent of system size
1328 and reference temperature.
1330 It is however important to keep the difference between the
1331 weak-coupling scheme and the Nos
{\'e
}-Hoover algorithm in mind:
1332 Using weak coupling you get a
1333 strongly damped
{\em exponential relaxation
},
1334 while the Nos
{\'e
}-Hoover approach
1335 produces an
{\em oscillatory relaxation
}.
1336 The actual time it takes to relax with Nos
{\'e
}-Hoover coupling is
1337 several times larger than the period of the
1338 oscillations that you select. These oscillations (in contrast
1339 to exponential relaxation) also means that
1340 the time constant normally should be
4--
5 times larger
1341 than the relaxation time used with weak coupling, but your
1344 Nos
{\'e
}-Hoover dynamics in simple systems such as collections of
1345 harmonic oscillators, can be
{\em nonergodic
}, meaning that only a
1346 subsection of phase space is ever sampled, even if the simulations
1347 were to run for infinitely long. For this reason, the Nos
{\'e
}-Hoover
1348 chain approach was developed, where each of the Nos
{\'e
}-Hoover
1349 thermostats has its own Nos
{\'e
}-Hoover thermostat controlling its
1350 temperature. In the limit of an infinite chain of thermostats, the
1351 dynamics are guaranteed to be ergodic. Using just a few chains can
1352 greatly improve the ergodicity, but recent research has shown that the
1353 system will still be nonergodic, and it is still not entirely clear
1354 what the practical effect of this~
\cite{Cooke2008
}. Currently, the
1355 default number of chains is
10, but this can be controlled by the
1356 user. In the case of chains, the equations are modified in the
1357 following way to include a chain of thermostatting
1358 particles~
\cite{Martyna1992
}:
1361 \frac {\de^
2\ve{r
}_i
}{\de t^
2} &~=~&
\frac{\ve{F
}_i
}{m_i
} -
\frac{p_
{{\xi}_1
}}{Q_1
} \frac{\de \ve{r
}_i
}{\de t
} \nonumber \\
1362 \frac {\de p_
{{\xi}_1
}}{\de t
} &~=~&
\left( T - T_0
\right) - p_
{{\xi}_1
} \frac{p_
{{\xi}_2
}}{Q_2
} \nonumber \\
1363 \frac {\de p_
{{\xi}_
{i=
2\ldots N
}}}{\de t
} &~=~&
\left(
\frac{p_
{\xi_{i-
1}}^
2}{Q_
{i-
1}} -kT
\right) - p_
{\xi_i} \frac{p_
{\xi_{i+
1}}}{Q_
{i+
1}} \nonumber \\
1364 \frac {\de p_
{\xi_N}}{\de t
} &~=~&
\left(
\frac{p_
{\xi_{N-
1}}^
2}{Q_
{N-
1}}-kT
\right)
1365 \label{eqn:NH-chain-eqn-of-motion
}
1367 The conserved quantity for Nos
{\'e
}-Hoover chains is
1369 H =
\sum_{i=
1}^
{N
} \frac{\pb_i}{2m_i
} + U
\left(
\rv_1,
\rv_2,
\ldots,
\rv_N\right) +
\sum_{k=
1}^M
\frac{p^
2_
{\xi_k}}{2Q^
{\prime}_k
} + N_fkT
\xi_1 + kT
\sum_{k=
2}^M
\xi_k
1371 The values and velocities of the Nos
{\'e
}-Hoover thermostat variables
1372 are generally not included in the output, as they take up a fair
1373 amount of space and are generally not important for analysis of
1374 simulations, but this can be overridden by defining the environment
1375 variable
{\tt GMX_NOSEHOOVER_CHAINS
}, which will print the values of all
1376 the positions and velocities of all Nos
{\'e
}-Hoover particles in the
1377 chain to the
{\tt .edr
} file. Leap-frog simulations currently can only have
1378 Nos
{\'e
}-Hoover chain lengths of
1, but this will likely be updated in
1381 As described in the integrator section, for temperature coupling, the
1382 temperature that the algorithm attempts to match to the reference
1383 temperature is calculated differently in velocity Verlet and leap-frog
1384 dynamics. Velocity Verlet (
{\em md-vv
}) uses the full-step kinetic
1385 energy, while leap-frog and
{\em md-vv-avek
} use the half-step-averaged
1388 We can examine the Trotter decomposition again to better understand
1389 the differences between these constant-temperature integrators. In
1390 the case of Nos
{\'e
}-Hoover dynamics (for simplicity, using a chain
1391 with $N=
1$, with more details in Ref.~
\cite{Martyna1996
}), we split
1392 the Liouville operator as
1394 iL = iL_1 + iL_2 + iL_
{\mathrm{NHC
}},
1398 iL_1 &=&
\sum_{i=
1}^N
\left[\frac{\pb_i}{m_i
}\right]\cdot \frac{\partial}{\partial \rv_i} \nonumber \\
1399 iL_2 &=&
\sum_{i=
1}^N
\F_i\cdot \frac{\partial}{\partial \pb_i} \nonumber \\
1400 iL_
{\mathrm{NHC
}} &=&
\sum_{i=
1}^N-
\frac{p_
{\xi}}{Q
}\vv_i\cdot \nabla_{\vv_i} +
\frac{p_
{\xi}}{Q
}\frac{\partial }{\partial \xi} +
\left( T - T_0
\right)
\frac{\partial }{\partial p_
{\xi}}
1402 For standard velocity Verlet with Nos
{\'e
}-Hoover temperature control, this becomes
1404 \exp(iL
\dt) &=&
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\exp\left(iL_2
\dt/
2\right)
\nonumber \\
1405 &&
\exp\left(iL_1
\dt\right)
\exp\left(iL_2
\dt/
2\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right) +
\mathcal{O
}(
\Dt^
3).
1407 For half-step-averaged temperature control using
{\em md-vv-avek
},
1408 this decomposition will not work, since we do not have the full step
1409 temperature until after the second velocity step. However, we can
1410 construct an alternate decomposition that is still reversible, by
1411 switching the place of the NHC and velocity portions of the
1414 \exp(iL
\dt) &=&
\exp\left(iL_2
\dt/
2\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\exp\left(iL_1
\dt\right)
\nonumber \\
1415 &&
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\exp\left(iL_2
\dt/
2\right)+
\mathcal{O
}(
\Dt^
3)
1416 \label{eq:half_step_NHC_integrator
}
1418 This formalism allows us to easily see the difference between the
1419 different flavors of velocity Verlet integrator. The leap-frog
1420 integrator can be seen as starting with
1421 Eq.~
\ref{eq:half_step_NHC_integrator
} just before the $
\exp\left(iL_1
1422 \dt\right)$ term, yielding:
1424 \exp(iL
\dt) &=&
\exp\left(iL_1
\dt\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\nonumber \\
1425 &&
\exp\left(iL_2
\dt\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right) +
\mathcal{O
}(
\Dt^
3)
1427 and then using some algebra tricks to solve for some quantities are
1428 required before they are actually calculated~
\cite{Holian95
}.
1432 \subsubsection{Group temperature coupling
}\index{temperature-coupling group
}%
1433 In
{\gromacs} temperature coupling can be performed on groups of
1434 atoms, typically a protein and solvent. The reason such algorithms
1435 were introduced is that energy exchange between different components
1436 is not perfect, due to different effects including cut-offs etc. If
1437 now the whole system is coupled to one heat bath, water (which
1438 experiences the largest cut-off noise) will tend to heat up and the
1439 protein will cool down. Typically
100 K differences can be obtained.
1440 With the use of proper electrostatic methods (PME) these difference
1441 are much smaller but still not negligible. The parameters for
1442 temperature coupling in groups are given in the
{\tt mdp
} file.
1443 Recent investigation has shown that small temperature differences
1444 between protein and water may actually be an artifact of the way
1445 temperature is calculated when there are finite timesteps, and very
1446 large differences in temperature are likely a sign of something else
1447 seriously going wrong with the system, and should be investigated
1448 carefully~
\cite{Eastwood2010
}.
1450 One special case should be mentioned: it is possible to temperature-couple only
1451 part of the system, leaving other parts without temperature
1452 coupling. This is done by specifying $
{-
1}$ for the time constant
1453 $
\tau_T$ for the group that should not be thermostatted. If only
1454 part of the system is thermostatted, the system will still eventually
1455 converge to an NVT system. In fact, one suggestion for minimizing
1456 errors in the temperature caused by discretized timesteps is that if
1457 constraints on the water are used, then only the water degrees of
1458 freedom should be thermostatted, not protein degrees of freedom, as
1459 the higher frequency modes in the protein can cause larger deviations
1460 from the ``true'' temperature, the temperature obtained with small
1461 timesteps~
\cite{Eastwood2010
}.
1463 \subsection{Pressure coupling
\index{pressure coupling
}}
1464 In the same spirit as the temperature coupling, the system can also be
1465 coupled to a ``pressure bath.''
{\gromacs} supports both the Berendsen
1466 algorithm~
\cite{Berendsen84
} that scales coordinates and box vectors
1467 every step, the extended-ensemble Parrinello-Rahman approach~
\cite{Parrinello81,Nose83
}, and for
1468 the velocity Verlet variants, the Martyna-Tuckerman-Tobias-Klein
1469 (MTTK) implementation of pressure
1470 control~
\cite{Martyna1996
}. Parrinello-Rahman and Berendsen can be
1471 combined with any of the temperature coupling methods above. MTTK can
1472 only be used with Nos
{\'e
}-Hoover temperature control. From
5.1 afterwards,
1473 it can only used when the system does not have constraints.
1475 \subsubsection{Berendsen pressure coupling
\pawsindexquiet{Berendsen
}{pressure coupling
}\index{weak coupling
}}
1476 \label{sec:berendsen_pressure_coupling
}
1477 The Berendsen algorithm rescales the
1478 coordinates and box vectors every step, or every $n_
\mathrm{PC
}$ steps,
1479 with a matrix
{\boldmath $
\mu$
},
1480 which has the effect of a first-order kinetic relaxation of the pressure
1481 towards a given reference pressure $
{\bf P
}_0$ according to
1483 \frac{\de {\bf P
}}{\de t
} =
\frac{{\bf P
}_0-
{\bf P
}}{\tau_p}.
1485 The scaling matrix
{\boldmath $
\mu$
} is given by
1488 =
\delta_{ij
} -
\frac{n_
\mathrm{PC
}\Delta t
}{3\,
\tau_p} \beta_{ij
} \
{P_
{0ij
} - P_
{ij
}(t) \
}.
1491 \index{isothermal compressibility
}
1492 \index{compressibility
}
1493 Here,
{\boldmath $
\beta$
} is the isothermal compressibility of the system.
1494 In most cases this will be a diagonal matrix, with equal elements on the
1495 diagonal, the value of which is generally not known.
1496 It suffices to take a rough estimate because the value of
{\boldmath $
\beta$
}
1497 only influences the non-critical time constant of the
1498 pressure relaxation without affecting the average pressure itself.
1499 For water at
1 atm and
300 K
1500 $
\beta =
4.6 \times 10^
{-
10}$ Pa$^
{-
1} =
4.6 \times 10^
{-
5}$ bar$^
{-
1}$,
1501 which is $
7.6 \times 10^
{-
4}$ MD units (see
\chref{defunits
}).
1502 Most other liquids have similar values.
1503 When scaling completely anisotropically, the system has to be rotated in
1504 order to obey
\eqnref{box_rot
}.
1505 This rotation is approximated in first order in the scaling, which is usually
1506 less than $
10^
{-
4}$. The actual scaling matrix
{\boldmath $
\mu'$
} is
1508 \mbox{\boldmath $
\mu'$
} =
1509 \left(
\begin{array
}{ccc
}
1510 \mu_{xx
} &
\mu_{xy
} +
\mu_{yx
} &
\mu_{xz
} +
\mu_{zx
} \\
1511 0 &
\mu_{yy
} &
\mu_{yz
} +
\mu_{zy
} \\
1515 The velocities are neither scaled nor rotated.
1516 Since the equations of motion are modified by pressure coupling, the conserved
1517 energy quantity also needs to be modified. For first order pressure coupling,
1518 the work the barostat applies to the system every step needs to
1519 be subtracted from the total energy to obtain the conserved energy quantity:
1521 -
\sum_{i,j
} (
\mu_{ij
} -
\delta_{ij
}) P_
{ij
} V =
1522 \sum_{i,j
} 2(
\mu_{ij
} -
\delta_{ij
})
\Xi_{ij
}
1524 where $
\delta_{ij
}$ is the Kronecker delta and $
{\bf \Xi}$ is the virial.
1525 Note that the factor
2 originates from the factor $
\frac{1}{2}$
1526 in the virial definition (
\eqnref{Xi
}).
1529 In
{\gromacs}, the Berendsen scaling can also be done isotropically,
1530 which means that instead of $
\ve{P
}$ a diagonal matrix with elements of size
1531 trace$(
\ve{P
})/
3$ is used. For systems with interfaces, semi-isotropic
1532 scaling can be useful.
1533 In this case, the $x/y$-directions are scaled isotropically and the $z$
1534 direction is scaled independently. The compressibility in the $x/y$ or
1535 $z$-direction can be set to zero, to scale only in the other direction(s).
1537 If you allow full anisotropic deformations and use constraints you
1538 might have to scale more slowly or decrease your timestep to avoid
1539 errors from the constraint algorithms. It is important to note that
1540 although the Berendsen pressure control algorithm yields a simulation
1541 with the correct average pressure, it does not yield the exact NPT
1542 ensemble, and it is not yet clear exactly what errors this approximation
1545 % \ifthenelse{\equal{\gmxlite}{1}}{}{
1546 \subsubsection{Parrinello-Rahman pressure coupling
\pawsindexquiet{Parrinello-Rahman
}{pressure coupling
}}
1548 In cases where the fluctuations in pressure or volume are important
1549 {\em per se
} (
{\eg} to calculate thermodynamic properties), especially
1550 for small systems, it may be a problem that the exact ensemble is not
1551 well defined for the weak-coupling scheme, and that it does not
1552 simulate the true NPT ensemble.
1554 {\gromacs} also supports constant-pressure simulations using the
1555 Parrinello-Rahman approach~
\cite{Parrinello81,Nose83
}, which is similar
1556 to the Nos
{\'e
}-Hoover temperature coupling, and in theory gives the
1557 true NPT ensemble. With the Parrinello-Rahman barostat, the box
1558 vectors as represented by the matrix
\ve{b
} obey the matrix equation
1559 of motion
\footnote{The box matrix representation
\ve{b
} in
{\gromacs}
1560 corresponds to the transpose of the box matrix representation
\ve{h
}
1561 in the paper by Nos
{\'e
} and Klein. Because of this, some of our
1562 equations will look slightly different.
}
1564 \frac{\de \ve{b
}^
2}{\de t^
2}= V
\ve{W
}^
{-
1} \ve{b
}'^
{-
1} \left(
\ve{P
} -
\ve{P
}_
{ref
}\right).
1567 The volume of the box is denoted $V$, and $
\ve{W
}$ is a matrix parameter that determines
1568 the strength of the coupling. The matrices
\ve{P
} and
\ve{P
}$_
{ref
}$ are the
1569 current and reference pressures, respectively.
1571 The equations of motion for the particles are also changed, just as
1572 for the Nos
{\'e
}-Hoover coupling. In most cases you would combine the
1573 Parrinello-Rahman barostat with the Nos
{\'e
}-Hoover
1574 thermostat, but to keep it simple we only show the Parrinello-Rahman
1575 modification here. The modified Hamiltonian, which will be conserved, is:
1577 E_
\mathrm{pot
} + E_
\mathrm{kin
} +
\sum_i P_
{ii
} V +
1578 \sum_{i,j
} \frac{1}{2} W_
{ij
} \left(
\frac{\de b_
{ij
}}{\de t
} \right)^
2
1580 The equations of motion for the atoms, obtained from the Hamiltonian are:
1581 \bea \frac {\de^
2\ve{r
}_i
}{\de t^
2} & = &
\frac{\ve{F
}_i
}{m_i
} -
1582 \ve{M
} \frac{\de \ve{r
}_i
}{\de t
} , \\
\ve{M
} & = &
\ve{b
}^
{-
1} \left[
1583 \ve{b
} \frac{\de \ve{b
}'
}{\de t
} +
\frac{\de \ve{b
}}{\de t
} \ve{b
}'
1584 \right] \ve{b
}'^
{-
1}.
\eea The (inverse) mass parameter matrix
1585 $
\ve{W
}^
{-
1}$ determines the strength of the coupling, and how the box
1586 can be deformed. The box restriction (
\ref{eqn:box_rot
}) will be
1587 fulfilled automatically if the corresponding elements of $
\ve{W
}^
{-
1}$
1588 are zero. Since the coupling strength also depends on the size of your
1589 box, we prefer to calculate it automatically in
{\gromacs}. You only
1590 have to provide the approximate isothermal compressibilities
1591 {\boldmath $
\beta$
} and the pressure time constant $
\tau_p$ in the
1592 input file ($L$ is the largest box matrix element):
\beq \left(
1593 \ve{W
}^
{-
1} \right)_
{ij
} =
\frac{4 \pi^
2 \beta_{ij
}}{3 \tau_p^
2 L
}.
1594 \eeq Just as for the Nos
{\'e
}-Hoover thermostat, you should realize
1595 that the Parrinello-Rahman time constant is
{\em not
} equivalent to
1596 the relaxation time used in the Berendsen pressure coupling algorithm.
1597 In most cases you will need to use a
4--
5 times larger time constant
1598 with Parrinello-Rahman coupling. If your pressure is very far from
1599 equilibrium, the Parrinello-Rahman coupling may result in very large
1600 box oscillations that could even crash your run. In that case you
1601 would have to increase the time constant, or (better) use the weak-coupling
1602 scheme to reach the target pressure, and then switch to
1603 Parrinello-Rahman coupling once the system is in equilibrium.
1604 Additionally, using the leap-frog algorithm, the pressure at time $t$
1605 is not available until after the time step has completed, and so the
1606 pressure from the previous step must be used, which makes the algorithm
1607 not directly reversible, and may not be appropriate for high precision
1608 thermodynamic calculations.
1610 \subsubsection{Surface-tension coupling
\pawsindexquiet{surface-tension
}{pressure coupling
}}
1611 When a periodic system consists of more than one phase, separated by
1612 surfaces which are parallel to the $xy$-plane,
1613 the surface tension and the $z$-component of the pressure can be coupled
1614 to a pressure bath. Presently, this only works with the Berendsen
1615 pressure coupling algorithm in
{\gromacs}.
1616 The average surface tension $
\gamma(t)$ can be calculated from
1617 the difference between the normal and the lateral pressure
1620 \frac{1}{n
} \int_0^
{L_z
}
1621 \left\
{ P_
{zz
}(z,t) -
\frac{P_
{xx
}(z,t) + P_
{yy
}(z,t)
}{2} \right\
} \mbox{d
}z \\
1623 \frac{L_z
}{n
} \left\
{ P_
{zz
}(t) -
\frac{P_
{xx
}(t) + P_
{yy
}(t)
}{2} \right\
},
1625 where $L_z$ is the height of the box and $n$ is the number of surfaces.
1626 The pressure in the z-direction is corrected by scaling the height of
1627 the box with $
\mu_{zz
}$
1629 \Delta P_
{zz
} =
\frac{\Delta t
}{\tau_p} \
{ P_
{0zz
} - P_
{zz
}(t) \
}
1632 \mu_{zz
} =
1 +
\beta_{zz
} \Delta P_
{zz
}
1634 This is similar to normal pressure coupling, except that the factor
1635 of $
1/
3$ is missing.
1636 The pressure correction in the $z$-direction is then used to get the
1637 correct convergence for the surface tension to the reference value $
\gamma_0$.
1638 The correction factor for the box length in the $x$/$y$-direction is
1640 \mu_{x/y
} =
1 +
\frac{\Delta t
}{2\,
\tau_p} \beta_{x/y
}
1641 \left(
\frac{n
\gamma_0}{\mu_{zz
} L_z
}
1642 -
\left\
{ P_
{zz
}(t)+
\Delta P_
{zz
} -
\frac{P_
{xx
}(t) + P_
{yy
}(t)
}{2} \right\
}
1645 The value of $
\beta_{zz
}$ is more critical than with normal pressure
1646 coupling. Normally an incorrect compressibility will just scale $
\tau_p$,
1647 but with surface tension coupling it affects the convergence of the surface
1649 When $
\beta_{zz
}$ is set to zero (constant box height), $
\Delta P_
{zz
}$ is also set
1650 to zero, which is necessary for obtaining the correct surface tension.
1652 \subsubsection{MTTK pressure control algorithms
}
1654 As mentioned in the previous section, one weakness of leap-frog
1655 integration is in constant pressure simulations, since the pressure
1656 requires a calculation of both the virial and the kinetic energy at
1657 the full time step; for leap-frog, this information is not available
1658 until
{\em after
} the full timestep. Velocity Verlet does allow the
1659 calculation, at the cost of an extra round of global communication,
1660 and can compute, mod any integration errors, the true NPT ensemble.
1662 The full equations, combining both pressure coupling and temperature
1663 coupling, are taken from Martyna
{\em et al.
}~
\cite{Martyna1996
} and
1664 Tuckerman~
\cite{Tuckerman2006
} and are referred to here as MTTK
1665 equations (Martyna-Tuckerman-Tobias-Klein). We introduce for
1666 convenience $
\epsilon = (
1/
3)
\ln (V/V_0)$, where $V_0$ is a reference
1667 volume. The momentum of $
\epsilon$ is $
\veps = p_
{\epsilon}/W =
1668 \dot{\epsilon} =
\dot{V
}/
3V$, and define $
\alpha =
1 +
3/N_
{dof
}$ (see
1669 Ref~
\cite{Tuckerman2006
})
1671 The isobaric equations are
1673 \dot{\rv}_i &=&
\frac{\pb_i}{m_i
} +
\frac{\peps}{W
} \rv_i \nonumber \\
1674 \frac{\dot{\pb}_i
}{m_i
} &=&
\frac{1}{m_i
}\F_i -
\alpha\frac{\peps}{W
} \frac{\pb_i}{m_i
} \nonumber \\
1675 \dot{\epsilon} &=&
\frac{\peps}{W
} \nonumber \\
1676 \frac{\dot{\peps}}{W
} &=&
\frac{3V
}{W
}(P_
{\mathrm{int
}} - P) + (
\alpha-
1)
\left(
\sum_{n=
1}^N
\frac{\pb_i^
2}{m_i
}\right),\\
1680 P_
{\mathrm{int
}} &=& P_
{\mathrm{kin
}} -P_
{\mathrm{vir
}} =
\frac{1}{3V
}\left[\sum_{i=
1}^N
\left(
\frac{\pb_i^
2}{2m_i
} -
\rv_i \cdot \F_i\
1683 The terms including $
\alpha$ are required to make phase space
1684 incompressible~
\cite{Tuckerman2006
}. The $
\epsilon$ acceleration term
1687 \frac{\dot{\peps}}{W
} &=&
\frac{3V
}{W
}\left(
\alpha P_
{\mathrm{kin
}} - P_
{\mathrm{vir
}} - P
\right)
1689 In terms of velocities, these equations become
1691 \dot{\rv}_i &=&
\vv_i +
\veps \rv_i \nonumber \\
1692 \dot{\vv}_i &=&
\frac{1}{m_i
}\F_i -
\alpha\veps \vv_i \nonumber \\
1693 \dot{\epsilon} &=&
\veps \nonumber \\
1694 \dot{\veps} &=&
\frac{3V
}{W
}(P_
{\mathrm{int
}} - P) + (
\alpha-
1)
\left(
\sum_{n=
1}^N
\frac{1}{2} m_i
\vv_i^
2\right)
\nonumber \\
1695 P_
{\mathrm{int
}} &=& P_
{\mathrm{kin
}} - P_
{\mathrm{vir
}} =
\frac{1}{3V
}\left[\sum_{i=
1}^N
\left(
\frac{1}{2} m_i
\vv_i^
2 -
\rv_i \cdot \F_i\right)
\right]
1697 For these equations, the conserved quantity is
1699 H =
\sum_{i=
1}^
{N
} \frac{\pb_i^
2}{2m_i
} + U
\left(
\rv_1,
\rv_2,
\ldots,
\rv_N\right) +
\frac{p_
\epsilon}{2W
} + PV
1701 The next step is to add temperature control. Adding Nos
{\'e
}-Hoover
1702 chains, including to the barostat degree of freedom, where we use
1703 $
\eta$ for the barostat Nos
{\'e
}-Hoover variables, and $Q^
{\prime}$
1704 for the coupling constants of the thermostats of the barostats, we get
1706 \dot{\rv}_i &=&
\frac{\pb_i}{m_i
} +
\frac{\peps}{W
} \rv_i \nonumber \\
1707 \frac{\dot{\pb}_i
}{m_i
} &=&
\frac{1}{m_i
}\F_i -
\alpha\frac{\peps}{W
} \frac{\pb_i}{m_i
} -
\frac{p_
{\xi_1}}{Q_1
}\frac{\pb_i}{m_i
}\nonumber \\
1708 \dot{\epsilon} &=&
\frac{\peps}{W
} \nonumber \\
1709 \frac{\dot{\peps}}{W
} &=&
\frac{3V
}{W
}(
\alpha P_
{\mathrm{kin
}} - P_
{\mathrm{vir
}} - P) -
\frac{p_
{\eta_1}}{Q^
{\prime}_1
}\peps \nonumber \\
1710 \dot{\xi}_k &=&
\frac{p_
{\xi_k}}{Q_k
} \nonumber \\
1711 \dot{\eta}_k &=&
\frac{p_
{\eta_k}}{Q^
{\prime}_k
} \nonumber \\
1712 \dot{p
}_
{\xi_k} &=& G_k -
\frac{p_
{\xi_{k+
1}}}{Q_
{k+
1}} \;\;\;\; k=
1,
\ldots, M-
1 \nonumber \\
1713 \dot{p
}_
{\eta_k} &=& G^
\prime_k -
\frac{p_
{\eta_{k+
1}}}{Q^
\prime_{k+
1}} \;\;\;\; k=
1,
\ldots, M-
1 \nonumber \\
1714 \dot{p
}_
{\xi_M} &=& G_M
\nonumber \\
1715 \dot{p
}_
{\eta_M} &=& G^
\prime_M,
\nonumber \\
1719 P_
{\mathrm{int
}} &=& P_
{\mathrm{kin
}} - P_
{\mathrm{vir
}} =
\frac{1}{3V
}\left[\sum_{i=
1}^N
\left(
\frac{\pb_i^
2}{2m_i
} -
\rv_i \cdot \F_i\right)
\right] \nonumber \\
1720 G_1 &=&
\sum_{i=
1}^N
\frac{\pb^
2_i
}{m_i
} - N_f kT
\nonumber \\
1721 G_k &=&
\frac{p^
2_
{\xi_{k-
1}}}{2Q_
{k-
1}} - kT \;\; k =
2,
\ldots,M
\nonumber \\
1722 G^
\prime_1 &=&
\frac{\peps^
2}{2W
} - kT
\nonumber \\
1723 G^
\prime_k &=&
\frac{p^
2_
{\eta_{k-
1}}}{2Q^
\prime_{k-
1}} - kT \;\; k =
2,
\ldots,M
1725 The conserved quantity is now
1727 H =
\sum_{i=
1}^
{N
} \frac{\pb_i}{2m_i
} + U
\left(
\rv_1,
\rv_2,
\ldots,
\rv_N\right) +
\frac{p^
2_
\epsilon}{2W
} + PV +
\nonumber \\
1728 \sum_{k=
1}^M
\frac{p^
2_
{\xi_k}}{2Q_k
} +
\sum_{k=
1}^M
\frac{p^
2_
{\eta_k}}{2Q^
{\prime}_k
} + N_fkT
\xi_1 + kT
\sum_{i=
2}^M
\xi_k + kT
\sum_{k=
1}^M
\eta_k
1730 Returning to the Trotter decomposition formalism, for pressure control and temperature control~
\cite{Martyna1996
} we get:
1732 iL = iL_1 + iL_2 + iL_
{\epsilon,
1} + iL_
{\epsilon,
2} + iL_
{\mathrm{NHC-baro
}} + iL_
{\mathrm{NHC
}}
1734 where ``NHC-baro'' corresponds to the Nos
{\`e
}-Hoover chain of the barostat,
1735 and NHC corresponds to the NHC of the particles,
1737 iL_1 &=&
\sum_{i=
1}^N
\left[\frac{\pb_i}{m_i
} +
\frac{\peps}{W
}\rv_i\right]\cdot \frac{\partial}{\partial \rv_i} \\
1738 iL_2 &=&
\sum_{i=
1}^N
\F_i -
\alpha \frac{\peps}{W
}\pb_i \cdot \frac{\partial}{\partial \pb_i} \\
1739 iL_
{\epsilon,
1} &=&
\frac{p_
\epsilon}{W
} \frac{\partial}{\partial \epsilon}\\
1740 iL_
{\epsilon,
2} &=& G_
{\epsilon} \frac{\partial}{\partial p_
\epsilon}
1744 G_
{\epsilon} =
3V
\left(
\alpha P_
{\mathrm{kin
}} - P_
{\mathrm{vir
}} - P
\right)
1746 Using the Trotter decomposition, we get
1748 \exp(iL
\dt) &=&
\exp\left(iL_
{\mathrm{NHC-baro
}}\dt/
2\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\nonumber \nonumber \\
1749 &&
\exp\left(iL_
{\epsilon,
2}\dt/
2\right)
\exp\left(iL_2
\dt/
2\right)
\nonumber \nonumber \\
1750 &&
\exp\left(iL_
{\epsilon,
1}\dt\right)
\exp\left(iL_1
\dt\right)
\nonumber \nonumber \\
1751 &&
\exp\left(iL_2
\dt/
2\right)
\exp\left(iL_
{\epsilon,
2}\dt/
2\right)
\nonumber \nonumber \\
1752 &&
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\exp\left(iL_
{\mathrm{NHC-baro
}}\dt/
2\right) +
\mathcal{O
}(
\dt^
3)
1754 The action of $
\exp\left(iL_1
\dt\right)$ comes from the solution of
1755 the the differential equation
1756 $
\dot{\rv}_i =
\vv_i +
\veps \rv_i$
1757 with $
\vv_i =
\pb_i/m_i$ and $
\veps$ constant with initial condition
1758 $
\rv_i(
0)$, evaluate at $t=
\Delta t$. This yields the evolution
1760 \rv_i(
\dt) =
\rv_i(
0)e^
{\veps \dt} +
\Delta t
\vv_i(
0) e^
{\veps \dt/
2} \sinhx{\veps \dt/
2}.
1762 The action of $
\exp\left(iL_2
\dt/
2\right)$ comes from the solution
1763 of the differential equation $
\dot{\vv}_i =
\frac{\F_i}{m_i
} -
1764 \alpha\veps\vv_i$, yielding
1766 \vv_i(
\dt/
2) =
\vv_i(
0)e^
{-
\alpha\veps \dt/
2} +
\frac{\Delta t
}{2m_i
}\F_i(
0) e^
{-
\alpha\veps \dt/
4}\sinhx{\alpha\veps \dt/
4}.
1768 {\em md-vv-avek
} uses the full step kinetic energies for determining the pressure with the pressure control,
1769 but the half-step-averaged kinetic energy for the temperatures, which can be written as a Trotter decomposition as
1771 \exp(iL
\dt) &=&
\exp\left(iL_
{\mathrm{NHC-baro
}}\dt/
2\right)
\nonumber \exp\left(iL_
{\epsilon,
2}\dt/
2\right)
\exp\left(iL_2
\dt/
2\right)
\nonumber \\
1772 &&
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\exp\left(iL_
{\epsilon,
1}\dt\right)
\exp\left(iL_1
\dt\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\nonumber \\
1773 &&
\exp\left(iL_2
\dt/
2\right)
\exp\left(iL_
{\epsilon,
2}\dt/
2\right)
\exp\left(iL_
{\mathrm{NHC-baro
}}\dt/
2\right) +
\mathcal{O
}(
\dt^
3)
1776 With constraints, the equations become significantly more complicated,
1777 in that each of these equations need to be solved iteratively for the
1778 constraint forces. Before
{\gromacs} 5.1, these iterative
1779 constraints were solved as described in~
\cite{Yu2010
}. From
{\gromacs}
1780 5.1 onward, MTTK with constraints has been removed because of
1781 numerical stability issues with the iterations.
1783 \subsubsection{Infrequent evaluation of temperature and pressure coupling
}
1785 Temperature and pressure control require global communication to
1786 compute the kinetic energy and virial, which can become costly if
1787 performed every step for large systems. We can rearrange the Trotter
1788 decomposition to give alternate symplectic, reversible integrator with
1789 the coupling steps every $n$ steps instead of every steps. These new
1790 integrators will diverge if the coupling time step is too large, as
1791 the auxiliary variable integrations will not converge. However, in
1792 most cases, long coupling times are more appropriate, as they disturb
1793 the dynamics less~
\cite{Martyna1996
}.
1795 Standard velocity Verlet with Nos
{\'e
}-Hoover temperature control has a Trotter expansion
1797 \exp(iL
\dt) &
\approx&
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right)
\exp\left(iL_2
\dt/
2\right)
\nonumber \\
1798 &&
\exp\left(iL_1
\dt\right)
\exp\left(iL_2
\dt/
2\right)
\exp\left(iL_
{\mathrm{NHC
}}\dt/
2\right).
1800 If the Nos
{\'e
}-Hoover chain is sufficiently slow with respect to the motions of the system, we can
1801 write an alternate integrator over $n$ steps for velocity Verlet as
1803 \exp(iL
\dt) &
\approx& (
\exp\left(iL_
{\mathrm{NHC
}}(n
\dt/
2)
\right)
\left[\exp\left(iL_2
\dt/
2\right)
\right.
\nonumber \\
1804 &&
\left.
\exp\left(iL_1
\dt\right)
\exp\left(iL_2
\dt/
2\right)
\right]^n
\exp\left(iL_
{\mathrm{NHC
}}(n
\dt/
2)
\right).
1806 For pressure control, this becomes
1808 \exp(iL
\dt) &
\approx&
\exp\left(iL_
{\mathrm{NHC-baro
}}(n
\dt/
2)
\right)
\exp\left(iL_
{\mathrm{NHC
}}(n
\dt/
2)
\right)
\nonumber \nonumber \\
1809 &&
\exp\left(iL_
{\epsilon,
2}(n
\dt/
2)
\right)
\left[\exp\left(iL_2
\dt/
2\right)
\right.
\nonumber \nonumber \\
1810 &&
\exp\left(iL_
{\epsilon,
1}\dt\right)
\exp\left(iL_1
\dt\right)
\nonumber \nonumber \\
1811 &&
\left.
\exp\left(iL_2
\dt/
2\right)
\right]^n
\exp\left(iL_
{\epsilon,
2}(n
\dt/
2)
\right)
\nonumber \nonumber \\
1812 &&
\exp\left(iL_
{\mathrm{NHC
}}(n
\dt/
2)
\right)
\exp\left(iL_
{\mathrm{NHC-baro
}}(n
\dt/
2)
\right),
1814 where the box volume integration occurs every step, but the auxiliary variable
1815 integrations happen every $n$ steps.
1817 % } % Brace matches ifthenelse test for gmxlite
1820 \subsection{The complete update algorithm
}
1823 \addtolength{\fboxsep}{0.5cm
}
1824 \begin{shadowenv
}[12cm
]
1825 {\large \bf THE UPDATE ALGORITHM
}
1826 \rule{\textwidth}{2pt
} \\
1828 Positions $
\ve{r
}$ of all atoms at time $t$ \\
1829 Velocities $
\ve{v
}$ of all atoms at time $t-
\hDt$ \\
1830 Accelerations $
\ve{F
}/m$ on all atoms at time $t$.\\
1831 (Forces are computed disregarding any constraints)\\
1832 Total kinetic energy and virial at $t-
\Dt$\\
1834 {\bf 1.
} Compute the scaling factors $
\lambda$ and $
\mu$\\
1835 according to
\eqnsref{lambda
}{mu
}\\
1837 {\bf 2.
} Update and scale velocities: $
\ve{v
}' =
\lambda (
\ve{v
} +
1838 \ve{a
} \Delta t)$ \\
1840 {\bf 3.
} Compute new unconstrained coordinates: $
\ve{r
}' =
\ve{r
} +
\ve{v
}'
1843 {\bf 4.
} Apply constraint algorithm to coordinates: constrain($
\ve{r
}^
{'
} \rightarrow \ve{r
}'';
1846 {\bf 5.
} Correct velocities for constraints: $
\ve{v
} = (
\ve{r
}'' -
1847 \ve{r
}) /
\Delta t$ \\
1849 {\bf 6.
} Scale coordinates and box: $
\ve{r
} =
\mu \ve{r
}'';
\ve{b
} =
1852 \caption{The MD update algorithm with the leap-frog integrator
}
1853 \label{fig:complete-update
}
1856 The complete algorithm for the update of velocities and coordinates is
1857 given using leap-frog in
\figref{complete-update
}. The SHAKE algorithm of step
1858 4 is explained below.
1860 {\gromacs} has a provision to ``freeze'' (prevent motion of) selected
1861 particles
\index{frozen atoms
}, which must be defined as a ``
\swapindex{freeze
}{group
}.'' This is implemented
1862 using a
{\em freeze factor $
\ve{f
}_g$
}, which is a vector, and differs for each
1863 freeze group (see
\secref{groupconcept
}). This vector contains only
1864 zero (freeze) or one (don't freeze).
1865 When we take this freeze factor and the external acceleration $
\ve{a
}_h$ into
1866 account the update algorithm for the velocities becomes
1868 \ve{v
}(t+
\hdt)~=~
\ve{f
}_g *
\lambda *
\left[ \ve{v
}(t-
\hdt) +
\frac{\ve{F
}(t)
}{m
}\Delta t +
\ve{a
}_h
\Delta t
\right],
1870 where $g$ and $h$ are group indices which differ per atom.
1872 \subsection{Output step
}
1873 The most important output of the MD run is the
{\em
1874 \swapindex{trajectory
}{file
}}, which contains particle coordinates
1875 and (optionally) velocities at regular intervals.
1876 The trajectory file contains frames that could include positions,
1877 velocities and/or forces, as well as information about the dimensions
1878 of the simulation volume, integration step, integration time, etc. The
1879 interpretation of the time varies with the integrator chosen, as
1880 described above. For Velocity Verlet integrators, velocities labeled
1881 at time $t$ are for that time. For other integrators (e.g. leap-frog,
1882 stochastic dynamics), the velocities labeled at time $t$ are for time
1885 Since the trajectory
1886 files are lengthy, one should not save every step! To retain all
1887 information it suffices to write a frame every
15 steps, since at
1888 least
30 steps are made per period of the highest frequency in the
1889 system, and Shannon's
\normindex{sampling
} theorem states that two samples per
1890 period of the highest frequency in a band-limited signal contain all
1891 available information. But that still gives very long files! So, if
1892 the highest frequencies are not of interest,
10 or
20 samples per ps
1893 may suffice. Be aware of the distortion of high-frequency motions by
1894 the
{\em stroboscopic effect
}, called
{\em aliasing
}: higher frequencies
1895 are mirrored with respect to the sampling frequency and appear as
1898 {\gromacs} can also write reduced-precision coordinates for a subset of
1899 the simulation system to a special compressed trajectory file
1900 format. All the other tools can read and write this format. See
1901 the User Guide for details on how to set up your
{\tt .mdp
} file
1902 to have
{\tt mdrun
} use this feature.
1904 % \ifthenelse{\equal{\gmxlite}{1}}{}{
1905 \section{Shell molecular dynamics
}
1906 {\gromacs} can simulate
\normindex{polarizability
} using the
1907 \normindex{shell model
} of Dick and Overhauser~
\cite{Dick58
}. In such models
1908 a shell particle representing the electronic degrees of freedom is
1909 attached to a nucleus by a spring. The potential energy is minimized with
1910 respect to the shell position at every step of the simulation (see below).
1911 Successful applications of shell models in
{\gromacs} have been published
1912 for $N_2$~
\cite{Jordan95
} and water~
\cite{Maaren2001a
}.
1914 \subsection{Optimization of the shell positions
}
1915 The force
\ve{F
}$_S$ on a shell particle $S$ can be decomposed into two
1918 \ve{F
}_S ~=~
\ve{F
}_
{bond
} +
\ve{F
}_
{nb
}
1920 where
\ve{F
}$_
{bond
}$ denotes the component representing the
1921 polarization energy, usually represented by a harmonic potential and
1922 \ve{F
}$_
{nb
}$ is the sum of Coulomb and van der Waals interactions. If we
1923 assume that
\ve{F
}$_
{nb
}$ is almost constant we can analytically derive the
1924 optimal position of the shell, i.e. where
\ve{F
}$_S$ =
0. If we have the
1925 shell S connected to atom A we have
1927 \ve{F
}_
{bond
} ~=~ k_b
\left(
\ve{x
}_S -
\ve{x
}_A
\right).
1929 In an iterative solver, we have positions
\ve{x
}$_S(n)$ where $n$ is
1930 the iteration count. We now have at iteration $n$
1932 \ve{F
}_
{nb
} ~=~
\ve{F
}_S - k_b
\left(
\ve{x
}_S(n) -
\ve{x
}_A
\right)
1934 and the optimal position for the shells $x_S(n+
1)$ thus follows from
1936 \ve{F
}_S - k_b
\left(
\ve{x
}_S(n) -
\ve{x
}_A
\right) + k_b
\left(
\ve{x
}_S(n+
1) -
\ve{x
}_A
\right) =
0
1940 \Delta \ve{x
}_S =
\ve{x
}_S(n+
1) -
\ve{x
}_S(n)
1944 \Delta \ve{x
}_S =
\ve{F
}_S/k_b
1946 which then yields the algorithm to compute the next trial in the optimization
1949 \ve{x
}_S(n+
1) ~=~
\ve{x
}_S(n) +
\ve{F
}_S/k_b.
1951 % } % Brace matches ifthenelse test for gmxlite
1953 \section{Constraint algorithms
\index{constraint algorithms
}}
1954 Constraints can be imposed in
{\gromacs} using LINCS (default) or
1955 the traditional SHAKE method.
1957 \subsection{\normindex{SHAKE
}}
1958 \label{subsec:SHAKE
}
1959 The SHAKE~
\cite{Ryckaert77
} algorithm changes a set of unconstrained
1960 coordinates $
\ve{r
}^
{'
}$ to a set of coordinates $
\ve{r
}''$ that
1961 fulfill a list of distance constraints, using a set $
\ve{r
}$
1964 {\rm SHAKE
}(
\ve{r
}^
{'
} \rightarrow \ve{r
}'';\,
\ve{r
})
1966 This action is consistent with solving a set of Lagrange multipliers
1967 in the constrained equations of motion. SHAKE needs a
{\em relative tolerance
};
1968 it will continue until all constraints are satisfied within
1969 that relative tolerance. An error message is
1970 given if SHAKE cannot reset the coordinates because the deviation is
1971 too large, or if a given number of iterations is surpassed.
1973 Assume the equations of motion must fulfill $K$ holonomic constraints,
1976 \sigma_k(
\ve{r
}_1
\ldots \ve{r
}_N) =
0; \;\; k=
1 \ldots K.
1978 For example, $(
\ve{r
}_1 -
\ve{r
}_2)^
2 - b^
2 =
0$.
1979 Then the forces are defined as
1981 -
\frac{\partial}{\partial \ve{r
}_i
} \left( V +
\sum_{k=
1}^K
\lambda_k
1984 where $
\lambda_k$ are Lagrange multipliers which must be solved to
1985 fulfill the constraint equations. The second part of this sum
1986 determines the
{\em constraint forces
} $
\ve{G
}_i$, defined by
1988 \ve{G
}_i = -
\sum_{k=
1}^K
\lambda_k \frac{\partial \sigma_k}{\partial
1991 The displacement due to the constraint forces in the leap-frog or
1992 Verlet algorithm is equal to $(
\ve{G
}_i/m_i)(
\Dt)^
2$. Solving the
1993 Lagrange multipliers (and hence the displacements) requires the
1994 solution of a set of coupled equations of the second degree. These are
1995 solved iteratively by SHAKE.
1996 % \ifthenelse{\equal{\gmxlite}{1}}{}{
1997 \label{subsec:SETTLE
}
1998 For the special case of rigid water molecules, that often make up more
1999 than
80\% of the simulation system we have implemented the
2001 algorithm~
\cite{Miyamoto92
} (
\secref{constraints
}).
2003 For velocity Verlet, an additional round of constraining must be
2004 done, to constrain the velocities of the second velocity half step,
2005 removing any component of the velocity parallel to the bond vector.
2006 This step is called RATTLE, and is covered in more detail in the
2007 original Andersen paper~
\cite{Andersen1983a
}.
2009 % } % Brace matches ifthenelse test for gmxlite
2014 \newcommand{\fs}[1]{\begin{equation
} \label{eqn:
#1}}
2015 \newcommand{\fe}{\end{equation
}}
2016 \newcommand{\p}{\partial}
2017 \newcommand{\Bm}{\ve{B
}}
2018 \newcommand{\M}{\ve{M
}}
2019 \newcommand{\iM}{\M^
{-
1}}
2020 \newcommand{\Tm}{\ve{T
}}
2021 \newcommand{\Sm}{\ve{S
}}
2022 \newcommand{\fo}{\ve{f
}}
2023 \newcommand{\con}{\ve{g
}}
2024 \newcommand{\lenc}{\ve{d
}}
2026 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2027 \subsection{\normindex{LINCS
}}
2028 \label{subsec:lincs
}
2030 \subsubsection{The LINCS algorithm
}
2031 LINCS is an algorithm that resets bonds to their correct lengths
2032 after an unconstrained update~
\cite{Hess97
}.
2033 The method is non-iterative, as it always uses two steps.
2034 Although LINCS is based on matrices, no matrix-matrix multiplications are
2035 needed. The method is more stable and faster than SHAKE,
2036 but it can only be used with bond constraints and
2037 isolated angle constraints, such as the proton angle in OH.
2038 Because of its stability, LINCS is especially useful for Brownian dynamics.
2039 LINCS has two parameters, which are explained in the subsection parameters.
2040 The parallel version of LINCS, P-LINCS, is described
2041 in subsection
\ssecref{plincs
}.
2043 \subsubsection{The LINCS formulas
}
2044 We consider a system of $N$ particles, with positions given by a
2045 $
3N$ vector $
\ve{r
}(t)$.
2046 For molecular dynamics the equations of motion are given by Newton's Law
2048 {\de^
2 \ve{r
} \over \de t^
2} =
\iM \ve{F
},
2050 where $
\ve{F
}$ is the $
3N$ force vector
2051 and $
\M$ is a $
3N
\times 3N$ diagonal matrix,
2052 containing the masses of the particles.
2053 The system is constrained by $K$ time-independent constraint equations
2055 g_i(
\ve{r
}) = |
\ve{r
}_
{i_1
}-
\ve{r
}_
{i_2
} | - d_i =
0 ~~~~~~i=
1,
\ldots,K.
2058 In a numerical integration scheme, LINCS is applied after an
2059 unconstrained update, just like SHAKE. The algorithm works in two
2060 steps (see figure
\figref{lincs
}). In the first step, the projections
2061 of the new bonds on the old bonds are set to zero. In the second step,
2062 a correction is applied for the lengthening of the bonds due to
2063 rotation. The numerics for the first step and the second step are very
2064 similar. A complete derivation of the algorithm can be found in
2065 \cite{Hess97
}. Only a short description of the first step is given
2069 \centerline{\includegraphics[height=
50mm
]{plots/lincs
}}
2070 \caption[The three position updates needed for one time step.
]{The
2071 three position updates needed for one time step. The dashed line is
2072 the old bond of length $d$, the solid lines are the new bonds. $l=d
2073 \cos \theta$ and $p=(
2 d^
2 - l^
2)^
{1 \over 2}$.
}
2077 A new notation is introduced for the gradient matrix of the constraint
2078 equations which appears on the right hand side of this equation:
2080 B_
{hi
} =
{\p g_h
\over \p r_i
}
2082 Notice that $
\Bm$ is a $K
\times 3N$ matrix, it contains the directions
2084 The following equation shows how the new constrained coordinates
2085 $
\ve{r
}_
{n+
1}$ are related to the unconstrained coordinates
2086 $
\ve{r
}_
{n+
1}^
{unc
}$ by
2089 \ve{r
}_
{n+
1}=(
\ve{I
}-
\Tm_n \ve{B
}_n)
\ve{r
}_
{n+
1}^
{unc
} +
\Tm_n \lenc=
2091 \ve{r
}_
{n+
1}^
{unc
} -
2092 \iM \Bm_n (
\Bm_n \iM \Bm_n^T)^
{-
1} (
\Bm_n \ve{r
}_
{n+
1}^
{unc
} -
\lenc)
2095 where $
\Tm =
\iM \Bm^T (
\Bm \iM \Bm^T)^
{-
1}$.
2096 The derivation of this equation from
\eqnsref{c1
}{c2
} can be found
2099 This first step does not set the real bond lengths to the prescribed lengths,
2100 but the projection of the new bonds onto the old directions of the bonds.
2101 To correct for the rotation of bond $i$, the projection of the
2102 bond, $p_i$, on the old direction is set to
2104 p_i=
\sqrt{2 d_i^
2 - l_i^
2},
2106 where $l_i$ is the bond length after the first projection.
2107 The corrected positions are
2109 \ve{r
}_
{n+
1}^*=(
\ve{I
}-
\Tm_n \Bm_n)
\ve{r
}_
{n+
1} +
\Tm_n \ve{p
}.
2111 This correction for rotational effects is actually an iterative process,
2112 but during MD only one iteration is applied.
2113 The relative constraint deviation after this procedure will be less than
2114 0.0001 for every constraint.
2115 In energy minimization, this might not be accurate enough, so the number
2116 of iterations is equal to the order of the expansion (see below).
2118 Half of the CPU time goes to inverting the constraint coupling
2119 matrix $
\Bm_n \iM \Bm_n^T$, which has to be done every time step.
2120 This $K
\times K$ matrix
2121 has $
1/m_
{i_1
} +
1/m_
{i_2
}$ on the diagonal.
2122 The off-diagonal elements are only non-zero when two bonds are connected,
2124 $
\cos \phi /m_c$, where $m_c$ is
2125 the mass of the atom connecting the
2126 two bonds and $
\phi$ is the angle between the bonds.
2128 The matrix $
\Tm$ is inverted through a power expansion.
2129 A $K
\times K$ matrix $
\ve{S
}$ is
2130 introduced which is the inverse square root of
2131 the diagonal of $
\Bm_n \iM \Bm_n^T$.
2132 This matrix is used to convert the diagonal elements
2133 of the coupling matrix to one:
2136 (
\Bm_n \iM \Bm_n^T)^
{-
1}
2137 =
\Sm \Sm^
{-
1} (
\Bm_n \iM \Bm_n^T)^
{-
1} \Sm^
{-
1} \Sm \\
[2mm
]
2138 =
\Sm (
\Sm \Bm_n \iM \Bm_n^T
\Sm)^
{-
1} \Sm =
2139 \Sm (
\ve{I
} -
\ve{A
}_n)^
{-
1} \Sm
2142 The matrix $
\ve{A
}_n$ is symmetric and sparse and has zeros on the diagonal.
2143 Thus a simple trick can be used to calculate the inverse:
2145 (
\ve{I
}-
\ve{A
}_n)^
{-
1}=
2146 \ve{I
} +
\ve{A
}_n +
\ve{A
}_n^
2 +
\ve{A
}_n^
3 +
\ldots
2149 This inversion method is only valid if the absolute values of all the
2150 eigenvalues of $
\ve{A
}_n$ are smaller than one.
2151 In molecules with only bond constraints, the connectivity is so low
2152 that this will always be true, even if ring structures are present.
2153 Problems can arise in angle-constrained molecules.
2154 By constraining angles with additional distance constraints,
2155 multiple small ring structures are introduced.
2156 This gives a high connectivity, leading to large eigenvalues.
2157 Therefore LINCS should NOT be used with coupled angle-constraints.
2159 For molecules with all bonds constrained the eigenvalues of $A$
2160 are around
0.4. This means that with each additional order
2161 in the expansion
\eqnref{m3
} the deviations decrease by a factor
0.4.
2162 But for relatively isolated triangles of constraints the largest
2163 eigenvalue is around
0.7.
2164 Such triangles can occur when removing hydrogen angle vibrations
2165 with an additional angle constraint in alcohol groups
2166 or when constraining water molecules with LINCS, for instance
2167 with flexible constraints.
2168 The constraints in such triangles converge twice as slow as
2169 the other constraints. Therefore, starting with
{\gromacs} 4,
2170 additional terms are added to the expansion for such triangles
2172 (
\ve{I
}-
\ve{A
}_n)^
{-
1} \approx
2173 \ve{I
} +
\ve{A
}_n +
\ldots +
\ve{A
}_n^
{N_i
} +
2174 \left(
\ve{A
}^*_n +
\ldots +
{\ve{A
}_n^*
}^
{N_i
} \right)
\ve{A
}_n^
{N_i
}
2176 where $N_i$ is the normal order of the expansion and
2177 $
\ve{A
}^*$ only contains the elements of $
\ve{A
}$ that couple
2178 constraints within rigid triangles, all other elements are zero.
2179 In this manner, the accuracy of angle constraints comes close
2180 to that of the other constraints, while the series of matrix vector
2181 multiplications required for determining the expansion
2182 only needs to be extended for a few constraint couplings.
2183 This procedure is described in the P-LINCS paper
\cite{Hess2008a
}.
2185 \subsubsection{The LINCS Parameters
}
2186 The accuracy of LINCS depends on the number of matrices used
2187 in the expansion
\eqnref{m3
}. For MD calculations a fourth order
2188 expansion is enough. For Brownian dynamics with
2189 large time steps an eighth order expansion may be necessary.
2190 The order is a parameter in the
{\tt *.mdp
} file.
2191 The implementation of LINCS is done in such a way that the
2192 algorithm will never crash. Even when it is impossible to
2193 to reset the constraints LINCS will generate a conformation
2194 which fulfills the constraints as well as possible.
2195 However, LINCS will generate a warning when in one step a bond
2196 rotates over more than a predefined angle.
2197 This angle is set by the user in the
{\tt *.mdp
} file.
2199 % } % Brace matches ifthenelse test for gmxlite
2202 \section{Simulated Annealing
}
2204 The well known
\swapindex{simulated
}{annealing
}
2205 (SA) protocol is supported in
{\gromacs}, and you can even couple multiple
2206 groups of atoms separately with an arbitrary number of reference temperatures
2207 that change during the simulation. The annealing is implemented by simply
2208 changing the current reference temperature for each group in the temperature
2209 coupling, so the actual relaxation and coupling properties depends on the
2210 type of thermostat you use and how hard you are coupling it. Since we are
2211 changing the reference temperature it is important to remember that the system
2212 will NOT instantaneously reach this value - you need to allow for the inherent
2213 relaxation time in the coupling algorithm too. If you are changing the
2214 annealing reference temperature faster than the temperature relaxation you
2215 will probably end up with a crash when the difference becomes too large.
2217 The annealing protocol is specified as a series of corresponding times and
2218 reference temperatures for each group, and you can also choose whether you only
2219 want a single sequence (after which the temperature will be coupled to the
2220 last reference value), or if the annealing should be periodic and restart at
2221 the first reference point once the sequence is completed. You can mix and
2222 match both types of annealing and non-annealed groups in your simulation.
2224 \newcommand{\vrond}{\stackrel{\circ}{\ve{r
}}}
2225 \newcommand{\rond}{\stackrel{\circ}{r
}}
2226 \newcommand{\ruis}{\ve{r
}^G
}
2228 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2229 \section{Stochastic Dynamics
\swapindexquiet{stochastic
}{dynamics
}}
2231 Stochastic or velocity
\swapindex{Langevin
}{dynamics
} adds a friction
2232 and a noise term to Newton's equations of motion, as
2235 m_i
{\de^
2 \ve{r
}_i
\over \de t^
2} =
2236 - m_i
\gamma_i {\de \ve{r
}_i
\over \de t
} +
\ve{F
}_i(
\ve{r
}) +
\vrond_i,
2238 where $
\gamma_i$ is the friction constant $
[1/
\mbox{ps
}]$ and
2239 $
\vrond_i\!\!(t)$ is a noise process with
2240 $
\langle \rond_i\!\!(t)
\rond_j\!\!(t+s)
\rangle =
2241 2 m_i
\gamma_i k_B T
\delta(s)
\delta_{ij
}$.
2242 When $
1/
\gamma_i$ is large compared to the time scales present in the system,
2243 one could see stochastic dynamics as molecular dynamics with stochastic
2244 temperature-coupling. The advantage compared to MD with Berendsen
2245 temperature-coupling is that in case of SD the generated ensemble is known.
2246 For simulating a system in vacuum there is the additional advantage that there is no
2247 accumulation of errors for the overall translational and rotational
2249 When $
1/
\gamma_i$ is small compared to the time scales present in the system,
2250 the dynamics will be completely different from MD, but the sampling is
2253 In
{\gromacs} there is one simple and efficient implementation. Its
2254 accuracy is equivalent to the normal MD leap-frog and
2255 Velocity Verlet integrator. It is nearly identical to the common way of discretizing the Langevin equation, but the friction and velocity term are applied in an impulse fashion~
\cite{Goga2012
}.
2256 It can be described as:
2259 \ve{v
}' &~=~&
\ve{v
}(t-
\hDt) +
\frac{1}{m
}\ve{F
}(t)
\Dt \\
2260 \Delta\ve{v
} &~=~& -
\alpha \,
\ve{v
}'(t+
\hDt) +
\sqrt{\frac{k_B T
}{m
}(
1 -
\alpha^
2)
} \,
\ruis_i \\
2261 \ve{r
}(t+
\Dt) &~=~&
\ve{r
}(t)+
\left(
\ve{v
}' +
\frac{1}{2}\Delta \ve{v
}\right)
\Dt \label{eqn:sd1_x_upd
}\\
2262 \ve{v
}(t+
\hDt) &~=~&
\ve{v
}' +
\Delta \ve{v
} \\
2263 \alpha &~=~&
1 - e^
{-
\gamma \Dt}
2265 where $
\ruis_i$ is Gaussian distributed noise with $
\mu =
0$, $
\sigma =
1$.
2266 The velocity is first updated a full time step without friction and noise to get $
\ve{v
}'$, identical to the normal update in leap-frog. The friction and noise are then applied as an impulse at step $t+
\Dt$. The advantage of this scheme is that the velocity-dependent terms act at the full time step, which makes the correct integration of forces that depend on both coordinates and velocities, such as constraints and dissipative particle dynamics (DPD, not implented yet), straightforward. With constraints, the coordinate update
\eqnref{sd1_x_upd
} is split into a normal leap-frog update and a $
\Delta \ve{v
}$. After both of these updates the constraints are applied to coordinates and velocities.
2268 When using SD as a thermostat, an appropriate value for $
\gamma$ is e.g.
0.5 ps$^
{-
1}$,
2269 since this results in a friction that is lower than the internal friction
2270 of water, while it still provides efficient thermostatting.
2273 \section{Brownian Dynamics
\swapindexquiet{Brownian
}{dynamics
}}
2275 In the limit of high friction, stochastic dynamics reduces to
2276 Brownian dynamics, also called position Langevin dynamics.
2277 This applies to over-damped systems,
2278 {\ie} systems in which the inertia effects are negligible.
2281 {\de \ve{r
}_i
\over \de t
} =
\frac{1}{\gamma_i} \ve{F
}_i(
\ve{r
}) +
\vrond_i
2283 where $
\gamma_i$ is the friction coefficient $
[\mbox{amu/ps
}]$ and
2284 $
\vrond_i\!\!(t)$ is a noise process with
2285 $
\langle \rond_i\!\!(t)
\rond_j\!\!(t+s)
\rangle =
2286 2 \delta(s)
\delta_{ij
} k_B T /
\gamma_i$.
2287 In
{\gromacs} the equations are integrated with a simple, explicit scheme
2289 \ve{r
}_i(t+
\Delta t) =
\ve{r
}_i(t) +
2290 {\Delta t
\over \gamma_i} \ve{F
}_i(
\ve{r
}(t))
2291 +
\sqrt{2 k_B T
{\Delta t
\over \gamma_i}}\,
\ruis_i,
2293 where $
\ruis_i$ is Gaussian distributed noise with $
\mu =
0$, $
\sigma =
1$.
2294 The friction coefficients $
\gamma_i$ can be chosen the same for all
2295 particles or as $
\gamma_i = m_i\,
\gamma_i$, where the friction constants
2296 $
\gamma_i$ can be different for different groups of atoms.
2297 Because the system is assumed to be over-damped, large timesteps
2298 can be used. LINCS should be used for the constraints since SHAKE
2299 will not converge for large atomic displacements.
2300 BD is an option of the
{\tt mdrun
} program.
2301 % } % Brace matches ifthenelse test for gmxlite
2303 \section{Energy Minimization
}
2304 \label{sec:EM
}\index{energy minimization
}%
2305 Energy minimization in
{\gromacs} can be done using steepest descent,
2306 conjugate gradients, or l-bfgs (limited-memory
2307 Broyden-Fletcher-Goldfarb-Shanno quasi-Newtonian minimizer...we
2308 prefer the abbreviation). EM is just an option of the
{\tt mdrun
}
2311 \subsection{Steepest Descent
\index{steepest descent
}}
2312 Although steepest descent is certainly not the most efficient
2313 algorithm for searching, it is robust and easy to implement.
2315 We define the vector $
\ve{r
}$ as the vector of all $
3N$ coordinates.
2316 Initially a maximum displacement $h_0$ (
{\eg} 0.01 nm) must be given.
2318 First the forces $
\ve{F
}$ and potential energy are calculated.
2319 New positions are calculated by
2321 \ve{r
}_
{n+
1} =
\ve{r
}_n +
\frac{\ve{F
}_n
}{\max (|
\ve{F
}_n|)
} h_n,
2323 where $h_n$ is the maximum displacement and $
\ve{F
}_n$ is the force,
2324 or the negative gradient of the potential $V$. The notation $
\max
2325 (|
\ve{F
}_n|)$ means the largest of the absolute values of the force
2326 components. The forces and energy are again computed for the new positions \\
2327 If ($V_
{n+
1} < V_n$) the new positions are accepted and $h_
{n+
1} =
1.2
2329 If ($V_
{n+
1} \geq V_n$) the new positions are rejected and $h_n =
0.2 h_n$.
2331 The algorithm stops when either a user-specified number of force
2332 evaluations has been performed (
{\eg} 100), or when the maximum of the absolute
2333 values of the force (gradient) components is smaller than a specified
2335 Since force truncation produces some noise in the
2336 energy evaluation, the stopping criterion should not be made too tight
2337 to avoid endless iterations. A reasonable value for $
\epsilon$ can be
2338 estimated from the root mean square force $f$ a harmonic oscillator would exhibit at a
2339 temperature $T$. This value is
2341 f =
2 \pi \nu \sqrt{ 2mkT
},
2343 where $
\nu$ is the oscillator frequency, $m$ the (reduced) mass, and
2344 $k$ Boltzmann's constant. For a weak oscillator with a wave number of
2345 100 cm$^
{-
1}$ and a mass of
10 atomic units, at a temperature of
1 K,
2346 $f=
7.7$ kJ~mol$^
{-
1}$~nm$^
{-
1}$. A value for $
\epsilon$ between
1 and
2349 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2350 \subsection{Conjugate Gradient
\index{conjugate gradient
}}
2351 Conjugate gradient is slower than steepest descent in the early stages
2352 of the minimization, but becomes more efficient closer to the energy
2353 minimum. The parameters and stop criterion are the same as for
2354 steepest descent. In
{\gromacs} conjugate gradient can not be used
2355 with constraints, including the SETTLE algorithm for
2356 water~
\cite{Miyamoto92
}, as this has not been implemented. If water is
2357 present it must be of a flexible model, which can be specified in the
2358 {\tt *.mdp
} file by
{\tt define = -DFLEXIBLE
}.
2360 This is not really a restriction, since the accuracy of conjugate
2361 gradient is only required for minimization prior to a normal-mode
2362 analysis, which cannot be performed with constraints. For most other
2363 purposes steepest descent is efficient enough.
2364 % } % Brace matches ifthenelse test for gmxlite
2366 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2367 \subsection{\normindex{L-BFGS
}}
2368 The original BFGS algorithm works by successively creating better
2369 approximations of the inverse Hessian matrix, and moving the system to
2370 the currently estimated minimum. The memory requirements for this are
2371 proportional to the square of the number of particles, so it is not
2372 practical for large systems like biomolecules. Instead, we use the
2373 L-BFGS algorithm of Nocedal~
\cite{Byrd95a,Zhu97a
}, which approximates
2374 the inverse Hessian by a fixed number of corrections from previous
2375 steps. This sliding-window technique is almost as efficient as the
2376 original method, but the memory requirements are much lower -
2377 proportional to the number of particles multiplied with the correction
2378 steps. In practice we have found it to converge faster than conjugate
2379 gradients, but due to the correction steps it is not yet parallelized.
2380 It is also noteworthy that switched or shifted interactions usually
2381 improve the convergence, since sharp cut-offs mean the potential
2382 function at the current coordinates is slightly different from the
2383 previous steps used to build the inverse Hessian approximation.
2384 % } % Brace matches ifthenelse test for gmxlite
2386 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2387 \section{Normal-Mode Analysis
\index{normal-mode analysis
}\index{NMA
}}
2388 Normal-mode analysis~
\cite{Levitt83,Go83,BBrooks83b
}
2389 can be performed using
{\gromacs}, by diagonalization of the mass-weighted
2390 \normindex{Hessian
} $H$:
2392 R^T M^
{-
1/
2} H M^
{-
1/
2} R &=&
\mbox{diag
}(
\lambda_1,
\ldots,
\lambda_{3N
})
2394 \lambda_i &=& (
2 \pi \omega_i)^
2
2396 where $M$ contains the atomic masses, $R$ is a matrix that contains
2397 the eigenvectors as columns, $
\lambda_i$ are the eigenvalues
2398 and $
\omega_i$ are the corresponding frequencies.
2400 First the Hessian matrix, which is a $
3N
\times 3N$ matrix where $N$
2401 is the number of atoms, needs to be calculated:
2403 H_
{ij
} &=&
\frac{\partial^
2 V
}{\partial x_i
\partial x_j
}
2405 where $x_i$ and $x_j$ denote the atomic x, y or z coordinates.
2406 In practice, this equation is not used, but the Hessian is
2407 calculated numerically from the force as:
2410 \frac{f_i(
{\bf x
}+h
{\bf e
}_j) - f_i(
{\bf x
}-h
{\bf e
}_j)
}{2h
}
2412 f_i &=& -
\frac{\partial V
}{\partial x_i
}
2414 where $
{\bf e
}_j$ is the unit vector in direction $j$.
2415 It should be noted that
2416 for a usual normal-mode calculation, it is necessary to completely minimize
2417 the energy prior to computation of the Hessian.
2418 The tolerance required depends on the type of system,
2419 but a rough indication is
0.001 kJ mol$^
{-
1}$.
2420 Minimization should be done with conjugate gradients or L-BFGS in double precision.
2422 A number of
{\gromacs} programs are involved in these
2423 calculations. First, the energy should be minimized using
{\tt mdrun
}.
2424 Then,
{\tt mdrun
} computes the Hessian.
{\bf Note
} that for generating
2425 the run input file, one should use the minimized conformation from
2426 the full precision trajectory file, as the structure file is not
2428 {\tt \normindex{gmx nmeig
}} does the diagonalization and
2429 the sorting of the normal modes according to their frequencies.
2430 Both
{\tt mdrun
} and
{\tt gmx nmeig
} should be run in double precision.
2431 The normal modes can be analyzed with the program
{\tt gmx anaeig
}.
2432 Ensembles of structures at any temperature and for any subset of
2433 normal modes can be generated with
{\tt \normindex{gmx nmens
}}.
2434 An overview of normal-mode analysis and the related principal component
2435 analysis (see
\secref{covanal
}) can be found in~
\cite{Hayward95b
}.
2436 % } % Brace matches ifthenelse test for gmxlite
2438 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2440 \section{Free energy calculations
\index{free energy calculations
}}
2442 \subsection{Slow-growth methods
\index{slow-growth methods
}}
2443 Free energy calculations can be performed
2444 in
{\gromacs} using a number of methods, including ``slow-growth.'' An example problem
2445 might be calculating the difference in free energy of binding of an inhibitor
{\bf I
}
2446 to an enzyme
{\bf E
} and to a mutated enzyme
{\bf E$^
{\prime}$
}. It
2447 is not feasible with computer simulations to perform a docking
2448 calculation for such a large complex, or even releasing the inhibitor from
2449 the enzyme in a reasonable amount of computer time with reasonable accuracy.
2450 However, if we consider the free energy cycle in~
\figref{free
}A
2453 \Delta G_1 -
\Delta G_2 =
\Delta G_3 -
\Delta G_4
2456 If we are interested in the left-hand term we can equally well compute
2457 the right-hand term.
2459 \centerline{\includegraphics[width=
6cm,angle=
270]{plots/free1
}\hspace{2cm
}\includegraphics[width=
6cm,angle=
270]{plots/free2
}}
2460 \caption[Free energy cycles.
]{Free energy cycles.
{\bf A:
} to
2461 calculate $
\Delta G_
{12}$, the free energy difference between the
2462 binding of inhibitor
{\bf I
} to enzymes
{\bf E
} respectively
{\bf
2463 E$^
{\prime}$
}.
{\bf B:
} to calculate $
\Delta G_
{12}$, the free energy
2464 difference for binding of inhibitors
{\bf I
} respectively
{\bf I$^
{\prime}$
} to
2469 If we want to compute the difference in free energy of binding of two
2470 inhibitors
{\bf I
} and
{\bf I$^
{\prime}$
} to an enzyme
{\bf E
} (
\figref{free
}B)
2471 we can again use
\eqnref{ddg
} to compute the desired property.
2473 \newcommand{\sA}{^
{\mathrm{A
}}}
2474 \newcommand{\sB}{^
{\mathrm{B
}}}
2475 Free energy differences between two molecular species can
2476 be calculated in
{\gromacs} using the ``slow-growth'' method.
2477 Such free energy differences between different molecular species are
2478 physically meaningless, but they can be used to obtain meaningful
2479 quantities employing a thermodynamic cycle.
2480 The method requires a simulation during which the Hamiltonian of the
2481 system changes slowly from that describing one system (A) to that
2482 describing the other system (B). The change must be so slow that the
2483 system remains in equilibrium during the process; if that requirement
2484 is fulfilled, the change is reversible and a slow-growth simulation from B to A
2485 will yield the same results (but with a different sign) as a slow-growth
2486 simulation from A to B. This is a useful check, but the user should be
2487 aware of the danger that equality of forward and backward growth results does
2488 not guarantee correctness of the results.
2490 The required modification of the Hamiltonian $H$ is realized by making
2491 $H$ a function of a
\textit{coupling parameter
} $
\lambda:
2492 H=H(p,q;
\lambda)$ in such a way that $
\lambda=
0$ describes system A
2493 and $
\lambda=
1$ describes system B:
2495 H(p,q;
0)=H
\sA (p,q);~~~~ H(p,q;
1)=H
\sB (p,q).
2497 In
{\gromacs}, the functional form of the $
\lambda$-dependence is
2498 different for the various force-field contributions and is described
2499 in section
\secref{feia
}.
2501 The Helmholtz free energy $A$ is related to the
2502 partition function $Q$ of an $N,V,T$ ensemble, which is assumed to be
2503 the equilibrium ensemble generated by a MD simulation at constant
2504 volume and temperature. The generally more useful Gibbs free energy
2505 $G$ is related to the partition function $
\Delta$ of an $N,p,T$
2506 ensemble, which is assumed to be the equilibrium ensemble generated by
2507 a MD simulation at constant pressure and temperature:
2509 A(
\lambda) &=& -k_BT
\ln Q \\
2510 Q &=& c
\int\!\!
\int \exp[-
\beta H(p,q;
\lambda)
]\,dp\,dq \\
2511 G(
\lambda) &=& -k_BT
\ln \Delta \\
2512 \Delta &=& c
\int\!\!
\int\!\!
\int \exp[-
\beta H(p,q;
\lambda) -
\beta
2516 where $
\beta =
1/(k_BT)$ and $c = (N! h^
{3N
})^
{-
1}$.
2517 These integrals over phase space cannot be evaluated from a
2518 simulation, but it is possible to evaluate the derivative with
2519 respect to $
\lambda$ as an ensemble average:
2521 \frac{dA
}{d
\lambda} =
\frac{\int\!\!
\int (
\partial H/
\partial
2522 \lambda)
\exp[-
\beta H(p,q;
\lambda)
]\,dp\,dq
}{\int\!\!
\int \exp[-
\beta
2523 H(p,q;
\lambda)
]\,dp\,dq
} =
2524 \left\langle \frac{\partial H
}{\partial \lambda} \right\rangle_{NVT;
\lambda},
2526 with a similar relation for $dG/d
\lambda$ in the $N,p,T$
2527 ensemble. The difference in free energy between A and B can be found
2528 by integrating the derivative over $
\lambda$:
2530 A
\sB(V,T)-A
\sA(V,T) &=&
\int_0^
1 \left\langle \frac{\partial
2531 H
}{\partial \lambda} \right\rangle_{NVT;
\lambda} \,d
\lambda
2533 G
\sB(p,T)-G
\sA(p,T) &=&
\int_0^
1 \left\langle \frac{\partial
2534 H
}{\partial \lambda} \right\rangle_{NpT;
\lambda} \,d
\lambda.
2537 If one wishes to evaluate $G
\sB(p,T)-G
\sA(p,T)$,
2538 the natural choice is a constant-pressure simulation. However, this
2539 quantity can also be obtained from a slow-growth simulation at
2540 constant volume, starting with system A at pressure $p$ and volume $V$
2541 and ending with system B at pressure $p_B$, by applying the following
2542 small (but, in principle, exact) correction:
2545 A
\sB(V)-A
\sA(V) -
\int_p^
{p
\sB}[V
\sB(p')-V
]\,dp'
2547 Here we omitted the constant $T$ from the notation. This correction is
2548 roughly equal to $-
\frac{1}{2} (p
\sB-p)
\Delta V=(
\Delta V)^
2/(
2
2549 \kappa V)$, where $
\Delta V$ is the volume change at $p$ and $
\kappa$
2550 is the isothermal compressibility. This is usually
2551 small; for example, the growth of a water molecule from nothing
2552 in a bath of
1000 water molecules at constant volume would produce an
2553 additional pressure of as much as
22 bar, but a correction to the
2554 Helmholtz free energy of just -
1 kJ mol$^
{-
1}$.
%-20 J/mol.
2556 In Cartesian coordinates, the kinetic energy term in the Hamiltonian
2557 depends only on the momenta, and can be separately integrated and, in
2558 fact, removed from the equations. When masses do not change, there is
2559 no contribution from the kinetic energy at all; otherwise the
2560 integrated contribution to the free energy is $-
\frac{3}{2} k_BT
\ln
2561 (m
\sB/m
\sA)$.
{\bf Note
} that this is only true in the absence of constraints.
2563 \subsection{Thermodynamic integration
\index{thermodynamic integration
}\index{BAR
}\index{Bennett's acceptance ratio
}}
2564 {\gromacs} offers the possibility to integrate eq.~
\ref{eq:delA
} or
2565 eq.
\ref{eq:delG
} in one simulation over the full range from A to
2566 B. However, if the change is large and insufficient sampling can be
2567 expected, the user may prefer to determine the value of $
\langle
2568 dG/d
\lambda \rangle$ accurately at a number of well-chosen
2569 intermediate values of $
\lambda$. This can easily be done by setting
2570 the stepsize
{\tt delta_lambda
} to zero. Each simulation can be
2571 equilibrated first, and a proper error estimate can be made for each
2572 value of $dG/d
\lambda$ from the fluctuation of $
\partial H/
\partial
2573 \lambda$. The total free energy change is then determined afterward
2574 by an appropriate numerical integration procedure.
2576 {\gromacs} now also supports the use of Bennett's Acceptance Ratio~
\cite{Bennett1976
}
2577 for calculating values of $
\Delta$G for transformations from state A to state B using
2578 the program
{\tt \normindex{gmx bar
}}. The same data can also be used to calculate free
2579 energies using MBAR~
\cite{Shirts2008
}, though the analysis currently requires external tools from
2580 the external
{\tt pymbar
} package, at https://SimTK.org/home/pymbar.
2582 The $
\lambda$-dependence for the force-field contributions is
2583 described in detail in section
\secref{feia
}.
2584 % } % Brace matches ifthenelse test for gmxlite
2586 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2587 \section{Replica exchange
\index{replica exchange
}}
2588 Replica exchange molecular dynamics (
\normindex{REMD
})
2589 is a method that can be used to speed up
2590 the sampling of any type of simulation, especially if
2591 conformations are separated by relatively high energy barriers.
2592 It involves simulating multiple replicas of the same system
2593 at different temperatures and randomly exchanging the complete state
2594 of two replicas at regular intervals with the probability:
2596 P(
1 \leftrightarrow 2)=
\min\left(
1,
\exp\left[
2597 \left(
\frac{1}{k_B T_1
} -
\frac{1}{k_B T_2
}\right)(U_1 - U_2)
2600 where $T_1$ and $T_2$ are the reference temperatures and $U_1$ and $U_2$
2601 are the instantaneous potential energies of replicas
1 and
2 respectively.
2602 After exchange the velocities are scaled by $(T_1/T_2)^
{\pm0.5
}$
2603 and a neighbor search is performed the next step.
2604 This combines the fast sampling and frequent barrier-crossing
2605 of the highest temperature with correct Boltzmann sampling at
2606 all the different temperatures~
\cite{Hukushima96a,Sugita99
}.
2607 We only attempt exchanges for neighboring temperatures as the probability
2608 decreases very rapidly with the temperature difference.
2609 One should not attempt exchanges for all possible pairs in one step.
2610 If, for instance, replicas
1 and
2 would exchange, the chance of
2611 exchange for replicas
2 and
3 not only depends on the energies of
2612 replicas
2 and
3, but also on the energy of replica
1.
2613 In
{\gromacs} this is solved by attempting exchange for all ``odd''
2614 pairs on ``odd'' attempts and for all ``even'' pairs on ``even'' attempts.
2615 If we have four replicas:
0,
1,
2 and
3, ordered in temperature
2616 and we attempt exchange every
1000 steps, pairs
0-
1 and
2-
3
2617 will be tried at steps
1000,
3000 etc. and pair
1-
2 at steps
2000,
4000 etc.
2619 How should one choose the temperatures?
2620 The energy difference can be written as:
2622 U_1 - U_2 = N_
{df
} \frac{c
}{2} k_B (T_1 - T_2)
2624 where $N_
{df
}$ is the total number of degrees of freedom of one replica
2625 and $c$ is
1 for harmonic potentials and around
2 for protein/water systems.
2626 If $T_2 = (
1+
\epsilon) T_1$ the probability becomes:
2628 P(
1 \leftrightarrow 2)
2629 =
\exp\left( -
\frac{\epsilon^
2 c\,N_
{df
}}{2 (
1+
\epsilon)
} \right)
2630 \approx \exp\left(-
\epsilon^
2 \frac{c
}{2} N_
{df
} \right)
2632 Thus for a probability of $e^
{-
2}\approx 0.135$
2633 one obtains $
\epsilon \approx 2/
\sqrt{c\,N_
{df
}}$.
2634 With all bonds constrained one has $N_
{df
} \approx 2\, N_
{atoms
}$
2635 and thus for $c$ =
2 one should choose $
\epsilon$ as $
1/
\sqrt{N_
{atoms
}}$.
2636 However there is one problem when using pressure coupling. The density at
2637 higher temperatures will decrease, leading to higher energy~
\cite{Seibert2005a
},
2638 which should be taken into account. The
{\gromacs} website features a
2639 so-called ``REMD calculator,'' that lets you type in the temperature range and
2640 the number of atoms, and based on that proposes a set of temperatures.
2642 An extension to the REMD for the isobaric-isothermal ensemble was
2643 proposed by Okabe
{\em et al.
}~
\cite{Okabe2001a
}. In this work the
2644 exchange probability is modified to:
2646 P(
1 \leftrightarrow 2)=
\min\left(
1,
\exp\left[
2647 \left(
\frac{1}{k_B T_1
} -
\frac{1}{k_B T_2
}\right)(U_1 - U_2) +
2648 \left(
\frac{P_1
}{k_B T_1
} -
\frac{P_2
}{k_B T_2
}\right)
\left(V_1-V_2
\right)
2651 where $P_1$ and $P_2$ are the respective reference pressures and $V_1$ and
2652 $V_2$ are the respective instantaneous volumes in the simulations.
2653 In most cases the differences in volume are so small that the second
2654 term is negligible. It only plays a role when the difference between
2655 $P_1$ and $P_2$ is large or in phase transitions.
2657 Hamiltonian replica exchange is also supported in
{\gromacs}. In
2658 Hamiltonian replica exchange, each replica has a different
2659 Hamiltonian, defined by the free energy pathway specified for the simulation. The
2660 exchange probability to maintain the correct ensemble probabilities is:
2661 \beq P(
1 \leftrightarrow 2)=
\min\left(
1,
\exp\left[
2662 \left(
\frac{1}{k_B T
} -
\frac{1}{k_B T
}\right)((U_1(x_2) - U_1(x_1)) + (U_2(x_1) - U_2(x_2)))
2666 The separate Hamiltonians are defined by the free energy functionality
2667 of
{\gromacs}, with swaps made between the different values of
2668 $
\lambda$ defined in the mdp file.
2670 Hamiltonian and temperature replica exchange can also be performed
2671 simultaneously, using the acceptance criteria:
2673 P(
1 \leftrightarrow 2)=
\min\left(
1,
\exp\left[
2674 \left(
\frac{1}{k_B T
} -
\right)(
\frac{U_1(x_2) - U_1(x_1)
}{k_B T_1
} +
\frac{U_2(x_1) - U_2(x_2)
}{k_B T_2
})
2678 Gibbs sampling replica exchange has also been implemented in
2679 {\gromacs}~
\cite{Chodera2011
}. In Gibbs sampling replica exchange, all
2680 possible pairs are tested for exchange, allowing swaps between
2681 replicas that are not neighbors.
2683 Gibbs sampling replica exchange requires no additional potential
2684 energy calculations. However there is an additional communication
2685 cost in Gibbs sampling replica exchange, as for some permutations,
2686 more than one round of swaps must take place. In some cases, this
2687 extra communication cost might affect the efficiency.
2689 All replica exchange variants are options of the
{\tt mdrun
}
2690 program. It will only work when MPI is installed, due to the inherent
2691 parallelism in the algorithm. For efficiency each replica can run on a
2692 separate rank. See the manual page of
{\tt mdrun
} on how to use these
2695 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2697 \section{Essential Dynamics sampling
\index{essential dynamics
}\index{principal component analysis
}\seeindexquiet{PCA
}{covariance analysis
}}
2698 The results from Essential Dynamics (see
\secref{covanal
})
2699 of a protein can be used to guide MD simulations. The idea is that
2700 from an initial MD simulation (or from other sources) a definition of
2701 the collective fluctuations with largest amplitude is obtained. The
2702 position along one or more of these collective modes can be
2703 constrained in a (second) MD simulation in a number of ways for
2704 several purposes. For example, the position along a certain mode may
2705 be kept fixed to monitor the average force (free-energy gradient) on
2706 that coordinate in that position. Another application is to enhance
2707 sampling efficiency with respect to usual MD
2708 \cite{Degroot96a,Degroot96b
}. In this case, the system is encouraged
2709 to sample its available configuration space more systematically than
2710 in a diffusion-like path that proteins usually take.
2712 Another possibility to enhance sampling is
\normindex{flooding
}.
2713 Here a flooding potential is added to certain
2714 (collective) degrees of freedom to expel the system out
2715 of a region of phase space
\cite{Lange2006a
}.
2717 The procedure for essential dynamics sampling or flooding is as follows.
2718 First, the eigenvectors and eigenvalues need to be determined
2719 using covariance analysis (
{\tt gmx covar
})
2720 or normal-mode analysis (
{\tt gmx nmeig
}).
2721 Then, this information is fed into
{\tt make_edi
},
2722 which has many options for selecting vectors and setting parameters,
2723 see
{\tt gmx make_edi -h
}.
2724 The generated
{\tt edi
} input file is then passed to
{\tt mdrun
}.
2726 % } % Brace matches ifthenelse test for gmxlite
2728 % \ifthenelse{\equal{\gmxlite}{1}}{}{
2729 \section{\normindex{Expanded Ensemble
}}
2731 In an expanded ensemble simulation~
\cite{Lyubartsev1992
}, both the coordinates and the
2732 thermodynamic ensemble are treated as configuration variables that can
2733 be sampled over. The probability of any given state can be written as:
2735 P(
\vec{x
},k)
\propto \exp\left(-
\beta_k U_k + g_k
\right),
2737 where $
\beta_k =
\frac{1}{k_B T_k
}$ is the $
\beta$ corresponding to the $k$th
2738 thermodynamic state, and $g_k$ is a user-specified weight factor corresponding
2739 to the $k$th state. This space is therefore a
{\em mixed
},
{\em generalized
}, or
{\em
2740 expanded
} ensemble which samples from multiple thermodynamic
2741 ensembles simultaneously. $g_k$ is chosen to give a specific weighting
2742 of each subensemble in the expanded ensemble, and can either be fixed,
2743 or determined by an iterative procedure. The set of $g_k$ is
2744 frequently chosen to give each thermodynamic ensemble equal
2745 probability, in which case $g_k$ is equal to the free energy in
2746 non-dimensional units, but they can be set to arbitrary values as
2747 desired. Several different algorithms can be used to equilibrate
2748 these weights, described in the mdp option listings.
2749 % } % Brace matches ifthenelse test for gmxlite
2751 In
{\gromacs}, this space is sampled by alternating sampling in the $k$
2752 and $
\vec{x
}$ directions. Sampling in the $
\vec{x
}$ direction is done
2753 by standard molecular dynamics sampling; sampling between the
2754 different thermodynamics states is done by Monte Carlo, with several
2755 different Monte Carlo moves supported. The $k$ states can be defined
2756 by different temperatures, or choices of the free energy $
\lambda$
2757 variable, or both. Expanded ensemble simulations thus represent a
2758 serialization of the replica exchange formalism, allowing a single
2759 simulation to explore many thermodynamic states.
2763 \section{Parallelization
\index{parallelization
}}
2764 The CPU time required for a simulation can be reduced by running the simulation
2765 in parallel over more than one core.
2766 Ideally, one would want to have linear scaling: running on $N$ cores
2767 makes the simulation $N$ times faster. In practice this can only be
2768 achieved for a small number of cores. The scaling will depend
2769 a lot on the algorithms used. Also, different algorithms can have different
2770 restrictions on the interaction ranges between atoms.
2772 \section{Domain decomposition
\index{domain decomposition
}}
2773 Since most interactions in molecular simulations are local,
2774 domain decomposition is a natural way to decompose the system.
2775 In domain decomposition, a spatial domain is assigned to each rank,
2776 which will then integrate the equations of motion for the particles
2777 that currently reside in its local domain. With domain decomposition,
2778 there are two choices that have to be made: the division of the unit cell
2779 into domains and the assignment of the forces to domains.
2780 Most molecular simulation packages use the half-shell method for assigning
2781 the forces. But there are two methods that always require less communication:
2782 the eighth shell~
\cite{Liem1991
} and the midpoint~
\cite{Shaw2006
} method.
2783 {\gromacs} currently uses the eighth shell method, but for certain systems
2784 or hardware architectures it might be advantageous to use the midpoint
2785 method. Therefore, we might implement the midpoint method in the future.
2786 Most of the details of the domain decomposition can be found
2787 in the
{\gromacs} 4 paper~
\cite{Hess2008b
}.
2789 \subsection{Coordinate and force communication
}
2790 In the most general case of a triclinic unit cell,
2791 the space in divided with a
1-,
2-, or
3-D grid in parallelepipeds
2792 that we call domain decomposition cells.
2793 Each cell is assigned to a particle-particle rank.
2794 The system is partitioned over the ranks at the beginning
2795 of each MD step in which neighbor searching is performed.
2796 Since the neighbor searching is based on charge groups, charge groups
2797 are also the units for the domain decomposition.
2798 Charge groups are assigned to the cell where their center of geometry resides.
2799 Before the forces can be calculated, the coordinates from some
2800 neighboring cells need to be communicated,
2801 and after the forces are calculated, the forces need to be communicated
2802 in the other direction.
2803 The communication and force assignment is based on zones that
2804 can cover one or multiple cells.
2805 An example of a zone setup is shown in
\figref{ddcells
}.
2808 \centerline{\includegraphics[width=
6cm
]{plots/dd-cells
}}
2810 A non-staggered domain decomposition grid of
3$
\times$
2$
\times$
2 cells.
2811 Coordinates in zones
1 to
7 are communicated to the corner cell
2812 that has its home particles in zone
0.
2813 $r_c$ is the cut-off radius.
2818 The coordinates are communicated by moving data along the ``negative''
2819 direction in $x$, $y$ or $z$ to the next neighbor. This can be done in one
2820 or multiple pulses. In
\figref{ddcells
} two pulses in $x$ are required,
2821 then one in $y$ and then one in $z$. The forces are communicated by
2822 reversing this procedure. See the
{\gromacs} 4 paper~
\cite{Hess2008b
}
2823 for details on determining which non-bonded and bonded forces
2824 should be calculated on which rank.
2826 \subsection{Dynamic load balancing
\swapindexquiet{dynamic
}{load balancing
}}
2827 When different ranks have a different computational load
2828 (load imbalance), all ranks will have to wait for the one
2829 that takes the most time. One would like to avoid such a situation.
2830 Load imbalance can occur due to four reasons:
2832 \item inhomogeneous particle distribution
2833 \item inhomogeneous interaction cost distribution (charged/uncharged,
2834 water/non-water due to
{\gromacs} water innerloops)
2835 \item statistical fluctuation (only with small particle numbers)
2836 \item differences in communication time, due to network topology and/or other jobs on the machine interfering with our communication
2838 So we need a dynamic load balancing algorithm
2839 where the volume of each domain decomposition cell
2840 can be adjusted
{\em independently
}.
2841 To achieve this, the
2- or
3-D domain decomposition grids need to be
2842 staggered.
\figref{ddtric
} shows the most general case in
2-D.
2843 Due to the staggering, one might require two distance checks
2844 for deciding if a charge group needs to be communicated:
2845 a non-bonded distance and a bonded distance check.
2848 \centerline{\includegraphics[width=
7cm
]{plots/dd-tric
}}
2850 The zones to communicate to the rank of zone
0,
2851 see the text for details. $r_c$ and $r_b$ are the non-bonded
2852 and bonded cut-off radii respectively, $d$ is an example
2853 of a distance between following, staggered boundaries of cells.
2858 By default,
{\tt mdrun
} automatically turns on the dynamic load
2859 balancing during a simulation when the total performance loss
2860 due to the force calculation imbalance is
2\% or more.
2861 {\bf Note
} that the reported force load imbalance numbers might be higher,
2862 since the force calculation is only part of work that needs to be done
2863 during an integration step.
2864 The load imbalance is reported in the log file at log output steps
2865 and when the
{\tt -v
} option is used also on screen.
2866 The average load imbalance and the total performance loss
2867 due to load imbalance are reported at the end of the log file.
2869 There is one important parameter for the dynamic load balancing,
2870 which is the minimum allowed scaling. By default, each dimension
2871 of the domain decomposition cell can scale down by at least
2872 a factor of
0.8. For
3-D domain decomposition this allows cells
2873 to change their volume by about a factor of
0.5, which should allow
2874 for compensation of a load imbalance of
100\%.
2875 The minimum allowed scaling can be changed with the
{\tt -dds
}
2876 option of
{\tt mdrun
}.
2878 The load imbalance is measured by timing a single region of the MD step
2879 on each MPI rank. This region can not include MPI communication, as
2880 timing of MPI calls does not allow separating wait due to imbalance
2881 from actual communication.
2882 The domain volumes are then scaled, with under-relaxation, inversely
2883 proportional with the measured time. This procedure will decrease the
2884 load imbalance when the change in load in the measured region correlates
2885 with the change in domain volume and the load outside
2886 the measured region does not depend strongly on the domain volume.
2887 In CPU-only simulations, the load is measured between the coordinate
2888 and the force communication. In simulations with non-bonded
2889 work on GPUs, we overlap communication and
2890 work on the CPU with calculation on the GPU. Therefore we
2891 measure from the last communication before the force calculation to
2892 when the CPU or GPU is finished, whichever is last.
2893 When not using PME ranks, we subtract the time in PME from the CPU time,
2894 as this includes MPI calls and the PME load is independent of domain size.
2895 This generally works well, unless the non-bonded load is low and there is
2896 imbalance in the bonded interactions. Then two issues can arise.
2897 Dynamic load balancing can increase the imbalance in update and constraints
2898 and with PME the coordinate and force redistribution time can go up
2899 significantly. Although dynamic load balancing
2900 can significantly improve performance in cases where there is imbalance in
2901 the bonded interactions on the CPU, there are many situations in which
2902 some domains continue decreasing in size and the load imbalance increases
2903 and/or PME coordinate and force redistribution cost increases significantly.
2904 As of version
2016.1,
{\tt mdrun
} disables the dynamic load balancing when
2905 measurement indicates that it deteriorates performance. This means that in most
2906 cases the user will get good performance with the default, automated
2907 dynamic load balancing setting.
2909 \subsection{Constraints in parallel
\index{constraints
}}
2910 \label{subsec:plincs
}
2911 Since with domain decomposition parts of molecules can reside
2912 on different ranks, bond constraints can cross cell boundaries.
2913 Therefore a parallel constraint algorithm is required.
2914 {\gromacs} uses the
\normindex{P-LINCS
} algorithm~
\cite{Hess2008a
},
2915 which is the parallel version of the
\normindex{LINCS
} algorithm~
\cite{Hess97
}
2916 % \ifthenelse{\equal{\gmxlite}{1}}
2918 {(see
\ssecref{lincs
}).
}
2919 The P-LINCS procedure is illustrated in
\figref{plincs
}.
2920 When molecules cross the cell boundaries, atoms in such molecules
2921 up to (
{\tt lincs_order +
1}) bonds away are communicated over the cell boundaries.
2922 Then, the normal LINCS algorithm can be applied to the local bonds
2923 plus the communicated ones. After this procedure, the local bonds
2924 are correctly constrained, even though the extra communicated ones are not.
2925 One coordinate communication step is required for the initial LINCS step
2926 and one for each iteration. Forces do not need to be communicated.
2929 \centerline{\includegraphics[width=
6cm
]{plots/par-lincs2
}}
2931 Example of the parallel setup of P-LINCS with one molecule
2932 split over three domain decomposition cells, using a matrix
2933 expansion order of
3.
2934 The top part shows which atom coordinates need to be communicated
2935 to which cells. The bottom parts show the local constraints (solid)
2936 and the non-local constraints (dashed) for each of the three cells.
2941 \subsection{Interaction ranges
}
2942 Domain decomposition takes advantage of the locality of interactions.
2943 This means that there will be limitations on the range of interactions.
2944 By default,
{\tt mdrun
} tries to find the optimal balance between
2945 interaction range and efficiency. But it can happen that a simulation
2946 stops with an error message about missing interactions,
2947 or that a simulation might run slightly faster with shorter
2948 interaction ranges. A list of interaction ranges
2949 and their default values is given in
\tabref{dd_ranges
}.
2953 \begin{tabular
}{|c|c|ll|
}
2955 interaction & range & option & default \\
2957 non-bonded & $r_c$ = max($r_
{\mathrm{list
}}$,$r_
{\mathrm{VdW
}}$,$r_
{\mathrm{Coul
}}$) &
{\tt mdp
} file & \\
2958 two-body bonded & max($r_
{\mathrm{mb
}}$,$r_c$) &
{\tt mdrun -rdd
} & starting conf. +
10\% \\
2959 multi-body bonded & $r_
{\mathrm{mb
}}$ &
{\tt mdrun -rdd
} & starting conf. +
10\% \\
2960 constraints & $r_
{\mathrm{con
}}$ &
{\tt mdrun -rcon
} & est. from bond lengths \\
2961 virtual sites & $r_
{\mathrm{con
}}$ &
{\tt mdrun -rcon
} &
0 \\
2965 \caption{The interaction ranges with domain decomposition.
}
2966 \label{tab:dd_ranges
}
2969 In most cases the defaults of
{\tt mdrun
} should not cause the simulation
2970 to stop with an error message of missing interactions.
2971 The range for the bonded interactions is determined from the distance
2972 between bonded charge-groups in the starting configuration, with
10\% added
2973 for headroom. For the constraints, the value of $r_
{\mathrm{con
}}$ is determined by
2974 taking the maximum distance that (
{\tt lincs_order +
1}) bonds can cover
2975 when they all connect at angles of
120 degrees.
2976 The actual constraint communication is not limited by $r_
{\mathrm{con
}}$,
2977 but by the minimum cell size $L_C$, which has the following lower limit:
2979 L_C
\geq \max(r_
{\mathrm{mb
}},r_
{\mathrm{con
}})
2981 Without dynamic load balancing the system is actually allowed to scale
2982 beyond this limit when pressure scaling is used.
2983 {\bf Note
} that for triclinic boxes, $L_C$ is not simply the box diagonal
2984 component divided by the number of cells in that direction,
2985 rather it is the shortest distance between the triclinic cells borders.
2986 For rhombic dodecahedra this is a factor of $
\sqrt{3/
2}$ shorter
2989 When $r_
{\mathrm{mb
}} > r_c$,
{\tt mdrun
} employs a smart algorithm to reduce
2990 the communication. Simply communicating all charge groups within
2991 $r_
{\mathrm{mb
}}$ would increase the amount of communication enormously.
2992 Therefore only charge-groups that are connected by bonded interactions
2993 to charge groups which are not locally present are communicated.
2994 This leads to little extra communication, but also to a slightly
2995 increased cost for the domain decomposition setup.
2996 In some cases,
{\eg} coarse-grained simulations with a very short cut-off,
2997 one might want to set $r_
{\mathrm{mb
}}$ by hand to reduce this cost.
2999 \subsection{Multiple-Program, Multiple-Data PME parallelization
\index{PME
}}
3000 \label{subsec:mpmd_pme
}
3001 Electrostatics interactions are long-range, therefore special
3002 algorithms are used to avoid summation over many atom pairs.
3003 In
{\gromacs} this is usually
3004 % \ifthenelse{\equal{\gmxlite}{1}}
3006 {PME (
\secref{pme
}).
}
3007 Since with PME all particles interact with each other, global communication
3008 is required. This will usually be the limiting factor for
3009 scaling with domain decomposition.
3010 To reduce the effect of this problem, we have come up with
3011 a Multiple-Program, Multiple-Data approach~
\cite{Hess2008b
}.
3012 Here, some ranks are selected to do only the PME mesh calculation,
3013 while the other ranks, called particle-particle (PP) ranks,
3014 do all the rest of the work.
3015 For rectangular boxes the optimal PP to PME rank ratio is usually
3:
1,
3016 for rhombic dodecahedra usually
2:
1.
3017 When the number of PME ranks is reduced by a factor of
4, the number
3018 of communication calls is reduced by about a factor of
16.
3019 Or put differently, we can now scale to
4 times more ranks.
3020 In addition, for modern
4 or
8 core machines in a network,
3021 the effective network bandwidth for PME is quadrupled,
3022 since only a quarter of the cores will be using the network connection
3023 on each machine during the PME calculations.
3026 \centerline{\includegraphics[width=
12cm
]{plots/mpmd-pme
}}
3028 Example of
8 ranks without (left) and with (right) MPMD.
3029 The PME communication (red arrows) is much higher on the left
3030 than on the right. For MPMD additional PP - PME coordinate
3031 and force communication (blue arrows) is required,
3032 but the total communication complexity is lower.
3033 \label{fig:mpmd_pme
}
3037 {\tt mdrun
} will by default interleave the PP and PME ranks.
3038 If the ranks are not number consecutively inside the machines,
3039 one might want to use
{\tt mdrun -ddorder pp_pme
}.
3040 For machines with a real
3-D torus and proper communication software
3041 that assigns the ranks accordingly one should use
3042 {\tt mdrun -ddorder cartesian
}.
3044 To optimize the performance one should usually set up the cut-offs
3045 and the PME grid such that the PME load is
25 to
33\% of the total
3046 calculation load.
{\tt grompp
} will print an estimate for this load
3047 at the end and also
{\tt mdrun
} calculates the same estimate
3048 to determine the optimal number of PME ranks to use.
3049 For high parallelization it might be worthwhile to optimize
3050 the PME load with the
{\tt mdp
} settings and/or the number
3051 of PME ranks with the
{\tt -npme
} option of
{\tt mdrun
}.
3052 For changing the electrostatics settings it is useful to know
3053 the accuracy of the electrostatics remains nearly constant
3054 when the Coulomb cut-off and the PME grid spacing are scaled
3056 {\bf Note
} that it is usually better to overestimate than to underestimate
3057 the number of PME ranks, since the number of PME ranks is smaller
3058 than the number of PP ranks, which leads to less total waiting time.
3060 The PME domain decomposition can be
1-D or
2-D along the $x$ and/or
3061 $y$ axis.
2-D decomposition is also known as
\normindex{pencil decomposition
} because of
3062 the shape of the domains at high parallelization.
3063 1-D decomposition along the $y$ axis can only be used when
3064 the PP decomposition has only
1 domain along $x$.
2-D PME decomposition
3065 has to have the number of domains along $x$ equal to the number of
3066 the PP decomposition.
{\tt mdrun
} automatically chooses
1-D or
2-D
3067 PME decomposition (when possible with the total given number of ranks),
3068 based on the minimum amount of communication for the coordinate redistribution
3069 in PME plus the communication for the grid overlap and transposes.
3070 To avoid superfluous communication of coordinates and forces
3071 between the PP and PME ranks, the number of DD cells in the $x$
3072 direction should ideally be the same or a multiple of the number
3073 of PME ranks. By default,
{\tt mdrun
} takes care of this issue.
3075 \subsection{Domain decomposition flow chart
}
3076 In
\figref{dd_flow
} a flow chart is shown for domain decomposition
3077 with all possible communication for different algorithms.
3078 For simpler simulations, the same flow chart applies,
3079 without the algorithms and communication for
3080 the algorithms that are not used.
3083 \centerline{\includegraphics[width=
12cm
]{plots/flowchart
}}
3085 Flow chart showing the algorithms and communication (arrows)
3086 for a standard MD simulation with virtual sites, constraints
3087 and separate PME-mesh ranks.
3093 \section{Implicit solvation
\index{implicit solvation
}\index{Generalized Born methods
}}
3095 Implicit solvent models provide an efficient way of representing
3096 the electrostatic effects of solvent molecules, while saving a
3097 large piece of the computations involved in an accurate, aqueous
3098 description of the surrounding water in molecular dynamics simulations.
3099 Implicit solvation models offer several advantages compared with
3100 explicit solvation, including eliminating the need for the equilibration of water
3101 around the solute, and the absence of viscosity, which allows the protein
3102 to more quickly explore conformational space.
3104 Implicit solvent calculations in
{\gromacs} can be done using the
3105 generalized Born-formalism, and the Still~
\cite{Still97
}, HCT~
\cite{Truhlar96
},
3106 and OBC~
\cite{Case04
} models are available for calculating the Born radii.
3108 Here, the free energy $G_
{\mathrm{solv
}}$ of solvation is the sum of three terms,
3109 a solvent-solvent cavity term ($G_
{\mathrm{cav
}}$), a solute-solvent van der
3110 Waals term ($G_
{\mathrm{vdw
}}$), and finally a solvent-solute electrostatics
3111 polarization term ($G_
{\mathrm{pol
}}$).
3113 The sum of $G_
{\mathrm{cav
}}$ and $G_
{\mathrm{vdw
}}$ corresponds to the (non-polar)
3114 free energy of solvation for a molecule from which all charges
3115 have been removed, and is commonly called $G_
{\mathrm{np
}}$,
3116 calculated from the total solvent accessible surface area
3117 multiplied with a surface tension.
3118 The total expression for the solvation free energy then becomes:
3121 G_
{\mathrm{solv
}} = G_
{\mathrm{np
}} + G_
{\mathrm{pol
}}
3125 Under the generalized Born model, $G_
{\mathrm{pol
}}$ is calculated from the generalized Born equation~
\cite{Still97
}:
3128 G_
{\mathrm{pol
}} =
\left(
1-
\frac{1}{\epsilon}\right)
\sum_{i=
1}^n
\sum_{j>i
}^n
\frac {q_i q_j
}{\sqrt{r^
2_
{ij
} + b_i b_j
\exp\left(
\frac{-r^
2_
{ij
}}{4 b_i b_j
}\right)
}}
3129 \label{eqn:gb_still
}
3132 In
{\gromacs}, we have introduced the substitution~
\cite{Larsson10
}:
3135 c_i=
\frac{1}{\sqrt{b_i
}}
3136 \label{eqn:gb_subst
}
3139 which makes it possible to introduce a cheap transformation to a new
3140 variable $x$ when evaluating each interaction, such that:
3143 x=
\frac{r_
{ij
}}{\sqrt{b_i b_j
}} = r_
{ij
} c_i c_j
3144 \label{eqn:gb_subst2
}
3147 In the end, the full re-formulation of~
\ref{eqn:gb_still
} becomes:
3150 G_
{\mathrm{pol
}} =
\left(
1-
\frac{1}{\epsilon}\right)
\sum_{i=
1}^n
\sum_{j>i
}^n
\frac{q_i q_j
}{\sqrt{b_i b_j
}} ~
\xi (x) =
\left(
1-
\frac{1}{\epsilon}\right)
\sum_{i=
1}^n q_i c_i
\sum_{j>i
}^n q_j c_j~
\xi (x)
3151 \label{eqn:gb_final
}
3154 The non-polar part ($G_
{\mathrm{np
}}$) of Equation~
\ref{eqn:gb_solv
} is calculated
3155 directly from the Born radius of each atom using a simple ACE type
3156 approximation by Schaefer
{\em et al.
}~
\cite{Karplus98
}, including a
3157 simple loop over all atoms.
3158 This requires only one extra solvation parameter, independent of atom type,
3159 but differing slightly between the three Born radii models.
3161 % LocalWords: GROningen MAchine BIOSON Groningen GROMACS Berendsen der Spoel
3162 % LocalWords: Drunen Comp Phys Comm ROck NS FFT pbc EM ifthenelse gmxlite ff
3163 % LocalWords: octahedra triclinic Ewald PME PPPM trjconv xy solvated
3164 % LocalWords: boxtypes boxshapes editconf Lennard COM XTC TNG kT defunits
3165 % LocalWords: Boltzmann's Mueller nb int mdrun chargegroup simplerc prefactor
3166 % LocalWords: pme waterloops CH NH CO df com virial integrator Verlet vverlet
3167 % LocalWords: integrators ref timepoint timestep timesteps mdp md vv avek NVE
3168 % LocalWords: NVT off's leapfrogv lll LR rmfast SPC fs Nos physicality ps GMX
3169 % LocalWords: Tcoupling nonergodic thermostatting NOSEHOOVER algorithmes ij yx
3170 % LocalWords: Parrinello Rahman rescales atm anisotropically ccc xz zx yy yz
3171 % LocalWords: zy zz se barostat compressibilities MTTK NPT Martyna al isobaric
3172 % LocalWords: Tuckerman vir PV fkT iLt iL Liouville NHC Eq baro mu trj mol bc
3173 % LocalWords: freezegroup Shannon's polarizability Overhauser barostats iLn KE
3174 % LocalWords: negligibly thermostatted Tobias rhombic maxwell et xtc tng TC rlist
3175 % LocalWords: waals LINCS holonomic plincs lincs unc ang SA Langevin SD amu BD
3176 % LocalWords: bfgs Broyden Goldfarb Shanno mkT kJ DFLEXIBLE Nocedal diag nmeig
3177 % LocalWords: diagonalization anaeig nmens covanal ddg feia BT dp dq pV dV dA
3178 % LocalWords: NpT eq stepsize REMD constrainted website Okabe MPI covar edi dd
3179 % LocalWords: progman NMR ddcells innerloops ddtric tric dds rdd conf rcon est
3180 % LocalWords: mb PP MPMD ddorder pp cartesian grompp npme parallelizable edr
3181 % LocalWords: macromolecule nstlist vacuo parallelization dof indices MBAR AVX
3182 % LocalWords: TOL numerics parallelized eigenvectors dG parallelepipeds VdW np
3183 % LocalWords: Coul multi solvation HCT OBC solv cav vdw Schaefer symplectic dt
3184 % LocalWords: pymbar multinode subensemble Monte solute subst groupconcept GPU
3185 % LocalWords: dodecahedron octahedron dodecahedra equilibration usinggroups nm
3186 % LocalWords: topologies rlistlong CUDA GPUs rcoulomb SIMD BlueGene FPUs erfc
3187 % LocalWords: cutoffschemesupport unbuffered bondeds OpenMP ewald rtol
3188 % LocalWords: verletdrift peptide RMS rescaling ergodicity ergodic discretized
3189 % LocalWords: isothermal compressibility isotropically anisotropic iteratively
3190 % LocalWords: incompressible integrations translational biomolecules NMA PCA
3191 % LocalWords: Bennett's equilibrated Hamiltonians covariance equilibrate
3192 % LocalWords: inhomogeneous conformational online other's th