# Theoretical background¶

## Mapping¶

The mapping is an operator that establishes a link between the atomistic and coarse-grained representations of the system. An atomistic system is described by specifying the values of the Cartesian coordinates and momenta

of the \(n\) atoms in the system. [1] On a coarse-grained level, the coordinates and momenta are specified by the positions and momenta of CG sites

Note that capitalized symbols are used for the CG sites while lower case letters are used for the atomistic system.

The mapping operator \({\mathbf c}_I\) is defined by a matrix for each bead \(I\) and links the two descriptions

for all \(I = 1,\dots,N\).

If an atomistic system is translated by a constant vector, the corresponding coarse-grained system is also translated by the same vector. This implies that, for all \(I\),

In some cases it is useful to define the CG mapping in such a way that
certain atoms belong to several CG beads at the same
time [Fritz:2009]. Following
ref. [Noid:2008.1], we define two sets of atoms for
each of the \(N\) CG beads. For each site \(I\), a set of
*involved* atoms is defined as

An atom \(i\) in the atomistic model is involved in a CG site, *I*,
if and only if this atom provides a nonzero contribution to the sum in
the equation above.

A set of *specific* atoms is defined as

In other words, atom \(i\) is specific to site \(I\) if and only if this atom is involved in site \(I\) and is not involved in the definition of any other site.

The CG model will generate an equilibrium distribution of momenta that
is consistent with an underlying atomistic model if all the atoms are
*specific* and if the mass of the \(I^\text{th}\) CG site is given
by [Noid:2008.1]

If all atoms are specific and the center of mass of a bead is used for mapping, then \(c_{Ii} = \frac{m_i}{M_I}\), and the condition is automatically satisfied.

Footnotes

## Boltzmann inversion¶

Boltzmann inversion is mostly used for *bonded* potentials, such as
bonds, angles, and torsions [Tschoep:1998]. Boltzmann
inversion is structure-based and only requires positions of atoms.

The idea of Boltzmann inversion stems from the fact that in a canonical
ensemble *independent* degrees of freedom \(q\) obey the Boltzmann
distribution, i. e.

where is a partition function, . Once \(P(q)\) is known, one can obtain the coarse-grained potential, which in this case is a potential of mean force, by inverting the probability distribution \(P(q)\) of a variable \(q\), which is either a bond length, bond angle, or torsion angle

The normalization factor \(Z\) is not important since it would only enter the coarse-grained potential \(U(q)\) as an irrelevant additive constant.

Note that the histograms for the bonds \(H_r(r)\), angles \(H_\theta(\theta)\), and torsion angles \(H_\varphi(\varphi)\) have to be rescaled in order to obtain the volume normalized distribution functions \(P_r(r)\), \(P_\theta(\theta)\), and \(P_\varphi(\varphi)\), respectively,

where \(r\) is the bond length \(r\), \(\theta\) is the bond angle, and \(\varphi\) is the torsion angle. The bonded coarse-grained potential can then be written as a sum of distribution functions

On the technical side, the implementation of the Boltzmann inversion
method requires *smoothing* of \(U(q)\) to provide a continuous
force. Splines can be used for this purpose. Poorly and unsampled
regions, that is regions with high \(U(q)\), shall be
*extrapolated*. Since the contribution of these regions to the canonical
density of states is small, the exact shape of the extrapolation is less
important.

Another crucial issue is the cross-correlation of the coarse-grained degrees of freedom. Independence of the coarse-grained degrees of freedom is the main assumption that allows factorization of the probability distribution and the potential as in the above equation. Hence, one has to carefully check whether this assumption holds in practice. This can be done by performing coarse-grained simulations and comparing cross-correlations for all pairs of degrees of freedom in atomistic and coarse-grained resolution, e. g. using a two-dimensional histogram, analogous to a Ramachandran plot. [2]

Footnotes

Checking the linear correlation coefficient does not guarantee statistical independence of variables, for example \(c(x, x^2)=0\) if \(x\) has a symmetric probability density \(P(x) = P(-x)\). This case is often encountered in systems used for coarse-graining.

### Separation of bonded and non-bonded degrees of freedom¶

When coarse-graining polymeric systems, it is convenient to treat bonded and non-bonded interactions separately [Tschoep:1998]. In this case, sampling of the atomistic system shall be performed on a special system where non-bonded interactions are artificially removed, so that the non-bonded interactions in the reference system do not contribute to the bonded interactions of the coarse-grained model.

This can be done by employing exclusion lists using with the option
`--excl`

. This is described in detail in Generating exclusion lists.

## Iterative methods¶

Iterative workflow control is essential for the IBI and IMC methods. The general idea of iterative workflow is sketched in the block-scheme above. A run starts with an initial guess during the global initialization phase. This guess is used for the first sampling step, followed by an update of the potential. The update itself often requires additional postprocessing such as smoothing, interpolation, extrapolation or fitting. Different methods are available to update the potential, for instance Iterative Boltzmann Inversion (see Iterative Boltzmann Inversion) or Inverse Monte Carlo (see Inverse Monte Carlo). The whole procedure is then iterated until a convergence criterion is satisfied.

## Iterative Boltzmann Inversion¶

Iterative Boltzmann inversion (IBI) is a natural extension of the Boltzmann inversion method. Since the goal of the coarse-grained model is to reproduce the distribution functions of the reference system as accurately as possible, one can also iteratively refine the coarse-grained potentials using some numerical scheme.

In IBIthe potential update \(\Delta U\) is given by [Reith:2003]

Here \(\lambda \in (0,1]\) is a numerical factor which helps to stabilize the scheme.

The convergence is reached as soon as the distribution function \(P^{(n)}\) matches the reference distribution function \(P_{\rm ref}\), or, in other words, the potential of mean force, \(U_\text{PMF}^{(n)}\), converges to the reference potential of mean force.

IBIcan be used to refine both bonded and non-bonded potentials. It is primarily used for simple fluids with the aim to reproduce the radial distribution function of the reference system in order to obtain non-bonded interactions. On the implementation side, IBIhas the same issues as the inverse Boltzmann method, i. e. smoothing and extrapolation of the potential must be used.

## Inverse Monte Carlo¶

Inverse Monte Carlo (IMC) is an iterative scheme which additionally includes cross correlations of distributions. A detailed derivation of the IMCmethod can be found in ref. [Lyubartsev:1995].

The potential update \(\Delta U\) of the IMCmethod is calculated by solving a set of linear equations

where

and \(S\) the histogram of a coarse-grained variable of interest. For example, in case of coarse-graining of the non-bonded interactions which depend only on the distance \(r_{ij}\) between particles \(i\) and \(j\) and assuming that the interaction potential is short-ranged, i.e. \(U(r_{ij})=0\) if \(r_{ij} \ge r_{\text{cut} }\), the average value of \(S_{\alpha}\) is related to the radial distribution function \(g(r_{\alpha})\) by

where \(N\) is the number of atoms in the system (\(\frac{1}{2} N(N-1)\) is then the number of all pairs), \(\Delta r\) is the grid spacing, \(r_{\text{cut}}/M\), \(V\) is the total volume of the system. In other words, in this particular case the physical meaning of \(S_{\alpha}\) is the number of particle pairs with interparticle distances \(r_{ij} = r_{\alpha}\) which correspond to the tabulated value of the potential \(U_{\alpha}\).

### Regularization of Inverse Monte Carlo¶

To get a well defined cross correlation matrix, \(A_{\alpha \gamma}\), enough sampling is needed. If there is not enough smapling or the initial potential guess is far from the real solution of the inverse problem, the algorithm might not converge to a stable solution. To overcome this instability problem one could reformulate the above equation by addition of a penalty term. In this case the potential update is computed as follows:[Murtola:2007]

This equation is known as Tikhonov regularization, where \(R\) is the regularization operator, which here is the identity matrix and \(\lambda >0 ` is the regularization parameter. The optimal choice for :math:\)lambda` can only be determined if the exact solution of the inverse problem is known, which in practice is not the case. To get a good initial guess on the magnitude of the regularization parameter a singular value decomposition of the matrix \(A_{\alpha \gamma}\) might help. A good \(\lambda\) parameter should dominate the smallest singular values (squared) but is itself small compared to the larger ones. [Rosenberger:2016]

## Iterative Integral Equation methods¶

Iterative Integral Equation (IIE) methods are best compared to the IMC method. The main difference is that the Jacobian is not sampled from particle number fluctuations, but approximately infered from the RDFs. The connection of potential and RDF is obtained from inverting the Ornstein-Zernicke (OZ) equation and a closing relation, e.g. the hypernetted-chain (HNC) equation

where \(h\) is \(g - 1\), \(c\) is the direct correlation function, \(\rho\) is the density, \(u\) is the pair potential, and \(*\) denotes a 3D convolution. For the case of bonds, the reference interaction site model (RISM) form of the OZ equation is used. For multiple bead types, the OZ equation becomes a matrix equation but this case is currently not implemented.

The Gauss-Newton formalism allows the incorporation of one or multiple constraints into the potential update. Those have to be expressible in terms of the RDF and the potential or force. Currently, only a pressure constraint is implemented, which is defined by

Here \(p_\text{tgt}\) and \(p_k\) are the target and current pressure, respectively, and \(\vec{f_k}\) is the current pair force. Element \(\alpha\) of vector \(\vec{l}\) is defined as

where \(\rho\) is the particle density and \(r\) is the radius.

The exact formulas and their derivation can be found in ref. [Delbary:2020] and [Bernhardt:2021].

## Force Matching¶

Force matching (FM) is another approach to evaluate corse-grained potentials [Ercolessi:1994,Izvekov:2005,Noid:2007]_. In contrast to the structure-based approaches, its aim is not to reproduce various distribution functions, but instead to match the multibody potential of mean force as close as possible with a given set of coarse-grained interactions.

The method works as follows. We first assume that the coarse-grained force-field (and hence the forces) depends on \(M\) parameters \(g_1,...,g_M\). These parameters can be prefactors of analytical functions, tabulated values of the interaction potentials, or coefficients of splines used to describe these potentials.

In order to determine these parameters, the reference forces on coarse-grained beads are calculated by summing up the forces on the atoms

where the sum is over all atoms of the CG site *I* (see
Mapping). The \(d_{Ij}\) coefficients can, in
principle, be chosen arbitrarily, provided that the condition
:math:` sum_{i=1}^{n}d_{Ii}=1` is
satisfied [Noid:2008.1]. If mapping coefficients for
the forces are not provided, it is assumed that \(d_{Ij} = c_{Ij}\)
(see also Input files).

By calculating the reference forces for \(L\) snapshots we can write down \(N \times L\) equations

Here \({{{{\mathbf F}}}}_{Il}^\text{ref}\) is the force on the bead \(I\) and \({{{{\mathbf F}}}}_{Il}^\text{cg} ` is the coarse-grained representation of this force. The index :math:`l\) enumerates snapshots picked for coarse-graining. By running the simulations long enough one can always ensure that \(M < N \times L\). In this case the set of equations is overdetermined and can be solved in a least-squares manner.

\({\mathbf F}_{il}^\text{cg}\) is, in principle, a non-linear function of its parameters \(\{g_i\}\). Therefore, it is useful to represent the coarse-grained force-field in such a way that equations become linear functions of \({g_i}\). This can be done using splines to describe the functional form of the forces [Izvekov:2005]. Implementation details are discussed in ref. [Ruehle:2009.a].

Note that an adequate sampling of the system requires a large number of snapshots \(L\). Hence, the applicability of the method is often constrained by the amount of memory available. To remedy the situation, one can split the trajectory into blocks, find the coarse-grained potential for each block and then perform averaging over all blocks.

### Three-body Stillinger-Weber interactions¶

In addition to non-bonded pair interactions, it is also possible to parametrize non-bonded three-body interactions of the Stillinger Weber type [stillinger_computer_1985]:

where \(I\) is the index of the central bead of a triplet of a CG sites, and \(J\) and \(K\) are the other two bead indices of a triplet of beads with an angular interaction term \(f^{\left(3b\right)}\left(\theta_{IJK}\right)\) and two exponential terms, \(\exp\left(\frac{\gamma\sigma}{r-a\sigma}\right)\), which guarantee a smooth switching on of the three-body forces at the cutoff distance. We do not limit ourselves to an analytic expression of \(f^{\left(3b\right)}\left(\theta_{IJK}\right)\) as in the original SW potential, but allow for a flexible angular dependence of \(f^{\left(3b\right)}\left(\theta_{IJK}\right)\). This is done by using a cubic spline representation of \(f^{\left(3b\right)}\left(\theta\right)\). Treating the two exponential terms as prefactors yields a linear dependence of the potential and thus the CG force \({\mathbf F}_{Il}^\text{cg}\) on the \(M\) spline coefficients. Therefore, the remaining parameters \(a_{IJ}\), \(a_{IK}\) (three-body short-range cutoff), \(\sigma_{IJ}\), \(\sigma_{IK}\) (additional parameter, can be set to 1), \(\gamma_{IJ}\), and \(\gamma_{IK}\) (steepness of switching), have to be set before solving the set of linear equations. For a more detailed description, we refer to ref. [scherer_understanding_2018].

## Relative Entropy¶

Relative entropy is a method which quantifies the extent of the configurational phase-space overlap between two molecular ensembles [Wu2005]. It can be used as a measure of the discrepancies between various properties of the CG system’s and the target all-atom (AA) ensemble. It has been shown by Shell S. [Shell2008] that one can minimize the relative entropy metric between the model CG system and the target AA system to optimize CG potential parameters such that the CG ensemble would mimic the target AA ensemble.

Relative entropy, \(S_{\text{rel}}\), is defined as [Shell2008]

where the sum is over all the configurations of the reference AA system, \(r=\{r_i\} (i=1,2,...)\), \(M\) is the mapping operation to generate a corresponding CG configuration, \(R_I\), from a AA configuration, \(r_i\), i.e., \(R_I = M(r_i)\), \(p_\text{AA}\) and \(p_\text{CG}\) are the configurational probabilities based on the AA and CG potentials, respectively, and \(\langle S_{\text{map}}\rangle_{\text{AA}}\) is the mapping entropy due to the average degeneracy of AA configurations mapping to the same CG configuration, given by

where \(\delta\) is the Kronecker delta function. Physically, \(S_{\text{rel}}\) can be interpreted as the likelihood that one test configuration of the model CG ensemble is representative of the target AA ensemble, and when the likelihood is a maximum, \(S_{\text{rel}}\) is at a minimum. Hence, the numerical minimization of \(S_{\text{rel}}\) with respect to the parameters of the CG model can be used to optimize the CG model.

In a canonical ensemble, substituting canonical configurational probabilities into the definition for the relative entropy, it simplifies to

where \(\beta={1}/{k_{\text{B}}T}\), \(k_{\text{B}}\) is the Boltzmann constant, \(T\) is the temperature, \(U_\text{CG}\) and \(U_\text{AA}\) are the total potential energies from the CG and AA potentials, respectively, \(A_\text{CG}\) and \(A_\text{AA}\) are the configurational part of the Helmholtz free energies from the CG and AA potentials, respectively, and all the averages are computed in the reference AA ensemble.

Consider a model CG system defined by the CG potentials between various CG sites such that the CG potentials depend on the parameters \(\boldsymbol\lambda=\{\lambda_1,\lambda_2,...\lambda_n\}\). Then \(\boldsymbol\lambda\) are optimized by the relative entropy minimization. We use the Newton-Raphson strategy for the relative entropy minimization described in ref. [Chaimovich2011]. In this strategy, the CG potential parameters, \(\boldsymbol\lambda\), are refined iteratively as

where \(k\) is the iteration index, \(\chi\in(0...1)\) is the scaling parameter that can be adjusted to ensure convergence, \(\nabla_{\lambda}S_{\text{rel}}\) is the vector of the first derivatives of \(S_{\text{rel}}\) with respect to \(boldsymbollambda\), which can be computed from the above equation<theory_eq_srelcan> as

and \(\mathbf{H}\) is the Hessian matrix of \(S_{\text{rel}}\) given by

To compute \(\nabla_{\lambda}S_{\text{rel}}\) and \(\mathbf{H}\) from those two equations, we need average CG energy derivatives in the AA and CG ensembles. For two-body CG pair potentials, \(u_{\text{CG}}\), between CG sites, the ensemble averages of the CG energy derivatives can be computed as

where the sum is performed over all the CG site pairs \((i,j)\), \(a\) stands for the 1\(^{\text{st}}\), 2\(^{\text{nd}}\),… derivatives and \(b\) stands for the different powers, i.e., \(b=1,2,...\). For the averages in the AA ensemble, first a single AA system simulation can be performed and RDFs between the CG sites in the AA ensemble can be saved, then the average CG energy derivatives in AA ensemble can be computed by processing the CG RDFs in the AA ensemble using the CG potentials at each iteration. For the averages in the CG ensemble, since the CG ensemble changes with the CG parameters, \(\boldsymbol\lambda\), a short CG simulation is performed at each iteration to generate corresponding CG configurations.

Comparisons between relative entropy and other coarse-graining methods are made in ref. [rudzinski_coarse-graining_2011] and [Chaimovich2011]. Chaimovich and Shell [Chaimovich2011] have shown that for certain CG models relative entropy minimization produces the same CG potentials as other methods, e.g., it is equivalent to the IBI when CG interactions are modeled using finely tabulated pair additive potentials, and to the FM when a CG model is based on \(N-\)body interactions, where \(N\) is the number of degrees of freedom in the CG model. However, there are some advantages of using relative entropy based coarse-graining. Relative entropy method allows to use analytical function forms for CG potentials, which are desired in theoretical treatments, such as parametric study of CG potentials, whereas, methods, like IBI, use tabulated potentials. Recently Lyubartsev et. al [lyubartsev2010systematic] have shows how to use IMC with an analytical function form, too. BI, IBI, and IMC methods are based on pair correlations and hence, they are only useful to optimize 2-body CG potentials, whereas, relative entropy uses more generic metric which offers more flexibility in modeling CG interactions and not only 2-body, but also 3-body (for example see ref. [lu_coarse-graining_2014]) and N-body CG potentials can be optimized. In addition to the CG potential optimization, the relative entropy metric can also be used to optimize an AA to CG mapping operator.