Department of Chemical Engineering, University of Virginia , Charlottesville, Virginia 22904, United States.
Department of Chemical and Biological Engineering, University of Colorado Boulder , Boulder, Colorado 80309, United States.
J Chem Theory Comput. 2016 Apr 12;12(4):1806-23. doi: 10.1021/acs.jctc.5b00869. Epub 2016 Feb 29.
We show how thermodynamic properties of molecular models can be computed over a large, multidimensional parameter space by combining multistate reweighting analysis with a linear basis function approach. This approach reduces the computational cost to estimate thermodynamic properties from molecular simulations for over 130,000 tested parameter combinations from over 1000 CPU years to tens of CPU days. This speed increase is achieved primarily by computing the potential energy as a linear combination of basis functions, computed from either modified simulation code or as the difference of energy between two reference states, which can be done without any simulation code modification. The thermodynamic properties are then estimated with the Multistate Bennett Acceptance Ratio (MBAR) as a function of multiple model parameters without the need to define a priori how the states are connected by a pathway. Instead, we adaptively sample a set of points in parameter space to create mutual configuration space overlap. The existence of regions of poor configuration space overlap are detected by analyzing the eigenvalues of the sampled states' overlap matrix. The configuration space overlap to sampled states is monitored alongside the mean and maximum uncertainty to determine convergence, as neither the uncertainty or the configuration space overlap alone is a sufficient metric of convergence. This adaptive sampling scheme is demonstrated by estimating with high precision the solvation free energies of charged particles of Lennard-Jones plus Coulomb functional form with charges between -2 and +2 and generally physical values of σij and ϵij in TIP3P water. We also compute entropy, enthalpy, and radial distribution functions of arbitrary unsampled parameter combinations using only the data from these sampled states and use the estimates of free energies over the entire space to examine the deviation of atomistic simulations from the Born approximation to the solvation free energy.
我们展示了如何通过将多态重加权分析与线性基函数方法相结合,在一个大的多维参数空间中计算分子模型的热力学性质。这种方法将从分子模拟中估计热力学性质的计算成本降低了 130,000 多个经过测试的参数组合,从 1000 多个 CPU 年减少到几十个 CPU 天。这种速度的提高主要是通过将势能计算为基函数的线性组合来实现的,这些基函数可以从修改后的模拟代码或两个参考状态之间的能量差中计算出来,而无需进行任何模拟代码修改。然后,我们使用 Multistate Bennett Acceptance Ratio(MBAR)根据多个模型参数来估计热力学性质,而无需先验地定义状态之间的连接路径。相反,我们自适应地在参数空间中采样一组点,以创建相互配置空间重叠。通过分析采样状态的重叠矩阵的特征值来检测配置空间重叠不良的区域。同时监测配置空间重叠和平均最大不确定性以确定收敛性,因为不确定性或配置空间重叠本身都不是收敛性的充分指标。这种自适应采样方案通过以高精度估计 Lennard-Jones 加库仑官能团形式的带电粒子的溶剂化自由能来证明,其电荷在-2 到+2 之间,并且通常具有 TIP3P 水中的物理值σij和ϵij。我们还仅使用这些采样状态的数据计算任意未采样参数组合的熵、焓和径向分布函数,并使用整个空间上的自由能估计来检查原子模拟对溶剂化自由能的 Born 近似的偏差。