Tan Zhiqiang, Xia Junchao, Zhang Bin W, Levy Ronald M
Department of Statistics, Rutgers University, Piscataway, New Jersey 08854, USA.
Center for Biophysics and Computational Biology, Department of Chemistry and Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122, USA.
J Chem Phys. 2016 Jan 21;144(3):034107. doi: 10.1063/1.4939768.
The weighted histogram analysis method (WHAM) including its binless extension has been developed independently in several different contexts, and widely used in chemistry, physics, and statistics, for computing free energies and expectations from multiple ensembles. However, this method, while statistically efficient, is computationally costly or even infeasible when a large number, hundreds or more, of distributions are studied. We develop a locally WHAM (local WHAM) from the perspective of simulations of simulations (SOS), using generalized serial tempering (GST) to resample simulated data from multiple ensembles. The local WHAM equations based on one jump attempt per GST cycle can be solved by optimization algorithms orders of magnitude faster than standard implementations of global WHAM, but yield similarly accurate estimates of free energies to global WHAM estimates. Moreover, we propose an adaptive SOS procedure for solving local WHAM equations stochastically when multiple jump attempts are performed per GST cycle. Such a stochastic procedure can lead to more accurate estimates of equilibrium distributions than local WHAM with one jump attempt per cycle. The proposed methods are broadly applicable when the original data to be "WHAMMED" are obtained properly by any sampling algorithm including serial tempering and parallel tempering (replica exchange). To illustrate the methods, we estimated absolute binding free energies and binding energy distributions using the binding energy distribution analysis method from one and two dimensional replica exchange molecular dynamics simulations for the beta-cyclodextrin-heptanoate host-guest system. In addition to the computational advantage of handling large datasets, our two dimensional WHAM analysis also demonstrates that accurate results similar to those from well-converged data can be obtained from simulations for which sampling is limited and not fully equilibrated.
加权直方图分析方法(WHAM)及其无箱扩展已在几种不同的背景下独立开发,并广泛应用于化学、物理和统计学领域,用于从多个系综计算自由能和期望值。然而,当研究大量(数百个或更多)分布时,这种方法虽然在统计上有效,但计算成本很高甚至不可行。我们从模拟的模拟(SOS)角度开发了一种局部WHAM(local WHAM),使用广义串行回火(GST)对来自多个系综的模拟数据进行重新采样。基于每个GST周期一次跳跃尝试的局部WHAM方程可以通过优化算法求解,其速度比全局WHAM的标准实现快几个数量级,但对自由能的估计与全局WHAM估计同样准确。此外,我们提出了一种自适应SOS程序,用于在每个GST周期执行多次跳跃尝试时随机求解局部WHAM方程。这种随机程序可以比每个周期一次跳跃尝试的局部WHAM更准确地估计平衡分布。当通过包括串行回火和平行回火(副本交换)在内的任何采样算法正确获得要进行“WHAM处理”的原始数据时,所提出的方法具有广泛的适用性。为了说明这些方法,我们使用β-环糊精-庚酸酯主客体系统的一维和二维副本交换分子动力学模拟中的结合能分布分析方法估计了绝对结合自由能和结合能分布。除了处理大型数据集的计算优势外,我们的二维WHAM分析还表明,对于采样有限且未完全平衡的模拟,可以获得与收敛良好的数据相似的准确结果。