Chodera John D, Swope William C, Pitera Jed W, Seok Chaok, Dill Ken A
Graduate Group in Biophysics and Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, California 94143, IBM Almaden Research Center, 650 Harry Road, San Jose, California 95120, and Department of Chemistry, College of Natural Sciences, Seoul National University, Gwanak-gu, Shillim-dong, san 56-1 Seoul 151-747, Republic of Korea.
J Chem Theory Comput. 2007 Jan;3(1):26-41. doi: 10.1021/ct0502864.
The growing adoption of generalized-ensemble algorithms for biomolecular simulation has resulted in a resurgence in the use of the weighted histogram analysis method (WHAM) to make use of all data generated by these simulations. Unfortunately, the original presentation of WHAM by Kumar et al. is not directly applicable to data generated by these methods. WHAM was originally formulated to combine data from independent samplings of the canonical ensemble, whereas many generalized-ensemble algorithms sample from mixtures of canonical ensembles at different temperatures. Sorting configurations generated from a parallel tempering simulation by temperature obscures the temporal correlation in the data and results in an improper treatment of the statistical uncertainties used in constructing the estimate of the density of states. Here we present variants of WHAM, STWHAM and PTWHAM, derived with the same set of assumptions, that can be directly applied to several generalized ensemble algorithms, including simulated tempering, parallel tempering (better known as replica-exchange among temperatures), and replica-exchange simulated tempering. We present methods that explicitly capture the considerable temporal correlation in sequentially generated configurations using autocorrelation analysis. This allows estimation of the statistical uncertainty in WHAM estimates of expectations for the canonical ensemble. We test the method with a one-dimensional model system and then apply it to the estimation of potentials of mean force from parallel tempering simulations of the alanine dipeptide in both implicit and explicit solvent.
用于生物分子模拟的广义系综算法的日益广泛应用,使得加权直方图分析方法(WHAM)的使用再度兴起,以便利用这些模拟产生的所有数据。不幸的是,Kumar等人最初提出的WHAM并不直接适用于这些方法产生的数据。WHAM最初是为了合并正则系综独立抽样的数据而制定的,而许多广义系综算法是从不同温度下的正则系综混合物中进行抽样的。按温度对并行回火模拟产生的构型进行排序会掩盖数据中的时间相关性,并导致在构建态密度估计时对统计不确定性的处理不当。在此,我们提出了WHAM的变体,即STWHAM和PTWHAM,它们是基于同一组假设推导出来的,可以直接应用于几种广义系综算法,包括模拟回火、并行回火(在温度之间更常被称为副本交换)以及副本交换模拟回火。我们提出了使用自相关分析来明确捕捉顺序生成的构型中显著时间相关性的方法。这使得能够估计正则系综期望的WHAM估计中的统计不确定性。我们用一个一维模型系统对该方法进行了测试,然后将其应用于从丙氨酸二肽在隐式和显式溶剂中的并行回火模拟中估计平均力势。