Vitalis Andreas, Caflisch Amedeo
Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland.
J Chem Theory Comput. 2012 Mar 13;8(3):1108-20. doi: 10.1021/ct200801b. Epub 2012 Feb 10.
The coarse-graining of data from molecular simulations yields conformational space networks that may be used for predicting the system's long time scale behavior, to discover structural pathways connecting free energy basins in the system, or simply to represent accessible phase space regions of interest and their connectivities in a two-dimensional plot. In this contribution, we present a tree-based algorithm to partition conformations of biomolecules into sets of similar microstates, i.e., to coarse-grain trajectory data into mesostates. On account of utilizing an architecture similar to that of established tree-based algorithms, the proposed scheme operates in near-linear time with data set size. We derive expressions needed for the fast evaluation of mesostate properties and distances when employing typical choices for measures of similarity between microstates. Using both a pedagogically useful and a real-word application, the algorithm is shown to be robust with respect to tree height, which in addition to mesostate threshold size is the main adjustable parameter. It is demonstrated that the derived mesostate networks can preserve information regarding the free energy basins and barriers by which the system is characterized.
分子模拟数据的粗粒化产生了构象空间网络,该网络可用于预测系统的长时间尺度行为,发现连接系统中自由能盆地的结构途径,或者仅仅用于在二维图中表示感兴趣的可及相空间区域及其连通性。在本论文中,我们提出了一种基于树的算法,将生物分子的构象划分为相似微状态的集合,即将轨迹数据粗粒化为介观状态。由于采用了与已有的基于树的算法相似的架构,所提出的方案在数据集大小方面以近线性时间运行。当对微状态之间的相似性度量采用典型选择时,我们推导了快速评估介观状态属性和距离所需的表达式。通过一个具有教学意义的应用和一个实际应用,该算法被证明对于树高是稳健的,树高除了介观状态阈值大小之外是主要的可调参数。结果表明,所推导的介观状态网络可以保留关于系统特征的自由能盆地和势垒的信息。