Boninsegna Lorenzo, Gobbo Gianpaolo, Noé Frank, Clementi Cecilia
Center for Theoretical Biological Physics and Department of Chemistry, Rice University , 6100 Main Street, Houston, Texas 77005, United States.
Maxwell Institute for Mathematical Sciences and School of Mathematics, The University of Edinburgh , Peter Guthrie Tait Road, Edinburgh EH9 3FD, United Kingdom.
J Chem Theory Comput. 2015 Dec 8;11(12):5947-60. doi: 10.1021/acs.jctc.5b00749. Epub 2015 Nov 18.
Identification of the collective coordinates that describe rare events in complex molecular transitions such as protein folding has been a key challenge in the theoretical molecular sciences. In the Diffusion Map approach, one assumes that the molecular configurations sampled have been generated by a diffusion process, and one uses the eigenfunctions of the corresponding diffusion operator as reaction coordinates. While diffusion coordinates (DCs) appear to provide a good approximation to the true dynamical reaction coordinates, they are not parametrized using dynamical information. Thus, their approximation quality could not, as yet, be validated, nor could the diffusion map eigenvalues be used to compute relaxation rate constants of the system. Here we combine the Diffusion Map approach with the recently proposed Variational Approach for Conformation Dynamics (VAC). Diffusion Map coordinates are used as a basis set, and their optimal linear combination is sought using the VAC, which employs time-correlation information on the molecular dynamics (MD) trajectories. We have applied this approach to ultra-long MD simulations of the Fip35 WW domain and found that the first DCs are indeed a good approximation to the true reaction coordinates of the system, but they could be further improved using the VAC. Using the Diffusion Map basis, excellent approximations to the relaxation rates of the system are obtained. Finally, we evaluate the quality of different metric spaces and find that pairwise minimal root-mean-square deviation performs poorly, while operating in the recently introduced kinetic maps based on the time-lagged independent component analysis gives the best performance.
识别描述诸如蛋白质折叠等复杂分子转变中罕见事件的集体坐标,一直是理论分子科学中的关键挑战。在扩散映射方法中,人们假定所采样的分子构型是由扩散过程生成的,并将相应扩散算子的本征函数用作反应坐标。虽然扩散坐标(DCs)似乎能很好地近似真实的动力学反应坐标,但它们并非使用动力学信息进行参数化。因此,其近似质量尚未得到验证,扩散映射特征值也无法用于计算系统的弛豫速率常数。在此,我们将扩散映射方法与最近提出的构象动力学变分方法(VAC)相结合。以扩散映射坐标作为基集,并使用VAC寻找其最优线性组合,VAC利用分子动力学(MD)轨迹上的时间相关信息。我们已将此方法应用于Fip35 WW结构域的超长MD模拟,发现首个DCs确实能很好地近似系统的真实反应坐标,但使用VAC可进一步改进。利用扩散映射基集,可获得对系统弛豫速率的出色近似。最后,我们评估了不同度量空间的质量,发现成对最小均方根偏差表现不佳,而在基于时间滞后独立成分分析的最近引入的动力学映射中操作表现最佳。