Brandt Simon, Sittel Florian, Ernst Matthias, Stock Gerhard
Biomolecular Dynamics, Institute of Physics , Albert Ludwigs University , 79104 Freiburg , Germany.
J Phys Chem Lett. 2018 May 3;9(9):2144-2150. doi: 10.1021/acs.jpclett.8b00759. Epub 2018 Apr 12.
We present a systematic approach to reduce the dimensionality of a complex molecular system. Starting with a data set of molecular coordinates (obtained from experiment or simulation) and an associated set of metastable conformational states (obtained from clustering the data), a supervised machine learning model is trained to assign unknown molecular structures to the set of metastable states. In this way, the model learns to determine the features of the molecular coordinates that are most important to discriminate the states. Using a new algorithm that exploits this feature importance via an iterative exclusion principle, we identify the essential internal coordinates (such as specific interatomic distances or dihedral angles) of the system, which are shown to represent versatile reaction coordinates that account for the dynamics of the slow degrees of freedom and explain the mechanism of the underlying processes. Moreover, these coordinates give rise to a free energy landscape that may reveal previously hidden intermediate states of the system.
我们提出了一种系统方法来降低复杂分子系统的维度。从分子坐标数据集(通过实验或模拟获得)以及一组相关的亚稳态构象状态(通过对数据进行聚类获得)开始,训练一个监督机器学习模型,以便将未知分子结构分配到亚稳态集合中。通过这种方式,模型学会确定对区分状态最为重要的分子坐标特征。使用一种通过迭代排除原理利用此特征重要性的新算法,我们确定了系统的基本内部坐标(如特定原子间距离或二面角),这些坐标被证明代表了通用的反应坐标,它们解释了慢自由度的动力学并阐明了潜在过程的机制。此外,这些坐标产生了一个自由能景观,它可能揭示系统先前隐藏的中间状态。