Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA.
Department of Chemical and Biomolecular Engineering, University of Maryland, College Park, Maryland 20742, USA.
J Chem Phys. 2022 May 14;156(18):184103. doi: 10.1063/5.0085607.
Finding a low dimensional representation of data from long-timescale trajectories of biomolecular processes, such as protein folding or ligand-receptor binding, is of fundamental importance, and kinetic models, such as Markov modeling, have proven useful in describing the kinetics of these systems. Recently, an unsupervised machine learning technique called VAMPNet was introduced to learn the low dimensional representation and the linear dynamical model in an end-to-end manner. VAMPNet is based on the variational approach for Markov processes and relies on neural networks to learn the coarse-grained dynamics. In this paper, we combine VAMPNet and graph neural networks to generate an end-to-end framework to efficiently learn high-level dynamics and metastable states from the long-timescale molecular dynamics trajectories. This method bears the advantages of graph representation learning and uses graph message passing operations to generate an embedding for each datapoint, which is used in the VAMPNet to generate a coarse-grained dynamical model. This type of molecular representation results in a higher resolution and a more interpretable Markov model than the standard VAMPNet, enabling a more detailed kinetic study of the biomolecular processes. Our GraphVAMPNet approach is also enhanced with an attention mechanism to find the important residues for classification into different metastable states.
从生物分子过程(如蛋白质折叠或配体-受体结合)的长时轨迹中找到数据的低维表示,这一点非常重要,而动力学模型(如马尔可夫建模)已被证明可用于描述这些系统的动力学。最近,一种名为 VAMPNet 的无监督机器学习技术被引入,用于以端到端的方式学习低维表示和线性动力学模型。VAMPNet 基于马尔可夫过程的变分方法,并依赖神经网络来学习粗粒度动力学。在本文中,我们结合 VAMPNet 和图神经网络,生成一个端到端的框架,以便从长时分子动力学轨迹中高效地学习高级动力学和亚稳态。这种方法具有图表示学习的优势,并使用图消息传递操作为每个数据点生成一个嵌入,该嵌入用于 VAMPNet 生成粗粒度动力学模型。这种分子表示类型比标准 VAMPNet 产生更高分辨率和更可解释的马尔可夫模型,从而能够更详细地研究生物分子过程的动力学。我们的 GraphVAMPNet 方法还增强了注意力机制,以找到用于将不同亚稳态分类的重要残基。