Suppr超能文献

用于分子动力学轨迹结构聚类的大小和形状空间高斯混合模型。

Size-and-Shape Space Gaussian Mixture Models for Structural Clustering of Molecular Dynamics Trajectories.

机构信息

Department of Chemistry, Colorado State University, Fort Collins, Colorado 80523, United States.

Department of Chemistry, New York University, New York, New York 10003, United States.

出版信息

J Chem Theory Comput. 2022 May 10;18(5):3218-3230. doi: 10.1021/acs.jctc.1c01290. Epub 2022 Apr 28.

Abstract

Determining the optimal number and identity of structural clusters from an ensemble of molecular configurations continues to be a challenge. Recent structural clustering methods have focused on the use of internal coordinates due to the innate rotational and translational invariance of these features. The vast number of possible internal coordinates necessitates a feature space supervision step to make clustering tractable but yields a protocol that can be system type-specific. Particle positions offer an appealing alternative to internal coordinates but suffer from a lack of rotational and translational invariance, as well as a perceived insensitivity to regions of structural dissimilarity. Here, we present a method, denoted shape-GMM, that overcomes the shortcomings of particle positions using a weighted maximum likelihood alignment procedure. This alignment strategy is then built into an expectation maximization Gaussian mixture model (GMM) procedure to capture metastable states in the free-energy landscape. The resulting algorithm distinguishes between a variety of different structures, including those indistinguishable by root-mean-square displacement and pairwise distances, as demonstrated on several model systems. Shape-GMM results on an extensive simulation of the fast-folding HP35 Nle/Nle mutant protein support a four-state folding/unfolding mechanism, which is consistent with previous experimental results and provides kinetic details comparable to previous state-of-the art clustering approaches, as measured by the VAMP-2 score. Currently, training of shape-GMMs is recommended for systems (or subsystems) that can be represented by ≲200 particles and ≲100k configurations to estimate high-dimensional covariance matrices and balance computational expense. Once a shape-GMM is trained, it can be used to predict the cluster identities of millions of configurations.

摘要

从分子构象的集合中确定最佳的结构簇数量和身份仍然是一个挑战。最近的结构聚类方法侧重于使用内部坐标,因为这些特征具有固有的旋转和平移不变性。大量可能的内部坐标需要特征空间监督步骤来使聚类变得可行,但会产生一种特定于系统类型的协议。粒子位置提供了一种替代内部坐标的诱人选择,但由于缺乏旋转和平移不变性,以及对结构差异区域的感知不敏感性,因此受到限制。在这里,我们提出了一种方法,称为 shape-GMM,它使用加权最大似然对齐程序克服了粒子位置的缺点。然后,该对齐策略被构建到期望最大化高斯混合模型(GMM)程序中,以捕获自由能景观中的亚稳态。由此产生的算法可以区分多种不同的结构,包括那些通过均方根位移和成对距离无法区分的结构,这在几个模型系统上得到了证明。在对快速折叠 HP35 Nle/Nle 突变蛋白的广泛模拟中,shape-GMM 的结果支持了四态折叠/展开机制,这与先前的实验结果一致,并提供了与先前的最先进聚类方法相当的动力学细节,如 VAMP-2 评分所衡量的。目前,建议在可以用 ≲200 个粒子和 ≲100k 个构象表示的系统(或子系统)上训练 shape-GMM,以估计高维协方差矩阵并平衡计算费用。一旦训练了 shape-GMM,就可以用于预测数百万个构象的聚类身份。

相似文献

6
Regularized Gaussian Mixture Model for High-Dimensional Clustering.用于高维聚类的正则化高斯混合模型
IEEE Trans Cybern. 2019 Oct;49(10):3677-3688. doi: 10.1109/TCYB.2018.2846404. Epub 2018 Jun 27.
9

引用本文的文献

10
Quantifying Unbiased Conformational Ensembles from Biased Simulations Using ShapeGMM.使用 ShapeGMM 从有偏模拟中定量无偏构象集合。
J Chem Theory Comput. 2024 May 14;20(9):3492-3502. doi: 10.1021/acs.jctc.4c00223. Epub 2024 Apr 25.

本文引用的文献

1
CATBOSS: Cluster Analysis of Trajectories Based on Segment Splitting.基于分段拆分的轨迹聚类分析。
J Chem Inf Model. 2021 Oct 25;61(10):5066-5081. doi: 10.1021/acs.jcim.1c00598. Epub 2021 Oct 5.
3
Unsupervised Learning Methods for Molecular Simulation Data.无监督学习方法在分子模拟数据中的应用。
Chem Rev. 2021 Aug 25;121(16):9722-9758. doi: 10.1021/acs.chemrev.0c01195. Epub 2021 May 4.
5
Sapphire-Based Clustering.蓝宝石聚类。
J Chem Theory Comput. 2020 Oct 13;16(10):6383-6396. doi: 10.1021/acs.jctc.0c00604. Epub 2020 Sep 24.
7
Infinite switch simulated tempering in force (FISST).无限开关模拟力回火(FISST)。
J Chem Phys. 2020 Jun 28;152(24):244120. doi: 10.1063/5.0009280.
8
InfleCS: Clustering Free Energy Landscapes with Gaussian Mixtures.InfleCS:使用高斯混合模型对自由能景观进行无聚类分析。
J Chem Theory Comput. 2019 Dec 10;15(12):6752-6759. doi: 10.1021/acs.jctc.9b00454. Epub 2019 Nov 7.
10
Dynamical coring of Markov state models.马尔可夫状态模型的动力学核化。
J Chem Phys. 2019 Mar 7;150(9):094111. doi: 10.1063/1.5081767.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验