Suppr超能文献

线性和非线性降维方法在小分子溶液相反应的整体变量识别中的应用。

Behavior of Linear and Nonlinear Dimensionality Reduction for Collective Variable Identification of Small Molecule Solution-Phase Reactions.

机构信息

Department of Chemistry, Washington State University, Pullman, Washington 99164, United States.

Materials Science and Engineering, Rensselaer Polytechnic Institute, Troy, New York 12180, United States.

出版信息

J Chem Theory Comput. 2022 Mar 8;18(3):1286-1296. doi: 10.1021/acs.jctc.1c00983. Epub 2022 Feb 28.

Abstract

Identifying collective variables (CVs) for chemical reactions is essential to reduce the 3-dimensional energy landscape into lower dimensional basins and barriers of interest. However, in condensed phase processes, the nonmeaningful motions of bulk solvent often overpower the ability of dimensionality reduction methods to identify correlated motions that underpin collective variables. Yet solvent can play important indirect or direct roles in reactivity, and much can be lost through treatments that remove or dampen solvent motion. This has been amply demonstrated within principal component analysis (PCA), although less is known about the behavior of nonlinear dimensionality reduction methods, e.g., uniform manifold approximation and projection (UMAP), that have become recently utilized. The latter presents an interesting alternative to linear methods though often at the expense of interpretability. This work presents distance-attenuated projection methods of atomic coordinates that facilitate the application of both PCA and UMAP to identify collective variables in the presence of explicit solvent and further the specific identity of solvent molecules that participate in chemical reactions. The performance of both methods is examined in detail for two reactions where the explicit solvent plays very different roles within the collective variables. When applied to raw molecular dynamics data in solution, both PCA and UMAP representations are dominated by bulk solvent motions. On the other hand, when applied to data preprocessed by our attenuated projection methods, both PCA and UMAP identify the appropriate collective variables (though varying sensitivity is observed due to the presence of explicit solvent that results from the projection method). Importantly, this approach allows identification of specific solvent molecules that are relevant to the CVs and their importance.

摘要

识别化学反应的集体变量(CVs)对于将 3 维能量景观简化为感兴趣的低维盆地和障碍至关重要。然而,在凝聚相过程中,大量溶剂的无意义运动常常超过降维方法识别支撑集体变量的相关运动的能力。然而,溶剂可以在反应性中发挥重要的间接或直接作用,通过去除或抑制溶剂运动的处理,会失去很多信息。这在主成分分析(PCA)中得到了充分证明,尽管对于非线性降维方法(例如,均匀流形逼近和投影(UMAP))的行为知之甚少,这些方法最近已经得到了应用。后者虽然常常以可解释性为代价,但提供了一种与线性方法不同的有趣选择。这项工作提出了原子坐标的距离衰减投影方法,这些方法有助于 PCA 和 UMAP 在存在显式溶剂的情况下识别集体变量,并进一步确定参与化学反应的溶剂分子的特定身份。详细检查了这两种方法在两种反应中的性能,其中显式溶剂在集体变量中扮演着非常不同的角色。当应用于溶液中的原始分子动力学数据时,PCA 和 UMAP 表示都主要由大量溶剂运动主导。另一方面,当应用于我们衰减投影方法预处理的数据时,PCA 和 UMAP 都可以识别适当的集体变量(尽管由于投影方法导致存在显式溶剂,观察到了不同的灵敏度)。重要的是,这种方法允许识别与 CVs 及其重要性相关的特定溶剂分子。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验