Suppr超能文献

通过模式识别方法识别分子结构特征。

Identifying molecular structural features by pattern recognition methods.

作者信息

Lu Qing

机构信息

Beijing National Laboratory for Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 China

出版信息

RSC Adv. 2022 Jun 14;12(27):17559-17569. doi: 10.1039/d2ra00764a. eCollection 2022 Jun 7.

Abstract

Identification of molecular structural features is a central part of computational chemistry. It would be beneficial if pattern recognition techniques could be incorporated to facilitate the identification. Currently, the quantification of the structural dissimilarity is mainly carried out by root-mean-square-deviation (RMSD) calculations such as in molecular dynamics simulations. However, the RMSD calculation underperforms for large molecules, showing the so-called "curse of dimensionality" problem. Also, it requires consistent ordering of atoms in two comparing structures, which needs nontrivial effort to fulfill. In this work, we propose to take advantage of the point cloud recognition using convex hulls as the basis to recognize molecular structural features. Two advantages of the method can be highlighted. First, the dimension of the input data structure is largely reduced from the number of atoms of molecules to the number of atoms of convex hulls. Therefore, the dimensionality curse problem is avoided, and the atom ordering process is saved. Second, the construction of convex hulls can be used to define new molecular descriptors, such as the contact area of molecular interactions. These new molecular descriptors have different properties from existing ones, therefore they are expected to exhibit different behaviors for certain machine learning studies. Several illustrative applications have been carried out, which provide promising results for structure-activity studies.

摘要

分子结构特征的识别是计算化学的核心内容。如果能够引入模式识别技术来促进识别,将会大有裨益。目前,结构差异的量化主要通过均方根偏差(RMSD)计算来进行,例如在分子动力学模拟中。然而,RMSD计算对于大分子表现不佳,存在所谓的“维度诅咒”问题。此外,它要求在两个比较结构中原子具有一致的排序,这需要付出不小的努力才能实现。在这项工作中,我们建议利用以凸包为基础的点云识别来识别分子结构特征。该方法的两个优点值得强调。首先,输入数据结构的维度从分子的原子数大幅减少到凸包的原子数。因此,避免了维度诅咒问题,也省去了原子排序过程。其次,凸包的构建可用于定义新的分子描述符,如分子相互作用的接触面积。这些新的分子描述符具有与现有描述符不同的性质,因此预计它们在某些机器学习研究中会表现出不同的行为。已经进行了几个说明性应用,为构效关系研究提供了有希望的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7525/9192268/1892544e2fa9/d2ra00764a-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验