• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习分子相似性描述符的量子化学根源

Quantum Chemical Roots of Machine-Learning Molecular Similarity Descriptors.

作者信息

Gugler Stefan, Reiher Markus

机构信息

Laboratorium für Physikalische Chemie, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland.

出版信息

J Chem Theory Comput. 2022 Nov 8;18(11):6670-6689. doi: 10.1021/acs.jctc.2c00718. Epub 2022 Oct 11.

DOI:10.1021/acs.jctc.2c00718
PMID:36218328
Abstract

In this work, we explore the quantum chemical foundations of descriptors for molecular similarity. Such descriptors are key for traversing chemical compound space with machine learning. Our focus is on the Coulomb matrix and on the smooth overlap of atomic positions (SOAP). We adopt a basic framework that allows us to connect both descriptors to electronic structure theory. This framework enables us to then define two new descriptors that are more closely related to electronic structure theory, which we call Coulomb lists and smooth overlap of electron densities (SOED). By investigating their usefulness as molecular similarity descriptors, we gain new insights into how and why Coulomb matrix and SOAP work. Moreover, Coulomb lists avoid the somewhat mysterious diagonalization step of the Coulomb matrix and might provide a direct means to extract subsystem information that can be compared across Born-Oppenheimer surfaces of varying dimension. For the electron density, we derive the necessary formalism to create the SOED measure in close analogy to SOAP. Because this formalism is more involved than that of SOAP, we review the essential theory as well as introduce a set of approximations that eventually allow us to work with SOED in terms of the same implementation available for the evaluation of SOAP. We focus our analysis on elementary reaction steps, where transition state structures are more similar to either reactant or product structures than the latter two are with respect to one another. The prediction of electronic energies of transition state structures can, however, be more difficult than that of stable intermediates due to multi-configurational effects. The question arises to what extent molecular similarity descriptors rooted in electronic structure theory can resolve these intricate effects.

摘要

在这项工作中,我们探索了分子相似性描述符的量子化学基础。此类描述符是利用机器学习遍历化合物空间的关键。我们的重点是库仑矩阵和平滑原子位置重叠(SOAP)。我们采用了一个基本框架,使我们能够将这两种描述符与电子结构理论联系起来。这个框架使我们能够定义两个与电子结构理论联系更紧密的新描述符,我们称之为库仑列表和电子密度平滑重叠(SOED)。通过研究它们作为分子相似性描述符的有用性,我们对库仑矩阵和SOAP的工作方式及原因有了新的认识。此外,库仑列表避免了库仑矩阵有点神秘的对角化步骤,并且可能提供一种直接方法来提取可以在不同维度的玻恩 - 奥本海默表面之间进行比较的子系统信息。对于电子密度,我们推导了类似于SOAP来创建SOED度量的必要形式。由于这种形式比SOAP的更复杂,我们回顾了基本理论,并引入了一组近似,最终使我们能够以与评估SOAP相同的实现方式来处理SOED。我们将分析重点放在基元反应步骤上,在这些步骤中,过渡态结构与反应物或产物结构的相似性比反应物和产物结构彼此之间的相似性更高。然而,由于多组态效应,过渡态结构电子能量的预测可能比稳定中间体的预测更困难。基于电子结构理论的分子相似性描述符在多大程度上能够解决这些复杂效应的问题由此产生。

相似文献

1
Quantum Chemical Roots of Machine-Learning Molecular Similarity Descriptors.机器学习分子相似性描述符的量子化学根源
J Chem Theory Comput. 2022 Nov 8;18(11):6670-6689. doi: 10.1021/acs.jctc.2c00718. Epub 2022 Oct 11.
2
Do Machine-Learning Atomic Descriptors and Order Parameters Tell the Same Story? The Case of Liquid Water.机器学习原子描述符和序参量讲述的是同一个故事吗?以液态水为例。
J Chem Theory Comput. 2023 Jul 25;19(14):4596-4605. doi: 10.1021/acs.jctc.2c01205. Epub 2023 Mar 15.
3
Graph theoretical descriptors differentiate d-Mannose isomers in the principal component proposed feature space: A computational approach.图论描述符可区分主成分提出特征空间中的 d-甘露糖异构体:一种计算方法。
Carbohydr Res. 2024 Jul;541:109147. doi: 10.1016/j.carres.2024.109147. Epub 2024 May 19.
4
Locally coupled open subsystems: A formalism for affordable electronic structure calculations featuring fractional charges and size consistency.局部耦合开系统:一种用于经济的电子结构计算的形式体系,其特点是分数电荷和大小一致性。
J Chem Phys. 2018 Jul 21;149(3):034105. doi: 10.1063/1.5038557.
5
Analyzing the substitution effect on the CoMFA results within the framework of density functional theory (DFT).在密度泛函理论(DFT)框架内分析对CoMFA结果的取代效应。
J Mol Model. 2016 Jul;22(7):164. doi: 10.1007/s00894-016-3036-7. Epub 2016 Jun 21.
6
Bias-Free Chemically Diverse Test Sets from Machine Learning.机器学习中无偏差的化学多样性测试集
ACS Comb Sci. 2017 Aug 14;19(8):544-554. doi: 10.1021/acscombsci.7b00087. Epub 2017 Jul 27.
7
Classification of biomass reactions and predictions of reaction energies through machine learning.通过机器学习对生物质反应进行分类并预测反应能量。
J Chem Phys. 2020 Jul 28;153(4):044126. doi: 10.1063/5.0014828.
8
Kernel Methods for Predicting Yields of Chemical Reactions.核方法在化学反应产率预测中的应用。
J Chem Inf Model. 2022 May 9;62(9):2077-2092. doi: 10.1021/acs.jcim.1c00699. Epub 2021 Oct 26.
9
Performance and Cost Assessment of Machine Learning Interatomic Potentials.机器学习原子间势的性能与成本评估
J Phys Chem A. 2020 Jan 30;124(4):731-745. doi: 10.1021/acs.jpca.9b08723. Epub 2020 Jan 22.
10
Electronic structure evaluation through quantum chemical descriptors of 17β-aminoestrogens with an anticoagulant effect.通过具有抗凝作用的 17β-氨基雌激素的量子化学描述符进行电子结构评估。
Eur J Med Chem. 2011 Jun;46(6):2463-8. doi: 10.1016/j.ejmech.2011.03.032. Epub 2011 Mar 23.

引用本文的文献

1
iSIM: instant similarity.iSIM:即时相似度。
Digit Discov. 2024 May 7;3(6):1160-1171. doi: 10.1039/d4dd00041b. eCollection 2024 Jun 12.
2
SPAM(a,b): Encoding the Density Information from Guess Hamiltonian in Quantum Machine Learning Representations.SPAM(a,b):在量子机器学习表示中对来自猜测哈密顿量的密度信息进行编码。
J Chem Theory Comput. 2024 Feb 13;20(3):1108-1117. doi: 10.1021/acs.jctc.3c01040. Epub 2024 Jan 16.
3
SaPt-CNN-LSTM-AR-EA: a hybrid ensemble learning framework for time series-based multivariate DNA sequence prediction.
SaPt-CNN-LSTM-AR-EA:一种用于基于时间序列的多变量DNA序列预测的混合集成学习框架。
PeerJ. 2023 Oct 4;11:e16192. doi: 10.7717/peerj.16192. eCollection 2023.
4
Lifelong Machine Learning Potentials.终身机器学习潜力。
J Chem Theory Comput. 2023 Jun 27;19(12):3509-3525. doi: 10.1021/acs.jctc.3c00279. Epub 2023 Jun 8.