• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在预测有机反应性、选择性和化学性质方面,工程化和学习的分子表示的重要性。

Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties.

机构信息

Department of Chemistry, Colorado State University, Fort Collins, Colorado 80523, United States.

Biosciences Center, National Renewable Energy Laboratory, 15103 Denver West Parkway, Golden, Colorado 80401, United States.

出版信息

Acc Chem Res. 2021 Feb 16;54(4):827-836. doi: 10.1021/acs.accounts.0c00745. Epub 2021 Feb 3.

DOI:10.1021/acs.accounts.0c00745
PMID:33534534
Abstract

Machine-readable chemical structure representations are foundational in all attempts to harness machine learning for the prediction of reactivities, selectivities, and chemical properties directly from molecular structure. The featurization of discrete chemical structures into a continuous vector space is a critical phase undertaken before model selection, and the development of new ways to quantitatively encode molecules is an active area of research. In this Account, we highlight the application and suitability of different representations, from expert-guided "engineered" descriptors to automatically "learned" features, in different prediction tasks relevant to organic and organometallic chemistry, where differing amounts of training data are available. These tasks include statistical models of stereo- and enantioselectivity, thermochemistry, and kinetics developed using experimental and quantum chemical data.The use of expert-guided molecular descriptors provides an opportunity to incorporate chemical knowledge, domain expertise, and physical constraints into statistical modeling. In applications to stereoselective organic and organometallic catalysis, where data sets may be relatively small and 3D-geometries and conformations play an important role, mechanistically informed features can be used successfully to obtain predictive statistical models that are also chemically interpretable. We provide an overview of several recent applications of this approach to obtain quantitative models for reactivity and selectivity, where topological descriptors, quantum mechanical calculations of electronic and steric properties, along with conformational ensembles, all feature as essential ingredients of the molecular representations used.Alternatively, more flexible, general-purpose molecular representations such as attributed molecular graphs can be used with machine learning approaches to learn the complex relationship between a structure and prediction target. This approach has the potential to out-perform more traditional representation methods such as "hand-crafted" molecular descriptors, particularly as data set sizes grow. One area where this is particularly relevant is in the use of large sets of quantum mechanical data to train quantitative structure-property relationships. A general approach toward curating useful data sets and training highly accurate graph neural network models is discussed in the context of organic bond dissociation enthalpies, where this strategy outperforms regression using precomputed descriptors.Finally, we describe how graph neural network predictions can be incorporated into mechanistically informed statistical models of chemical reactivity and selectivity. Once trained, this approach avoids the expensive computational overhead associated with quantum mechanical calculations, while maintaining chemical interpretability. We illustrate examples for which fast predictions of bond dissociation enthalpy and of the identities of radicals formed through cleavage of a molecule's weakest bond are used in simple physical models of site-selectivity and reactivity.

摘要

机器可读的化学结构表示法是利用机器学习直接从分子结构预测反应性、选择性和化学性质的所有尝试的基础。将离散的化学结构特征化为连续的向量空间是在模型选择之前进行的关键阶段,开发新的定量编码分子的方法是一个活跃的研究领域。在本报告中,我们强调了不同表示方法的应用和适用性,从专家指导的“设计”描述符到自动“学习”的特征,这些方法在与有机和有机金属化学相关的不同预测任务中都有应用,这些任务涉及到不同数量的训练数据。这些任务包括使用实验和量子化学数据开发的立体和对映选择性、热化学和动力学的统计模型。使用专家指导的分子描述符提供了一个机会,可以将化学知识、领域专业知识和物理约束纳入统计建模中。在应用于立体选择性有机和有机金属催化的过程中,数据集可能相对较小,3D 几何形状和构象起着重要作用,因此可以成功使用机械启发式特征来获得可预测的统计模型,这些模型也具有化学可解释性。我们概述了这种方法的几个最新应用,以获得反应性和选择性的定量模型,其中拓扑描述符、电子和立体性质的量子力学计算以及构象系综都是所使用的分子表示的基本成分。或者,可以使用更灵活的通用分子表示形式,例如带属性的分子图,并结合机器学习方法来学习结构与预测目标之间的复杂关系。随着数据集规模的增长,这种方法有可能优于更传统的表示方法,例如“手工制作”的分子描述符。在使用大量量子力学数据来训练定量结构-性质关系方面,这一点尤其相关。在讨论有机键离解焓的情况下,讨论了一种用于策展有用数据集和训练高度准确的图神经网络模型的一般方法,这种策略优于使用预先计算的描述符的回归。最后,我们描述了如何将图神经网络预测纳入化学反应性和选择性的机械启发式统计模型中。一旦经过训练,这种方法就可以避免与量子力学计算相关的昂贵计算开销,同时保持化学可解释性。我们举例说明了通过快速预测键离解焓和通过分子最弱键断裂形成的自由基的身份,在简单的位置选择性和反应性物理模型中使用。

相似文献

1
Importance of Engineered and Learned Molecular Representations in Predicting Organic Reactivity, Selectivity, and Chemical Properties.在预测有机反应性、选择性和化学性质方面,工程化和学习的分子表示的重要性。
Acc Chem Res. 2021 Feb 16;54(4):827-836. doi: 10.1021/acs.accounts.0c00745. Epub 2021 Feb 3.
2
A big data approach to the ultra-fast prediction of DFT-calculated bond energies.一种大数据方法,可实现对 DFT 计算键能的超快速预测。
J Cheminform. 2013 Jul 12;5:34. doi: 10.1186/1758-2946-5-34. eCollection 2013.
3
Molecular Machine Learning for Chemical Catalysis: Prospects and Challenges.分子机器学习在化学催化中的应用:前景与挑战。
Acc Chem Res. 2023 Feb 7;56(3):402-412. doi: 10.1021/acs.accounts.2c00801. Epub 2023 Jan 30.
4
When Do Quantum Mechanical Descriptors Help Graph Neural Networks to Predict Chemical Properties?量子力学描述符何时有助于图神经网络预测化学性质?
J Am Chem Soc. 2024 Aug 21;146(33):23103-23120. doi: 10.1021/jacs.4c04670. Epub 2024 Aug 6.
5
Navigating Transition-Metal Chemical Space: Artificial Intelligence for First-Principles Design.探索过渡金属化学空间:基于第一性原理设计的人工智能
Acc Chem Res. 2021 Feb 2;54(3):532-545. doi: 10.1021/acs.accounts.0c00686. Epub 2021 Jan 22.
6
Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships.解析过渡金属化学空间:机器学习的特征选择与结构-性质关系
J Phys Chem A. 2017 Nov 22;121(46):8939-8954. doi: 10.1021/acs.jpca.7b08750. Epub 2017 Nov 15.
7
Predicting Energetics Materials' Crystalline Density from Chemical Structure by Machine Learning.通过机器学习从化学结构预测能质材料的结晶密度。
J Chem Inf Model. 2021 May 24;61(5):2147-2158. doi: 10.1021/acs.jcim.0c01318. Epub 2021 Apr 26.
8
Many-Body Descriptors for Predicting Molecular Properties with Machine Learning: Analysis of Pairwise and Three-Body Interactions in Molecules.多体描述符用于通过机器学习预测分子性质:分子中对相互作用和三体相互作用的分析。
J Chem Theory Comput. 2018 Jun 12;14(6):2991-3003. doi: 10.1021/acs.jctc.8b00110. Epub 2018 May 31.
9
Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations.通过转换等效化学表示来学习连续且数据驱动的分子描述符。
Chem Sci. 2018 Nov 19;10(6):1692-1701. doi: 10.1039/c8sc04175j. eCollection 2019 Feb 14.
10
Improving VAE based molecular representations for compound property prediction.改进基于变分自编码器的分子表示以进行化合物性质预测。
J Cheminform. 2022 Oct 14;14(1):69. doi: 10.1186/s13321-022-00648-x.

引用本文的文献

1
Molecular Rotors as Reactivity Probes: Predicting Electrophilicity from the Speed of Rotation.作为反应性探针的分子转子:根据旋转速度预测亲电性
Angew Chem Int Ed Engl. 2025 Sep 1;64(36):e202510556. doi: 10.1002/anie.202510556. Epub 2025 Jul 29.
2
AI Approaches to Homogeneous Catalysis with Transition Metal Complexes.过渡金属配合物均相催化的人工智能方法
ACS Catal. 2025 May 14;15(11):9089-9105. doi: 10.1021/acscatal.5c01202. eCollection 2025 Jun 6.
3
Quinoline Quest: Kynurenic Acid Strategies for Next-Generation Therapeutics via Rational Drug Design.
喹啉探索:通过合理药物设计开发下一代治疗药物的犬尿喹啉酸策略
Pharmaceuticals (Basel). 2025 Apr 22;18(5):607. doi: 10.3390/ph18050607.
4
Data-Driven Virtual Screening of Conformational Ensembles of Transition-Metal Complexes.基于数据驱动的过渡金属配合物构象集合虚拟筛选
J Chem Theory Comput. 2025 May 27;21(10):5334-5345. doi: 10.1021/acs.jctc.5c00303. Epub 2025 May 9.
5
Machine-Learning-Based Design of Metallocene Catalysts for Controlled Olefin Copolymerization.基于机器学习的用于可控烯烃共聚的茂金属催化剂设计
Chemistry. 2025 Jun 6;31(32):e202500316. doi: 10.1002/chem.202500316. Epub 2025 May 7.
6
Impact of Model Selection and Conformational Effects on the Descriptors for In Silico Screening Campaigns: A Case Study of Rh-Catalyzed Acrylate Hydrogenation.模型选择和构象效应对计算机辅助筛选活动描述符的影响:铑催化丙烯酸酯氢化的案例研究
J Phys Chem C Nanomater Interfaces. 2024 May 2;128(19):7987-7998. doi: 10.1021/acs.jpcc.4c01631. eCollection 2024 May 16.
7
The Importance of Atomic Charges for Predicting Site-Selective Ir-, Ru-, and Rh-Catalyzed C-H Borylations.原子电荷对预测铱、钌和铑催化的位点选择性C-H硼化反应的重要性
J Org Chem. 2025 May 2;90(17):6000-6012. doi: 10.1021/acs.joc.5c00343. Epub 2025 Apr 23.
8
Transfer Learning-Enabled Ligand Prediction for Ni-Catalyzed Atroposelective Suzuki-Miyaura Cross-Coupling Based on Mechanistic Similarity: Leveraging Pd Knowledge for Ni Discovery.基于机理相似性的迁移学习实现镍催化的对映选择性铃木-宫浦交叉偶联配体预测:利用钯的知识发现镍
J Am Chem Soc. 2025 May 7;147(18):15318-15328. doi: 10.1021/jacs.5c00838. Epub 2025 Mar 28.
9
Computational tools for the prediction of site- and regioselectivity of organic reactions.用于预测有机反应位点和区域选择性的计算工具。
Chem Sci. 2025 Mar 4;16(13):5383-5412. doi: 10.1039/d5sc00541h. eCollection 2025 Mar 26.
10
Chemically Informed Deep Learning for Interpretable Radical Reaction Prediction.用于可解释自由基反应预测的化学信息深度学习
J Chem Inf Model. 2025 Feb 10;65(3):1228-1242. doi: 10.1021/acs.jcim.4c01901. Epub 2025 Jan 28.