利用第一性原理相互作用自由能生成的大量多样训练数据构建的氢键供体和受体强度机器学习模型。

Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies.

作者信息

Bauer Christoph A, Schneider Gisbert, Göller Andreas H

机构信息

Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), 8093, Zurich, Switzerland.

Bayer AG, Pharmaceuticals, R&D, 42096, Wuppertal, Germany.

出版信息

J Cheminform. 2019 Sep 11;11(1):59. doi: 10.1186/s13321-019-0381-4.

DOI:10.1186/s13321-019-0381-4

PMID:33430967

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6737620/

Abstract

We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol (acceptors), and 2.3 kJ mol (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding.

摘要

我们提出了用于氢键受体（HBA）和氢键供体（HBD）强度的机器学习（ML）模型。与参考分子4-氟苯酚和丙酮形成1:1氢键复合物时在溶液中的量子化学（QC）自由能作为我们的目标值。我们的受体和供体数据库是有记录以来最大的，分别有4426个和1036个数据点。在对径向原子描述符和ML方法进行扫描后，我们最终训练的HBA和HBD ML模型在实验测试集上的均方根误差分别为3.8 kJ/mol（受体）和2.3 kJ/mol（供体）。这种性能与之前基于实验氢键自由能训练的模型相当，表明分子QC数据可以替代实验。其潜在影响可能导致通过QC完全取代用于HBA/HBD强度测定的湿实验室化学方法。作为我们ML模型的一种可能的化学应用，我们在关于分子内氢键趋势的两个案例研究中强调了我们预测的HBA和HBD强度作为可能的描述符。

相似文献

Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies.利用第一性原理相互作用自由能生成的大量多样训练数据构建的氢键供体和受体强度机器学习模型。

J Cheminform. 2019 Sep 11;11(1):59. doi: 10.1186/s13321-019-0381-4.

Gaussian Process Regression Models for the Prediction of Hydrogen Bond Acceptor Strengths.高斯过程回归模型在预测氢键受体强度中的应用。

Mol Inform. 2019 Apr;38(4):e1800115. doi: 10.1002/minf.201800115. Epub 2018 Nov 25.

How to Model Inter- and Intramolecular Hydrogen Bond Strengths with Quantum Chemistry.如何用量子化学模拟分子间和分子内氢键强度。

J Chem Inf Model. 2019 Sep 23;59(9):3735-3743. doi: 10.1021/acs.jcim.9b00132. Epub 2019 Aug 27.

Predictive Models for the Free Energy of Hydrogen Bonded Complexes with Single and Cooperative Hydrogen Bonds.具有单氢键和协同氢键的氢键复合物自由能的预测模型。

Mol Inform. 2016 Dec;35(11-12):629-638. doi: 10.1002/minf.201600070. Epub 2016 Aug 16.

Polarization charge densities provide a predictive quantification of hydrogen bond energies.极化电荷密度提供了对氢键能的预测量化。

Phys Chem Chem Phys. 2012 Jan 14;14(2):955-63. doi: 10.1039/c1cp22640a. Epub 2011 Nov 28.

Quantum chemical investigation of hydrogen-bond strengths and partition into donor and acceptor contributions.氢键强度的量子化学研究及其对供体和受体贡献的划分。

J Comput Chem. 2007 Jul 15;28(9):1503-1515. doi: 10.1002/jcc.20673.

Modulating solvation interactions of deep eutectic solvents formed by ammonium salts and carboxylic acids through varying the molar ratio of hydrogen bond donor and acceptor.通过改变氢键供体和受体的摩尔比来调节由铵盐和羧酸形成的深共晶溶剂的溶剂相互作用。

J Chromatogr A. 2021 Apr 26;1643:462011. doi: 10.1016/j.chroma.2021.462011. Epub 2021 Feb 18.

Theoretical prediction of hydrogen-bond basicity pKBHX using quantum chemical topology descriptors.采用量子化学拓扑描述符理论预测氢键碱度 pKBHX。

J Chem Inf Model. 2014 Feb 24;54(2):553-61. doi: 10.1021/ci400657c. Epub 2014 Feb 4.

Perhalogenated Anilines as Bifunctional Donors of Hydrogen and Halogen Bonds in Cocrystals with Ditopic Nitrogen-Containing Acceptors.全卤代苯胺作为与双位点含氮受体共晶中氢键和卤键的双功能供体。

Cryst Growth Des. 2024 Jun 6;24(12):5078-5088. doi: 10.1021/acs.cgd.4c00315. eCollection 2024 Jun 19.

Investigating the effect of systematically modifying the molar ratio of hydrogen bond donor and acceptor on solvation characteristics of deep eutectic solvents formed using choline chloride salt and polyalcohols.研究通过氢键供体和受体的摩尔比系统修饰，对由胆碱氯化物盐和多元醇形成的深共熔溶剂的溶解特性的影响。

J Chromatogr A. 2022 Mar 29;1667:462871. doi: 10.1016/j.chroma.2022.462871. Epub 2022 Feb 3.

引用本文的文献

Solubility improvement of atorvastatin using deep eutectic solvents.使用深共熔溶剂提高阿托伐他汀的溶解度

J Adv Pharm Technol Res. 2025 Jul-Sep;16(3):144-150. doi: 10.4103/JAPTR.JAPTR_337_24. Epub 2025 Aug 9.

Developing a Machine Learning Model for Hydrogen Bond Acceptance Based on Natural Bond Orbital Descriptors.基于自然键轨道描述符开发用于氢键接受的机器学习模型。

J Org Chem. 2025 Jul 18;90(28):9776-9788. doi: 10.1021/acs.joc.5c00724. Epub 2025 Jul 6.

Atom-based machine learning for estimating nucleophilicity and electrophilicity with applications to retrosynthesis and chemical stability.基于原子的机器学习用于估计亲核性和亲电性及其在逆合成和化学稳定性方面的应用

Chem Sci. 2025 Feb 25;16(13):5676-5687. doi: 10.1039/d4sc07297a. eCollection 2025 Mar 26.

Exploring Molecular Heteroencoders with Latent Space Arithmetic: Atomic Descriptors and Molecular Operators.利用潜在空间算法探索分子异质编码器：原子描述符和分子算子

Molecules. 2024 Aug 22;29(16):3969. doi: 10.3390/molecules29163969.

pKalculator: A p predictor for C-H bonds.pKalculator：一种用于C-H键的p预测器。

Beilstein J Org Chem. 2024 Jul 16;20:1614-1622. doi: 10.3762/bjoc.20.144. eCollection 2024.

Synergy of machine learning and density functional theory calculations for predicting experimental Lewis base affinity and Lewis polybase binding atoms.机器学习与密度泛函理论计算相结合用于预测实验性路易斯碱亲和力和路易斯多碱结合原子的协同作用。

J Comput Chem. 2024 Jul 5;45(18):1552-1561. doi: 10.1002/jcc.27329. Epub 2024 Mar 18.

Design, synthesis, electrochemistry and anti-trypanosomatid hit/lead identification of nitrofuranylazines.硝基呋喃嗪类化合物的设计、合成、电化学性质及抗锥虫活性筛选/先导化合物鉴定

RSC Med Chem. 2023 Aug 16;14(10):2012-2029. doi: 10.1039/d3md00220a. eCollection 2023 Oct 18.

Fast calculation of hydrogen-bond strengths and free energy of hydration of small molecules.快速计算小分子的氢键强度和水合自由能。

Sci Rep. 2023 Mar 13;13(1):4143. doi: 10.1038/s41598-023-30089-x.

Structural Optimization of Platinum Drugs to Improve the Drug-Loading and Antitumor Efficacy of PLGA Nanoparticles.铂类药物的结构优化以提高PLGA纳米颗粒的载药量和抗肿瘤疗效

Pharmaceutics. 2022 Oct 29;14(11):2333. doi: 10.3390/pharmaceutics14112333.

本文引用的文献

MMFF VII. Characterization of MMFF94, MMFF94s, and other widely available force fields for conformational energies and for intermolecular-interaction energies and geometries.MMFF VII. MMFF94、MMFF94s及其他广泛使用的力场在构象能、分子间相互作用能和几何结构方面的表征。

J Comput Chem. 1999 May;20(7):730-748. doi: 10.1002/(SICI)1096-987X(199905)20:7<730::AID-JCC8>3.0.CO;2-T.

MMFF VI. MMFF94s option for energy minimization studies.MMFF VI。用于能量最小化研究的MMFF94s选项。

J Comput Chem. 1999 May;20(7):720-729. doi: 10.1002/(SICI)1096-987X(199905)20:7<720::AID-JCC7>3.0.CO;2-X.

Gaussian Process Regression Models for the Prediction of Hydrogen Bond Acceptor Strengths.高斯过程回归模型在预测氢键受体强度中的应用。

Mol Inform. 2019 Apr;38(4):e1800115. doi: 10.1002/minf.201800115. Epub 2018 Nov 25.

MetScore: Site of Metabolism Prediction Beyond Cytochrome P450 Enzymes.MetScore：超越细胞色素 P450 酶的代谢预测位点。

ChemMedChem. 2018 Nov 6;13(21):2281-2289. doi: 10.1002/cmdc.201800309. Epub 2018 Oct 2.

Machine learning for the prediction of molecular dipole moments obtained by density functional theory.用于预测通过密度泛函理论获得的分子偶极矩的机器学习。

J Cheminform. 2018 Aug 22;10(1):43. doi: 10.1186/s13321-018-0296-5.

High throughput methods to measure the propensity of compounds to form intramolecular hydrogen bonding.用于测量化合物形成分子内氢键倾向的高通量方法。

Medchemcomm. 2017 Apr 27;8(6):1143-1151. doi: 10.1039/c7md00101k. eCollection 2017 Jun 1.

Machine Learning of Partial Charges Derived from High-Quality Quantum-Mechanical Calculations.机器学习部分电荷源于高质量量子力学计算。

J Chem Inf Model. 2018 Mar 26;58(3):579-590. doi: 10.1021/acs.jcim.7b00663. Epub 2018 Mar 7.

Quantum Machine Learning in Chemical Compound Space.量子机器学习在化学化合物空间中的应用。

Angew Chem Int Ed Engl. 2018 Apr 9;57(16):4164-4169. doi: 10.1002/anie.201709686. Epub 2018 Mar 14.

A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions.借助用于一般主族热化学、动力学和非共价相互作用的先进GMTKN55数据库审视密度泛函理论体系。

Phys Chem Chem Phys. 2017 Dec 13;19(48):32184-32215. doi: 10.1039/c7cp04913g.

An algorithm to identify functional groups in organic molecules.一种识别有机分子中官能团的算法。

J Cheminform. 2017 Jun 7;9(1):36. doi: 10.1186/s13321-017-0225-z.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用第一性原理相互作用自由能生成的大量多样训练数据构建的氢键供体和受体强度机器学习模型。

Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献