可解释的监督机器学习模型，用于预测溶剂化吉布斯自由能。

Explainable Supervised Machine Learning Model To Predict Solvation Gibbs Energy.

机构信息

Department of Chemistry and Biochemistry - Faculty of Sciences, University of Porto - Rua do Campo Alegre, S/N, 4169-007 Porto, Portugal.

Centre of Chemistry, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal.

出版信息

J Chem Inf Model. 2024 Apr 8;64(7):2250-2262. doi: 10.1021/acs.jcim.3c00544. Epub 2023 Aug 21.

DOI:10.1021/acs.jcim.3c00544

PMID:37603608

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11005042/

Abstract

Many challenges persist in developing accurate computational models for predicting solvation free energy (Δ). Despite recent developments in Machine Learning (ML) methodologies that outperformed traditional quantum mechanical models, several issues remain concerning explanatory insights for broad chemical predictions with an acceptable speed-accuracy trade-off. To overcome this, we present a novel supervised ML model to predict the Δ for an array of solvent-solute pairs. Using two different ensemble regressor algorithms, we made fast and accurate property predictions using open-source chemical features, encoding complex electronic, structural, and surface area descriptors for every solvent and solute. By integrating molecular properties and chemical interaction features, we have analyzed individual descriptor importance and optimized our model though explanatory information form feature groups. On aqueous and organic solvent databases, ML models revealed the predictive relevance of solutes with increasing polar surface area and decreasing polarizability, yielding better results than state-of-the-art benchmark Neural Network methods (without complex quantum mechanical or molecular dynamic simulations). Both algorithms successfully outperformed previous Δ predictions methods, with a maximum absolute error of 0.22 ± 0.02 kcal mol, further validated in an external benchmark database and with solvent hold-out tests. With these explanatory and statistical insights, they allow a thoughtful application of this method for predicting other thermodynamic properties, stressing the relevance of ML modeling for further complex computational chemistry problems.

摘要

尽管机器学习 (ML) 方法在预测溶剂化自由能 (Δ) 方面取得了优于传统量子力学模型的最新进展，但在具有可接受的速度-准确性权衡的广泛化学预测方面，仍存在一些解释性问题。为了解决这个问题，我们提出了一种新的监督机器学习模型，用于预测一系列溶剂-溶质对的 Δ。我们使用两种不同的集成回归算法，使用开源化学特征快速准确地预测属性，为每个溶剂和溶质编码复杂的电子、结构和表面积描述符。通过整合分子特性和化学相互作用特性，我们分析了各个描述符的重要性，并通过特征组的解释信息对模型进行了优化。在水相和有机相数据库上，ML 模型揭示了具有增加的极性表面积和降低的极化率的溶质的预测相关性，其结果优于最先进的基准神经网络方法（无需复杂的量子力学或分子动力学模拟）。这两种算法都成功地超越了之前的 Δ 预测方法，最大绝对误差为 0.22 ± 0.02 kcal mol，在外部基准数据库和溶剂保留测试中得到了进一步验证。这些解释性和统计性的见解允许对该方法进行深思熟虑的应用，以预测其他热力学性质，强调了 ML 建模在解决进一步复杂的计算化学问题中的相关性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b365/11005042/d5c481ddabad/ci3c00544_0001.jpg

相似文献

Explainable Supervised Machine Learning Model To Predict Solvation Gibbs Energy.可解释的监督机器学习模型，用于预测溶剂化吉布斯自由能。

J Chem Inf Model. 2024 Apr 8;64(7):2250-2262. doi: 10.1021/acs.jcim.3c00544. Epub 2023 Aug 21.

Multisolvent Models for Solvation Free Energy Predictions Using 3D-RISM Hydration Thermodynamic Descriptors.多溶剂模型用于使用 3D-RISM 水合热力学描述符预测溶剂化自由能。

J Chem Inf Model. 2020 Jun 22;60(6):2977-2988. doi: 10.1021/acs.jcim.0c00065. Epub 2020 Apr 30.

Hybrid QSPR models for the prediction of the free energy of solvation of organic solute/solvent pairs.用于预测有机溶质/溶剂对溶剂化自由能的混合定量构效关系（QSPR）模型。

Phys Chem Chem Phys. 2019 Jun 26;21(25):13706-13720. doi: 10.1039/c8cp07562j.

Predicting Free Energies of Exfoliation and Solvation for Graphitic Carbon Nitrides Using Machine Learning.使用机器学习预测石墨相氮化碳的剥离能和溶剂化自由能

ACS Appl Mater Interfaces. 2023 Nov 22;15(46):53786-53801. doi: 10.1021/acsami.3c09347. Epub 2023 Nov 8.

Data-driven, explainable machine learning model for predicting volatile organic compounds' standard vaporization enthalpy.用于预测挥发性有机化合物标准汽化焓的数据驱动型可解释机器学习模型。

Chemosphere. 2024 Jul;359:142257. doi: 10.1016/j.chemosphere.2024.142257. Epub 2024 May 6.

Solvation Thermodynamics of Solutes in Water and Ionic Liquids Using the Multiscale Solvation-Layer Interface Condition Continuum Model.利用多尺度溶剂化层界面条件连续模型研究溶质在水和离子液体中的溶剂化热力学。

J Chem Theory Comput. 2022 Sep 13;18(9):5539-5558. doi: 10.1021/acs.jctc.2c00248. Epub 2022 Aug 24.

Explainable Solvation Free Energy Prediction Combining Graph Neural Networks with Chemical Intuition.结合图神经网络与化学直觉的可解释溶剂化自由能预测

J Chem Inf Model. 2022 Nov 28;62(22):5457-5470. doi: 10.1021/acs.jcim.2c01013. Epub 2022 Nov 1.

Machine learning prediction of empirical polarity using SMILES encoding of organic solvents.基于有机溶剂 SMILES 编码的机器学习预测经验极性。

Mol Divers. 2023 Oct;27(5):2331-2343. doi: 10.1007/s11030-022-10559-6. Epub 2022 Nov 5.

Predicting Solubility Limits of Organic Solutes for a Wide Range of Solvents and Temperatures.预测宽范围溶剂和温度下有机溶质的溶解度极限。

J Am Chem Soc. 2022 Jun 22;144(24):10785-10797. doi: 10.1021/jacs.2c01768. Epub 2022 Jun 10.

Predicting solvent-water partitioning of charged organic species using quantum-chemically estimated Abraham pp-LFER solute parameters.利用量子化学估算的亚伯拉罕pp-LFER溶质参数预测带电有机物种的溶剂-水分配。

Chemosphere. 2016 Dec;164:634-642. doi: 10.1016/j.chemosphere.2016.08.135. Epub 2016 Sep 13.

引用本文的文献

Enhancing Accuracy and Feature Insights in Hydration Free Energy Predictions for Small Molecules with Machine Learning.利用机器学习提高小分子水合自由能预测的准确性和特征洞察

ACS Omega. 2025 Jul 2;10(27):29781-29792. doi: 10.1021/acsomega.5c04249. eCollection 2025 Jul 15.

Using Deep Graph Neural Networks Improves Physics-Based Hydration Free Energy Predictions Even for Molecules Outside of the Training Set Distribution.使用深度图神经网络可改善基于物理的水合自由能预测，即使对于训练集分布之外的分子也是如此。

J Phys Chem B. 2025 Jul 24;129(29):7483-7498. doi: 10.1021/acs.jpcb.5c02263. Epub 2025 Jul 11.

Machine learning applications for thermochemical and kinetic property prediction.用于热化学和动力学性质预测的机器学习应用。

Rev Chem Eng. 2024 Nov 29;41(4):419-449. doi: 10.1515/revce-2024-0027. eCollection 2025 May.

Solvent Screening for Separation Processes Using Machine Learning and High-Throughput Technologies.利用机器学习和高通量技术进行分离过程的溶剂筛选

Chem Bio Eng. 2025 Mar 5;2(4):210-228. doi: 10.1021/cbe.4c00170. eCollection 2025 Apr 24.

Integrating Solvent Effects into the Prediction of Kinetic Constants Using a COSMO-Based Equation of State.使用基于COSMO的状态方程将溶剂效应纳入动力学常数预测中。

J Chem Theory Comput. 2025 Apr 8;21(7):3625-3648. doi: 10.1021/acs.jctc.5c00133. Epub 2025 Mar 25.

Data-Driven Approaches to Predict Dendrimer Cytotoxicity.预测树枝状大分子细胞毒性的数据驱动方法。

ACS Omega. 2024 May 27;9(23):24899-24906. doi: 10.1021/acsomega.4c01775. eCollection 2024 Jun 11.

Predicting the stereoselectivity of chemical reactions by composite machine learning method.用复合机器学习方法预测化学反应的立体选择性

Sci Rep. 2024 May 27;14(1):12131. doi: 10.1038/s41598-024-62158-0.

本文引用的文献

Machine Learning Prediction of Hydration Free Energy with Physically Inspired Descriptors.基于物理启发描述符的水合自由能的机器学习预测。

J Phys Chem Lett. 2023 Feb 23;14(7):1877-1884. doi: 10.1021/acs.jpclett.2c03858. Epub 2023 Feb 13.

Explainable Solvation Free Energy Prediction Combining Graph Neural Networks with Chemical Intuition.结合图神经网络与化学直觉的可解释溶剂化自由能预测

J Chem Inf Model. 2022 Nov 28;62(22):5457-5470. doi: 10.1021/acs.jcim.2c01013. Epub 2022 Nov 1.

Development of Force Field Parameters for the Simulation of Single- and Double-Stranded DNA Molecules and DNA-Protein Complexes.开发用于模拟单链和双链 DNA 分子以及 DNA-蛋白质复合物的力场参数。

J Phys Chem B. 2022 Jun 23;126(24):4442-4457. doi: 10.1021/acs.jpcb.1c10971. Epub 2022 Jun 12.

Accurate Prediction of Aqueous Free Solvation Energies Using 3D Atomic Feature-Based Graph Neural Network with Transfer Learning.使用基于 3D 原子特征的图神经网络与迁移学习准确预测水相自由溶解能。

J Chem Inf Model. 2022 Apr 25;62(8):1840-1848. doi: 10.1021/acs.jcim.2c00260. Epub 2022 Apr 14.

Accurate determination of solvation free energies of neutral organic compounds from first principles.从第一性原理准确确定中性有机化合物的溶剂化自由能。

Nat Commun. 2022 Jan 20;13(1):414. doi: 10.1038/s41467-022-28041-0.

Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy.用于预测亚伯拉罕溶质参数、溶剂化自由能和溶剂化焓的基团贡献法和机器学习方法。

J Chem Inf Model. 2022 Feb 14;62(3):433-446. doi: 10.1021/acs.jcim.1c01103. Epub 2022 Jan 19.

MLSolvA: solvation free energy prediction from pairwise atomistic interactions by machine learning.MLSolvA：通过机器学习从成对原子相互作用预测溶剂化自由能。

J Cheminform. 2021 Jul 31;13(1):56. doi: 10.1186/s13321-021-00533-z.

Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems.结合机器学习和计算化学，对化学系统进行预测性洞察。

Chem Rev. 2021 Aug 25;121(16):9816-9872. doi: 10.1021/acs.chemrev.1c00107. Epub 2021 Jul 7.

Improved prediction of solvation free energies by machine-learning polarizable continuum solvation model.通过机器学习极化连续介质溶剂化模型改进溶剂化自由能的预测

Nat Commun. 2021 Jun 18;12(1):3584. doi: 10.1038/s41467-021-23724-6.

Algebraic graph-assisted bidirectional transformers for molecular property prediction.基于代数图辅助的双向转换器在分子性质预测中的应用。

Nat Commun. 2021 Jun 10;12(1):3521. doi: 10.1038/s41467-021-23720-w.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

可解释的监督机器学习模型，用于预测溶剂化吉布斯自由能。

Explainable Supervised Machine Learning Model To Predict Solvation Gibbs Energy.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献