Suppr超能文献

通过机器学习相对于实验的剩余误差来提高基于物理的水合自由能预测的准确性。

Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment.

作者信息

Bass Lewis, Elder Luke H, Folescu Dan E, Forouzesh Negin, Tolokh Igor S, Karpatne Anuj, Onufriev Alexey V

机构信息

Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States.

Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States.

出版信息

J Chem Theory Comput. 2024 Jan 9;20(1):396-410. doi: 10.1021/acs.jctc.3c00981. Epub 2023 Dec 27.

Abstract

The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.

摘要

水的计算模型的准确性是生物分子原子模拟的关键。我们提出了一种计算效率高的方法来提高小分子水化自由能(HFE)预测的准确性:通过机器学习(ML)作为后处理步骤来预测和减轻基于物理的模型相对于实验的剩余误差。具体来说,经过训练的图卷积神经网络试图识别基于物理的模型预测中的“盲点”,即水溶剂化的复杂物理过程未得到充分考虑的地方,并对其进行部分校正。我们针对代表各种准确性/速度权衡的五种经典溶剂模型探索了该策略,从快速解析广义玻恩(GB)模型到流行的TIP3P显式溶剂模型;使用来自FreeSolv集的小中性分子的实验HFE进行训练和测试。对于所有模型,ML校正降低了小分子HFE相对于实验的均方根误差,没有明显的过拟合,并且计算开销可以忽略不计。例如,在测试集上,快速解析GB模型的相对准确性提高了47%,经过ML校正后,其准确性几乎与未校正的TIP3P模型一样。对于TIP3P模型,准确性提高了约39%,使ML校正后的模型准确性低于1 kcal/mol阈值。一般来说,对于更准确的基于物理的模型,ML校正的相对益处较小,与仅基于物理的处理相比,相对准确性增益的下限约为20%。所提出的使用ML来学习基于物理的模型的剩余误差的策略相对于直接在参考HFE上单独训练ML具有明显优势:即使在训练集之外,它也能保持正确的总体趋势。

相似文献

2
Introducing Charge Hydration Asymmetry into the Generalized Born Model.将电荷水化不对称性引入广义玻恩模型。
J Chem Theory Comput. 2014 Apr 8;10(4):1788-1794. doi: 10.1021/ct4010917. Epub 2014 Feb 18.
3
9
Tuning Potential Functions to Host-Guest Binding Data.调整势能函数以适应主客体结合数据。
J Chem Theory Comput. 2024 Jan 9;20(1):239-252. doi: 10.1021/acs.jctc.3c01050. Epub 2023 Dec 26.

本文引用的文献

1
Machine Learning Methods for Small Data Challenges in Molecular Science.机器学习方法在分子科学中小数据挑战中的应用。
Chem Rev. 2023 Jul 12;123(13):8736-8780. doi: 10.1021/acs.chemrev.3c00189. Epub 2023 Jun 29.
4
Fast Polarizable Water Model for Atomistic Simulations.快速极化水分子模型用于原子模拟。
J Chem Theory Comput. 2022 Oct 11;18(10):6324-6333. doi: 10.1021/acs.jctc.2c00378. Epub 2022 Oct 3.
10
Choosing the right molecular machine learning potential.选择合适的分子机器学习势函数。
Chem Sci. 2021 Sep 15;12(43):14396-14413. doi: 10.1039/d1sc03564a. eCollection 2021 Nov 10.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验