Suppr超能文献

溶液中分子能量学的多任务深度集成预测:从量子力学到实验性质

Multitask Deep Ensemble Prediction of Molecular Energetics in Solution: From Quantum Mechanics to Experimental Properties.

作者信息

Xia Song, Zhang Dongdong, Zhang Yingkai

机构信息

Department of Chemistry, New York University, New York, New York10003, United States.

Simons Center for Computational Physical Chemistry at New York University, New York, New York10003, United States.

出版信息

J Chem Theory Comput. 2023 Jan 6. doi: 10.1021/acs.jctc.2c01024.

Abstract

The past few years have witnessed significant advances in developing machine learning methods for molecular energetics predictions, including calculated electronic energies with high-level quantum mechanical methods and experimental properties, such as solvation free energy and logP. Typically, task-specific machine learning models are developed for distinct prediction tasks. In this work, we present a multitask deep ensemble model, sPhysNet-MT-ens5, which can simultaneously and accurately predict electronic energies of molecules in gas, water, and octanol phases, as well as transfer free energies at both calculated and experimental levels. On the calculated data set Frag20-solv-678k, which is developed in this work and contains 678,916 molecular conformations, up to 20 heavy atoms, and their properties calculated at B3LYP/6-31G* level of theory with continuum solvent models, sPhysNet-MT-ens5 predicts density functional theory (DFT)-level electronic energies directly from force field-optimized geometry within chemical accuracy. On the experimental data sets, sPhysNet-MT-ens5 achieves state-of-the-art performances, which predict both experimental hydration free energy with a RMSE of 0.620 kcal/mol on the FreeSolv data set and experimental logP with a RMSE of 0.393 on the PHYSPROP data set. Furthermore, sPhysNet-MT-ens5 also provides a reasonable estimation of model uncertainty which shows correlations with prediction error. Finally, by analyzing the atomic contributions of its predictions, we find that the developed deep learning model is aware of the chemical environment of each atom by assigning reasonable atomic contributions consistent with our chemical knowledge.

摘要

在开发用于分子能量预测的机器学习方法方面,过去几年取得了重大进展,包括使用高级量子力学方法计算电子能量以及预测诸如溶剂化自由能和logP等实验性质。通常,针对不同的预测任务开发特定任务的机器学习模型。在这项工作中,我们提出了一种多任务深度集成模型sPhysNet-MT-ens5,它可以同时准确地预测分子在气相、水相和辛醇相中的电子能量,以及在计算和实验水平上的转移自由能。在本文开发的包含678,916个分子构象、最多20个重原子且其性质在B3LYP/6-31G*理论水平并采用连续介质溶剂模型计算的数据集Frag20-solv-678k上,sPhysNet-MT-ens5能够在化学精度范围内直接从力场优化的几何结构预测密度泛函理论(DFT)水平的电子能量。在实验数据集上,sPhysNet-MT-ens5取得了领先的性能,在FreeSolv数据集上预测实验水合自由能的均方根误差(RMSE)为0.620 kcal/mol,在PHYSPROP数据集上预测实验logP的RMSE为0.393。此外,sPhysNet-MT-ens5还能合理估计模型不确定性,且该不确定性与预测误差相关。最后,通过分析其预测的原子贡献,我们发现所开发的深度学习模型通过分配与化学知识一致的合理原子贡献,能够了解每个原子的化学环境。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a85/10323048/b298eb8e3bf8/nihms-1863301-f0002.jpg

相似文献

本文引用的文献

1
Exploring chemical compound space with quantum-based machine learning.利用基于量子的机器学习探索化合物空间。
Nat Rev Chem. 2020 Jul;4(7):347-358. doi: 10.1038/s41570-020-0189-9. Epub 2020 Jun 12.
7
Ab Initio Machine Learning in Chemical Compound Space.从头开始的化合物空间中的机器学习。
Chem Rev. 2021 Aug 25;121(16):10001-10036. doi: 10.1021/acs.chemrev.0c01303. Epub 2021 Aug 13.
8
Physics-Inspired Structural Representations for Molecules and Materials.受物理学启发的分子和材料结构表示。
Chem Rev. 2021 Aug 25;121(16):9759-9815. doi: 10.1021/acs.chemrev.1c00021. Epub 2021 Jul 26.
10
Assigning confidence to molecular property prediction.为分子性质预测分配置信度。
Expert Opin Drug Discov. 2021 Sep;16(9):1009-1023. doi: 10.1080/17460441.2021.1925247. Epub 2021 Jun 15.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验