化学机器学习中的快速准确不确定性估计。

Fast and Accurate Uncertainty Estimation in Chemical Machine Learning.

机构信息

Laboratory of Computational Science and Modeling, IMX , École Polytechnique Fédérale de Lausanne , 1015 Lausanne , Switzerland.

Machine Learning & Optimization Laboratory, IC , École Polytechnique Fédérale de Lausanne , 1015 Lausanne , Switzerland.

出版信息

J Chem Theory Comput. 2019 Feb 12;15(2):906-915. doi: 10.1021/acs.jctc.8b00959. Epub 2019 Jan 18.

DOI:10.1021/acs.jctc.8b00959

PMID:30605342

Abstract

We present a scheme to obtain an inexpensive and reliable estimate of the uncertainty associated with the predictions of a machine-learning model of atomic and molecular properties. The scheme is based on resampling, with multiple models being generated based on subsampling of the same training data. The accuracy of the uncertainty prediction can be benchmarked by maximum likelihood estimation, which can also be used to correct for correlations between resampled models and to improve the performance of the uncertainty estimation by a cross-validation procedure. In the case of sparse Gaussian Process Regression models, this resampled estimator can be evaluated at negligible cost. We demonstrate the reliability of these estimates for the prediction of molecular and materials energetics and for the estimation of nuclear chemical shieldings in molecular crystals. Extension to estimate the uncertainty in energy differences, forces, or other correlated predictions is straightforward. This method can be easily applied to other machine-learning schemes and will be beneficial to make data-driven predictions more reliable and to facilitate training-set optimization and active-learning strategies.

摘要

我们提出了一种方案，以获得对机器学习模型预测原子和分子性质相关不确定性的廉价且可靠的估计。该方案基于重采样，通过对相同训练数据进行子采样生成多个模型。不确定性预测的准确性可以通过最大似然估计进行基准测试，该估计也可以用于校正重采样模型之间的相关性，并通过交叉验证过程提高不确定性估计的性能。在稀疏高斯过程回归模型的情况下，可以以可忽略的成本评估此重采样估计器。我们证明了这些估计值在预测分子和材料的能量以及估算分子晶体中的核化学屏蔽方面的可靠性。扩展到估计能量差、力或其他相关预测的不确定性是很简单的。该方法可以轻松应用于其他机器学习方案，并将有助于使数据驱动的预测更加可靠，并促进训练集优化和主动学习策略。

相似文献

Fast and Accurate Uncertainty Estimation in Chemical Machine Learning.化学机器学习中的快速准确不确定性估计。

J Chem Theory Comput. 2019 Feb 12;15(2):906-915. doi: 10.1021/acs.jctc.8b00959. Epub 2019 Jan 18.

Large-scale evaluation of k-fold cross-validation ensembles for uncertainty estimation.用于不确定性估计的k折交叉验证集成的大规模评估。

J Cheminform. 2023 Apr 28;15(1):49. doi: 10.1186/s13321-023-00709-9.

Uncertainty estimation for molecular dynamics and sampling.分子动力学与采样的不确定性估计

J Chem Phys. 2021 Feb 21;154(7):074102. doi: 10.1063/5.0036522.

Addressing uncertainty in atomistic machine learning.解决原子机器学习中的不确定性。

Phys Chem Chem Phys. 2017 May 10;19(18):10978-10985. doi: 10.1039/c7cp00375g.

Deep Kernel learning for reaction outcome prediction and optimization.用于反应结果预测与优化的深度核学习

Commun Chem. 2024 Jun 14;7(1):136. doi: 10.1038/s42004-024-01219-x.

FCHL revisited: Faster and more accurate quantum machine learning.重新审视FCHL：更快、更准确的量子机器学习。

J Chem Phys. 2020 Jan 31;152(4):044107. doi: 10.1063/1.5126701.

The application of discriminant analysis and Machine Learning methods as tools to identify and classify compounds with potential as transdermal enhancers.判别分析和机器学习方法在识别和分类具有透皮增强潜力的化合物中的应用。

Eur J Pharm Sci. 2012 Jan 23;45(1-2):116-27. doi: 10.1016/j.ejps.2011.10.027. Epub 2011 Nov 11.

Multifidelity Information Fusion with Machine Learning: A Case Study of Dopant Formation Energies in Hafnia.基于机器学习的多保真信息融合：以氧化铪中掺杂剂形成能为例的研究

ACS Appl Mater Interfaces. 2019 Jul 17;11(28):24906-24918. doi: 10.1021/acsami.9b02174. Epub 2019 Apr 16.

Efficient Atomic-Resolution Uncertainty Estimation for Neural Network Potentials Using a Replica Ensemble.使用副本系综对神经网络势进行高效的原子分辨率不确定性估计。

J Phys Chem Lett. 2020 Aug 6;11(15):6090-6096. doi: 10.1021/acs.jpclett.0c01614. Epub 2020 Jul 16.

Graph neural network interatomic potential ensembles with calibrated aleatoric and epistemic uncertainty on energy and forces.具有校准的偶然不确定性和认知不确定性的能量与力的图神经网络原子间势系综。

Phys Chem Chem Phys. 2023 Sep 27;25(37):25828-25837. doi: 10.1039/d3cp02143b.

引用本文的文献

Uncertainty in the era of machine learning for atomistic modeling.用于原子尺度建模的机器学习时代的不确定性。

Digit Discov. 2025 Jun 9. doi: 10.1039/d5dd00102a.

Machine Learning-Enhanced Calculation of Quantum-Classical Binding Free Energies.机器学习增强的量子-经典结合自由能计算

J Chem Theory Comput. 2025 Aug 26;21(16):8182-8198. doi: 10.1021/acs.jctc.5c00388. Epub 2025 Aug 5.

Inverse Design of Singlet-Fission Materials with Uncertainty-Controlled Genetic Optimization.基于不确定性控制遗传优化的单线态裂变材料逆向设计

Angew Chem Int Ed Engl. 2025 Jan 15;64(3):e202415056. doi: 10.1002/anie.202415056. Epub 2024 Nov 11.

Prediction rigidities for data-driven chemistry.数据驱动化学的预测刚性

Faraday Discuss. 2025 Jan 14;256(0):322-344. doi: 10.1039/d4fd00101j.

Comparative Analysis of Chemical Descriptors by Machine Learning Reveals Atomistic Insights into Solute-Lipid Interactions.基于机器学习的化学描述符对比分析揭示了溶质-脂质相互作用的原子水平见解。

Mol Pharm. 2024 Jul 1;21(7):3343-3355. doi: 10.1021/acs.molpharmaceut.4c00080. Epub 2024 May 23.

Characterization and Molecular Simulations Reveal the Growth Kinetics of Graphene on Liquid Copper During Chemical Vapor Deposition.表征与分子模拟揭示化学气相沉积过程中石墨烯在液态铜上的生长动力学

ACS Nano. 2024 May 14;18(19):12503-12511. doi: 10.1021/acsnano.4c02070. Epub 2024 Apr 30.

Effect of Framework Composition and NH on the Diffusion of Cu in Cu-CHA Catalysts Predicted by Machine-Learning Accelerated Molecular Dynamics.通过机器学习加速分子动力学预测框架组成和NH对Cu-CHA催化剂中Cu扩散的影响。

ACS Cent Sci. 2023 Oct 18;9(11):2044-2056. doi: 10.1021/acscentsci.3c00870. eCollection 2023 Nov 22.

Artificial Intelligence and Complex Network Approaches Reveal Potential Gene Biomarkers for Hepatocellular Carcinoma.人工智能和复杂网络方法揭示了肝细胞癌的潜在基因生物标志物。

Int J Mol Sci. 2023 Oct 18;24(20):15286. doi: 10.3390/ijms242015286.

An eXplainable Artificial Intelligence analysis of Raman spectra for thyroid cancer diagnosis.基于可解释人工智能的甲状腺癌诊断的拉曼光谱分析。

Sci Rep. 2023 Oct 3;13(1):16590. doi: 10.1038/s41598-023-43856-7.

Lifelong Machine Learning Potentials.终身机器学习潜力。

J Chem Theory Comput. 2023 Jun 27;19(12):3509-3525. doi: 10.1021/acs.jctc.3c00279. Epub 2023 Jun 8.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

化学机器学习中的快速准确不确定性估计。

Fast and Accurate Uncertainty Estimation in Chemical Machine Learning.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献