Suppr超能文献

用于势能面的机器学习:一个广泛的数据库和方法评估

Machine learning for potential energy surfaces: An extensive database and assessment of methods.

作者信息

Schmitz Gunnar, Godtliebsen Ian Heide, Christiansen Ove

机构信息

Department of Chemistry, Aarhus Universitet, DK-8000 Aarhus, Denmark.

出版信息

J Chem Phys. 2019 Jun 28;150(24):244113. doi: 10.1063/1.5100141.

Abstract

On the basis of a new extensive database constructed for the purpose, we assess various Machine Learning (ML) algorithms to predict energies in the framework of potential energy surface (PES) construction and discuss black box character, robustness, and efficiency. The database for training ML algorithms in energy predictions based on the molecular structure contains SCF, RI-MP2, RI-MP2-F12, and CCSD(F12)(T) data for around 10.5 × 10 configurations of 15 small molecules. The electronic energies as function of molecular structure are computed from both static and iteratively refined grids in the context of automized PES construction for anharmonic vibrational computations within the n-mode expansion. We explore the performance of a range of algorithms including Gaussian Process Regression (GPR), Kernel Ridge Regression, Support Vector Regression, and Neural Networks (NNs). We also explore methods related to GPR such as sparse Gaussian Process Regression, Gaussian process Markov Chains, and Sparse Gaussian Process Markov Chains. For NNs, we report some explorations of architecture, activation functions, and numerical settings. Different delta-learning strategies are considered, and the use of delta learning targeting CCSD(F12)(T) predictions using, for example, RI-MP2 combined with machine learned CCSD(F12)(T)-RI-MP2 differences is found to be an attractive option.

摘要

基于为此目的构建的一个新的广泛数据库,我们评估了各种机器学习(ML)算法,以在势能面(PES)构建框架中预测能量,并讨论了黑箱特性、稳健性和效率。用于基于分子结构进行能量预测的ML算法训练的数据库包含15个小分子约10.5×10种构型的SCF、RI-MP2、RI-MP2-F12和CCSD(F12)(T)数据。在n模式展开中用于非谐振动计算的自动化PES构建背景下,从静态和迭代细化网格计算作为分子结构函数的电子能量。我们探索了一系列算法的性能,包括高斯过程回归(GPR)、核岭回归、支持向量回归和神经网络(NNs)。我们还探索了与GPR相关的方法,如稀疏高斯过程回归、高斯过程马尔可夫链和稀疏高斯过程马尔可夫链。对于神经网络,我们报告了一些关于架构、激活函数和数值设置的探索。考虑了不同的增量学习策略,发现使用例如RI-MP2结合机器学习的CCSD(F12)(T)-RI-MP2差异来针对CCSD(F12)(T)预测进行增量学习是一个有吸引力的选择。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验