Suppr超能文献

基于机器学习增强密度泛函理论的精确预测分子 C/H NMR 化学位移的通用方案。

General Protocol for the Accurate Prediction of Molecular C/H NMR Chemical Shifts via Machine Learning Augmented DFT.

机构信息

School of Chemistry and Molecular Bioscience, University of Wollongong, Wollongong, NSW 2500, Australia.

Physical Sciences Division, Pacific Northwest National Laboratory (PNNL), Richland, Washington 99352, United States.

出版信息

J Chem Inf Model. 2020 Aug 24;60(8):3746-3754. doi: 10.1021/acs.jcim.0c00388. Epub 2020 Jul 20.

Abstract

An accurate prediction of NMR chemical shifts at affordable computational cost is very important for different types of structural assignments in experimental studies. Density functional theory (DFT) and gauge-including atomic orbital (GIAO) are two of the most popular computational methods for NMR calculation, yet they often fail to resolve ambiguities in structural assignments. Here, we present a new method that uses machine learning (ML) techniques (DFT + ML) that significantly increases the accuracy of C/H NMR chemical shift prediction for a variety of organic molecules. The input of the generalizable DFT + ML model contains two critical parts: one is a vector providing insights into chemical environments, which can be evaluated without knowing the exact geometry of the molecule; the other one is the DFT-calculated isotropic shielding constant. The DFT + ML model was trained with a data set containing 476 C and 270 H experimental chemical shifts. For the DFT methods used here, the root mean square deviations (RMSDs) for the errors between predicted and experimental C/H chemical shifts can be as small as 2.10/0.18 ppm, which is much lower than those from simple DFT (5.54/0.25 ppm), or DFT + linear regression (LR) (4.77/0.23 ppm) approaches. It also has a smaller maximum absolute error than two previously proposed NMR-predicting ML models. The robustness of the DFT + ML model is tested on two classes of organic molecules (TIC10 and hyacinthacines), where the correct isomers were unambiguously assigned to the experimental ones. Overall, the DFT + ML model shows promise for structural assignments in a variety of systems, including stereoisomers, that are often challenging to determine experimentally.

摘要

在可承受的计算成本下准确预测 NMR 化学位移对于实验研究中的各种结构分配非常重要。密度泛函理论 (DFT) 和包含自洽场原子轨道 (GIAO) 的方法是两种最受欢迎的 NMR 计算计算方法,但它们常常无法解决结构分配中的歧义。在这里,我们提出了一种新的方法,该方法使用机器学习 (ML) 技术 (DFT + ML),可显著提高各种有机分子的 C/H NMR 化学位移预测的准确性。可推广的 DFT + ML 模型的输入包含两个关键部分:一个是提供化学环境见解的向量,无需知道分子的精确几何形状即可评估;另一个是 DFT 计算的各向同性屏蔽常数。DFT + ML 模型使用包含 476 个 C 和 270 个 H 实验化学位移的数据集进行训练。对于这里使用的 DFT 方法,预测和实验 C/H 化学位移之间的误差的均方根偏差 (RMSD) 可以小至 2.10/0.18 ppm,远低于简单 DFT(5.54/0.25 ppm)或 DFT + 线性回归 (LR)(4.77/0.23 ppm)方法。它的最大绝对误差也比两个先前提出的用于预测 NMR 的 ML 模型小。DFT + ML 模型在两类有机分子(TIC10 和 hyacinthacines)上进行了稳健性测试,其中实验的正确异构体被明确分配给了实验的异构体。总体而言,DFT + ML 模型在包括立体异构体在内的各种系统的结构分配中具有很大的应用潜力,这些系统通常很难通过实验确定。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验