College of Physical Science and Technology, Huazhong Normal University, Wuhan 430079, China.
Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China.
J Chem Inf Model. 2023 Aug 28;63(16):5097-5106. doi: 10.1021/acs.jcim.3c00892. Epub 2023 Aug 10.
Accurate determination of the thermal rate constants for combustion reactions is a highly challenging task, both experimentally and theoretically. Machine learning has been proven to be a powerful tool to predict reaction rate constants in recent years. In this work, three supervised machine learning algorithms, including XGB, FNN, and XGB-FNN, are used to develop quantitative structure-property relationship models for the estimation of the rate constants of hydrogen abstraction reactions from alkanes by the free radicals CH, H, and O. The molecular similarity based on Morgan molecular fingerprints combined with the topological indices are proposed to represent chemical reactions in the machine learning models. Using the newly constructed descriptors, the hybrid XGB-FNN algorithm yields average deviations of 65.4%, 12.1%, and 64.5% on the prediction sets of alkanes + CH, H, and O, respectively, whose performance is comparable and even superior to the corresponding one using the activation energy as a descriptor. The use of activation energy as a descriptor has previously been shown to significantly improve prediction accuracy ( 2022, 322, 124150) but typically requires cumbersome ab initio calculations. In addition, the XGB-FNN models could reasonably predict reaction rate constants of hydrogen abstractions from different sites of alkanes and their isomers, indicating a good generalization ability. It is expected that the reaction descriptors proposed in this work can be applied to build machine learning models for other reactions.
准确确定燃烧反应的热速率常数是一个极具挑战性的任务,无论是在实验上还是理论上。近年来,机器学习已被证明是预测反应速率常数的有力工具。在这项工作中,使用了三种监督机器学习算法,包括 XGB、FNN 和 XGB-FNN,为通过自由基 CH、H 和 O 从烷烃中提取氢的反应速率常数的估计开发了定量结构-性质关系模型。基于 Morgan 分子指纹的分子相似性结合拓扑指数被提出用于表示机器学习模型中的化学反应。使用新构建的描述符,混合 XGB-FNN 算法在烷烃+CH、H 和 O 的预测集上的平均偏差分别为 65.4%、12.1%和 64.5%,其性能与使用活化能作为描述符的相应算法相当,甚至更好。先前已经表明,使用活化能作为描述符可以显著提高预测精度(2022, 322, 124150),但通常需要繁琐的从头计算。此外,XGB-FNN 模型可以合理地预测不同烷烃和其异构体中氢提取反应的速率常数,表明具有良好的泛化能力。预计本工作中提出的反应描述符可用于构建其他反应的机器学习模型。