Zhang Lei, Ye Lili, Wang Fan, Gao Wei, Yu Jinhui, Zhang Lidong
School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China.
State Key Laboratory of Fire Science, University of Science and Technology of China, Hefei, Anhui 230026, China.
J Phys Chem A. 2024 Feb 1;128(4):761-772. doi: 10.1021/acs.jpca.3c06917. Epub 2024 Jan 18.
Hydrogen abstraction reactions between hydrocarbons and hydroxyl radicals are important propagation steps in radical chain reactions, playing a crucial role in atmospheric and combustion chemistry. This study focuses on predicting the rate constants of the prototype of the reaction class of hydrogen abstractions, i.e., the primary allylic hydrogen abstraction from alkenes by the OH radical, via utilizing machine learning (ML) methods. Specifically, three distinct models, namely, feedforward neural network (FNN), support vector regression (SVR), and Gaussian process regression (GPR), have been employed to construct robust ML models for prediction. We proposed a novel strategy that seamlessly integrates descriptor preprocessing, a pairwise linear correlation analysis, and a model-specific Wrapper method to enhance the effectiveness of the feature selection procedure. The selected feature subset was then evaluated using two cross-validation techniques, i.e., leave-one-group-out (LOGO) and K-fold cross-validations, for each of the three ML models (FNN, SVR, and GPR) to assess their predictive and stability performance. The results demonstrate that the FNN model, trained with seven representative descriptors, achieves superior performance compared to the other two methods. For the FNN model, the average percentage deviation is 39.06% on the test set by performing LOGO cross-validation, while the repeated 10-fold cross-validation achieves a percentage prediction deviation of 19.1%. Two larger alkenes with 10 carbons were selected to test the prediction performance of the trained FNN model on primary allylic hydrogen abstraction. Results show that the kinetic predictions follow well the modified three-parameter Arrhenius equation, indicating the reliable performance of FNN in predicting hydrogen abstraction rate constants, especially for the primary allylic site. Hopefully, this work can shed useful light on the application of ML in generating chemical kinetic parameters of hydrocarbon combustion chemistry.
碳氢化合物与羟基自由基之间的氢提取反应是自由基链反应中的重要传播步骤,在大气化学和燃烧化学中起着关键作用。本研究聚焦于通过利用机器学习(ML)方法预测氢提取反应类别的原型反应速率常数,即OH自由基从烯烃中提取一级烯丙基氢的反应速率常数。具体而言,采用了三种不同的模型,即前馈神经网络(FNN)、支持向量回归(SVR)和高斯过程回归(GPR)来构建用于预测的稳健ML模型。我们提出了一种新颖的策略,该策略无缝集成了描述符预处理、成对线性相关分析和特定于模型的包装方法,以提高特征选择过程的有效性。然后,针对三个ML模型(FNN、SVR和GPR)中的每一个,使用两种交叉验证技术,即留一组交叉验证(LOGO)和K折交叉验证,对所选特征子集进行评估,以评估它们的预测性能和稳定性。结果表明,使用七个代表性描述符训练的FNN模型与其他两种方法相比具有卓越的性能。对于FNN模型,通过执行LOGO交叉验证,测试集上的平均百分比偏差为39.06%,而重复10折交叉验证的预测百分比偏差为19.1%。选择了两种含有10个碳原子的较大烯烃来测试训练后的FNN模型对一级烯丙基氢提取的预测性能。结果表明,动力学预测很好地遵循了修正的三参数阿伦尼乌斯方程,表明FNN在预测氢提取速率常数方面具有可靠的性能,特别是对于一级烯丙基位点。希望这项工作能为ML在生成碳氢化合物燃烧化学的化学动力学参数方面的应用提供有益的启示。