Li Maogang, Lai Weipeng, Li Ruirui, Zhou Jiajun, Liu Yingzhe, Yu Tao, Zhang Tianlong, Tang Hongsheng, Li Hua
Key Laboratory of Synthetic and Natural Functional Molecule of the Ministry of Education, College of Chemistry & Materials Science, Northwest University, Xi'an 710127, China.
Xi'an Modern Chemistry Research Institute, Xi'an 710065, China.
ACS Omega. 2023 Jan 4;8(2):2752-2759. doi: 10.1021/acsomega.2c07436. eCollection 2023 Jan 17.
With the further development of the concept of green chemistry, the new generation of energetic materials tends to exhibit detonation properties such as higher insensitivity, higher density, and higher energy. Therefore, the precise molecular design and green and efficient synthesis of energetic materials will be one of the serious challenges. For the purpose of accurate prediction of detonation performance of energetic materials, an ensemble modeling strategy based on the combination of Monte Carlo (MC) and variable importance measurement (VIM) improved random forest (RF) and quantitative structure-property relationship (QSPR) is proposed, which was successfully used for density prediction of energetic materials. First, the structure of 162 energetic compounds was optimized by Gaussian software, and the molecular descriptor data were calculated by CODESSA software based on the optimized molecular structure. Then, the MCVIMRF_Med ensemble model was constructed on the basis of the above molecular descriptor data and the corresponding energetic compound density index. The joint - distance algorithm (SPXY) is used to partition the data set. And then, MC is used to further divide the calibration set data into multiple subsets for the construction of the ensemble model. The subset size and the number of iterations of the MCVIMRF_Med ensemble model were optimized through MC cross validation. The final output strategy of the ensemble model is optimized based on the optimized parameters, and an output optimization method based on median screening is proposed and successfully applied for the prediction performance optimization of the MCVIMRF_Med ensemble model. To further investigate the performance of the MCVIMRF_Med ensemble model, the performance of it was compared with partial least squares, RF, VIMRF, and MCVIMRF calibration models. It shows that the MCVIMRF_Med ensemble model can achieve a better prediction result for the density of energetic materials, with of 0.9596, RMSECV of 0.0437 g/cm, of 0.9768, RMSEP of 0.0578 g/cm, and relative analysis deviation of prediction set of 3.951. Therefore, the MCVIMRF_Med ensemble modeling strategy combined with QSPR is an effective approach for the density prediction of energetic materials. This work is expected to provide new research ideas and technical support for accurate prediction of detonation performance of energetic materials.
随着绿色化学概念的进一步发展,新一代含能材料倾向于展现出更高不敏感性、更高密度和更高能量等爆轰性能。因此,含能材料的精确分子设计以及绿色高效合成将是严峻挑战之一。为了准确预测含能材料的爆轰性能,提出了一种基于蒙特卡罗(MC)与变量重要性度量(VIM)改进的随机森林(RF)和定量结构-性质关系(QSPR)相结合的集成建模策略,该策略成功用于含能材料的密度预测。首先,通过高斯软件对162种含能化合物的结构进行优化,并基于优化后的分子结构利用CODESSA软件计算分子描述符数据。然后,基于上述分子描述符数据和相应的含能化合物密度指标构建MCVIMRF_Med集成模型。采用联合距离算法(SPXY)对数据集进行划分。接着,利用MC将校准集数据进一步划分为多个子集用于构建集成模型。通过MC交叉验证对MCVIMRF_Med集成模型的子集大小和迭代次数进行优化。基于优化后的参数对集成模型的最终输出策略进行优化,提出了基于中位数筛选的输出优化方法并成功应用于MCVIMRF_Med集成模型的预测性能优化。为了进一步研究MCVIMRF_Med集成模型的性能,将其性能与偏最小二乘法、RF、VIMRF和MCVIMRF校准模型进行了比较。结果表明,MCVIMRF_Med集成模型对含能材料密度能够取得较好的预测结果,决定系数为0.9596,交叉验证均方根误差为0.0437 g/cm³,预测集决定系数为0.9768,预测集均方根误差为0.0578 g/cm³,预测集相对分析偏差为3.951。因此,结合QSPR的MCVIMRF_Med集成建模策略是含能材料密度预测的有效方法。这项工作有望为含能材料爆轰性能的准确预测提供新的研究思路和技术支持。