Zhou Xiaoke, He Sisi, Xiao Min, He Jing, Wang Yuan, Zhu Yuanqin, He Haixiang
College of Chemistry and Chemical Engineering, Guangxi Key Laboratory of Electrochemical Energy Materials, Guangxi University, Nanning, 530004, China.
State Key Laboratory of Polyolefins and Catalysis, Shanghai, 200062, China.
Mol Divers. 2025 Aug;29(4):3411-3422. doi: 10.1007/s11030-025-11147-0. Epub 2025 Mar 7.
This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.
本研究使用机器学习(ML)方法系统地研究了30种钛-苯氧基亚胺(FI-Ti)催化剂的构效关系。在测试的算法中,XGBoost表现出卓越的预测性能,训练集的R值为0.998,测试集的R值为0.859,交叉验证的Q值为0.617。特征重要性分析确定了三个复合描述符——ODI_HOMO_1_Neg_Average GGI2、ALIEmax GATS8d和Mol_Size_L——为关键贡献因素,它们共同占模型预测能力的63%以上。多项式特征扩展有效地捕捉了描述符之间的非线性相互作用,而SHAP和ICE分析增强了可解释性,揭示了阈值效应和特定描述符的趋势。然而,模型的通用性可能受到有限数据集大小(30个样本)和对密度泛函理论(DFT)衍生描述符的依赖的限制,因此需要进行实验验证。此外,该研究仅关注40°C下的乙烯聚合反应;对于不同催化体系或反应条件的更广泛适用性需要进一步验证。这些发现为催化剂设计提供了一个数据驱动的框架,不过未来的工作应整合实验验证并扩大数据集,以提高预测的稳健性。