Geng Ranran, Deng Wenjuan, Hu Zhiqiang, Wang Jianlei, Zhao Yuanyuan, Zhou Baichuan, Tian Guocai
Faculty of Metallurgical and Energy Engineering, Kunming University of Science and Technology, Kunming 650093, China.
State Key Laboratory of Complex Non-Ferrous Metal Resource Clean Utilization, Kunming University of Science and Technology, Kunming 650093, China.
Phys Chem Chem Phys. 2025 Jul 10;27(27):14482-14491. doi: 10.1039/d5cp01972a.
Carbon dioxide emission reduction, conversion and utilization are hot topics and difficult issues in the world. As a new class of green solvents, ionic liquids (ILs) are widely used in CO capture and conversion, but there are various kinds of ILs (more than 10). How to select and screen appropriate ILs for CO capture is an urgent problem to be solved. Therefore, it is of great significance to establish the quantitative structure-property relationship (QSPR) of ILs for CO capture. From the practical point of view of IL design and synthesis, a new functional structure descriptor (FSD) based on the group contribution method (GC) was constructed. At the same time, the idea of increasing dimensions to increase accuracy in traditional machine learning is changed, and the feasibility of reducing the dimension under the condition of ensuring accuracy is examined. A dimensionless molecular descriptor CORE is constructed. Based on these two new molecular descriptors, we discussed the performance of six common ensemble learning models (CatBoost, LightGBM, XGBoost, GBDT, RF and AdaBoost) for CO solubility in ILs. It is shown that all ensemble learning models can achieve good performance, but the CatBoost model is the most outstanding. An of 0.9945 and MAE of 0.0108 for the CatBoost-FSD model is achieved, while the and MAE values are 0.9925 and 0.0120 for the CatBoost-CORE model, respectively. The interpretability of the CatBoost-FSD model is analyzed, and the key features are determined. Based on the CORE descriptor, the best experimental conditions are obtained, and nine kinds of ILs with superior performance are recommended.
二氧化碳减排、转化与利用是当今世界的热点和难题。离子液体(ILs)作为一类新型绿色溶剂,在二氧化碳捕集与转化中得到广泛应用,但离子液体种类繁多(超过10种)。如何选择和筛选适合二氧化碳捕集的离子液体是亟待解决的问题。因此,建立用于二氧化碳捕集的离子液体定量结构-性质关系(QSPR)具有重要意义。从离子液体设计与合成的实际角度出发,构建了一种基于基团贡献法(GC)的新型功能结构描述符(FSD)。同时,改变了传统机器学习中增加维度以提高精度的思路,考察了在保证精度的条件下进行降维的可行性。构建了一个无量纲分子描述符CORE。基于这两个新的分子描述符,我们讨论了六种常见的集成学习模型(CatBoost、LightGBM、XGBoost、GBDT、RF和AdaBoost)对二氧化碳在离子液体中溶解度的预测性能。结果表明,所有集成学习模型都能取得良好的性能,但CatBoost模型最为突出。CatBoost-FSD模型的决定系数为0.9945,平均绝对误差为0.0108,而CatBoost-CORE模型的决定系数和平均绝对误差分别为0.9925和0.0120。分析了CatBoost-FSD模型的可解释性,并确定了关键特征。基于CORE描述符,获得了最佳实验条件,并推荐了九种性能优异的离子液体。