Abbas Usman L, Zhang Yuxuan, Tapia Joseph, Md Selim, Chen Jin, Shi Jian, Shao Qing
Department of Chemical and Materials Engineering, University of Kentucky, Lexington, KY 40506, USA.
Department of Biosystems and Agricultural Engineering, University of Kentucky, Lexington, KY 40506, USA.
Engineering (Beijing). 2024 Aug;39:74-83. doi: 10.1016/j.eng.2023.10.020. Epub 2024 Jul 9.
Non-ionic deep eutectic solvents (DESs) are non-ionic designer solvents with various applications in catalysis, extraction, carbon capture, and pharmaceuticals. However, discovering new DES candidates is challenging due to a lack of efficient tools that accurately predict DES formation. The search for DES relies heavily on intuition or trial-and-error processes, leading to low success rates or missed opportunities. Recognizing that hydrogen bonds (HBs) play a central role in DES formation, we aim to identify HB features that distinguish DES from non-DES systems and use them to develop machine learning (ML) models to discover new DES systems. We first analyze the HB properties of 38 known DES and 111 known non-DES systems using their molecular dynamics (MD) simulation trajectories. The analysis reveals that DES systems have two unique features compared to non-DES systems: The DESs have ① more imbalance between the numbers of the two intra-component HBs and ② more and stronger inter-component HBs. Based on these results, we develop 30 ML models using ten algorithms and three types of HB-based descriptors. The model performance is first benchmarked using the average and minimal receiver operating characteristic (ROC)-area under the curve (AUC) values. We also analyze the importance of individual features in the models, and the results are consistent with the simulation-based statistical analysis. Finally, we validate the models using the experimental data of 34 systems. The extra trees forest model outperforms the other models in the validation, with an ROC-AUC of 0.88. Our work illustrates the importance of HBs in DES formation and shows the potential of ML in discovering new DESs.
非离子型低共熔溶剂(DESs)是一类非离子型的定制溶剂,在催化、萃取、碳捕获和制药等领域有多种应用。然而,由于缺乏能够准确预测DES形成的有效工具,发现新的DES候选物具有挑战性。对DES的寻找严重依赖直觉或试错过程,导致成功率较低或错失机会。认识到氢键(HBs)在DES形成中起核心作用,我们旨在识别能区分DES与非DES体系的HB特征,并利用这些特征开发机器学习(ML)模型以发现新的DES体系。我们首先使用分子动力学(MD)模拟轨迹分析了38种已知DES和111种已知非DES体系的HB性质。分析表明,与非DES体系相比,DES体系有两个独特特征:①DES体系中两种组分内HB数量之间的不平衡更大;②组分间HB更多且更强。基于这些结果,我们使用十种算法和三种基于HB的描述符开发了30个ML模型。首先使用平均和最小受试者工作特征(ROC)曲线下面积(AUC)值对模型性能进行基准测试。我们还分析了模型中各个特征的重要性,结果与基于模拟的统计分析一致。最后,我们使用34个体系的实验数据对模型进行验证。在验证中,极端随机树森林模型的表现优于其他模型,ROC-AUC为0.88。我们的工作说明了氢键在DES形成中的重要性,并展示了机器学习在发现新DES方面的潜力。