A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia.
A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia.
J Chromatogr A. 2024 Aug 16;1730:465144. doi: 10.1016/j.chroma.2024.465144. Epub 2024 Jul 6.
Ionic liquids, i.e., organic salts with a low melting point, can be used as gas chromatographic liquid stationary phases. These stationary phases have some advantages such as peculiar selectivity, high polarity, and thermostability. Many previous works are devoted to such stationary phases. However, there are still no large enough retention data sets of structurally diverse compounds for them. Consequently, there are very few works devoted to quantitative structure-retention relationships (QSRR) for ionic liquid-based stationary phases. This work is aimed at closing this gap. Three ionic liquids with substituted pyridinium cations are considered. We provide large enough data sets (123-158 compounds) that can be used in further works devoted to QSRR and related methods. We provide a QSRR study using this data set and demonstrate the following. The retention index for a polyethylene glycol stationary phase (denoted as RI_PEG), predicted using another model, can be used as a molecular descriptor. This descriptor significantly improves the accuracy of the QSRR model. Both deep learning-based and linear models were considered for RI_PEG prediction. The ability to predict the retention indices for ionic liquid-based stationary phases with high accuracy is demonstrated. Particular attention is paid to the reproducibility and reliability of the QSRR study. It was demonstrated that adding/removing several compounds, small perturbations of the data set can considerably affect the results such as descriptor importance and model accuracy. These facts have to be considered in order to avoid misleading conclusions. For the QSRR research, we developed a software tool with a graphical user interface, which we called CHERESHNYA. It is intended to select molecular descriptors and construct linear equations connecting molecular descriptors with gas chromatographic retention indices for any stationary phase. The software allows the user to generate several hundred molecular descriptors (one-dimensional and two-dimensional). Among them, predicted retention indices for popular stationary phases such as polydimethylsiloxane and polyethylene glycol are used as molecular descriptors. Various methods for selecting (and assessing the importance of) molecular descriptors have been implemented, in particular the Boruta algorithm, partial least squares, genetic algorithms, L1-regularized regression (LASSO) and others. The software is free, open-source and available online: https://github.com/mtshn/chereshnya.
离子液体,即具有低熔点的有机盐,可以用作气相色谱的液体固定相。这些固定相具有一些优点,如独特的选择性、高极性和热稳定性。许多之前的工作都致力于这些固定相。然而,对于结构多样的化合物,仍然没有足够大的保留数据集。因此,很少有工作致力于基于离子液体的固定相的定量构效关系(QSRR)。这项工作旨在弥补这一差距。我们考虑了三种带有取代吡啶阳离子的离子液体。我们提供了足够大的数据集(123-158 种化合物),可用于进一步致力于 QSRR 和相关方法的工作。我们使用这个数据集进行了 QSRR 研究,并证明了以下几点。使用另一个模型预测的聚乙二醇固定相的保留指数(表示为 RI_PEG)可以用作分子描述符。这个描述符显著提高了 QSRR 模型的准确性。我们考虑了基于深度学习和线性模型的 RI_PEG 预测。证明了能够高精度预测离子液体固定相的保留指数。特别关注 QSRR 研究的可重复性和可靠性。结果表明,添加/删除几个化合物、数据集的微小扰动会极大地影响描述符重要性和模型准确性等结果。为了避免得出误导性的结论,必须考虑这些事实。对于 QSRR 研究,我们开发了一个带有图形用户界面的软件工具,我们称之为 CHERESHNYA。它旨在为任何固定相选择分子描述符,并构建将分子描述符与气相色谱保留指数联系起来的线性方程。该软件允许用户生成几百个分子描述符(一维和二维)。其中,聚二甲基硅氧烷和聚乙二醇等常用固定相的预测保留指数被用作分子描述符。实现了各种选择(和评估)分子描述符的方法,特别是 Boruta 算法、偏最小二乘法、遗传算法、L1 正则化回归(LASSO)等。该软件是免费的、开源的,并可在线获得:https://github.com/mtshn/chereshnya。