Acevedo-Martínez Jorge, Escalona-Arranz Julio Cesar, Villar-Rojas Alberto, Téllez-Palmero Franklin, Pérez-Rosés Renato, González Luis, Carrasco-Velar Ramón
Dpto. Química, Fac. Ciencias Naturales, Universidad de Oriente, Patricio Lumumba s/n, Santiago de Cuba, Cuba.
J Chromatogr A. 2006 Jan 13;1102(1-2):238-44. doi: 10.1016/j.chroma.2005.10.019. Epub 2005 Nov 8.
The Kováts retention index is one of the most popular descriptors of the performance of organic compounds in gas chromatography (GC). The mathematical modeling of this index is an interesting and open problem in analytical chemistry. In this paper, two models for the prediction of the Kováts retention index are presented. Topologic, topographic and quantum-chemical descriptors were used as structural descriptors. Multiple linear regression (MLR) analysis provides the first model using the forward stepwise procedure for the variable selection. For the second one, an ensemble of artificial neural network (ANN) was constructed using the pruning algorithm. Both methods were validated by an external set of compounds, by the Golbraikh and Tropsha method and by the leave-one-out (LOO) and the leave many out (LMO) procedures. The R2, RMScv and Q2, values for the training sets were 0.884, 0.589 and 0.830 for NN and 0.974, 0.417 and 0.970 for MLR models, respectively. The robustness of both models was demonstrated. Both portrait the chromatographic performance of the sample but in this case, the results of MLR equation are better than the NN ones. The MLR model is recommended because of its simplicity.
科瓦茨保留指数是气相色谱(GC)中描述有机化合物性能最常用的指标之一。该指数的数学建模是分析化学中一个有趣且尚未解决的问题。本文提出了两种预测科瓦茨保留指数的模型。拓扑、地形和量子化学描述符被用作结构描述符。多元线性回归(MLR)分析提供了第一个使用前向逐步程序进行变量选择的模型。对于第二个模型,使用剪枝算法构建了一个人工神经网络(ANN)集成。两种方法都通过一组外部化合物、戈尔布赖希和特罗普沙方法以及留一法(LOO)和留多法(LMO)程序进行了验证。对于神经网络,训练集的R2、RMScv和Q2值分别为0.884、0.589和0.830;对于MLR模型,分别为0.974、0.417和0.970。两种模型的稳健性都得到了证明。两者都描绘了样品的色谱性能,但在这种情况下,MLR方程的结果优于神经网络的结果。由于其简单性,推荐使用MLR模型。