Xiang Zheng, Liang Yizeng, Hu Qiannan
Research Center of Modernization of Chinese Herbal Medicines, College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, China.
Se Pu. 2005 Mar;23(2):117-22.
Quantitative structure-property relationships (QSPR) have been demonstrated to be a powerful tool in chromatography. QSPR have been used to obtain simple models to explain and predict the chromatographic behavior of various classes of compounds. The study of quantitative structure and retention index relationship (QSRR) is an important subject in chromatographic field. One hundred twenty-seven topological descriptors of 207 methylalkane structures are calculated. GAPLS method, which is a variable selection method combining with genetic algorithms (GA), back stepwise and partial least squares (PLS), is introduced in the variable selection of quantitative structure gas chromatographic (GC) retention index relationship. Seven topological descriptors are selected from 127 topological descriptors by GAPLS method to build QSRR model with high regression quality: squared correlation coefficient (R2) of 0.99998, standard deviation (S) of 2.88. The error of the model is similar to the experimental error. The validation of the model is checked by leave-one-out cross-validation technique. The result of leave-one-out cross-validation indicates that the built model is reliable and stable with high prediction quality, such as squared correlation coefficient of leave-one-out (R2cv) of 0.99997 and standard deviation of leave-one-out predictions (Scv) of 2.95. A successful interpretation of the complex relationship between GC retention indexes of methylalkanes and the chemical structure is achieved using QSPR method. The seven variables in the model are also rationally interpreted, which indicates methylalkane retention index are precisely represented by topological descriptors.
定量结构-性质关系(QSPR)已被证明是色谱分析中的一种强大工具。QSPR已被用于获得简单模型,以解释和预测各类化合物的色谱行为。定量结构与保留指数关系(QSRR)的研究是色谱领域的一个重要课题。计算了207种甲基烷烃结构的127个拓扑描述符。在定量结构气相色谱(GC)保留指数关系的变量选择中引入了GAPLS方法,该方法是一种结合遗传算法(GA)、反向逐步回归和偏最小二乘法(PLS)的变量选择方法。通过GAPLS方法从127个拓扑描述符中选择了7个拓扑描述符,以构建具有高回归质量的QSRR模型:平方相关系数(R2)为0.99998,标准差(S)为2.88。该模型的误差与实验误差相似。通过留一法交叉验证技术对模型进行验证。留一法交叉验证的结果表明,所构建的模型可靠且稳定,具有较高的预测质量,如留一法平方相关系数(R2cv)为0.99997,留一法预测标准差(Scv)为2.95。使用QSPR方法成功地解释了甲基烷烃的GC保留指数与化学结构之间的复杂关系。对模型中的七个变量也进行了合理的解释,这表明甲基烷烃的保留指数可以由拓扑描述符精确表示。