Riahi Siavash, Beheshti Abolghasem, Ganjali Mohammad Reza, Norouzi Parviz
Institute of Petroleum Engineering, Faculty of Engineering, University of Tehran, Tehran, Iran.
Electrophoresis. 2008 Oct;29(19):4027-35. doi: 10.1002/elps.200800038.
Some drugs' migration time (MT) has been studied employing quantitative structure-property relationship using new descriptors that are able to predict MT value with high accuracy. MT property modeling of the drugs was established as a function of the new theoretically derived descriptors applying multiple linear regressions and partial least-squares regression. The genetic algorithm was used to select those variables that resulted in the best-fitted models. To select a set of descriptors that are most relevant to MT, illustrating the affecting degree for the affinity of different descriptors, the linear models with 1-14 variables were constructed and were then investigated based on F-value, squared regression coefficients of cross-validated (Q2), adjusted R2 (R2adj) and standard error of estimate (S) statistical parameters. Finally, the best model with ten variables was selected. Statistical parameters of the test set, such as standard deviation error in test, were 0.559 and 0.616, while relative error of test was equal to 7.648 and 8.497% for multiple linear regressions and partial least-squares models, respectively, confirming the good predictive ability of the model. Since the capillary lengths were not the same for the drugs in the data set, MT values were normalized based on a specific capillary before modeling, which is also one of the advantages of this method, enabling us to use the model for different capillary lengths.
一些药物的迁移时间(MT)已通过使用能够高精度预测MT值的新描述符,采用定量结构-性质关系进行了研究。利用多元线性回归和偏最小二乘回归,将药物的MT性质建模为新的理论推导描述符的函数。遗传算法用于选择那些能产生最佳拟合模型的变量。为了选择一组与MT最相关的描述符,说明不同描述符对亲和力的影响程度,构建了具有1 - 14个变量的线性模型,然后基于F值、交叉验证的平方回归系数(Q2)、调整后的R2(R2adj)和估计标准误差(S)统计参数进行研究。最后,选择了具有十个变量的最佳模型。测试集的统计参数,如测试中的标准偏差误差分别为0.559和0.616,而多元线性回归和偏最小二乘模型的测试相对误差分别为7.648%和8.497%,证实了该模型具有良好的预测能力。由于数据集中药物的毛细管长度不同,在建模前基于特定毛细管对MT值进行了归一化处理,这也是该方法的优点之一,使我们能够将该模型用于不同的毛细管长度。