Doroudi Zohreh, Niazi Ali
Department of Chemistry, Arak Branch, Islamic Azad University, Arak, Iran.
Department of Chemistry, Central Tehran Branch, Islamic Azad University, Tehran, Iran.
Iran J Pharm Res. 2019 Summer;18(3):1239-1252. doi: 10.22037/ijpr.2019.1100731.
Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration, such as principal component regression (PCR) and partial least squares (PLS). In this paper, we investigated the effect of pixel selection by application of genetic algorithms (GAs) for the PLS model. GAs is very useful in the variable selection in modelling and calibration because of the strong effect of the relationship between presence/absence of variables in a calibration model and the prediction ability of the model itself. The subset of pixels, which resulted in the low prediction error, was selected by genetic algorithms. The resulted GA-PLS model had a high statistical quality (RMSEP = 0.0423 and R = 0.9412) in comparison with PCR (RMSEP = 0.4559, R = 0.7929) and PLS (RMSEP = 0.3275 and R = 0.0.8427) for predicting the activity of the compounds. Because of high correlation between values of predicted and experimental activities, MIA-QSAR proved to be a highly predictive approach.
已采用化学计量学方法对107种具有抗病毒活性的抗HIV HEPT化合物进行了定量构效关系(QSAR)分析。利用二维图像计算一些像素,并通过多元校准(如主成分回归(PCR)和偏最小二乘法(PLS))将多元图像分析应用于HEPT类似物抗HIV潜力的QSAR建模。在本文中,我们通过将遗传算法(GAs)应用于PLS模型来研究像素选择的效果。由于校准模型中变量的存在与否与模型本身的预测能力之间的关系影响很大,GAs在建模和校准的变量选择中非常有用。通过遗传算法选择了导致低预测误差的像素子集。与用于预测化合物活性的PCR(RMSEP = 0.4559,R = 0.7929)和PLS(RMSEP = 0.3275,R = 0.8427)相比,所得的GA-PLS模型具有较高的统计质量(RMSEP = 0.0423,R = 0.9412)。由于预测活性值与实验活性值之间具有高度相关性,MIA-QSAR被证明是一种高度预测性的方法。