Roy Kunal, Mandal Asim Sattwa
Division of Medicinal and Pharmaceutical Chemistry, Drug Theoretics and Cheminformatics Lab, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.
J Enzyme Inhib Med Chem. 2009 Feb;24(1):205-23. doi: 10.1080/14756360802051297.
Quantitative structure-activity relationship (QSAR) studies have been performed on piperidine derivatives (n = 119) as CCR5 antagonists. The whole data set was divided into a training set (75% of the dataset) and a test set (remaining 25%) on the basis of K-means clustering technique. Models developed from the training set were used to assess the predictive potential of the models using test set compounds. Initially classical type QSAR models were developed using structural, spatial, electronic, physicochemical and/or topological parameters using statistical methods like stepwise regression, partial least squares (PLS) and factor analysis followed by multiple linear regression (FA-MLR). Using topological and structural parameters, FA-MLR provided the best equation based on internal validation (Q(2) = 0.514) but the best externally validated model was obtained with PLS ([image omitted] = 0.565). When structural, physicochemical, spatial and electronic descriptors were used, the best Q(2) value (0.562) was obtained from the stepwise regression derived model whereas the best [image omitted] value (0.571) came from the PLS model. When topological descriptors were used in combination with the structural, physicochemical, spatial and electronic descriptors, the best Q(2) and [image omitted] values obtained were 0.530 (stepwise regression) and 0.580 (PLS) respectively. Attempt was made to develop 3D-QSAR models using molecular shape analysis descriptors in combination with structural, physicochemical, spatial and electronic parameters. Linear models were developed using genetic function algorithm coupled with multiple linear regression. However, the results from the 3D-QSAR study were not superior to those of the classical QSAR models. Finally, artificial neural network was employed for development of nonlinear models. The ANN models showed acceptable values of squared correlation coefficient for the observed and predicted values of the test set compounds. From the view point of external predictability, selected ANN models were superior to the linear QSAR models. All reported models satisfy the criteria of external validation as recommended by Golbraikh and Tropsha (J Mol Graphics Mod 2002; 20: 269-276), whereas the majority of the models have modified r(2) ([image omitted] ) value of the test set for external validation more than 0.5 as suggested by Roy and Roy (QSAR Comb Sci 2008; 27: 302-313).
已对作为CCR5拮抗剂的哌啶衍生物(n = 119)进行了定量构效关系(QSAR)研究。基于K均值聚类技术,将整个数据集分为训练集(数据集的75%)和测试集(其余25%)。使用训练集开发的模型用于使用测试集化合物评估模型的预测潜力。最初,使用逐步回归、偏最小二乘法(PLS)和因子分析等统计方法,结合结构、空间、电子、物理化学和/或拓扑参数,开发了经典类型的QSAR模型,随后进行多元线性回归(FA-MLR)。使用拓扑和结构参数,基于内部验证(Q(2) = 0.514),FA-MLR提供了最佳方程,但使用PLS获得了最佳外部验证模型([图像省略] = 0.565)。当使用结构、物理化学、空间和电子描述符时,逐步回归衍生模型获得了最佳Q(2)值(0.562),而最佳[图像省略]值(0.571)来自PLS模型。当将拓扑描述符与结构、物理化学、空间和电子描述符结合使用时,获得的最佳Q(2)和[图像省略]值分别为0.530(逐步回归)和0.580(PLS)。尝试使用分子形状分析描述符结合结构、物理化学、空间和电子参数开发3D-QSAR模型。使用遗传函数算法结合多元线性回归开发了线性模型。然而,3D-QSAR研究的结果并不优于经典QSAR模型。最后,采用人工神经网络开发非线性模型。人工神经网络模型对于测试集化合物的观测值和预测值显示出可接受的平方相关系数值。从外部可预测性的角度来看,所选的人工神经网络模型优于线性QSAR模型。所有报告的模型均满足Golbraikh和Tropsha推荐的外部验证标准(J Mol Graphics Mod 2002; 20: 269 - 276),而大多数模型的外部验证测试集的修正r(2)([图像省略])值超过了Roy和Roy建议的0.5(QSAR Comb Sci 2008; 27: 302 - 313)。