Comelli Nieves C, Duchowicz Pablo R, Castro Eduardo A
Facultad de Ciencias Agrarias, Universidad Nacional de Catamarca, Av. Belgrano y Maestro Quiroga, 4700 Catamarca, Argentina.
Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas INIFTA (UNLP, CCT La Plata-CONICET), Diag. 113 y 64, C.C. 16, Sucursal 4, 1900 La Plata, Argentina.
Eur J Pharm Sci. 2014 Oct 1;62:171-9. doi: 10.1016/j.ejps.2014.05.029. Epub 2014 Jun 6.
The inhibitory activity of 103 thiophene and 33 imidazopyridine derivatives against Polo-Like Kinase 1 (PLK1) expressed as pIC50 (-logIC50) was predicted by QSAR modeling. Multivariate linear regression (MLR) was employed to model the relationship between 0D and 3D molecular descriptors and biological activities of molecules using the replacement method (MR) as variable selection tool. The 136 compounds were separated into several training and test sets. Two splitting approaches, distribution of biological data and structural diversity, and the statistical experimental design procedure D-optimal distance were applied to the dataset. The significance of the training set models was confirmed by statistically higher values of the internal leave one out cross-validated coefficient of determination (Q2) and external predictive coefficient of determination for the test set (Rtest2). The model developed from a training set, obtained with the D-optimal distance protocol and using 3D descriptor space along with activity values, separated chemical features that allowed to distinguish high and low pIC50 values reasonably well. Then, we verified that such model was sufficient to reliably and accurately predict the activity of external diverse structures. The model robustness was properly characterized by means of standard procedures and their applicability domain (AD) was analyzed by leverage method.
通过定量构效关系(QSAR)建模预测了103种噻吩和33种咪唑并吡啶衍生物对以pIC50(-logIC50)表示的Polo样激酶1(PLK1)的抑制活性。采用多元线性回归(MLR),以替换法(MR)作为变量选择工具,对0D和3D分子描述符与分子生物活性之间的关系进行建模。将136种化合物分为几个训练集和测试集。两种拆分方法,即生物数据分布和结构多样性,以及统计实验设计程序D-最优距离,应用于数据集。训练集模型的显著性通过内部留一法交叉验证决定系数(Q2)的统计学更高值和测试集的外部预测决定系数(Rtest2)得到证实。从训练集中开发的模型,通过D-最优距离协议获得,并使用3D描述符空间以及活性值,分离出了能够较好地区分高和低pIC50值的化学特征。然后,我们验证了该模型足以可靠且准确地预测外部不同结构的活性。通过标准程序对模型稳健性进行了适当表征,并通过杠杆法分析了其适用域(AD)。