Fernández Michael, Caballero Julio
Molecular Modeling Group, Center for Biotechnological Studies, University of Matanzas, Matanzas, Cuba.
Bioorg Med Chem. 2006 Jan 1;14(1):280-94. doi: 10.1016/j.bmc.2005.08.022. Epub 2005 Oct 3.
Artificial neural networks (ANNs) were used to model both inhibition of HIV-1 protease (K(i)) and inhibition of HIV replication (IC90) for 55 cyclic urea derivatives using constitutional and 2D descriptors. As a preliminary step, linear dependences were established by multiple linear regression (MLR) approaches, selecting the relevant descriptors by genetic algorithm (GA) feature selection. For ANN models non-linear GA feature selection was also applied. Non-linear modeling of K(i) overcame the results of the linear one using four properties, keeping in mind standard Pearson R correlation coefficients (0.931 vs. 0.862) and leave one out (LOO) cross-validation analysis (Q(LOO)2 = 0.703 vs. 0.510). On the other hand, IC90 modeling was insoluble by a linear approach: no predictive model was achieved; however, a non-linear relation was encountered according to statistic results (R = 0.891; Q(LOO)2 = 0.568). The best non-linear models suggested the influence of the presence of nitrogen atoms and the molecular volume distribution in the inhibitor structures on the HIV-1 protease inhibition as well as that the inhibition of HIV replication was dependent on the occurrence of five-member rings. Finally, inhibitors were well distributed regarding its activity levels in a Kohonen self-organizing map built using the input variables of the best non-linear models.
利用结构描述符和二维描述符,使用人工神经网络(ANN)对55种环脲衍生物的HIV-1蛋白酶抑制作用(K(i))和HIV复制抑制作用(IC90)进行建模。作为初步步骤,通过多元线性回归(MLR)方法建立线性相关性,通过遗传算法(GA)特征选择来选择相关描述符。对于ANN模型,也应用了非线性GA特征选择。K(i)的非线性建模使用四个属性克服了线性建模的结果,记住标准皮尔逊R相关系数(0.931对0.862)和留一法(LOO)交叉验证分析(Q(LOO)2 = 0.703对0.510)。另一方面,IC90建模无法通过线性方法解决:未获得预测模型;然而,根据统计结果遇到了非线性关系(R = 0.891;Q(LOO)2 = 0.568)。最佳非线性模型表明抑制剂结构中氮原子的存在和分子体积分布对HIV-1蛋白酶抑制的影响,以及HIV复制的抑制取决于五元环的出现。最后,在使用最佳非线性模型的输入变量构建的Kohonen自组织映射中,抑制剂的活性水平分布良好。