González-Díaz Humberto, Cruz-Monteagudo Maykel, Viña Dolores, Santana Lourdes, Uriarte Eugenio, De Clercq Erik
Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain.
Bioorg Med Chem Lett. 2005 Mar 15;15(6):1651-7. doi: 10.1016/j.bmcl.2005.01.047.
The unified representation of spectral moments, classic topologic indices, quadratic indices, and stochastic molecular descriptors show that all these molecular descriptors lie within the same family. Consequently, the same prior probability for a successful quantitative-structure-activity-relationship (QSAR) may be expected irrespective of which indices are selected. Herein, we used stochastic spectral moments as molecular descriptors to seek a QSAR using a database of 221 bioactive compounds previously tested against diverse RNA-viruses and 402 nonactive ones. The QSAR model thus obtained correctly classifies 90.9% of compounds in training. The model also correctly classifies a total of 87.9% of 207 compounds on additional external predicting series, 73 of them having anti-RNA-virus activity and 134 nonactive ones. In addition, all compounds were regrouped into five different subsets for leave-group-out studies: (1) anti-influenza, (2) anti-picornavirus, (3) anti-paramyxovirus, (4) anti-RSV/anti-influenza, and (5) broad range anti-RNA-virus activity. The model has retained overall accuracies of about 90% on these studies validating model robustness. Finally, we exemplify the practical use of the model with the discovery of compounds 124 and 128. These compounds presented MIC50 values=3.2 and 8 microg/mL against respiratory syncytial virus (RSV) respectively. Both compounds also have low cytotoxicity expressed by their Minimal Cytotoxic Concentrations >400 microg/mL for HeLa cells. The present approach represents an effort toward a formalization and application of molecular indices in bioorganic and medicinal chemistry.
光谱矩、经典拓扑指数、二次指数和随机分子描述符的统一表示表明,所有这些分子描述符都属于同一类别。因此,无论选择哪种指数,都可以预期成功的定量构效关系(QSAR)具有相同的先验概率。在此,我们使用随机光谱矩作为分子描述符,利用一个包含221种先前针对多种RNA病毒进行测试的生物活性化合物和402种非活性化合物的数据库来寻找QSAR。由此获得的QSAR模型在训练中正确分类了90.9%的化合物。该模型在另外的外部预测系列中也正确分类了207种化合物中的87.9%,其中73种具有抗RNA病毒活性,134种无活性。此外,所有化合物被重新分组为五个不同的子集用于留组法研究:(1)抗流感,(2)抗微小核糖核酸病毒,(3)抗副粘病毒,(4)抗呼吸道合胞病毒/抗流感,以及(5)广谱抗RNA病毒活性。这些研究验证了模型的稳健性,模型在这些研究中的总体准确率保持在约90%。最后,我们通过发现化合物124和128举例说明了该模型的实际应用。这些化合物对呼吸道合胞病毒(RSV)的MIC50值分别为3.2和8微克/毫升。两种化合物对HeLa细胞的最小细胞毒性浓度>400微克/毫升,表明它们的细胞毒性较低。本方法代表了在生物有机化学和药物化学中对分子指数进行形式化和应用的一种努力。