González-Díaz Humberto, Prado-Prado Francisco J, Santana Lourdes, Uriarte Eugenio
Department of Organic Chemistry, University of Santiago de Compostela 15782, Spain.
Bioorg Med Chem. 2006 Sep 1;14(17):5973-80. doi: 10.1016/j.bmc.2006.05.018. Epub 2006 Jun 8.
Most of up-to-date reported molecular descriptors encode only information about the molecular structure. In previous papers, we have extended stochastic descriptors to encode additional information such as target site, partition system, or biological species [Bioorg. Med. Chem. Lett.2005, 15, 551; Bioorg. Med. Chem. 2005, 13, 1119]. This work develops an unify Markov model to describe with a single linear equation the biological activity of 74 drugs tested in the literature against some of the fungi species selected from a list of 87 species (491 cases in total). The data were processed by linear discriminant analysis (LDA) classifying drugs as active or non-active against the different tested fungi species. The model correctly classifies 338 out of 368 active compounds (91.85%) and 89 out of 123 non-active compounds (72.36%). Overall training predictability was 86.97% (427 out of 491 compounds). Validation of the model was carried out by means of leave-species-out (LSO) procedure. After elimination step-by-step of all drugs tested against one specific species, we record the percentage of good classification of leave-out compounds (LSO-predictability). In addition, robustness of the model to the elimination of the compounds (LSO-robustness) was considered. This aspect was considered as the variation of the percentage of good classification of the modified model (Delta) in LSO with respect to the original one. Average LSO-predictability was 86.41+/-0.95% (average+/-SD) and Delta = -0.55%, being 6 the average number of drugs tested against each fungi species. Results for some of the 87 studied species were Candida albicans: 43 tested compounds, 100% of LSO-predictability, Delta = -3.49%; Candida parapsilosis 23, 100%, Delta = -0.86%; Aspergillus fumigatus 21, 95.20%, Delta = 0.05%; Microsporum canis 12, 91.60%, Delta = -2.84%; Trichophyton mentagrophytes 11, 100%, Delta = -0.51%; Cryptococcus neoformans 10, 90%, Delta = -0.90%. The present one is the first reported unify model that allows one predicting antifungal activity of any organic compound against a very large diversity of fungi pathogens.
大多数最新报道的分子描述符仅编码有关分子结构的信息。在先前的论文中,我们扩展了随机描述符以编码其他信息,如靶点、分配系统或生物物种[《生物有机与药物化学快报》2005年,15卷,551页;《生物有机与药物化学》2005年,13卷,1119页]。这项工作开发了一个统一的马尔可夫模型,用一个线性方程来描述文献中测试的74种药物对从87种真菌物种列表中选出的一些真菌物种的生物活性(总共491个案例)。数据通过线性判别分析(LDA)进行处理,将药物分类为对不同测试真菌物种有活性或无活性。该模型正确地将368种活性化合物中的338种(91.85%)和123种无活性化合物中的89种(72.36%)进行了分类。总体训练可预测性为86.97%(491种化合物中的427种)。通过留物种法(LSO)程序对模型进行验证。在逐步剔除针对一种特定物种测试的所有药物后,我们记录留一化合物的良好分类百分比(LSO可预测性)。此外,还考虑了模型对化合物剔除的稳健性(LSO稳健性)。这一方面被视为修改后模型在LSO中的良好分类百分比相对于原始模型的变化(Δ)。平均LSO可预测性为86.41±0.95%(平均值±标准差),Δ = -0.55%,每种真菌物种测试的药物平均数量为6种。87种研究物种中的一些结果如下:白色念珠菌:43种测试化合物,LSO可预测性为100%,Δ = -3.49%;近平滑念珠菌23种,100%,Δ = -0.86%;烟曲霉21种,95.20%,Δ = 0.05%;犬小孢子菌12种,91.60%,Δ = -2.84%;须癣毛癣菌11种,100%,Δ = -0.51%;新型隐球菌10种,90%,Δ = -0.90%。本文是首次报道的统一模型,它能够预测任何有机化合物对多种真菌病原体的抗真菌活性。