Suppr超能文献

通过机器学习鉴定抗结核分枝杆菌的活性分子。

Identification of active molecules against Mycobacterium tuberculosis through machine learning.

机构信息

College of Pharmaceutical Sciences at Zhejiang University, China.

College of Pharmaceutical Sciences, Zhejiang University, China.

出版信息

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab068.

Abstract

Tuberculosis (TB) is an infectious disease caused by Mycobacterium tuberculosis (Mtb) and it has been one of the top 10 causes of death globally. Drug-resistant tuberculosis (XDR-TB), extensively resistant to the commonly used first-line drugs, has emerged as a major challenge to TB treatment. Hence, it is quite necessary to discover novel drug candidates for TB treatment. In this study, based on different types of molecular representations, four machine learning (ML) algorithms, including support vector machine, random forest (RF), extreme gradient boosting (XGBoost) and deep neural networks (DNN), were used to develop classification models to distinguish Mtb inhibitors from noninhibitors. The results demonstrate that the XGBoost model exhibits the best prediction performance. Then, two consensus strategies were employed to integrate the predictions from multiple models. The evaluation results illustrate that the consensus model by stacking the RF, XGBoost and DNN predictions offers the best predictions with area under the receiver operating characteristic curve of 0.842 and 0.942 for the 10-fold cross-validated training set and external test set, respectively. Besides, the association between the important descriptors and the bioactivities of molecules was interpreted by using the Shapley additive explanations method. Finally, an online webserver called ChemTB (http://cadd.zju.edu.cn/chemtb/) was developed, and it offers a freely available computational tool to detect potential Mtb inhibitors.

摘要

结核病(TB)是由结核分枝杆菌(Mtb)引起的传染病,它一直是全球十大死因之一。耐多药结核病(XDR-TB)对常用的一线药物广泛耐药,已成为结核病治疗的主要挑战。因此,发现治疗结核病的新型药物候选物是非常必要的。在这项研究中,基于不同类型的分子表示,我们使用了四种机器学习(ML)算法,包括支持向量机、随机森林(RF)、极端梯度提升(XGBoost)和深度神经网络(DNN),来开发分类模型,以区分结核分枝杆菌抑制剂和非抑制剂。结果表明,XGBoost 模型表现出最佳的预测性能。然后,我们采用了两种共识策略来整合来自多个模型的预测结果。评估结果表明,通过堆叠 RF、XGBoost 和 DNN 预测结果的共识模型提供了最佳的预测,在 10 倍交叉验证训练集和外部测试集上的接收器操作特征曲线下面积分别为 0.842 和 0.942。此外,还使用 Shapley 加法解释方法解释了重要描述符与分子生物活性之间的关系。最后,我们开发了一个名为 ChemTB(http://cadd.zju.edu.cn/chemtb/)的在线网络服务器,并提供了一个免费的计算工具来检测潜在的结核分枝杆菌抑制剂。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验