Suppr超能文献

基于细胞的 HIV 和逆转录酶数据集的多种机器学习比较。

Multiple Machine Learning Comparisons of HIV Cell-based and Reverse Transcriptase Data Sets.

机构信息

Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States.

The Rutgers Center for Computational and Integrative Biology , Camden , New Jersey 08102 , United States.

出版信息

Mol Pharm. 2019 Apr 1;16(4):1620-1632. doi: 10.1021/acs.molpharmaceut.8b01297. Epub 2019 Feb 26.

Abstract

The human immunodeficiency virus (HIV) causes over a million deaths every year and has a huge economic impact in many countries. The first class of drugs approved were nucleoside reverse transcriptase inhibitors. A newer generation of reverse transcriptase inhibitors have become susceptible to drug resistant strains of HIV, and hence, alternatives are urgently needed. We have recently pioneered the use of Bayesian machine learning to generate models with public data to identify new compounds for testing against different disease targets. The current study has used the NIAID ChemDB HIV, Opportunistic Infection and Tuberculosis Therapeutics Database for machine learning studies. We curated and cleaned data from HIV-1 wild-type cell-based and reverse transcriptase (RT) DNA polymerase inhibition assays. Compounds from this database with ≤1 μM HIV-1 RT DNA polymerase activity inhibition and cell-based HIV-1 inhibition are correlated (Pearson r = 0.44, n = 1137, p < 0.0001). Models were trained using multiple machine learning approaches (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, support vector classification, k-Nearest Neighbors, and deep neural networks as well as consensus approaches) and then their predictive abilities were compared. Our comparison of different machine learning methods demonstrated that support vector classification, deep learning, and a consensus were generally comparable and not significantly different from each other using 5-fold cross validation and using 24 training and test set combinations. This study demonstrates findings in line with our previous studies for various targets that training and testing with multiple data sets does not demonstrate a significant difference between support vector machine and deep neural networks.

摘要

人类免疫缺陷病毒(HIV)每年导致超过 100 万人死亡,并在许多国家造成巨大的经济影响。第一批批准的药物是核苷逆转录酶抑制剂。新一代的逆转录酶抑制剂已经对 HIV 的耐药菌株变得敏感,因此急需替代品。我们最近率先使用贝叶斯机器学习,利用公共数据生成模型,以识别新的化合物,用于针对不同疾病靶点的测试。本研究使用了 NIAID ChemDB HIV、机会性感染和结核病治疗数据库进行机器学习研究。我们从 HIV-1 野生型细胞和逆转录酶(RT)DNA 聚合酶抑制测定中 curated 和清理数据。来自该数据库的化合物,其 HIV-1 RT DNA 聚合酶活性抑制和基于细胞的 HIV-1 抑制率均≤1 μM(Pearson r = 0.44,n = 1137,p < 0.0001)。使用多种机器学习方法(伯努利朴素贝叶斯、AdaBoost 决策树、随机森林、支持向量分类、k-最近邻和深度神经网络以及共识方法)对模型进行训练,然后比较它们的预测能力。我们对不同机器学习方法的比较表明,支持向量分类、深度学习和共识通常是可比的,并且在使用 5 折交叉验证和 24 个训练和测试集组合时彼此之间没有显著差异。本研究的结果与我们之前针对各种靶标进行的研究一致,即使用多个数据集进行训练和测试并不表明支持向量机和深度神经网络之间存在显著差异。

相似文献

引用本文的文献

9
Comparing LD/LC Machine Learning Models for Multiple Species.比较多种物种的LD/LC机器学习模型
J Chem Health Saf. 2023 Mar 27;30(2):83-97. doi: 10.1021/acs.chas.2c00088. Epub 2023 Feb 23.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验