比较和验证用于结核分枝杆菌药物发现的机器学习模型。

Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

机构信息

Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States.

Department of Biochemistry and Biophysics , University of North Carolina , Chapel Hill , North Carolina 27599 , United States.

出版信息

Mol Pharm. 2018 Oct 1;15(10):4346-4360. doi: 10.1021/acs.molpharmaceut.8b00083. Epub 2018 Apr 26.

DOI:10.1021/acs.molpharmaceut.8b00083

PMID:29672063

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6167198/

Abstract

Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.

摘要

结核病是一个全球性的健康难题。2016 年，世界卫生组织报告了 1040 万例病例和 170 万人死亡。为了开发治疗结核分枝杆菌（Mtb）感染的新疗法，已经进行了许多大规模的表型筛选，并在体外发现了数千种新的活性化合物。然而，由于资金有限，需要更有效地发现针对 Mtb 的新活性分子。已经证明几种计算机器学习方法具有良好的富集和命中率。我们已经整理了小分子 Mtb 数据，并使用总共 18886 个具有 10 μM、1 μM 和 100 nM 活性截止值的分子开发了新模型。这些数据集用于评估不同的机器学习方法（包括深度学习）和指标，并对 2017 年发表的其他分子进行预测。一个 Mtb 模型，一个在 100 nM 活性下结合了体外和体内数据的贝叶斯模型，对于 5 倍交叉验证产生了以下指标：准确性=0.88、精度=0.22、召回率=0.91、特异性=0.88、kappa=0.31 和 MCC=0.41。我们还整理了 2017 年发表的一个评估集（n=153 种化合物），当用于测试我们的模型时，它显示出了相当的统计数据（准确性=0.83、精度=0.27、召回率=1.00、特异性=0.81、kappa=0.36 和 MCC=0.47）。我们还比较了这些模型与其他机器学习算法，表明使用文献 Mtb 数据构建的贝叶斯机器学习模型通常与具有外部测试集的深度神经网络等效或表现更好。最后，我们还比较了我们的训练集和测试集，以表明它们足够多样化且不同，以便代表有用的评估集。这种 Mtb 机器学习模型可以帮助确定化合物进行体外和体内测试的优先级。

相似文献

Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

Mol Pharm. 2018 Oct 1;15(10):4346-4360. doi: 10.1021/acs.molpharmaceut.8b00083. Epub 2018 Apr 26.

Machine Learning Models for Activity: Prediction and Target Visualization.

Mol Pharm. 2022 Feb 7;19(2):674-689. doi: 10.1021/acs.molpharmaceut.1c00791. Epub 2021 Dec 29.

Fusing dual-event data sets for Mycobacterium tuberculosis machine learning models and their evaluation.

J Chem Inf Model. 2013 Nov 25;53(11):3054-63. doi: 10.1021/ci400480s. Epub 2013 Oct 30.

Machine Learning Model Analysis and Data Visualization with Small Molecules Tested in a Mouse Model of Mycobacterium tuberculosis Infection (2014-2015).

J Chem Inf Model. 2016 Jul 25;56(7):1332-43. doi: 10.1021/acs.jcim.6b00004. Epub 2016 Jul 1.

Combining computational methods for hit to lead optimization in Mycobacterium tuberculosis drug discovery.

Pharm Res. 2014 Feb;31(2):414-35. doi: 10.1007/s11095-013-1172-7. Epub 2013 Oct 17.

Identification of active molecules against Mycobacterium tuberculosis through machine learning.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab068.

Enhancing hit identification in Mycobacterium tuberculosis drug discovery using validated dual-event Bayesian models.

PLoS One. 2013 May 7;8(5):e63240. doi: 10.1371/journal.pone.0063240. Print 2013.

Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis.

J Chem Inf Model. 2014 Jul 28;54(7):2157-65. doi: 10.1021/ci500264r. Epub 2014 Jul 17.

Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction.

Mol Pharm. 2018 Oct 1;15(10):4361-4370. doi: 10.1021/acs.molpharmaceut.8b00546. Epub 2018 Aug 28.

Prediction of Mycobacterium tuberculosis cell wall permeability using machine learning methods.

Mol Divers. 2024 Aug;28(4):2317-2329. doi: 10.1007/s11030-024-10952-3. Epub 2024 Aug 12.

引用本文的文献

Machine Learning in Tuberculosis Research: A Global Bibliometric Analysis of Diagnostic, Prognostic, and Drug Discovery Trends.

Ther Innov Regul Sci. 2025 Aug 21. doi: 10.1007/s43441-025-00866-z.

Computational Approaches for Predicting Drug Interactions with Human Organic Anion Transporter 4 (OAT4).

Mol Pharm. 2025 Apr 7;22(4):1847-1858. doi: 10.1021/acs.molpharmaceut.4c00984. Epub 2025 Mar 20.

Sulfate Ester Dioxygenase Rv3406 Is Able to Inactivate the RCB18350 Compound.

ACS Infect Dis. 2025 Apr 11;11(4):986-997. doi: 10.1021/acsinfecdis.4c01030. Epub 2025 Mar 20.

Machine learning-enabled virtual screening indicates the anti-tuberculosis activity of aldoxorubicin and quarfloxin with verification by molecular docking, molecular dynamics simulations, and biological evaluations.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae696.

Discovery of Dual Targeting GSK-3β/HIV-1 Reverse Transcriptase Inhibitors as Neuroprotective Antiviral Agents.

ACS Chem Neurosci. 2025 Jan 1;16(1):77-84. doi: 10.1021/acschemneuro.4c00725. Epub 2024 Dec 11.

TamGen: drug design with target-aware molecule generation through a chemical language model.

Nat Commun. 2024 Oct 29;15(1):9360. doi: 10.1038/s41467-024-53632-4.

Predicting the Hallucinogenic Potential of Molecules Using Artificial Intelligence.

ACS Chem Neurosci. 2024 Aug 21;15(16):3078-3089. doi: 10.1021/acschemneuro.4c00405. Epub 2024 Aug 2.

Near-Term Quantum Classification Algorithms Applied to Antimalarial Drug Discovery.

J Chem Inf Model. 2024 Aug 12;64(15):5922-5930. doi: 10.1021/acs.jcim.4c00953. Epub 2024 Jul 16.

Identification of New Modulators and Inhibitors of Palmitoyl-Protein Thioesterase 1 for CLN1 Batten Disease and Cancer.

ACS Omega. 2024 Feb 28;9(10):11870-11882. doi: 10.1021/acsomega.3c09607. eCollection 2024 Mar 12.

Computational drug repositioning identifies niclosamide and tribromsalan as inhibitors of Mycobacterium tuberculosis and Mycobacterium abscessus.

Tuberculosis (Edinb). 2024 May;146:102500. doi: 10.1016/j.tube.2024.102500. Epub 2024 Feb 27.

本文引用的文献

MoleculeNet: a benchmark for molecular machine learning.

Chem Sci. 2017 Oct 31;9(2):513-530. doi: 10.1039/c7sc02664a. eCollection 2018 Jan 14.

Deep Learning for Drug Design: an Artificial Intelligence Paradigm for Drug Discovery in the Big Data Era.

AAPS J. 2018 Mar 30;20(3):58. doi: 10.1208/s12248-018-0210-0.

The rise of deep learning in drug discovery.

Drug Discov Today. 2018 Jun;23(6):1241-1250. doi: 10.1016/j.drudis.2018.01.039. Epub 2018 Jan 31.

Novel Pyrimidines as Antitubercular Agents.

Antimicrob Agents Chemother. 2018 Feb 23;62(3). doi: 10.1128/AAC.02063-17. Print 2018 Mar.

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Mol Pharm. 2017 Dec 4;14(12):4462-4475. doi: 10.1021/acs.molpharmaceut.7b00578. Epub 2017 Nov 13.

From machine learning to deep learning: progress in machine intelligence for rational drug discovery.

Drug Discov Today. 2017 Nov;22(11):1680-1685. doi: 10.1016/j.drudis.2017.08.010. Epub 2017 Sep 4.

Identification of Better Pharmacokinetic Benzothiazinone Derivatives as New Antitubercular Agents.

ACS Med Chem Lett. 2017 May 10;8(6):636-641. doi: 10.1021/acsmedchemlett.7b00106. eCollection 2017 Jun 8.

Discovery of Indeno[1,2-c]quinoline Derivatives as Potent Dual Antituberculosis and Anti-Inflammatory Agents.

Molecules. 2017 Jun 16;22(6):1001. doi: 10.3390/molecules22061001.

Selective Killing of Dormant Mycobacterium tuberculosis by Marine Natural Products.

Antimicrob Agents Chemother. 2017 Jul 25;61(8). doi: 10.1128/AAC.00743-17. Print 2017 Aug.

Identification and synthesis of novel inhibitors of mycobacterium ATP synthase.

Bioorg Med Chem Lett. 2017 Aug 1;27(15):3454-3459. doi: 10.1016/j.bmcl.2017.05.081. Epub 2017 May 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

比较和验证用于结核分枝杆菌药物发现的机器学习模型。

Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

机构信息

Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States.

Department of Biochemistry and Biophysics , University of North Carolina , Chapel Hill , North Carolina 27599 , United States.

出版信息

Mol Pharm. 2018 Oct 1;15(10):4346-4360. doi: 10.1021/acs.molpharmaceut.8b00083. Epub 2018 Apr 26.

DOI:10.1021/acs.molpharmaceut.8b00083

PMID:29672063

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6167198/

Abstract

摘要

比较和验证用于结核分枝杆菌药物发现的机器学习模型。

Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

比较和验证用于结核分枝杆菌药物发现的机器学习模型。

Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery.

机构信息

出版信息