• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用传统机器学习算法和 CNN 从基因组序列数据中准确快速预测结核病耐药性。

Accurate and rapid prediction of tuberculosis drug resistance from genome sequence data using traditional machine learning algorithms and CNN.

机构信息

Center for Translational Data Science, The University of Chicago, Chicago, IL, 60615, USA.

Department of Medicine, The University of Chicago, Chicago, IL, 60637, USA.

出版信息

Sci Rep. 2022 Feb 14;12(1):2427. doi: 10.1038/s41598-022-06449-4.

DOI:10.1038/s41598-022-06449-4
PMID:35165358
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8844416/
Abstract

Effective and timely antibiotic treatment depends on accurate and rapid in silico antimicrobial-resistant (AMR) predictions. Existing statistical rule-based Mycobacterium tuberculosis (MTB) drug resistance prediction methods using bacterial genomic sequencing data often achieve varying results: high accuracy on some antibiotics but relatively low accuracy on others. Traditional machine learning (ML) approaches have been applied to classify drug resistance for MTB and have shown more stable performance. However, there is no study that uses deep learning architecture like Convolutional Neural Network (CNN) on a large and diverse cohort of MTB samples for AMR prediction. We developed 24 binary classifiers of MTB drug resistance status across eight anti-MTB drugs and three different ML algorithms: logistic regression, random forest and 1D CNN using a training dataset of 10,575 MTB isolates collected from 16 countries across six continents, where an extended pan-genome reference was used for detecting genetic features. Our 1D CNN architecture was designed to integrate both sequential and non-sequential features. In terms of F1-scores, 1D CNN models are our best classifiers that are also more accurate and stable than the state-of-the-art rule-based tool Mykrobe predictor (81.1 to 93.8%, 93.7 to 96.2%, 93.1 to 94.8%, 95.9 to 97.2% and 97.1 to 98.2% for ethambutol, rifampicin, pyrazinamide, isoniazid and ofloxacin respectively). We applied filter-based feature selection to find AMR relevant features. All selected variant features are AMR-related ones in CARD database. 78.8% of them are also in the catalogue of MTB mutations that were recently identified as drug resistance-associated ones by WHO. To facilitate ML model development for AMR prediction, we packaged every step into an automated pipeline and shared the source code at https://github.com/KuangXY3/MTB-AMR-classification-CNN .

摘要

有效的、及时的抗生素治疗取决于准确且快速的计算抗菌药物耐药性(AMR)预测。现有的基于统计规则的结核分枝杆菌(MTB)耐药预测方法使用细菌基因组测序数据,往往会得到不同的结果:对某些抗生素的准确性较高,但对其他抗生素的准确性相对较低。传统的机器学习(ML)方法已被应用于 MTB 的耐药性分类,并表现出更稳定的性能。然而,目前还没有研究使用卷积神经网络(CNN)等深度学习架构对来自六大洲 16 个国家的大量、多样化的 MTB 样本进行 AMR 预测。我们使用来自六大洲 16 个国家的 10575 株 MTB 分离株的训练数据集,开发了 24 种针对八种抗 MTB 药物的 MTB 耐药状态的二元分类器,并使用三种不同的 ML 算法:逻辑回归、随机森林和 1DCNN。其中,扩展的泛基因组参考用于检测遗传特征。我们的 1DCNN 架构旨在整合顺序和非顺序特征。在 F1 分数方面,1DCNN 模型是我们最好的分类器,比最先进的基于规则的工具 Mykrobe predictor(分别为 81.1%至 93.8%、93.7%至 96.2%、93.1%至 94.8%、95.9%至 97.2%和 97.1%至 98.2%)更准确和稳定,分别用于乙胺丁醇、利福平、吡嗪酰胺、异烟肼和氧氟沙星。我们应用基于滤波器的特征选择来发现 AMR 相关特征。所有选定的变体特征在 CARD 数据库中均与 AMR 相关。其中 78.8%也在世界卫生组织最近确定的与耐药性相关的 MTB 突变目录中。为了方便 AMR 预测的 ML 模型开发,我们将每个步骤打包到一个自动化管道中,并在 https://github.com/KuangXY3/MTB-AMR-classification-CNN 上共享源代码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/f8232dce1121/41598_2022_6449_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/5905bf6dd1bd/41598_2022_6449_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/1d6d037e7297/41598_2022_6449_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/08b5e37cf9bd/41598_2022_6449_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/744f52cae7b8/41598_2022_6449_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/f8232dce1121/41598_2022_6449_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/5905bf6dd1bd/41598_2022_6449_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/1d6d037e7297/41598_2022_6449_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/08b5e37cf9bd/41598_2022_6449_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/744f52cae7b8/41598_2022_6449_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1efe/8844416/f8232dce1121/41598_2022_6449_Fig5_HTML.jpg

相似文献

1
Accurate and rapid prediction of tuberculosis drug resistance from genome sequence data using traditional machine learning algorithms and CNN.利用传统机器学习算法和 CNN 从基因组序列数据中准确快速预测结核病耐药性。
Sci Rep. 2022 Feb 14;12(1):2427. doi: 10.1038/s41598-022-06449-4.
2
Evaluation of AAICare®-TB sequence analysis tool for accurate diagnosis of drug-resistant tuberculosis: A comparative study with TB-Profiler and Mykrobe.评估 AAICare®-TB 序列分析工具在耐药结核病准确诊断中的应用:与 TB-Profiler 和 Mykrobe 的比较研究。
Tuberculosis (Edinb). 2024 Jul;147:102515. doi: 10.1016/j.tube.2024.102515. Epub 2024 May 8.
3
Beyond multidrug resistance: Leveraging rare variants with machine and statistical learning models in Mycobacterium tuberculosis resistance prediction.超越多药耐药性:利用机器和统计学习模型在结核分枝杆菌耐药性预测中的罕见变异。
EBioMedicine. 2019 May;43:356-369. doi: 10.1016/j.ebiom.2019.04.016. Epub 2019 Apr 29.
4
Evaluation of WHO catalog of mutations and five WGS analysis tools for drug resistance prediction of isolates from China.世界卫生组织突变目录及五种全基因组测序分析工具对中国分离株耐药性预测的评估
Microbiol Spectr. 2024 Aug 6;12(8):e0334123. doi: 10.1128/spectrum.03341-23. Epub 2024 Jun 21.
5
Whole-genome sequencing in drug susceptibility testing of Mycobacterium tuberculosis in routine practice in Lyon, France.法国里昂常规实践中的结核分枝杆菌药物敏感性检测中的全基因组测序。
Int J Antimicrob Agents. 2020 Apr;55(4):105912. doi: 10.1016/j.ijantimicag.2020.105912. Epub 2020 Jan 25.
6
GenTB: A user-friendly genome-based predictor for tuberculosis resistance powered by machine learning.GenTB:一款基于基因组的用户友好型结核耐药预测器,由机器学习驱动。
Genome Med. 2021 Aug 30;13(1):138. doi: 10.1186/s13073-021-00953-4.
7
Whole-genome sequencing of for prediction of drug resistance.对 进行全基因组测序以预测药物耐药性。
Epidemiol Infect. 2022 Jan 7;150:e22. doi: 10.1017/S095026882100279X.
8
TB-DROP: deep learning-based drug resistance prediction of Mycobacterium tuberculosis utilizing whole genome mutations.TB-DROP:基于深度学习的结核分枝杆菌全基因组突变药物耐药性预测
BMC Genomics. 2024 Feb 12;25(1):167. doi: 10.1186/s12864-024-10066-y.
9
[Detection of tuberculosis genes associated with drug-resistance in paraffin-embedded tissue specimens using next generation sequencing technology].[利用新一代测序技术检测石蜡包埋组织标本中与耐药相关的结核基因]
Zhonghua Jie He He Hu Xi Za Zhi. 2020 Mar 12;43(3):234-241. doi: 10.3760/cma.j.issn.1001-0939.2020.03.019.
10
Machine learning for classifying tuberculosis drug-resistance from DNA sequencing data.基于 DNA 测序数据的机器学习方法用于结核分枝杆菌耐药性分类。
Bioinformatics. 2018 May 15;34(10):1666-1671. doi: 10.1093/bioinformatics/btx801.

引用本文的文献

1
Machine Learning in Tuberculosis Research: A Global Bibliometric Analysis of Diagnostic, Prognostic, and Drug Discovery Trends.结核病研究中的机器学习:诊断、预后及药物发现趋势的全球文献计量分析
Ther Innov Regul Sci. 2025 Aug 21. doi: 10.1007/s43441-025-00866-z.
2
Machine learning-based prediction of antimicrobial resistance and identification of AMR-related SNPs in Mycobacterium tuberculosis.基于机器学习的结核分枝杆菌抗菌药物耐药性预测及与耐药相关单核苷酸多态性的鉴定
BMC Genom Data. 2025 Jul 12;26(1):48. doi: 10.1186/s12863-025-01338-x.
3
Feature selection and aggregation for antibiotic resistance GWAS in : a comparative study.

本文引用的文献

1
The barley pan-genome reveals the hidden legacy of mutation breeding.大麦泛基因组揭示了诱变育种的隐藏遗产。
Nature. 2020 Dec;588(7837):284-289. doi: 10.1038/s41586-020-2947-8. Epub 2020 Nov 25.
2
An explainable machine learning platform for pyrazinamide resistance prediction and genetic feature identification of Mycobacterium tuberculosis.一个用于吡嗪酰胺耐药性预测和结核分枝杆菌遗传特征识别的可解释机器学习平台。
J Am Med Inform Assoc. 2021 Mar 1;28(3):533-540. doi: 10.1093/jamia/ocaa233.
3
Detection of low-frequency resistance-mediating SNPs in next-generation sequencing data of Mycobacterium tuberculosis complex strains with binoSNP.
抗生素耐药性全基因组关联研究中的特征选择与聚合:一项比较研究
Front Microbiol. 2025 Jun 18;16:1586476. doi: 10.3389/fmicb.2025.1586476. eCollection 2025.
4
Predicting rifampicin resistance in using machine learning informed by protein structural and chemical features.利用蛋白质结构和化学特征通过机器学习预测利福平耐药性。
ERJ Open Res. 2025 Jun 30;11(3). doi: 10.1183/23120541.00952-2024. eCollection 2025 May.
5
Artificial intelligence in drug resistance management.人工智能在耐药性管理中的应用
3 Biotech. 2025 May;15(5):126. doi: 10.1007/s13205-025-04282-w. Epub 2025 Apr 14.
6
Machine learning-based approach for identification of new resistance associated mutations from whole genome sequences of .基于机器学习的方法从……的全基因组序列中鉴定新的耐药相关突变
Bioinform Adv. 2025 Mar 11;5(1):vbaf050. doi: 10.1093/bioadv/vbaf050. eCollection 2025.
7
The role of artificial intelligence and machine learning in predicting and combating antimicrobial resistance.人工智能和机器学习在预测及对抗抗菌药物耐药性方面的作用。
Comput Struct Biotechnol J. 2025 Jan 18;27:423-439. doi: 10.1016/j.csbj.2025.01.006. eCollection 2025.
8
Integrative genomics would strengthen AMR understanding through ONE health approach.整合基因组学将通过“同一个健康”方法加强对耐药性的理解。
Heliyon. 2024 Jul 17;10(14):e34719. doi: 10.1016/j.heliyon.2024.e34719. eCollection 2024 Jul 30.
9
A machine learning-based strategy to elucidate the identification of antibiotic resistance in bacteria.一种基于机器学习的策略,用于阐明细菌中抗生素耐药性的识别。
Front Antibiot. 2024 Jun 18;3:1405296. doi: 10.3389/frabi.2024.1405296. eCollection 2024.
10
Unveiling the Dynamics of Antimicrobial Resistance: A Year-Long Surveillance (2023) at the Largest Infectious Disease Profile Hospital in Western Romania.揭示抗菌药物耐药性动态:罗马尼亚西部最大传染病专科医院的为期一年的监测(2023年)
Antibiotics (Basel). 2024 Nov 25;13(12):1130. doi: 10.3390/antibiotics13121130.
利用 binoSNP 检测结核分枝杆菌复合群菌株下一代测序数据中的低频耐药相关单核苷酸多态性。
Sci Rep. 2020 May 12;10(1):7874. doi: 10.1038/s41598-020-64708-8.
4
Computational pan-genome mapping and pairwise SNP-distance improve detection of Mycobacterium tuberculosis transmission clusters.计算泛基因组图谱和 SNP 对距离可提高结核分枝杆菌传播集群的检测能力。
PLoS Comput Biol. 2019 Dec 9;15(12):e1007527. doi: 10.1371/journal.pcbi.1007527. eCollection 2019 Dec.
5
CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database.CARD 2020:利用综合抗生素耐药数据库进行抗生素耐药组监测。
Nucleic Acids Res. 2020 Jan 8;48(D1):D517-D525. doi: 10.1093/nar/gkz935.
6
Sequencing-based methods and resources to study antimicrobial resistance.基于测序的方法和资源研究抗菌药物耐药性。
Nat Rev Genet. 2019 Jun;20(6):356-370. doi: 10.1038/s41576-019-0108-4.
7
Application of machine learning techniques to tuberculosis drug resistance analysis.机器学习技术在结核病耐药性分析中的应用。
Bioinformatics. 2019 Jul 1;35(13):2276-2282. doi: 10.1093/bioinformatics/bty949.
8
Prediction of Susceptibility to First-Line Tuberculosis Drugs by DNA Sequencing.基于 DNA 测序的一线抗结核药物敏感性预测。
N Engl J Med. 2018 Oct 11;379(15):1403-1415. doi: 10.1056/NEJMoa1800474. Epub 2018 Sep 26.
9
Machine learning for classifying tuberculosis drug-resistance from DNA sequencing data.基于 DNA 测序数据的机器学习方法用于结核分枝杆菌耐药性分类。
Bioinformatics. 2018 May 15;34(10):1666-1671. doi: 10.1093/bioinformatics/btx801.
10
ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads.ARIBA:直接从测序读段进行快速抗菌药物耐药基因分型。
Microb Genom. 2017 Sep 4;3(10):e000131. doi: 10.1099/mgen.0.000131. eCollection 2017 Oct.