• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过使用基于泛基因组的特征选择方法扩展潜在抗性基因库,增强对病原体抗菌抗性的预测。

Enhancing predictions of antimicrobial resistance of pathogens by expanding the potential resistance gene repertoire using a pan-genome-based feature selection approach.

作者信息

Yang Ming-Ren, Wu Yu-Wei

机构信息

Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 250 Wuxing St., Sinyi District, Taipei, 11031, Taiwan.

Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, 106, Taiwan.

出版信息

BMC Bioinformatics. 2022 Apr 15;23(Suppl 4):131. doi: 10.1186/s12859-022-04666-2.

DOI:10.1186/s12859-022-04666-2
PMID:35428201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9011928/
Abstract

BACKGROUND

Predicting which pathogens might exhibit antimicrobial resistance (AMR) based on genomics data is one of the promising ways to swiftly and precisely identify AMR pathogens. Currently, the most widely used genomics approach is through identifying known AMR genes from genomic information in order to predict whether a pathogen might be resistant to certain antibiotic drugs. The list of known AMR genes, however, is still far from comprehensive and may result in inaccurate AMR pathogen predictions. We thus felt the need to expand the AMR gene set and proposed a pan-genome-based feature selection method to identify potential gene sets for AMR prediction purposes.

RESULTS

By building pan-genome datasets and extracting gene presence/absence patterns from four bacterial species, each with more than 2000 strains, we showed that machine learning models built from pan-genome data can be very promising for predicting AMR pathogens. The gene set selected by the eXtreme Gradient Boosting (XGBoost) feature selection approach further improved prediction outcomes, and an incremental approach selecting subsets of XGBoost-selected features brought the machine learning model performance to the next level. Investigating selected gene sets revealed that on average about 50% of genes had no known function and very few of them were known AMR genes, indicating the potential of the selected gene sets to expand resistance gene repertoires.

CONCLUSIONS

We demonstrated that a pan-genome-based feature selection approach is suitable for building machine learning models for predicting AMR pathogens. The extracted gene sets may provide future clues to expand our knowledge of known AMR genes and provide novel hypotheses for inferring bacterial AMR mechanisms.

摘要

背景

基于基因组数据预测哪些病原体可能表现出抗菌药物耐药性(AMR)是快速准确识别AMR病原体的一种有前景的方法。目前,应用最广泛的基因组学方法是通过从基因组信息中识别已知的AMR基因,以预测病原体是否可能对某些抗生素耐药。然而,已知AMR基因的列表仍远不够全面,可能导致AMR病原体预测不准确。因此,我们认为有必要扩展AMR基因集,并提出了一种基于泛基因组的特征选择方法,以识别用于AMR预测目的的潜在基因集。

结果

通过构建泛基因组数据集并从四种细菌物种(每种细菌有2000多个菌株)中提取基因存在/缺失模式,我们表明基于泛基因组数据构建的机器学习模型在预测AMR病原体方面很有前景。通过极端梯度提升(XGBoost)特征选择方法选择的基因集进一步改善了预测结果,一种选择XGBoost选择特征子集的增量方法将机器学习模型的性能提升到了新的水平。对所选基因集的研究表明,平均约50%的基因功能未知,其中很少是已知的AMR基因,这表明所选基因集在扩展耐药基因库方面的潜力。

结论

我们证明了基于泛基因组的特征选择方法适用于构建预测AMR病原体的机器学习模型。提取的基因集可能为扩展我们对已知AMR基因的认识提供未来线索,并为推断细菌AMR机制提供新的假设。

相似文献

1
Enhancing predictions of antimicrobial resistance of pathogens by expanding the potential resistance gene repertoire using a pan-genome-based feature selection approach.通过使用基于泛基因组的特征选择方法扩展潜在抗性基因库,增强对病原体抗菌抗性的预测。
BMC Bioinformatics. 2022 Apr 15;23(Suppl 4):131. doi: 10.1186/s12859-022-04666-2.
2
A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains.基于泛基因组的机器学习方法预测大肠杆菌菌株的抗菌药物抗性活性。
Bioinformatics. 2018 Jul 1;34(13):i89-i95. doi: 10.1093/bioinformatics/bty276.
3
Unitig-centered pan-genome machine learning approach for predicting antibiotic resistance and discovering novel resistance genes in bacterial strains.基于重叠群的泛基因组机器学习方法用于预测细菌菌株的抗生素抗性并发现新的抗性基因。
Comput Struct Biotechnol J. 2024 Apr 16;23:1864-1876. doi: 10.1016/j.csbj.2024.04.035. eCollection 2024 Dec.
4
A Cross-Validated Feature Selection (CVFS) approach for extracting the most parsimonious feature sets and discovering potential antimicrobial resistance (AMR) biomarkers.一种用于提取最简约特征集并发现潜在抗菌药物耐药性(AMR)生物标志物的交叉验证特征选择(CVFS)方法。
Comput Struct Biotechnol J. 2022 Dec 28;21:769-779. doi: 10.1016/j.csbj.2022.12.046. eCollection 2023.
5
Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC).使用基于细菌泛基因组的特征选择方法来改善最低抑菌浓度(MIC)的预测。
Front Genet. 2023 May 30;14:1054032. doi: 10.3389/fgene.2023.1054032. eCollection 2023.
6
Machine learning with random subspace ensembles identifies antimicrobial resistance determinants from pan-genomes of three pathogens.基于随机子空间集成的机器学习方法从三种病原体的泛基因组中识别出抗生素耐药决定因子。
PLoS Comput Biol. 2020 Mar 2;16(3):e1007608. doi: 10.1371/journal.pcbi.1007608. eCollection 2020 Mar.
7
Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection.基于全基因组序列数据的深度特征选择进行抗菌药物耐药性遗传因子鉴定。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 15):535. doi: 10.1186/s12859-019-3054-4.
8
VAMPr: VAriant Mapping and Prediction of antibiotic resistance via explainable features and machine learning.VAMPr:通过可解释特征和机器学习对抗生素耐药性进行变异映射和预测。
PLoS Comput Biol. 2020 Jan 13;16(1):e1007511. doi: 10.1371/journal.pcbi.1007511. eCollection 2020 Jan.
9
Keeping up with the pathogens: improved antimicrobial resistance detection and prediction from Pseudomonas aeruginosa genomes.紧跟病原体:从铜绿假单胞菌基因组中提高对抗菌药物耐药性的检测和预测。
Genome Med. 2024 Jun 7;16(1):78. doi: 10.1186/s13073-024-01346-z.
10
Predicting antimicrobial resistance using conserved genes.利用保守基因预测抗菌药物耐药性。
PLoS Comput Biol. 2020 Oct 19;16(10):e1008319. doi: 10.1371/journal.pcbi.1008319. eCollection 2020 Oct.

引用本文的文献

1
LC-MS/MS metabolomics unravels the resistant phenotype of carbapenemase-producing Enterobacterales.液相色谱-串联质谱代谢组学揭示产碳青霉烯酶肠杆菌科细菌的耐药表型。
Metabolomics. 2025 Aug 12;21(5):115. doi: 10.1007/s11306-025-02300-9.
2
Prediction of antimicrobial resistance in with a machine learning classifier based on WGS data.基于全基因组测序(WGS)数据,利用机器学习分类器预测[具体对象]中的抗菌药物耐药性。 (注:原文中“in with”表述有误,推测可能是“in [具体对象] with”,这里根据可能情况补充完整翻译)
Microbiol Spectr. 2025 Sep 2;13(9):e0006525. doi: 10.1128/spectrum.00065-25. Epub 2025 Aug 5.
3
The role of artificial intelligence and machine learning in predicting and combating antimicrobial resistance.

本文引用的文献

1
Plant pan-genomes are the new reference.植物泛基因组成为新的参考。
Nat Plants. 2020 Aug;6(8):914-920. doi: 10.1038/s41477-020-0733-0. Epub 2020 Jul 20.
2
PARGT: a software tool for predicting antimicrobial resistance in bacteria.PARGT:一种用于预测细菌对抗菌药物耐药性的软件工具。
Sci Rep. 2020 Jul 3;10(1):11033. doi: 10.1038/s41598-020-67949-9.
3
Pan-genomics in the human genome era.人类基因组时代的泛基因组学。
人工智能和机器学习在预测及对抗抗菌药物耐药性方面的作用。
Comput Struct Biotechnol J. 2025 Jan 18;27:423-439. doi: 10.1016/j.csbj.2025.01.006. eCollection 2025.
4
Artificial intelligence in predicting pathogenic microorganisms' antimicrobial resistance: challenges, progress, and prospects.人工智能在预测病原微生物的抗菌药物耐药性方面的应用:挑战、进展和展望。
Front Cell Infect Microbiol. 2024 Nov 1;14:1482186. doi: 10.3389/fcimb.2024.1482186. eCollection 2024.
5
Integrating whole genome sequencing and machine learning for predicting antimicrobial resistance in critical pathogens: a systematic review of antimicrobial susceptibility tests.整合全基因组测序和机器学习预测关键病原体的抗菌药物耐药性:抗菌药物敏感性试验的系统评价。
PeerJ. 2024 Oct 9;12:e18213. doi: 10.7717/peerj.18213. eCollection 2024.
6
Artificial intelligence tools for the identification of antibiotic resistance genes.用于鉴定抗生素耐药基因的人工智能工具
Front Microbiol. 2024 Jul 12;15:1437602. doi: 10.3389/fmicb.2024.1437602. eCollection 2024.
7
Tackling the Antimicrobial Resistance "Pandemic" with Machine Learning Tools: A Summary of Available Evidence.使用机器学习工具应对抗微生物药物耐药性“大流行”:现有证据综述
Microorganisms. 2024 Apr 23;12(5):842. doi: 10.3390/microorganisms12050842.
8
Unitig-centered pan-genome machine learning approach for predicting antibiotic resistance and discovering novel resistance genes in bacterial strains.基于重叠群的泛基因组机器学习方法用于预测细菌菌株的抗生素抗性并发现新的抗性基因。
Comput Struct Biotechnol J. 2024 Apr 16;23:1864-1876. doi: 10.1016/j.csbj.2024.04.035. eCollection 2024 Dec.
9
Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC).使用基于细菌泛基因组的特征选择方法来改善最低抑菌浓度(MIC)的预测。
Front Genet. 2023 May 30;14:1054032. doi: 10.3389/fgene.2023.1054032. eCollection 2023.
10
Bioinformatic Analysis Reveals both Oversampled and Underexplored Biosynthetic Diversity in Nonribosomal Peptides.生物信息学分析揭示了非核糖体肽中的过采样和未充分探索的生物合成多样性。
ACS Chem Biol. 2023 Mar 17;18(3):476-483. doi: 10.1021/acschembio.2c00761. Epub 2023 Feb 23.
Nat Rev Genet. 2020 Apr;21(4):243-254. doi: 10.1038/s41576-020-0210-7. Epub 2020 Feb 7.
4
Towards the Complete Goat Pan-Genome by Recovering Missing Genomic Segments From the Reference Genome.通过从参考基因组中恢复缺失的基因组片段构建完整的山羊泛基因组
Front Genet. 2019 Nov 15;10:1169. doi: 10.3389/fgene.2019.01169. eCollection 2019.
5
The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities.PATRIC 生物信息学资源中心:扩展数据和分析功能。
Nucleic Acids Res. 2020 Jan 8;48(D1):D606-D612. doi: 10.1093/nar/gkz943.
6
CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database.CARD 2020:利用综合抗生素耐药数据库进行抗生素耐药组监测。
Nucleic Acids Res. 2020 Jan 8;48(D1):D517-D525. doi: 10.1093/nar/gkz935.
7
HUPAN: a pan-genome analysis pipeline for human genomes.HUPAN:一个用于人类基因组的泛基因组分析流水线。
Genome Biol. 2019 Jul 31;20(1):149. doi: 10.1186/s13059-019-1751-y.
8
Interpretable genotype-to-phenotype classifiers with performance guarantees.具有性能保证的可解释基因型到表型分类器。
Sci Rep. 2019 Mar 11;9(1):4071. doi: 10.1038/s41598-019-40561-2.
9
Prediction of antibiotic resistance in Escherichia coli from large-scale pan-genome data.从大规模泛基因组数据预测大肠杆菌的抗生素耐药性。
PLoS Comput Biol. 2018 Dec 14;14(12):e1006258. doi: 10.1371/journal.pcbi.1006258. eCollection 2018 Dec.
10
A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains.基于泛基因组的机器学习方法预测大肠杆菌菌株的抗菌药物抗性活性。
Bioinformatics. 2018 Jul 1;34(13):i89-i95. doi: 10.1093/bioinformatics/bty276.