Xie Fei, Wang Lifeng, Li Song, Hu Long, Wen Yanhua, Li Xuming, Ye Kun, Duan Zhimei, Wang Qi, Guan Yuanlin, Zhang Ye, Shi Qiqi, Yang Jiyong, Xia Han, Xie Lixin
College of Pulmonary and Critical Care Medicine, Chinese PLA General Hospital, Beijing, China.
Laboratory Medicine Department, First Medical Center of Chinese PLA General Hospital, Beijing, China.
mBio. 2025 Mar 12;16(3):e0285224. doi: 10.1128/mbio.02852-24. Epub 2025 Feb 20.
, a prominent nosocomial pathogen renowned for its extensive resistance to antimicrobial agents, poses a significant challenge in the accurate prediction of antimicrobial resistance (AMR) from genomic data. Despite thorough researches on the molecular mechanisms of AMR, gaps remain in our understanding of key contributors. This study utilized rule-based and three machine learning models to predict AMR phenotypes, aiming to decipher key genomic factors associated with AMR. Genomes and antibiotic resistance phenotypes from 1,012 public isolates were employed for model construction and training. To validate the models, a data set comprising 164 self-collected strains underwent next-generation sequencing, nanopore long-read sequencing, and antimicrobial susceptibility testing using the broth dilution method. It was found that the presence of antibiotic resistance genes (ARGs) alone was insufficient to accurately predict AMR phenotype for the majority of antibiotics (90%, 18 out of 20) in the public data set. Conversely, it was observed that combining ARGs with insertion sequence (IS) elements significantly enhanced predictive performance. The Random Forest model was found to outperform the support vector machine (SVM), logistic regression model, and rule-based method across all 20 antibiotics, with accuracies ranging from 83.80% to 97.70%. In the validation data set, even higher accuracies were achieved, ranging from 85.63% to 99.31%. Furthermore, conserved sequence patterns between IS elements and ARGs were validated using self-collected long-read sequencing data, substantially enhancing the accuracy of AMR prediction in . This study underscores the pivotal role of IS elements in AMR.
The interplay between insertion sequences (ISs) and antibiotic resistance genes (ARGs) in contributes to resistance against specific antibiotics. Conventionally, genetic variations and ARGs have been utilized for predicting resistance phenotypes, with the potential pivotal role of IS elements largely overlooked. Our study advances this approach by integrating both rule-based and machine learning models to predict AMR in . This significantly enhances the accuracy of AMR prediction, emphasizing the pivotal function of IS elements in antibiotic resistance. Notably, we uncover a series of conserved sequence patterns linking IS elements and ARGs, which outperform ARGs alone in phenotypic prediction. Our findings are crucial for bioinformatics strategies aimed at studying and tracking AMR, offering novel insights into combating the escalating AMR challenge.
作为一种以对抗菌药物具有广泛耐药性而闻名的重要医院病原体,在从基因组数据准确预测抗菌药物耐药性(AMR)方面面临重大挑战。尽管对抗菌药物耐药性的分子机制进行了深入研究,但我们对关键促成因素的理解仍存在差距。本研究利用基于规则的模型和三种机器学习模型来预测抗菌药物耐药性表型,旨在破译与抗菌药物耐药性相关的关键基因组因素。从1012株公开分离株中获取的基因组和抗生素耐药性表型用于模型构建和训练。为了验证模型,对一组包含164株自行收集菌株的数据集进行了下一代测序、纳米孔长读长测序,并使用肉汤稀释法进行了抗菌药物敏感性测试。研究发现,仅抗生素耐药基因(ARG)的存在不足以准确预测公共数据集中大多数抗生素(90%,20种中的18种)的抗菌药物耐药性表型。相反,观察到将抗生素耐药基因与插入序列(IS)元件相结合可显著提高预测性能。在所有20种抗生素中,随机森林模型的表现优于支持向量机(SVM)、逻辑回归模型和基于规则的方法,准确率在83.80%至97.70%之间。在验证数据集中,准确率更高,在85.63%至99.31%之间。此外,利用自行收集的长读长测序数据验证了插入序列元件与抗生素耐药基因之间的保守序列模式,大大提高了对[病原体名称未给出]抗菌药物耐药性预测的准确性。本研究强调了插入序列元件在抗菌药物耐药性中的关键作用。
插入序列(IS)与抗生素耐药基因(ARG)之间的相互作用导致了[病原体名称未给出]对特定抗生素的耐药性。传统上,基因变异和抗生素耐药基因已被用于预测耐药性表型,而插入序列元件的潜在关键作用在很大程度上被忽视了。我们的研究通过整合基于规则的模型和机器学习模型来预测[病原体名称未给出]的抗菌药物耐药性,推进了这一方法。这显著提高了抗菌药物耐药性预测的准确性,强调了插入序列元件在抗生素耐药性中的关键作用。值得注意的是,我们发现了一系列连接插入序列元件和抗生素耐药基因的保守序列模式,这些模式在表型预测方面优于单独的抗生素耐药基因。我们的发现对于旨在研究和追踪抗菌药物耐药性的生物信息学策略至关重要,为应对不断升级的抗菌药物耐药性挑战提供了新的见解。