Astudillo César A, López-Cortés Xaviera A, Ocque Elias, Manríquez-Troncoso José M
Computer Science Department, Engineering Faculty, Universidad de Talca, Talca, Chile.
Department of Computer Sciences and Industries, Universidad Católica del Maule, Talca, Chile.
Sci Rep. 2024 Dec 28;14(1):31283. doi: 10.1038/s41598-024-82697-w.
Antimicrobial resistance (AMR) poses a significant global health challenge, necessitating advanced predictive models to support clinical decision-making. In this study, we explore multi-label classification as a novel approach to predict antibiotic resistance across four clinically relevant bacteria: E. coli, S. aureus, K. pneumoniae, and P. aeruginosa. Using multiple datasets from the DRIAMS repository, we evaluated the performance of four algorithms - Multi-Layer Perceptron, Support Vector Classifier, Random Forest, and Extreme Gradient Boosting - under both single-label and multi-label frameworks. Our results demonstrate that the multi-label approach delivers competitive performance compared to traditional single-label models, with no statistically significant differences in most cases. The multi-label framework naturally captures the complex, interconnected nature of AMR data, reflecting real-world scenarios more accurately. We further validated the models on external datasets (DRIAMS B and C), confirming their generalizability and robustness. Additionally, we investigated the impact of oversampling techniques and provided a reproducible methodology for handling MALDI-TOF data, ensuring scalability for future studies. These findings underscore the potential of multi-label classification to enhance predictive accuracy in AMR research, offering valuable insights for developing diagnostic tools and guiding clinical interventions.
抗菌药物耐药性(AMR)对全球健康构成了重大挑战,因此需要先进的预测模型来支持临床决策。在本研究中,我们探索多标签分类作为一种新方法,用于预测四种临床相关细菌(大肠杆菌、金黄色葡萄球菌、肺炎克雷伯菌和铜绿假单胞菌)的抗生素耐药性。使用来自DRIAMS存储库的多个数据集,我们在单标签和多标签框架下评估了四种算法(多层感知器、支持向量分类器、随机森林和极端梯度提升)的性能。我们的结果表明,与传统的单标签模型相比,多标签方法具有竞争力,在大多数情况下没有统计学上的显著差异。多标签框架自然地捕捉了AMR数据复杂、相互关联的本质,更准确地反映了现实世界的情况。我们在外部数据集(DRIAMS B和C)上进一步验证了模型,证实了它们的通用性和稳健性。此外,我们研究了过采样技术的影响,并提供了一种可重复的方法来处理MALDI-TOF数据,确保了未来研究的可扩展性。这些发现强调了多标签分类在提高AMR研究预测准确性方面的潜力,为开发诊断工具和指导临床干预提供了有价值的见解。