Suppr超能文献

基于基质辅助激光解吸电离飞行时间质谱峰聚类分析的耐甲氧西林金黄色葡萄球菌的大规模调查与鉴定。

A large-scale investigation and identification of methicillin-resistant Staphylococcus aureus based on peaks binning of matrix-assisted laser desorption ionization-time of flight MS spectra.

机构信息

Department of Laboratory Medicine, Chang Gung Memorial Hospital at Linkou, Taoyuan City, Taiwan.

Department of Computer Science and Information Engineering, National Central University.

出版信息

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa138.

Abstract

Recent studies have demonstrated that the matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) could be used to detect superbugs, such as methicillin-resistant Staphylococcus aureus (MRSA). Due to an increasingly clinical need to classify between MRSA and methicillin-sensitive Staphylococcus aureus (MSSA) efficiently and effectively, we were motivated to develop a systematic pipeline based on a large-scale dataset of MS spectra. However, the shifting problem of peaks in MS spectra induced a low effectiveness in the classification between MRSA and MSSA isolates. Unlike previous works emphasizing on specific peaks, this study employs a binning method to cluster MS shifting ions into several representative peaks. A variety of bin sizes were evaluated to coalesce drifted or shifted MS peaks to a well-defined structured data. Then, various machine learning methods were performed to carry out the classification between MRSA and MSSA samples. Totally 4858 MS spectra of unique S. aureus isolates, including 2500 MRSA and 2358 MSSA instances, were collected by Chang Gung Memorial Hospitals, at Linkou and Kaohsiung branches, Taiwan. Based on the evaluation of Pearson correlation coefficients and the strategy of forward feature selection, a total of 200 peaks (with the bin size of 10 Da) were identified as the marker attributes for the construction of predictive models. These selected peaks, such as bins 2410-2419, 2450-2459 and 6590-6599 Da, have indicated remarkable differences between MRSA and MSSA, which were effective in the prediction of MRSA. The independent testing has revealed that the random forest model can provide a promising prediction with the area under the receiver operating characteristic curve (AUC) at 0.8450. When comparing to previous works conducted with hundreds of MS spectra, the proposed scheme demonstrates that incorporating machine learning method with a large-scale dataset of clinical MS spectra may be a feasible means for clinical physicians on the administration of correct antibiotics in shorter turn-around-time, which could reduce mortality, avoid drug resistance and shorten length of stay in hospital in the future.

摘要

最近的研究表明,基质辅助激光解吸电离飞行时间质谱(MALDI-TOF MS)可用于检测超级细菌,如耐甲氧西林金黄色葡萄球菌(MRSA)。由于临床需要高效、有效地对 MRSA 和甲氧西林敏感金黄色葡萄球菌(MSSA)进行分类,我们受到启发,基于大规模 MS 谱数据集开发了一个系统流程。然而,MS 谱中的峰漂移问题导致 MRSA 和 MSSA 分离物的分类效果不佳。与之前强调特定峰的工作不同,本研究采用分箱方法将 MS 漂移离子聚类为几个代表性峰。评估了多种分箱大小,以将漂移或移位的 MS 峰合并到定义明确的结构化数据中。然后,使用各种机器学习方法对 MRSA 和 MSSA 样本进行分类。总共收集了台湾长庚纪念医院林口和高雄分院的 4858 个独特金黄色葡萄球菌分离物的 MS 谱,其中包括 2500 个 MRSA 和 2358 个 MSSA 实例。基于皮尔逊相关系数的评估和前向特征选择策略,共鉴定出 200 个峰(分箱大小为 10 Da)作为构建预测模型的标记属性。这些选定的峰,如 2410-2419、2450-2459 和 6590-6599 Da 等,表明 MRSA 和 MSSA 之间存在显著差异,对 MRSA 的预测有效。独立测试表明,随机森林模型可以提供有前途的预测,接收器操作特征曲线(AUC)下面积为 0.8450。与以前使用数百个 MS 谱进行的工作相比,所提出的方案表明,将机器学习方法与大规模临床 MS 谱数据集相结合,可能是临床医生在更短的周转时间内正确使用抗生素的可行方法,这可以降低死亡率、避免耐药性并缩短未来的住院时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ec5/8138823/55dbfb7e97e7/bbaa138f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验