Institute of Life Sciences & Resources and Department of Food Science and Biotechnology, Kyung Hee University, Yongin 17104, Republic of Korea.
Department of Smart Farm Science, Kyung Hee University, Yongin 17104, Republic of Korea.
Food Chem. 2025 Jan 1;462:140931. doi: 10.1016/j.foodchem.2024.140931. Epub 2024 Aug 20.
This research focused on distinguishing distinct matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) spectral signatures of three Enterococcus species. We evaluated and compared the predictive performance of four supervised machine learning algorithms, K-nearest neighbor (KNN), support vector machine (SVM), and random forest (RF), to accurately classify Enterococcus species. This study involved a comprehensive dataset of 410 strains, generating 1640 individual spectra through on-plate and off-plate protein extraction methods. Although the commercial database correctly identified 76.9% of the strains, machine learning classifiers demonstrated superior performance (accuracy 0.991). In the RF model, top informative peaks played a significant role in the classification. Whole-genome sequencing showed that the most informative peaks are biomarkers connected to proteins, which are essential for understanding bacterial classification and evolution. The integration of MALDI-TOF MS and machine learning provides a rapid and accurate method for identifying Enterococcus species, improving healthcare and food safety.
本研究专注于区分三种肠球菌的基质辅助激光解吸/电离飞行时间质谱(MALDI-TOF MS)光谱特征。我们评估和比较了四种监督机器学习算法(K 近邻(KNN)、支持向量机(SVM)和随机森林(RF))的预测性能,以准确地对肠球菌进行分类。这项研究涉及一个包含 410 株菌的综合数据集,通过板上和板下蛋白质提取方法生成了 1640 个个体光谱。尽管商业数据库正确识别了 76.9%的菌株,但机器学习分类器表现出了更高的性能(准确率为 0.991)。在 RF 模型中,最重要的信息峰在分类中起着重要作用。全基因组测序表明,最具信息量的峰是与蛋白质相关的生物标志物,这对于理解细菌分类和进化至关重要。MALDI-TOF MS 和机器学习的结合为鉴定肠球菌提供了一种快速而准确的方法,提高了医疗保健和食品安全水平。