Sangar Srushti, Kolage Prathamesh, Chunarkar-Patil Pritee
Department of Bioinformatics, Rajiv Gandhi Institute of IT and Biotechnology, Bharati Vidyapeeth (Deemed to be University), Pune, Maharashtra, India.
Bioinformation. 2024 Sep 30;20(9):986-989. doi: 10.6026/973206300200986. eCollection 2024.
Bacterial identification is a critical process in microbiology, clinical diagnostics, environmental monitoring, and food safety. Machine learning holds great promise for improving bacterial identification by increasing accuracy, speed, and scalability. However, challenges such as data dependency, model interpretability, and computational demands must be addressed to fully realize it's potential. k-mer based bacterial identification algorithm is an attempt to address these issues. Sequence matching is completed using the KNN technique. This included feature extraction, dataset preparation, classifier training, and label prediction based on k-mer frequency distribution similarity. The algorithm's performance has been cross-checked through accuracy assessment metrics such as F1 score and precision with an impressive 93% accuracy rate.
细菌鉴定是微生物学、临床诊断、环境监测和食品安全中的关键过程。机器学习在提高细菌鉴定的准确性、速度和可扩展性方面具有巨大潜力。然而,要充分发挥其潜力,必须解决数据依赖性、模型可解释性和计算需求等挑战。基于k-mer的细菌鉴定算法试图解决这些问题。使用KNN技术完成序列匹配。这包括特征提取、数据集准备、分类器训练以及基于k-mer频率分布相似性的标签预测。该算法的性能已通过F1分数和精确率等准确性评估指标进行交叉检验,准确率高达93%,令人印象深刻。