Suppr超能文献

一种基于互信息特征选择方法的用于微生物分类的简易机器学习模型。

An Improvised Machine Learning Model Based on Mutual Information Feature Selection Approach for Microbes Classification.

作者信息

Dhindsa Anaahat, Bhatia Sanjay, Agrawal Sunil, Sohi Balwinder Singh

机构信息

Department of Electronics and Communication Engineering, Chandigarh University, Gharuan, Punjab 140413, India.

University Institute of Engineering and Technology, Panjab University, Chandigarh 160014, India.

出版信息

Entropy (Basel). 2021 Feb 23;23(2):257. doi: 10.3390/e23020257.

Abstract

The accurate classification of microbes is critical in today's context for monitoring the ecological balance of a habitat. Hence, in this research work, a novel method to automate the process of identifying microorganisms has been implemented. To extract the bodies of microorganisms accurately, a generalized segmentation mechanism which consists of a combination of convolution filter (Kirsch) and a variance-based pixel clustering algorithm (Otsu) is proposed. With exhaustive corroboration, a set of twenty-five features were identified to map the characteristics and morphology for all kinds of microbes. Multiple techniques for feature selection were tested and it was found that mutual information (MI)-based models gave the best performance. Exhaustive hyperparameter tuning of multilayer layer perceptron (MLP), k-nearest neighbors (KNN), quadratic discriminant analysis (QDA), logistic regression (LR), and support vector machine (SVM) was done. It was found that SVM radial required further improvisation to attain a maximum possible level of accuracy. Comparative analysis between SVM and improvised SVM (ISVM) through a 10-fold cross validation method ultimately showed that ISVM resulted in a 2% higher performance in terms of accuracy (98.2%), precision (98.2%), recall (98.1%), and F1 score (98.1%).

摘要

在当今环境下,微生物的准确分类对于监测栖息地的生态平衡至关重要。因此,在这项研究工作中,实施了一种用于自动识别微生物过程的新方法。为了准确提取微生物的主体,提出了一种广义分割机制,该机制由卷积滤波器(基尔希)和基于方差的像素聚类算法(大津)组合而成。经过详尽的验证,确定了一组25个特征来映射各类微生物的特征和形态。测试了多种特征选择技术,发现基于互信息(MI)的模型表现最佳。对多层感知器(MLP)、k近邻(KNN)、二次判别分析(QDA)、逻辑回归(LR)和支持向量机(SVM)进行了详尽的超参数调整。发现SVM径向需要进一步改进以达到最大可能的准确率水平。通过10折交叉验证方法对SVM和改进后的SVM(ISVM)进行的比较分析最终表明,ISVM在准确率(98.2%)、精确率(98.2%)、召回率(98.1%)和F1分数(98.1%)方面的性能提高了2%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93cd/7927045/87199c786bb2/entropy-23-00257-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验