文献检索，用中文搜 PubMed

A Comparative Study of Deep Learning Classification Methods on a Small Environmental Microorganism Image Dataset (EMDS-6): From Convolutional Neural Networks to Visual Transformers.

作者信息

Zhao Peng, Li Chen, Rahaman Md Mamunur, Xu Hao, Yang Hechen, Sun Hongzan, Jiang Tao, Grzegorzek Marcin

机构信息

Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.

Shengjing Hospital, China Medical University, Shenyang, China.

出版信息

Front Microbiol. 2022 Mar 2;13:792166. doi: 10.3389/fmicb.2022.792166. eCollection 2022.

In recent years, deep learning has made brilliant achievements in (EM) image classification. However, image classification of small EM datasets has still not obtained good research results. Therefore, researchers need to spend a lot of time searching for models with good classification performance and suitable for the current equipment working environment. To provide reliable references for researchers, we conduct a series of comparison experiments on 21 deep learning models. The experiment includes direct classification, imbalanced training, and hyper-parameters tuning experiments. During the experiments, we find complementarities among the 21 models, which is the basis for feature fusion related experiments. We also find that the data augmentation method of geometric deformation is difficult to improve the performance of VTs (ViT, DeiT, BotNet, and T2T-ViT) series models. In terms of model performance, Xception has the best classification performance, the vision transformer (ViT) model consumes the least time for training, and the ShuffleNet-V2 model has the least number of parameters.

近年来，深度学习在电子显微镜（EM）图像分类方面取得了卓越成就。然而，小尺寸EM数据集的图像分类仍未取得良好的研究成果。因此，研究人员需要花费大量时间寻找具有良好分类性能且适用于当前设备工作环境的模型。为了为研究人员提供可靠的参考，我们对21个深度学习模型进行了一系列比较实验。实验包括直接分类、不平衡训练和超参数调优实验。在实验过程中，我们发现了这21个模型之间的互补性，这是进行特征融合相关实验的基础。我们还发现，几何变形的数据增强方法难以提高视觉Transformer（VTs，即ViT、DeiT、BotNet和T2T-ViT）系列模型的性能。在模型性能方面，Xception具有最佳的分类性能，视觉Transformer（ViT）模型训练耗时最少，而ShuffleNet-V2模型的参数数量最少。