Peng Chen, Qian Zhiqin, Wang Kunyu, Zhang Lanzhu, Luo Qi, Bi Zhuming, Zhang Wenjun
School of Mechanical and Power Engineering, East China University of Science and Technology, Shanghai 200237, China.
Department of Engineering, Purdue University, West Lafayette, IN 47907, USA.
Sensors (Basel). 2024 Nov 23;24(23):7473. doi: 10.3390/s24237473.
Accurate polyp image segmentation is of great significance, because it can help in the detection of polyps. Convolutional neural network (CNN) is a common automatic segmentation method, but its main disadvantage is the long training time. Transformer is another method that can be adapted to the automatic segmentation method by employing a self-attention mechanism, which essentially assigns different importance weights to each piece of information, thus achieving high computational efficiency during segmentation. However, a potential drawback with Transformer is the risk of information loss. The study reported in this paper employed the well-known hybridization principle to propose a method to combine CNN and Transformer to retain the strengths of both. Specifically, this study applied this method to the early detection of colonic polyps and to implement a model called MugenNet for colonic polyp image segmentation. We conducted a comprehensive experiment to compare MugenNet with other CNN models on five publicly available datasets. An ablation experiment on MugenNet was conducted as well. The experimental results showed that MugenNet can achieve a mean Dice of 0.714 on the ETIS dataset, which is the optimal performance on this dataset compared to other models, with an inference speed of 56 FPS. The overall outcome of this study is a method to optimally combine two methods of machine learning which are complementary to each other.
准确的息肉图像分割具有重要意义,因为它有助于息肉的检测。卷积神经网络(CNN)是一种常见的自动分割方法,但其主要缺点是训练时间长。Transformer是另一种可通过采用自注意力机制来适应自动分割的方法,该机制本质上为每条信息分配不同的重要性权重,从而在分割过程中实现高计算效率。然而,Transformer的一个潜在缺点是存在信息丢失的风险。本文报道的研究采用了著名的混合原理,提出了一种将CNN和Transformer相结合的方法,以保留两者的优势。具体而言,本研究将该方法应用于结肠息肉的早期检测,并实现了一个名为MugenNet的结肠息肉图像分割模型。我们在五个公开可用的数据集上进行了全面实验,将MugenNet与其他CNN模型进行比较。同时也对MugenNet进行了消融实验。实验结果表明,MugenNet在ETIS数据集上的平均Dice系数可达0.714,与其他模型相比,这是该数据集上的最佳性能,推理速度为56帧每秒。本研究的总体成果是一种优化结合两种相互补充的机器学习方法的方式。