Wang Chuantao, Wang Saishuo, Shao Shuo, Zhai Jiliang
Beijing University of Civil Engineering and Architecture, School of Electromechanical and Vehicle Engineering, Beijing, China.
Department of Orthopedics, Peking Union Medical College Hospital, Beijing, China.
Quant Imaging Med Surg. 2024 Dec 5;14(12):8551-8567. doi: 10.21037/qims-24-985. Epub 2024 Nov 13.
Fast and accurate automatic segmentation of polyps in colonoscopy plays a crucial role in the early diagnosis and treatment of colon cancer. However, the current polyp segmentation algorithms based on deep neural networks suffer from the problems of larger models and lower segmentation accuracy. Meanwhile, achieving accurate segmentation of polyps is to improve the diagnostic efficiency of doctors, and this need motivates us to develop a set of lightweight models so that it can be easily embedded in clinical devices to meet the requirements of practical applications. This study aims to provide effective technical support for the rapid and precise segmentation of polyps in clinical applications.
DeepNeXt, an innovative polyp segmentation model grounded in multi-scale attention mechanisms. DeepNeXt incorporates a multi-stage, lightweight convolutional encoder module, leveraging several lightweight convolutional layers for efficient and accurate feature extraction. Furthermore, it features a novel multi-stage feature fusion structure designed to circumvent the potential loss of feature information during the encoding phase. Additionally, the model employs a multi-scale attentional feature encoding module that harnesses multi-branch deep strip convolution techniques to extract multi-dimensional information from the feature maps post-encoding, thereby enhancing the neural network's capability to extract diverse feature information.
Experimental validation on the Kvasir Segmentation Dataset (Kvasir-SEG dataset) and Colorectal Cancer-Clinic Datasetbase (CVC-ClinicDB datasets) demonstrates that DeepNeXt outperforms mainstream networks such as U-net, U-net++, TransUnet, SwinUnet, and TGANet in terms of parameters and floating-point operations (FLOPs). DeepNeXt achieved a FLOPs metric of only 3.04 G and a parameters (Params) metric of just 1.51 M, while delivering exceptional performance in segmentation. On the Kvasir-SEG dataset, it reached a mean intersection over union (mIOU) of 83.91, and on the CVC-ClinicDB dataset, 87.37. Additionally, the Dice and Recall metrics also showed superior results, highlighting that DeepNeXt strikes an optimal balance between computational efficiency, model compactness, and segmentation accuracy.
In conclusion, we have proposed the DeepNeXt network, a novel lightweight multi-scale attention segmentation network tailored for computationally limited medical devices, which provides strong support for accurate and efficient polyp segmentation in clinical applications.
结肠镜检查中息肉的快速准确自动分割在结肠癌的早期诊断和治疗中起着至关重要的作用。然而,当前基于深度神经网络的息肉分割算法存在模型较大和分割精度较低的问题。同时,实现息肉的准确分割是为了提高医生的诊断效率,这种需求促使我们开发一组轻量级模型,以便能够轻松嵌入临床设备以满足实际应用的要求。本研究旨在为临床应用中息肉的快速精确分割提供有效的技术支持。
DeepNeXt,一种基于多尺度注意力机制的创新息肉分割模型。DeepNeXt包含一个多阶段的轻量级卷积编码器模块,利用多个轻量级卷积层进行高效准确的特征提取。此外,它具有一种新颖的多阶段特征融合结构,旨在避免编码阶段特征信息的潜在损失。此外,该模型采用多尺度注意力特征编码模块,利用多分支深度带状卷积技术在编码后从特征图中提取多维信息,从而增强神经网络提取多样化特征信息的能力。
在Kvasir分割数据集(Kvasir-SEG数据集)和结直肠癌临床数据集(CVC-ClinicDB数据集)上的实验验证表明,DeepNeXt在参数和浮点运算(FLOPs)方面优于U-net、U-net++、TransUnet、SwinUnet和TGANet等主流网络。DeepNeXt的FLOPs指标仅为3.04 G,参数(Params)指标仅为1.51 M,同时在分割方面表现出色。在Kvasir-SEG数据集上,它的平均交并比(mIOU)达到83.91,在CVC-ClinicDB数据集上达到87.37。此外,Dice和召回率指标也显示出优异的结果,突出表明DeepNeXt在计算效率、模型紧凑性和分割精度之间达到了最佳平衡。
总之,我们提出了DeepNeXt网络,这是一种专为计算能力有限的医疗设备量身定制的新型轻量级多尺度注意力分割网络,为临床应用中准确高效的息肉分割提供了有力支持。