Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China.
Comput Assist Surg (Abingdon). 2024 Dec;29(1):2329675. doi: 10.1080/24699322.2024.2329675. Epub 2024 Mar 20.
The real-time requirement for image segmentation in laparoscopic surgical assistance systems is extremely high. Although traditional deep learning models can ensure high segmentation accuracy, they suffer from a large computational burden. In the practical setting of most hospitals, where powerful computing resources are lacking, these models cannot meet the real-time computational demands. We propose a novel network SwinD-Net based on Skip connections, incorporating Depthwise separable convolutions and Swin Transformer Blocks. To reduce computational overhead, we eliminate the skip connection in the first layer and reduce the number of channels in shallow feature maps. Additionally, we introduce Swin Transformer Blocks, which have a larger computational and parameter footprint, to extract global information and capture high-level semantic features. Through these modifications, our network achieves desirable performance while maintaining a lightweight design. We conduct experiments on the CholecSeg8k dataset to validate the effectiveness of our approach. Compared to other models, our approach achieves high accuracy while significantly reducing computational and parameter overhead. Specifically, our model requires only 98.82 M floating-point operations (FLOPs) and 0.52 M parameters, with an inference time of 47.49 ms per image on a CPU. Compared to the recently proposed lightweight segmentation network UNeXt, our model not only outperforms it in terms of the Dice metric but also has only 1/3 of the parameters and 1/22 of the FLOPs. In addition, our model achieves a 2.4 times faster inference speed than UNeXt, demonstrating comprehensive improvements in both accuracy and speed. Our model effectively reduces parameter count and computational complexity, improving the inference speed while maintaining comparable accuracy. The source code will be available at https://github.com/ouyangshuiming/SwinDNet.
腹腔镜手术辅助系统中对图像分割的实时性要求极高。虽然传统的深度学习模型可以保证较高的分割精度,但它们存在计算负担大的问题。在大多数医院缺乏强大计算资源的实际环境中,这些模型无法满足实时计算的需求。我们提出了一种基于 Skip connections 的新型网络 SwinD-Net,它结合了 Depthwise separable convolutions 和 Swin Transformer Blocks。为了降低计算开销,我们在第一层中消除了 Skip connection,并减少了浅层特征图中的通道数。此外,我们引入了 Swin Transformer Blocks,它具有更大的计算和参数足迹,用于提取全局信息和捕获高级语义特征。通过这些修改,我们的网络在保持轻量级设计的同时实现了良好的性能。我们在 CholecSeg8k 数据集上进行实验,验证了我们方法的有效性。与其他模型相比,我们的方法在显著降低计算和参数开销的同时实现了高精度。具体来说,我们的模型仅需要 98.82M 浮点运算(FLOPs)和 0.52M 参数,在 CPU 上每张图像的推断时间为 47.49ms。与最近提出的轻量级分割网络 UNeXt 相比,我们的模型不仅在 Dice 度量上优于它,而且参数数量仅为其的 1/3,FLOPs 仅为其的 1/22。此外,我们的模型的推断速度比 UNeXt 快 2.4 倍,在准确性和速度方面都有全面的提高。我们的模型有效地减少了参数数量和计算复杂度,提高了推断速度,同时保持了可比的准确性。源代码将在 https://github.com/ouyangshuiming/SwinDNet 上提供。