Ma Jiajun, Yuan Gang, Guo Chenhua, Gang Xiaoming, Zheng Minting
Shenhua Hollysys Information Technology Co., Ltd., Beijing, China.
The First Affiliated Hospital of Dalian Medical University, Dalian, China.
Front Med (Lausanne). 2023 Sep 28;10:1273441. doi: 10.3389/fmed.2023.1273441. eCollection 2023.
Medical images are information carriers that visually reflect and record the anatomical structure of the human body, and play an important role in clinical diagnosis, teaching and research, etc. Modern medicine has become increasingly inseparable from the intelligent processing of medical images. In recent years, there have been more and more attempts to apply deep learning theory to medical image segmentation tasks, and it is imperative to explore a simple and efficient deep learning algorithm for medical image segmentation. In this paper, we investigate the segmentation of lung nodule images. We address the above-mentioned problems of medical image segmentation algorithms and conduct research on medical image fusion algorithms based on a hybrid channel-space attention mechanism and medical image segmentation algorithms with a hybrid architecture of Convolutional Neural Networks (CNN) and Visual Transformer. To the problem that medical image segmentation algorithms are difficult to capture long-range feature dependencies, this paper proposes a medical image segmentation model SW-UNet based on a hybrid CNN and Vision Transformer (ViT) framework. Self-attention mechanism and sliding window design of Visual Transformer are used to capture global feature associations and break the perceptual field limitation of convolutional operations due to inductive bias. At the same time, a widened self-attentive vector is used to streamline the number of modules and compress the model size so as to fit the characteristics of a small amount of medical data, which makes the model easy to be overfitted. Experiments on the LUNA16 lung nodule image dataset validate the algorithm and show that the proposed network can achieve efficient medical image segmentation on a lightweight scale. In addition, to validate the migratability of the model, we performed additional validation on other tumor datasets with desirable results. Our research addresses the crucial need for improved medical image segmentation algorithms. By introducing the SW-UNet model, which combines CNN and ViT, we successfully capture long-range feature dependencies and break the perceptual field limitations of traditional convolutional operations. This approach not only enhances the efficiency of medical image segmentation but also maintains model scalability and adaptability to small medical datasets. The positive outcomes on various tumor datasets emphasize the potential migratability and broad applicability of our proposed model in the field of medical image analysis.
医学图像是直观反映和记录人体解剖结构的信息载体,在临床诊断、教学和研究等方面发挥着重要作用。现代医学已越来越离不开医学图像的智能处理。近年来,将深度学习理论应用于医学图像分割任务的尝试越来越多,探索一种简单高效的医学图像分割深度学习算法势在必行。在本文中,我们研究肺结节图像的分割。我们解决了上述医学图像分割算法的问题,并基于混合通道 - 空间注意力机制对医学图像融合算法以及具有卷积神经网络(CNN)和视觉Transformer混合架构的医学图像分割算法进行了研究。针对医学图像分割算法难以捕捉长距离特征依赖关系的问题,本文提出了一种基于CNN和视觉Transformer(ViT)混合框架的医学图像分割模型SW - UNet。利用视觉Transformer的自注意力机制和滑动窗口设计来捕捉全局特征关联,并打破由于归纳偏差导致的卷积操作的感知域限制。同时,使用加宽的自注意力向量来简化模块数量并压缩模型大小,以适应少量医学数据的特点,从而使模型不易过拟合。在LUNA16肺结节图像数据集上的实验验证了该算法,结果表明所提出的网络能够在轻量级规模上实现高效的医学图像分割。此外,为了验证模型的可迁移性,我们在其他肿瘤数据集上进行了额外验证,结果理想。我们的研究满足了对改进医学图像分割算法的关键需求。通过引入结合了CNN和ViT的SW - UNet模型,我们成功捕捉了长距离特征依赖关系并打破了传统卷积操作的感知域限制。这种方法不仅提高了医学图像分割的效率,还保持了模型的可扩展性以及对小型医学数据集的适应性。在各种肿瘤数据集上的积极结果强调了我们提出的模型在医学图像分析领域潜在的可迁移性和广泛适用性。