Department of Computer Engineering & Applications, G.L.A. University, Mathura, India.
Chitkara University Institute of Engineering and Technology, Centre for Research Impact & Outcome, Chitkara University, Rajpura, Punjab, India.
PLoS One. 2024 Nov 15;19(11):e0311080. doi: 10.1371/journal.pone.0311080. eCollection 2024.
Accurate segmentation of lung lesions in CT-scan images is essential to diagnose lung cancer. The challenges in lung nodule diagnosis arise due to their small size and diverse nature. We designed a transformer-based model EDTNet (Encoder Decoder Transformer Network) for PNS (Pulmonary Nodule Segmentation). Traditional CNN-based encoders and decoders are hindered by their inability to capture long-range spatial dependencies, leading to suboptimal performance in complex object segmentation tasks. To address the limitation, we leverage an enhanced spatial attention-based Vision Transformer (ViT) as an encoder and decoder in the EDTNet. The EDTNet integrates two successive transformer blocks, a patch-expanding layer, down-sampling layers, and up-sampling layers to improve segmentation capabilities. In addition, ESLA (Enhanced spatial aware local attention) and EGLA (Enhanced global aware local attention) blocks are added to provide attention to the spatial features. Furthermore, skip connections are introduced to facilitate symmetrical interaction between the corresponding encoder and decoder layer, enabling the retrieval of intricate details in the output. The EDTNet performance is compared with several models on DS1 and DS2, including Unet, ResUNet++, U-NET 3+, DeepLabV3+, SegNet, Trans-Unet, and Swin-UNet, demonstrates superior quantitative and visual results. On DS1, the EDTNet achieved 96.27%, 95.81%, 96.15% precision, IoU (Intersection over Union), and DSC (Sorensen-Dice coefficient). Moreover, the model has demonstrated sensitivity, IoU and SDC of 98.84%, 96.06% and 97.85% on DS2.
准确地对 CT 扫描图像中的肺部病变进行分割对于诊断肺癌至关重要。由于肺结节的体积小且形态多样,因此在肺结节诊断方面存在挑战。我们设计了一种基于变压器的模型 EDTNet(Encoder-Decoder-Transformer Network)用于肺结节分割。传统的基于 CNN 的编码器和解码器由于无法捕获长距离空间依赖关系,因此在复杂的目标分割任务中表现不佳。为了解决这个限制,我们在 EDTNet 中使用了增强的基于空间注意力的 Vision Transformer(ViT)作为编码器和解码器。EDTNet 集成了两个连续的变压器块、一个补丁扩展层、下采样层和上采样层,以提高分割能力。此外,还添加了 ESLA(增强的空间感知局部注意力)和 EGLA(增强的全局感知局部注意力)块,以提供对空间特征的关注。此外,引入了跳过连接,以促进对应编码器和解码器层之间的对称交互,从而可以在输出中检索复杂的细节。EDTNet 的性能在 DS1 和 DS2 上与包括 Unet、ResUNet++、U-NET 3+、DeepLabV3+、SegNet、Trans-Unet 和 Swin-UNet 在内的几种模型进行了比较,结果表明具有更好的定量和可视化效果。在 DS1 上,EDTNet 实现了 96.27%、95.81%、96.15%的精度、IoU(交并比)和 DSC(Sorensen-Dice 系数)。此外,该模型在 DS2 上表现出了 98.84%、96.06%和 97.85%的敏感性、IoU 和 DSC。