Wu Yuke, Liu Xiang, Shi Yunyu, Chen Xinyi, Wang Zhenglei, Xu YuQing, Wang ShuoHong
The College of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201600, China.
The Department of Medical Imaging, Shanghai Electric Power Hospital, Shanghai, 200000, China.
Med Biol Eng Comput. 2025 Aug 20. doi: 10.1007/s11517-025-03425-8.
Accurate segmentation of lung adenocarcinoma nodules in computed tomography (CT) images is critical for clinical staging and diagnosis. However, irregular nodule shapes and ambiguous boundaries pose significant challenges for existing methods. This study introduces STU-Net, a hybrid CNN-Transformer architecture designed to enhance feature extraction, fusion, and global context modeling. The model integrates three key innovations: (1) structured convolution blocks (DWF-Conv/DBR-Conv) for multi-scale feature extraction and overfitting mitigation; (2) S-MLP Link, a spatial-shift-enhanced skip-connection module to improve multi-level feature fusion; and 3) residual-based superpixel vision transformer (RM-SViT) to capture long-range dependencies efficiently. Evaluated on the LIDC-IDRI dataset, STU-Net achieves a Dice score of 89.04%, precision of 90.73%, and IoU of 90.70%, outperforming recent methods by 4.52% in Dice. Validation on the EPDB dataset further confirms its generalizability (Dice, 86.40%). This work contributes to bridging the gap between local feature sensitivity and global context awareness by integrating structured convolutions and superpixel-based transformers, offering a robust tool for clinical decision support.
在计算机断层扫描(CT)图像中准确分割肺腺癌结节对于临床分期和诊断至关重要。然而,结节形状不规则和边界模糊给现有方法带来了重大挑战。本研究介绍了STU-Net,一种混合的卷积神经网络-Transformer架构,旨在增强特征提取、融合和全局上下文建模。该模型集成了三项关键创新:(1)用于多尺度特征提取和减轻过拟合的结构化卷积块(DWF-Conv/DBR-Conv);(2)S-MLP Link,一种空间移位增强的跳跃连接模块,用于改善多级特征融合;以及(3)基于残差的超像素视觉Transformer(RM-SViT),以有效捕捉长距离依赖关系。在LIDC-IDRI数据集上进行评估时,STU-Net的Dice分数达到89.04%,精度为90.73%,交并比为90.70%,在Dice方面比最近的方法高出4.52%。在EPDB数据集上的验证进一步证实了其通用性(Dice,86.40%)。这项工作通过整合结构化卷积和基于超像素的Transformer,有助于弥合局部特征敏感性和全局上下文感知之间的差距,为临床决策支持提供了一个强大的工具。