Tian Yi, Mao Qi, Wang Wenfeng, Zhang Yan
College of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai, 201620, People's Republic of China.
Biomed Phys Eng Express. 2025 Mar 12;11(2). doi: 10.1088/2057-1976/adbafa.
Accurate and timely segmentation of COVID-19 infection regions is critical for effective diagnosis and treatment. While convolutional neural networks (CNNs) exhibit strong performance in medical image segmentation, they face challenges in handling complex lesion morphologies with irregular boundaries. Transformer-based approaches, though demonstrating superior capability in capturing global context, suffer from high computational costs and suboptimal multi-scale feature integration. To address these limitations, we proposed Hierarchical Agent Transformer Network (HATNet), a hierarchical encoder-bridge-decoder architecture that optimally balances segmentation accuracy with computational efficiency. The encoder employs novel agent Transformer blocks specifically designed to capture subtle features of small COVID-19 lesions through agent tokens with linear computational complexity. A diversity restoration module (DRM) is innovatively embedded within each agent Transformer block to counteract feature degradation. The hierarchical structure simultaneously extracts high-resolution shallow features and low-resolution fine features, ensuring comprehensive feature representation. The bridge stage incorporates an improved pyramid pooling module (IPPM) that establishes hierarchical global priors, significantly improving contextual understanding for the decoder. The decoder integrates a full-scale bidirectional feature pyramid network (FsBiFPN) with a dedicated border-refinement module (BRM), collectively enhancing edge precision. The HATNet were evaluated on the COVID-19-CT-Seg and CC-CCII datasets. Experimental results yielded Dice scores of 84.14% and 81.22% respectively, demonstrating superior segmentation performance compared to state-of-the-art models. Furthermore, it achieved notable advantages in model parameters and computational complexity, highlighting its clinical deployment potential.
准确及时地分割新冠病毒感染区域对于有效诊断和治疗至关重要。虽然卷积神经网络(CNN)在医学图像分割中表现出强大性能,但在处理边界不规则的复杂病变形态时面临挑战。基于Transformer的方法虽然在捕捉全局上下文方面展现出卓越能力,但存在计算成本高和多尺度特征整合欠佳的问题。为解决这些局限性,我们提出了分层智能体Transformer网络(HATNet),这是一种分层编码器-桥接-解码器架构,能在分割精度和计算效率之间实现最佳平衡。编码器采用了新颖的智能体Transformer模块,专门设计用于通过具有线性计算复杂度的智能体令牌捕捉新冠病毒小病变的细微特征。在每个智能体Transformer模块中创新性地嵌入了一个多样性恢复模块(DRM),以抵消特征退化。分层结构同时提取高分辨率浅层特征和低分辨率精细特征,确保全面的特征表示。桥接阶段包含一个改进的金字塔池化模块(IPPM),该模块建立分层全局先验,显著提高解码器的上下文理解能力。解码器将全尺度双向特征金字塔网络(FsBiFPN)与一个专门的边界细化模块(BRM)集成在一起,共同提高边缘精度。HATNet在COVID-19-CT-Seg和CC-CCII数据集上进行了评估。实验结果分别产生了84.14%和81.22%的Dice分数,表明与现有最先进模型相比具有卓越的分割性能。此外,它在模型参数和计算复杂度方面具有显著优势,凸显了其临床部署潜力。