• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MLAgg-UNet:借助高效Transformer和受曼巴启发的多尺度序列推进医学图像分割

MLAgg-UNet: Advancing Medical Image Segmentation with Efficient Transformer and Mamba-Inspired Multi-Scale Sequence.

作者信息

Jiang Jiaxu, Lei Sen, Li HengChao, Sun Yongjian

出版信息

IEEE J Biomed Health Inform. 2025 Aug 7;PP. doi: 10.1109/JBHI.2025.3596648.

DOI:10.1109/JBHI.2025.3596648
PMID:40773398
Abstract

Transformers and state space sequence models (SSMs) have attracted interest in biomedical image segmentation for their ability to capture long-range dependency. However, traditional visual state space (VSS) methods suffer from the incompatibility of image tokens with autoregressive assumption. Although Transformer attention does not require this assumption, its high computational cost limits effective channel-wise information utilization. To overcome these limitations, we propose the Mamba-Like Aggregated UNet (MLAgg-UNet), which introduces Mamba-inspired mechanism to enrich Transformer channel representation and exploit implicit autoregressive characteristic within U-shaped architecture. For establishing dependencies among image tokens in single scale, the Mamba-Like Aggregated Attention (MLAgg) block is designed to balance representational ability and computational efficiency. Inspired by the human foveal vision system, Mamba macro-structure, and differential attention, MLAgg block can slide its focus over each image token, suppress irrelevant tokens, and simultaneously strengthen channel-wise information utilization. Moreover, leveraging causal relationships between consecutive low-level and high-level features in U-shaped architecture, we propose the Multi-Scale Mamba Module with Implicit Causality (MSMM) to optimize complementary information across scales. Embedded within skip connections, this module enhances semantic consistency between encoder and decoder features. Extensive experiments on four benchmark datasets, including AbdomenMRI, ACDC, BTCV, and EndoVis17, which cover MRI, CT, and endoscopy modalities, demonstrate that the proposed MLAgg-UNet consistently outperforms state-of-the-art CNN-based, Transformer-based, and Mamba-based methods. Specifically, it achieves improvements of at least 1.24%, 0.20%, 0.33%, and 0.39% in DSC scores on these datasets, respectively. These results highlight the model's ability to effectively capture feature correlations and integrate complementary multi-scale information, providing a robust solution for medical image segmentation. The implementation is publicly available at https://github.com/aticejiang/MLAgg-UNet.

摘要

变压器和状态空间序列模型(SSM)因其捕捉长程依赖关系的能力而在生物医学图像分割领域引起了关注。然而,传统的视觉状态空间(VSS)方法存在图像令牌与自回归假设不兼容的问题。虽然变压器注意力不需要这个假设,但其高计算成本限制了有效的通道级信息利用。为了克服这些限制,我们提出了类曼巴聚合U-Net(MLAgg-UNet),它引入了受曼巴启发的机制来丰富变压器通道表示,并在U形架构中利用隐式自回归特性。为了在单尺度上建立图像令牌之间的依赖关系,类曼巴聚合注意力(MLAgg)块被设计用于平衡表示能力和计算效率。受人类中央凹视觉系统、曼巴宏观结构和差分注意力的启发,MLAgg块可以将其注意力焦点滑过每个图像令牌,抑制无关令牌,同时加强通道级信息利用。此外,利用U形架构中连续的低级和高级特征之间的因果关系,我们提出了具有隐式因果关系的多尺度曼巴模块(MSMM)来优化跨尺度的互补信息。该模块嵌入在跳跃连接中,增强了编码器和解码器特征之间的语义一致性。在包括腹部MRI、ACDC、BTCV和EndoVis17在内的四个基准数据集上进行的广泛实验,这些数据集涵盖了MRI、CT和内窥镜检查模态,表明所提出的MLAgg-UNet始终优于基于卷积神经网络(CNN)、基于变压器和基于曼巴的现有方法。具体而言,它在这些数据集上的DSC分数分别提高了至少1.24%、0.20%、0.33%和0.39%。这些结果突出了该模型有效捕捉特征相关性和整合互补多尺度信息的能力,为医学图像分割提供了一个强大的解决方案。该实现可在https://github.com/aticejiang/MLAgg-UNet上公开获取。

相似文献

1
MLAgg-UNet: Advancing Medical Image Segmentation with Efficient Transformer and Mamba-Inspired Multi-Scale Sequence.MLAgg-UNet:借助高效Transformer和受曼巴启发的多尺度序列推进医学图像分割
IEEE J Biomed Health Inform. 2025 Aug 7;PP. doi: 10.1109/JBHI.2025.3596648.
2
VMDU-net: a dual encoder multi-scale fusion network for polyp segmentation with Vision Mamba and Cross-Shape Transformer integration.VMDU-net:一种用于息肉分割的双编码器多尺度融合网络,集成了视觉曼巴和十字形变换器
Front Artif Intell. 2025 Jun 18;8:1557508. doi: 10.3389/frai.2025.1557508. eCollection 2025.
3
SegMamba-V2: Long-range Sequential Modeling Mamba For General 3D Medical Image Segmentation.SegMamba-V2:用于通用3D医学图像分割的长距离序列建模Mamba
IEEE Trans Med Imaging. 2025 Jul 18;PP. doi: 10.1109/TMI.2025.3589797.
4
CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation.CLT-MambaSeg:一种用于医学图像分割的卷积、线性变换器和多尺度曼巴的集成模型。
Comput Biol Med. 2025 Sep;196(Pt B):110736. doi: 10.1016/j.compbiomed.2025.110736. Epub 2025 Jul 26.
5
SCFMUNet: A fusion architecture based on multi-scale state space model and channel attention for medical image segmentation.SCFMUNet:一种基于多尺度状态空间模型和通道注意力机制的医学图像分割融合架构。
Neural Netw. 2025 Jul 29;192:107919. doi: 10.1016/j.neunet.2025.107919.
6
Multi-level channel-spatial attention and light-weight scale-fusion network (MCSLF-Net): multi-level channel-spatial attention and light-weight scale-fusion transformer for 3D brain tumor segmentation.多级通道空间注意力与轻量级尺度融合网络(MCSLF-Net):用于3D脑肿瘤分割的多级通道空间注意力与轻量级尺度融合变换器
Quant Imaging Med Surg. 2025 Jul 1;15(7):6301-6325. doi: 10.21037/qims-2025-354. Epub 2025 Jun 30.
7
A novel recursive transformer-based U-Net architecture for enhanced multi-scale medical image segmentation.一种基于递归变压器的新型U-Net架构,用于增强多尺度医学图像分割。
Comput Biol Med. 2025 Sep;196(Pt A):110658. doi: 10.1016/j.compbiomed.2025.110658. Epub 2025 Jul 6.
8
DCMC-UNet: A Novel Segmentation Model for Carbon Traces in Oil-Immersed Transformers Improved with Dynamic Feature Fusion and Adaptive Illumination Enhancement.DCMC-UNet:一种通过动态特征融合和自适应光照增强改进的油浸式变压器碳痕分割新模型。
Sensors (Basel). 2025 Jun 23;25(13):3904. doi: 10.3390/s25133904.
9
TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation.TLTNet:一种新颖的跨尺度级联分层Transformer 网络,用于增强视网膜血管分割。
Comput Biol Med. 2024 Aug;178:108773. doi: 10.1016/j.compbiomed.2024.108773. Epub 2024 Jun 25.
10
UTSRMorph: A Unified Transformer and Superresolution Network for Unsupervised Medical Image Registration.UTSRMorph:一种用于无监督医学图像配准的统一Transformer与超分辨率网络
IEEE Trans Med Imaging. 2025 Feb;44(2):891-902. doi: 10.1109/TMI.2024.3467919. Epub 2025 Feb 4.