HAFormer：释放层次感知特征在轻量级语义分割中的力量

HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation.

作者信息

Xu Guoan, Jia Wenjing, Wu Tao, Chen Ligeng, Gao Guangwei

出版信息

IEEE Trans Image Process. 2024;33:4202-4214. doi: 10.1109/TIP.2024.3425048. Epub 2024 Jul 22.

DOI:10.1109/TIP.2024.3425048

Abstract

Both Convolutional Neural Networks (CNNs) and Transformers have shown great success in semantic segmentation tasks. Efforts have been made to integrate CNNs with Transformer models to capture both local and global context interactions. However, there is still room for enhancement, particularly when considering constraints on computational resources. In this paper, we introduce HAFormer, a model that combines the hierarchical features extraction ability of CNNs with the global dependency modeling capability of Transformers to tackle lightweight semantic segmentation challenges. Specifically, we design a Hierarchy-Aware Pixel-Excitation (HAPE) module for adaptive multi-scale local feature extraction. During the global perception modeling, we devise an Efficient Transformer (ET) module streamlining the quadratic calculations associated with traditional Transformers. Moreover, a correlation-weighted Fusion (cwF) module selectively merges diverse feature representations, significantly enhancing predictive accuracy. HAFormer achieves high performance with minimal computational overhead and compact model size, achieving 74.2% mIoU on Cityscapes and 71.1% mIoU on CamVid test datasets, with frame rates of 105FPS and 118FPS on a single 2080Ti GPU. The source codes are available at https://github.com/XU-GITHUB-curry/HAFormer.

摘要

卷积神经网络（CNNs）和Transformer在语义分割任务中都取得了巨大成功。人们已努力将CNNs与Transformer模型集成，以捕捉局部和全局上下文交互。然而，仍有改进空间，特别是考虑到计算资源的限制时。在本文中，我们介绍了HAFormer，一种将CNNs的分层特征提取能力与Transformer的全局依赖性建模能力相结合的模型，以应对轻量级语义分割挑战。具体而言，我们设计了一个层次感知像素激励（HAPE）模块用于自适应多尺度局部特征提取。在全局感知建模期间，我们设计了一个高效Transformer（ET）模块，简化了与传统Transformer相关的二次计算。此外，一个相关加权融合（cwF）模块选择性地合并不同的特征表示，显著提高预测准确性。HAFormer以最小的计算开销和紧凑的模型大小实现了高性能，在Cityscapes测试数据集上达到了74.2%的平均交并比（mIoU），在CamVid测试数据集上达到了71.1%的mIoU，在单个2080Ti GPU上的帧率分别为105FPS和118FPS。源代码可在https://github.com/XU-GITHUB-curry/HAFormer获取。

相似文献

HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation.HAFormer：释放层次感知特征在轻量级语义分割中的力量

IEEE Trans Image Process. 2024;33:4202-4214. doi: 10.1109/TIP.2024.3425048. Epub 2024 Jul 22.

Rethinking 1D convolution for lightweight semantic segmentation.重新思考用于轻量级语义分割的一维卷积

Front Neurorobot. 2023 Feb 9;17:1119231. doi: 10.3389/fnbot.2023.1119231. eCollection 2023.

Lightweight medical image segmentation network with multi-scale feature-guided fusion.轻量级医疗图像分割网络，具有多尺度特征引导融合。

Comput Biol Med. 2024 Nov;182:109204. doi: 10.1016/j.compbiomed.2024.109204. Epub 2024 Oct 3.

TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation.TGDAUNet：基于 Transformer 和 GCNN 的双分支注意力 U-Net 用于医学图像分割。

Comput Biol Med. 2023 Dec;167:107583. doi: 10.1016/j.compbiomed.2023.107583. Epub 2023 Oct 21.

Context and Spatial Feature Calibration for Real-Time Semantic Segmentation.用于实时语义分割的上下文和空间特征校准

IEEE Trans Image Process. 2023;32:5465-5477. doi: 10.1109/TIP.2023.3318967. Epub 2023 Oct 25.

BiU-net: A dual-branch structure based on two-stage fusion strategy for biomedical image segmentation.BiU-net：一种基于两阶段融合策略的双分支结构，用于生物医学图像分割。

Comput Methods Programs Biomed. 2024 Jul;252:108235. doi: 10.1016/j.cmpb.2024.108235. Epub 2024 May 18.

LM-Net: A light-weight and multi-scale network for medical image segmentation.LM-Net：用于医学图像分割的轻量级多尺度网络。

Comput Biol Med. 2024 Jan;168:107717. doi: 10.1016/j.compbiomed.2023.107717. Epub 2023 Nov 23.

A lightweight multi-dimension dynamic convolutional network for real-time semantic segmentation.一种用于实时语义分割的轻量级多维动态卷积网络。

Front Neurorobot. 2022 Dec 15;16:1075520. doi: 10.3389/fnbot.2022.1075520. eCollection 2022.

Transformer guided self-adaptive network for multi-scale skin lesion image segmentation.Transformer 引导的自适网络用于多尺度皮肤病变图像分割。

Comput Biol Med. 2024 Feb;169:107846. doi: 10.1016/j.compbiomed.2023.107846. Epub 2023 Dec 23.

ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation.ETUNet：探索高效的基于Transformer 的增强型 UNet 进行 3D 脑肿瘤分割。

Comput Biol Med. 2024 Mar;171:108005. doi: 10.1016/j.compbiomed.2024.108005. Epub 2024 Jan 23.

引用本文的文献

DSC-SeNet: Unilateral Network with Feature Enhancement and Aggregation for Real-Time Segmentation of Carbon Trace in the Oil-Immersed Transformer.DSC-SeNet：用于油浸式变压器碳迹实时分割的具有特征增强与聚合的单边网络

Sensors (Basel). 2024 Dec 25;25(1):43. doi: 10.3390/s25010043.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

HAFormer：释放层次感知特征在轻量级语义分割中的力量

HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic Segmentation.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献