Suppr超能文献

一种用于实时语义分割的混合注意力多尺度融合网络。

A hybrid attention multi-scale fusion network for real-time semantic segmentation.

作者信息

Ye Baofeng, Xue Renzheng, Wu Qianlong

机构信息

School of Computer and Control Engineering, Qiqihar University, Qiqihar, 161003, China.

Heilongjiang Key Laboratory of Big Data Network Security Detection and Analysis, Qiqihar University, Qiqihar, 161000, China.

出版信息

Sci Rep. 2025 Jan 6;15(1):872. doi: 10.1038/s41598-024-84685-6.

Abstract

In semantic segmentation research, spatial information and receptive fields are essential. However, currently, most algorithms focus on acquiring semantic information and lose a significant amount of spatial information, leading to a significant decrease in accuracy despite improving real-time inference speed. This paper proposes a new method to address this issue. Specifically, we have designed a new module (HFRM) that combines channel attention and spatial attention to retrieve the spatial information lost during downsampling and enhance object classification accuracy. Regarding fusing spatial and semantic information, we have designed a new module (HFFM) to merge features of two different levels more effectively and capture a larger receptive field through an attention mechanism. Additionally, edge detection methods have been incorporated to enhance the extraction of boundary information. Experimental results demonstrate that for an input size of 512 × 1024, our proposed method achieves 73.6% mIoU at 176 frames per second (FPS) on the Cityscapes dataset and 70.0% mIoU at 146 FPS on Camvid. Compared to existing networks, our Model achieves faster inference speed while maintaining accuracy, enhancing its practicality.

摘要

在语义分割研究中,空间信息和感受野至关重要。然而,目前大多数算法专注于获取语义信息,丢失了大量空间信息,尽管提高了实时推理速度,但导致准确率大幅下降。本文提出了一种新方法来解决这个问题。具体来说,我们设计了一个新模块(HFRM),它结合了通道注意力和空间注意力,以检索下采样过程中丢失的空间信息,并提高目标分类准确率。关于融合空间和语义信息,我们设计了一个新模块(HFFM),通过注意力机制更有效地合并两个不同层次的特征,并捕获更大的感受野。此外,还引入了边缘检测方法来增强边界信息的提取。实验结果表明,对于512×1024的输入尺寸,我们提出的方法在Cityscapes数据集上以每秒176帧(FPS)的速度实现了73.6%的平均交并比(mIoU),在Camvid数据集上以每秒146帧的速度实现了70.0%的mIoU。与现有网络相比,我们的模型在保持准确率的同时实现了更快的推理速度,提高了其实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5205/11701099/f22ca7c5c5c0/41598_2024_84685_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验