Suppr超能文献

基于自适应注意力和深度融合的语义分割网络:利用多尺度扩张卷积金字塔

Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion Utilizing a Multi-Scale Dilated Convolutional Pyramid.

作者信息

Zhao Shan, Wang Zihao, Huo Zhanqiang, Zhang Fukai

机构信息

School of Software, Henan Polytechnic University, Jiaozuo 454000, China.

出版信息

Sensors (Basel). 2024 Aug 16;24(16):5305. doi: 10.3390/s24165305.

Abstract

Deep learning has recently made significant progress in semantic segmentation. However, the current methods face critical challenges. The segmentation process often lacks sufficient contextual information and attention mechanisms, low-level features lack semantic richness, and high-level features suffer from poor resolution. These limitations reduce the model's ability to accurately understand and process scene details, particularly in complex scenarios, leading to segmentation outputs that may have inaccuracies in boundary delineation, misclassification of regions, and poor handling of small or overlapping objects. To address these challenges, this paper proposes a Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion with the Multi-Scale Dilated Convolutional Pyramid (SDAMNet). Specifically, the Dilated Convolutional Atrous Spatial Pyramid Pooling (DCASPP) module is developed to enhance contextual information in semantic segmentation. Additionally, a Semantic Channel Space Details Module (SCSDM) is devised to improve the extraction of significant features through multi-scale feature fusion and adaptive feature selection, enhancing the model's perceptual capability for key regions and optimizing semantic understanding and segmentation performance. Furthermore, a Semantic Features Fusion Module (SFFM) is constructed to address the semantic deficiency in low-level features and the low resolution in high-level features. The effectiveness of SDAMNet is demonstrated on two datasets, revealing significant improvements in Mean Intersection over Union (MIOU) by 2.89% and 2.13%, respectively, compared to the Deeplabv3+ network.

摘要

深度学习最近在语义分割方面取得了重大进展。然而,当前的方法面临着严峻的挑战。分割过程往往缺乏足够的上下文信息和注意力机制,低级特征缺乏语义丰富性,高级特征的分辨率较差。这些限制降低了模型准确理解和处理场景细节的能力,特别是在复杂场景中,导致分割输出在边界划定、区域误分类以及对小物体或重叠物体的处理不佳等方面可能存在不准确之处。为了应对这些挑战,本文提出了一种基于自适应注意力和多尺度扩张卷积金字塔深度融合的语义分割网络(SDAMNet)。具体而言,开发了扩张卷积空洞空间金字塔池化(DCASPP)模块以增强语义分割中的上下文信息。此外,设计了一个语义通道空间细节模块(SCSDM),通过多尺度特征融合和自适应特征选择来改进显著特征的提取,增强模型对关键区域的感知能力,并优化语义理解和分割性能。此外,构建了一个语义特征融合模块(SFFM)来解决低级特征中的语义不足和高级特征中的低分辨率问题。在两个数据集上验证了SDAMNet的有效性,与Deeplabv3+网络相比,平均交并比(MIOU)分别显著提高了2.89%和2.13%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8f00/11359814/c8ea843878c3/sensors-24-05305-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验