Zhao Shan, Wang Zihao, Huo Zhanqiang, Zhang Fukai
School of Software, Henan Polytechnic University, Jiaozuo 454000, China.
Sensors (Basel). 2024 Aug 16;24(16):5305. doi: 10.3390/s24165305.
Deep learning has recently made significant progress in semantic segmentation. However, the current methods face critical challenges. The segmentation process often lacks sufficient contextual information and attention mechanisms, low-level features lack semantic richness, and high-level features suffer from poor resolution. These limitations reduce the model's ability to accurately understand and process scene details, particularly in complex scenarios, leading to segmentation outputs that may have inaccuracies in boundary delineation, misclassification of regions, and poor handling of small or overlapping objects. To address these challenges, this paper proposes a Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion with the Multi-Scale Dilated Convolutional Pyramid (SDAMNet). Specifically, the Dilated Convolutional Atrous Spatial Pyramid Pooling (DCASPP) module is developed to enhance contextual information in semantic segmentation. Additionally, a Semantic Channel Space Details Module (SCSDM) is devised to improve the extraction of significant features through multi-scale feature fusion and adaptive feature selection, enhancing the model's perceptual capability for key regions and optimizing semantic understanding and segmentation performance. Furthermore, a Semantic Features Fusion Module (SFFM) is constructed to address the semantic deficiency in low-level features and the low resolution in high-level features. The effectiveness of SDAMNet is demonstrated on two datasets, revealing significant improvements in Mean Intersection over Union (MIOU) by 2.89% and 2.13%, respectively, compared to the Deeplabv3+ network.
深度学习最近在语义分割方面取得了重大进展。然而,当前的方法面临着严峻的挑战。分割过程往往缺乏足够的上下文信息和注意力机制,低级特征缺乏语义丰富性,高级特征的分辨率较差。这些限制降低了模型准确理解和处理场景细节的能力,特别是在复杂场景中,导致分割输出在边界划定、区域误分类以及对小物体或重叠物体的处理不佳等方面可能存在不准确之处。为了应对这些挑战,本文提出了一种基于自适应注意力和多尺度扩张卷积金字塔深度融合的语义分割网络(SDAMNet)。具体而言,开发了扩张卷积空洞空间金字塔池化(DCASPP)模块以增强语义分割中的上下文信息。此外,设计了一个语义通道空间细节模块(SCSDM),通过多尺度特征融合和自适应特征选择来改进显著特征的提取,增强模型对关键区域的感知能力,并优化语义理解和分割性能。此外,构建了一个语义特征融合模块(SFFM)来解决低级特征中的语义不足和高级特征中的低分辨率问题。在两个数据集上验证了SDAMNet的有效性,与Deeplabv3+网络相比,平均交并比(MIOU)分别显著提高了2.89%和2.13%。