School of Automation and Electronic Information, Xiangtan University, Xiangtan 411105, China.
Shanghai Aerospace Control Technology Institute, Shanghai 201109, China.
Sensors (Basel). 2022 Jun 13;22(12):4477. doi: 10.3390/s22124477.
Contextual information and the dependencies between dimensions is vital in image semantic segmentation. In this paper, we propose a multiple-attention mechanism network (MANet) for semantic segmentation in a very effective and efficient way. Concretely, the contributions are as follows: (1) a novel dual-attention mechanism for capturing feature dependencies in spatial and channel dimensions, where the adjacent position attention captures the dependencies between pixels well; (2) a new cross-dimensional interactive attention feature fusion module, which strengthens the fusion of fine location structure information in low-level features and category semantic information in high-level features. We conduct extensive experiments on semantic segmentation benchmarks including PASCAL VOC 2012 and Cityscapes datasets. Our MANet achieves the mIoU scores of 75.5% and 72.8% on PASCAL VOC 2012 and Cityscapes datasets, respectively. The effectiveness of the network is higher than the previous popular semantic segmentation networks under the same conditions.
上下文信息和维度之间的依赖关系对于图像语义分割至关重要。在本文中,我们提出了一种多注意机制网络(MANet),以非常有效和高效的方式进行语义分割。具体来说,我们的贡献如下:(1)一种新的双注意机制,用于捕获空间和通道维度中的特征依赖关系,其中相邻位置注意很好地捕捉了像素之间的依赖关系;(2)一种新的跨维度交互注意特征融合模块,增强了低级特征中的精细位置结构信息和高级特征中的类别语义信息的融合。我们在语义分割基准上进行了广泛的实验,包括 PASCAL VOC 2012 和 Cityscapes 数据集。我们的 MANet 在 PASCAL VOC 2012 和 Cityscapes 数据集上分别实现了 75.5%和 72.8%的 mIoU 得分。在相同条件下,该网络的有效性高于以前流行的语义分割网络。