Li Xiujuan, Li Junhuai
School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, 710048, China.
School of Information, Xi'an University of Finance and Economics, Xi'an, 710100, China.
Sci Rep. 2024 Mar 8;14(1):5745. doi: 10.1038/s41598-024-56211-1.
Semantic segmentation of remote sensing images (RSI) is an important research direction in remote sensing technology. This paper proposes a multi-feature fusion and channel attention network, MFCA-Net, aiming to improve the segmentation accuracy of remote sensing images and the recognition performance of small target objects. The architecture is built on an encoding-decoding structure. The encoding structure includes the improved MobileNet V2 (IMV2) and multi-feature dense fusion (MFDF). In IMV2, the attention mechanism is introduced twice to enhance the feature extraction capability, and the design of MFDF can obtain more dense feature sampling points and larger receptive fields. In the decoding section, three branches of shallow features of the backbone network are fused with deep features, and upsampling is performed to achieve the pixel-level classification. Comparative experimental results of the six most advanced methods effectively prove that the segmentation accuracy of the proposed network has been significantly improved. Furthermore, the recognition degree of small target objects is higher. For example, the proposed MFCA-Net achieves about 3.65-23.55% MIoU improvement on the dataset Vaihingen.
遥感图像(RSI)的语义分割是遥感技术中的一个重要研究方向。本文提出了一种多特征融合与通道注意力网络MFCA-Net,旨在提高遥感图像的分割精度和小目标物体的识别性能。该架构基于编解码结构构建。编码结构包括改进的MobileNet V2(IMV2)和多特征密集融合(MFDF)。在IMV2中,两次引入注意力机制以增强特征提取能力,而MFDF的设计可以获得更密集的特征采样点和更大的感受野。在解码部分,主干网络的三个浅层特征分支与深层特征融合,并进行上采样以实现像素级分类。六种最先进方法的对比实验结果有效证明了所提网络的分割精度得到了显著提高。此外,对小目标物体的识别程度更高。例如,所提的MFCA-Net在Vaihingen数据集上实现了约3.65-23.55%的平均交并比提升。