Zhang Guanjin, Roslan Siti Nur Aliaa Binti, Wang Ci, Quan Ling
Department of Civil Engineering, Faculty of Engineering, University Putra Malaysia, 43400, Serdang, Selangor, Malaysia.
College of Resource and Environment, Anhui Science and Technology University, Chuzhou, 233100, China.
Sci Rep. 2023 Sep 28;13(1):16275. doi: 10.1038/s41598-023-43317-1.
In recent years, remote sensing images of various types have found widespread applications in resource exploration, environmental protection, and land cover classification. However, relying solely on a single optical or synthetic aperture radar (SAR) image as the data source for land cover classification studies may not suffice to achieve the desired accuracy in ground information monitoring. One widely employed neural network for remote sensing image land cover classification is the U-Net network, which is a classical semantic segmentation network. Nonetheless, the U-Net network has limitations such as poor classification accuracy, misclassification and omission of small-area terrains, and a large number of network parameters. To address these challenges, this research paper proposes an improved approach that combines both optical and SAR images in bands for land cover classification and enhances the U-Net network. The approach incorporates several modifications to the network architecture. Firstly, the encoder-decoder framework serves as the backbone terrain-extraction network. Additionally, a convolutional block attention mechanism is introduced in the terrain extraction stage. Instead of pooling layers, convolutions with a step size of 2 are utilized, and the Leaky ReLU function is employed as the network's activation function. This design offers several advantages: it enhances the network's ability to capture terrain characteristics from both spatial and channel dimensions, resolves the loss of terrain map information while reducing network parameters, and ensures non-zero gradients during the training process. The effectiveness of the proposed method is evaluated through land cover classification experiments conducted on optical, SAR, and combined optical and SAR datasets. The results demonstrate that our method achieves classification accuracies of 0.8905, 0.8609, and 0.908 on the three datasets, respectively, with corresponding mIoU values of 0.8104, 0.7804, and 0.8667. Compared to the traditional U-Net network, our method exhibits improvements in both classification accuracy and mIoU to a certain extent.
近年来,各类遥感图像在资源勘探、环境保护和土地覆盖分类中得到了广泛应用。然而,仅依靠单一的光学或合成孔径雷达(SAR)图像作为土地覆盖分类研究的数据源,可能不足以在地面信息监测中达到所需的精度。一种广泛用于遥感图像土地覆盖分类的神经网络是U-Net网络,它是一个经典的语义分割网络。尽管如此,U-Net网络存在一些局限性,如分类精度低、小面积地形的误分类和遗漏以及大量的网络参数。为应对这些挑战,本研究论文提出了一种改进方法,该方法将光学和SAR图像在波段上进行组合用于土地覆盖分类,并对U-Net网络进行了改进。该方法对网络架构进行了多项修改。首先,编码器-解码器框架作为主干地形提取网络。此外,在地形提取阶段引入了卷积块注意力机制。使用步长为2的卷积代替池化层,并采用Leaky ReLU函数作为网络的激活函数。这种设计具有多个优点:它增强了网络从空间和通道维度捕捉地形特征的能力,解决了地形地图信息丢失的问题,同时减少了网络参数,并确保了训练过程中的非零梯度。通过在光学、SAR以及光学和SAR组合数据集上进行土地覆盖分类实验,对所提方法的有效性进行了评估。结果表明,我们的方法在这三个数据集上分别实现了0.8905、0.8609和0.908的分类准确率,相应的平均交并比(mIoU)值分别为0.8104、0.7804和0.8667。与传统的U-Net网络相比,我们的方法在分类准确率和mIoU方面都有一定程度的提高。