Yu Chih-Chang, Chen Yuan-Di, Cheng Hsu-Yung, Jiang Chi-Lun
Department of Information and Computer Engineering, Chung Yuan Christian University, Taoyuan City 320, Taiwan.
Department of Computer Science and Information Engineering, National Central University, Taoyuan City 320, Taiwan.
Sensors (Basel). 2024 Oct 10;24(20):6539. doi: 10.3390/s24206539.
Advancements in satellite and aerial imagery technology have made it easier to obtain high-resolution remote sensing images, leading to widespread research and applications in various fields. Remote sensing image semantic segmentation is a crucial task that provides semantic and localization information for target objects. In addition to the large-scale variation issues common in most semantic segmentation datasets, aerial images present unique challenges, including high background complexity and imbalanced foreground-background ratios. However, general semantic segmentation methods primarily address scale variations in natural scenes and often neglect the specific challenges in remote sensing images, such as inadequate foreground modeling. In this paper, we present a foreground-aware remote sensing semantic segmentation model. The model introduces a multi-scale convolutional attention mechanism and utilizes a feature pyramid network architecture to extract multi-scale features, addressing the multi-scale problem. Additionally, we introduce a Foreground-Scene Relation Module to mitigate false alarms. The model enhances the foreground features by modeling the relationship between the foreground and the scene. In the loss function, a Soft Focal Loss is employed to focus on foreground samples during training, alleviating the foreground-background imbalance issue. Experimental results indicate that our proposed method outperforms current state-of-the-art general semantic segmentation methods and transformer-based methods on the LS dataset benchmark.
卫星和航空影像技术的进步使得获取高分辨率遥感影像变得更加容易,从而在各个领域引发了广泛的研究和应用。遥感影像语义分割是一项关键任务,它为目标物体提供语义和定位信息。除了大多数语义分割数据集中常见的大规模变化问题外,航空影像还存在独特的挑战,包括高背景复杂性和前景-背景比例失衡。然而,一般的语义分割方法主要解决自然场景中的尺度变化问题,往往忽略了遥感影像中的特定挑战,如前景建模不足。在本文中,我们提出了一种前景感知遥感语义分割模型。该模型引入了多尺度卷积注意力机制,并利用特征金字塔网络架构来提取多尺度特征,解决了多尺度问题。此外,我们引入了一个前景-场景关系模块来减少误报。该模型通过对前景与场景之间的关系进行建模来增强前景特征。在损失函数中,采用软焦损失在训练期间关注前景样本,缓解了前景-背景不平衡问题。实验结果表明,我们提出的方法在LS数据集基准上优于当前最先进的一般语义分割方法和基于Transformer的方法。