Zheng Zhuo, Zhong Yanfei, Wang Junjue, Ma Ailong, Zhang Liangpei
IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13715-13729. doi: 10.1109/TPAMI.2023.3296757. Epub 2023 Oct 3.
Geospatial object segmentation, a fundamental Earth vision task, always suffers from scale variation, the larger intra-class variance of background, and foreground-background imbalance in high spatial resolution (HSR) remote sensing imagery. Generic semantic segmentation methods mainly focus on the scale variation in natural scenarios. However, the other two problems are insufficiently considered in large area Earth observation scenarios. In this paper, we propose a foreground-aware relation network (FarSeg++) from the perspectives of relation-based, optimization-based, and objectness-based foreground modeling, alleviating the above two problems. From the perspective of the relations, the foreground-scene relation module improves the discrimination of the foreground features via the foreground-correlated contexts associated with the object-scene relation. From the perspective of optimization, foreground-aware optimization is proposed to focus on foreground examples and hard examples of the background during training to achieve a balanced optimization. Besides, from the perspective of objectness, a foreground-aware decoder is proposed to improve the objectness representation, alleviating the objectness prediction problem that is the main bottleneck revealed by an empirical upper bound analysis. We also introduce a new large-scale high-resolution urban vehicle segmentation dataset to verify the effectiveness of the proposed method and push the development of objectness prediction further forward. The experimental results suggest that FarSeg++ is superior to the state-of-the-art generic semantic segmentation methods and can achieve a better trade-off between speed and accuracy.
地理空间目标分割是一项基本的地球视觉任务,在高空间分辨率(HSR)遥感影像中,它一直受到尺度变化、背景的类内方差较大以及前景-背景不平衡等问题的困扰。通用语义分割方法主要关注自然场景中的尺度变化。然而,在大面积地球观测场景中,对另外两个问题的考虑并不充分。在本文中,我们从基于关系、基于优化和基于目标性的前景建模角度提出了一种前景感知关系网络(FarSeg++),以缓解上述两个问题。从关系角度来看,前景-场景关系模块通过与目标-场景关系相关的前景关联上下文来提高前景特征的辨别能力。从优化角度来看,提出了前景感知优化,在训练过程中聚焦于前景示例和背景的难示例,以实现平衡优化。此外,从目标性角度来看,提出了一种前景感知解码器来改善目标性表示,缓解通过经验上限分析揭示的作为主要瓶颈的目标性预测问题。我们还引入了一个新的大规模高分辨率城市车辆分割数据集,以验证所提方法的有效性,并进一步推动目标性预测的发展。实验结果表明,FarSeg++优于当前最先进的通用语义分割方法,并且能够在速度和准确性之间实现更好的权衡。