Li Yansheng, Luo Junwei, Zhang Yongjun, Tan Yihua, Yu Jin-Gang, Bai Song
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11507-11523. doi: 10.1109/TPAMI.2024.3393024. Epub 2024 Nov 6.
Bridge detection in remote sensing images (RSIs) plays a crucial role in various applications, but it poses unique challenges compared to the detection of other objects. In RSIs, bridges exhibit considerable variations in terms of their spatial scales and aspect ratios. Therefore, to ensure the visibility and integrity of bridges, it is essential to perform holistic bridge detection in large-size very-high-resolution (VHR) RSIs. However, the lack of datasets with large-size VHR RSIs limits the deep learning algorithms' performance on bridge detection. Due to the limitation of GPU memory in tackling large-size images, deep learning-based object detection methods commonly adopt the cropping strategy, which inevitably results in label fragmentation and discontinuous prediction. To ameliorate the scarcity of datasets, this paper proposes a large-scale dataset named GLH-Bridge comprising 6,000 VHR RSIs sampled from diverse geographic locations across the globe. These images encompass a wide range of sizes, varying from 2,048 × 2,048 to 16,384 × 16,384 pixels, and collectively feature 59,737 bridges. These bridges span diverse backgrounds, and each of them has been manually annotated, using both an oriented bounding box (OBB) and a horizontal bounding box (HBB). Furthermore, we present an efficient network for holistic bridge detection (HBD-Net) in large-size RSIs. The HBD-Net presents a separate detector-based feature fusion (SDFF) architecture and is optimized via a shape-sensitive sample re-weighting (SSRW) strategy. The SDFF architecture performs inter-layer feature fusion (IFF) to incorporate multi-scale context in the dynamic image pyramid (DIP) of the large-size image, and the SSRW strategy is employed to ensure an equitable balance in the regression weight of bridges with various aspect ratios. Based on the proposed GLH-Bridge dataset, we establish a bridge detection benchmark including the OBB and HBB tasks, and validate the effectiveness of the proposed HBD-Net. Additionally, cross-dataset generalization experiments on two publicly available datasets illustrate the strong generalization capability of the GLH-Bridge dataset.
遥感图像(RSIs)中的桥梁检测在各种应用中起着至关重要的作用,但与其他物体的检测相比,它带来了独特的挑战。在遥感图像中,桥梁在空间尺度和宽高比方面表现出很大的差异。因此,为了确保桥梁的可见性和完整性,在大尺寸超高分辨率(VHR)遥感图像中进行整体桥梁检测至关重要。然而,缺乏大尺寸VHR遥感图像数据集限制了深度学习算法在桥梁检测上的性能。由于在处理大尺寸图像时GPU内存的限制,基于深度学习的目标检测方法通常采用裁剪策略,这不可避免地导致标签碎片化和预测不连续。为了缓解数据集的稀缺问题,本文提出了一个名为GLH-Bridge的大规模数据集,它由从全球不同地理位置采样的6000张大尺寸VHR遥感图像组成。这些图像涵盖了广泛的尺寸范围,从2048×2048像素到16384×16384像素不等,总共包含59737座桥梁。这些桥梁跨越了不同的背景,并且每座桥梁都已使用定向边界框(OBB)和水平边界框(HBB)进行了手动标注。此外,我们提出了一种用于大尺寸遥感图像中整体桥梁检测(HBD-Net)的高效网络。HBD-Net提出了一种基于单独检测器的特征融合(SDFF)架构,并通过形状敏感样本重新加权(SSRW)策略进行优化。SDFF架构执行层间特征融合(IFF),以在大尺寸图像的动态图像金字塔(DIP)中纳入多尺度上下文,并且采用SSRW策略来确保不同宽高比桥梁的回归权重之间的公平平衡。基于所提出的GLH-Bridge数据集,我们建立了一个包括OBB和HBB任务的桥梁检测基准,并验证了所提出的HBD-Net的有效性。此外,在两个公开可用数据集上进行的跨数据集泛化实验说明了GLH-Bridge数据集强大的泛化能力。