Jin Xu, Junping Yin, Juan Zhang, Tianyan Gao
Institute of Applied Physics and Computational Mathematics, China Academy of Engineering Physics, Beijing, 100193, China.
Data Fusion Laboratory, Shanghai Zhangjiang Institute of Mathematics, Shanghai, 201210, China.
Sci Rep. 2025 Sep 29;15(1):33431. doi: 10.1038/s41598-025-18935-6.
Cross-view geo-localization aims to match images of the same location captured from different perspectives, such as drone and satellite views. This task is inherently challenging due to significant visual discrepancies caused by viewpoint variations. Existing approaches often rely on global descriptors or limited directional cues, failing to effectively integrate diverse spatial information and global-local interactions. To address these limitations, we propose the Global-Local Quadrant Interaction Network (GLQINet), which enhances feature representation through two key components: the Quadrant Insight Module (QIM) and the Integrated Global-Local Attention Module (IGLAM). QIM partitions feature maps into directional quadrants, refining multi-scale spatial representations while preserving intra-class consistency. Meanwhile, IGLAM bridges global and local features by aggregating high-association feature stripes, reinforcing semantic coherence and spatial correlations. Extensive experiments on the University-1652 and SUES-200 benchmarks demonstrate that GLQINet significantly improves geo-localization accuracy, achieving state-of-the-art performance and effectively mitigating cross-view discrepancies.
跨视角地理定位旨在匹配从不同视角(如无人机和卫星视图)拍摄的同一位置的图像。由于视角变化导致显著的视觉差异,此任务本质上具有挑战性。现有方法通常依赖全局描述符或有限的方向线索,未能有效整合多样的空间信息和全局-局部交互。为解决这些局限性,我们提出了全局-局部象限交互网络(GLQINet),它通过两个关键组件增强特征表示:象限洞察模块(QIM)和集成全局-局部注意力模块(IGLAM)。QIM将特征图划分为方向象限,在保留类内一致性的同时细化多尺度空间表示。同时,IGLAM通过聚合高关联特征条带桥接全局和局部特征,增强语义连贯性和空间相关性。在University-1652和SUES-200基准上的大量实验表明,GLQINet显著提高了地理定位精度,实现了领先的性能并有效减轻了跨视角差异。