跨视图地理定位的联合表示学习和关键点检测。

Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization.

出版信息

IEEE Trans Image Process. 2022;31:3780-3792. doi: 10.1109/TIP.2022.3175601. Epub 2022 Jun 2.

DOI:10.1109/TIP.2022.3175601

Abstract

In this paper, we study the cross-view geo-localization problem to match images from different viewpoints. The key motivation underpinning this task is to learn a discriminative viewpoint-invariant visual representation. Inspired by the human visual system for mining local patterns, we propose a new framework called RK-Net to jointly learn the discriminative Representation and detect salient Keypoints with a single Network. Specifically, we introduce a Unit Subtraction Attention Module (USAM) that can automatically discover representative keypoints from feature maps and draw attention to the salient regions. USAM contains very few learning parameters but yields significant performance improvement and can be easily plugged into different networks. We demonstrate through extensive experiments that (1) by incorporating USAM, RK-Net facilitates end-to-end joint learning without the prerequisite of extra annotations. Representation learning and keypoint detection are two highly-related tasks. Representation learning aids keypoint detection. Keypoint detection, in turn, enriches the model capability against large appearance changes caused by viewpoint variants. (2) USAM is easy to implement and can be integrated with existing methods, further improving the state-of-the-art performance. We achieve competitive geo-localization accuracy on three challenging datasets, i. e., University-1652, CVUSA and CVACT. Our code is available at https://github.com/AggMan96/RK-Net.

摘要

在本文中，我们研究了跨视图地理定位问题，以匹配来自不同视点的图像。这项任务的关键动机是学习具有辨别力的不变视点视觉表示。受人类视觉系统挖掘局部模式的启发，我们提出了一个名为 RK-Net 的新框架，该框架使用单个网络联合学习有辨别力的表示和检测显著的关键点。具体来说，我们引入了一个单元减法注意力模块（USAM），它可以自动从特征图中发现有代表性的关键点，并将注意力吸引到显著区域。USAM 包含很少的学习参数，但能显著提高性能，并且可以轻松地插入到不同的网络中。我们通过广泛的实验证明：（1）通过引入 USAM，RK-Net 促进了端到端的联合学习，而无需额外的注释。表示学习和关键点检测是两个高度相关的任务。表示学习有助于关键点检测。反过来，关键点检测丰富了模型的能力，以应对由视点变体引起的大的外观变化。（2）USAM 易于实现，并且可以与现有的方法集成，进一步提高了最先进的性能。我们在三个具有挑战性的数据集上实现了有竞争力的地理定位精度，即 University-1652、CVUSA 和 CVACT。我们的代码可在 https://github.com/AggMan96/RK-Net 获得。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

跨视图地理定位的联合表示学习和关键点检测。

Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization.

出版信息

相似文献

引用本文的文献

跨视图地理定位的联合表示学习和关键点检测。

Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization.

出版信息

相似文献

引用本文的文献