IEEE Trans Image Process. 2023;32:2552-2567. doi: 10.1109/TIP.2023.3270032. Epub 2023 May 5.
Robust keypoint detection on omnidirectional images against large perspective variations, is a key problem in many computer vision tasks. In this paper, we propose a perspectively equivariant keypoint learning framework named OmniKL for addressing this problem. Specifically, the framework is composed of a perspective module and a spherical module, each one including a keypoint detector specific to the type of the input image and a shared descriptor providing uniform description for omnidirectional and perspective images. In these detectors, we propose a differentiable candidate position sorting operation for localizing keypoints, which directly sorts the scores of the candidate positions in a differentiable manner and returns the globally top-K keypoints on the image. This approach does not break the differentiability of the two modules, thus they are end-to-end trainable. Moreover, we design a novel training strategy combining the self-supervised and co-supervised methods to train the framework without any labeled data. Extensive experiments on synthetic and real-world 360° image datasets demonstrate the effectiveness of OmniKL in detecting perspectively equivariant keypoints on omnidirectional images. Our source code are available online at https://github.com/vandeppce/sphkpt.
针对大视角变化的全向图像鲁棒关键点检测是许多计算机视觉任务中的一个关键问题。在本文中,我们提出了一种名为 OmniKL 的视角不变关键点学习框架来解决这个问题。具体来说,该框架由一个视角模块和一个球型模块组成,每个模块都包括一个特定于输入图像类型的关键点检测器和一个共享描述符,为全向和透视图像提供统一的描述。在这些检测器中,我们提出了一种可微分的候选位置排序操作,用于定位关键点,该操作以可微分的方式直接对候选位置的得分进行排序,并返回图像上全局的前 K 个关键点。这种方法不会破坏两个模块的可微性,因此它们是端到端可训练的。此外,我们设计了一种新颖的训练策略,结合自监督和协同监督方法来训练框架,而无需任何标记数据。在合成和真实的 360°图像数据集上的广泛实验表明,OmniKL 在全向图像上检测视角不变关键点是有效的。我们的源代码可在 https://github.com/vandeppce/sphkpt 上获得。