Colomer Sylvain, Cuperlier Nicolas, Bresson Guillaume, Gaussier Philippe, Romain Olivier
Institut de Recherche Vedecom, Versailles, France.
Laboratoire ETIS UMR8051, CY Cergy Paris Université, ENSEA, CNRS, Cergy, France.
Front Robot AI. 2022 Feb 4;8:703811. doi: 10.3389/frobt.2021.703811. eCollection 2021.
Autonomous vehicles require precise and reliable self-localization to cope with dynamic environments. The field of visual place recognition (VPR) aims to solve this challenge by relying on the visual modality to recognize a place despite changes in the appearance of the perceived visual scene. In this paper, we propose to tackle the VPR problem following a neuro-cybernetic approach. To this end, the Log-Polar Max-Pi (LPMP) model is introduced. This bio-inspired neural network allows building a neural representation of the environment an unsupervised one-shot learning. Inspired by the spatial cognition of mammals, visual information in the LPMP model are processed through two distinct pathways: a "what" pathway that extracts and learns the local visual signatures (landmarks) of a visual scene and a "where" pathway that computes their azimuth. These two pieces of information are then merged to build a visuospatial code that is characteristic of the place where the visual scene was perceived. Three main contributions are presented in this article: 1) the LPMP model is studied and compared with NetVLAD and CoHog, two state-of-the-art VPR models; 2) a test benchmark for the evaluation of VPR models according to the type of environment traveled is proposed based on the ; and 3) the impact of the use of a novel detector leading to an uneven paving of an environment is evaluated in terms of the localization performance and compared to a regular paving. Our experiments show that the LPMP model can achieve comparable or better localization performance than NetVLAD and CoHog.
自动驾驶车辆需要精确且可靠的自我定位来应对动态环境。视觉场所识别(VPR)领域旨在通过依靠视觉模态来识别场所,以解决这一挑战,尽管所感知的视觉场景外观会发生变化。在本文中,我们提议采用神经控制论方法来解决VPR问题。为此,引入了对数极坐标最大-π(LPMP)模型。这个受生物启发的神经网络允许构建环境的神经表示——一种无监督的一次性学习。受哺乳动物空间认知的启发,LPMP模型中的视觉信息通过两条不同的路径进行处理:一条“什么”路径,用于提取和学习视觉场景的局部视觉特征(地标),以及一条“哪里”路径,用于计算它们的方位角。然后将这两条信息合并,以构建一个视觉空间代码,该代码是感知视觉场景的场所的特征。本文提出了三个主要贡献:1)对LPMP模型进行了研究,并与两个最先进的VPR模型NetVLAD和CoHog进行了比较;2)基于[此处原文缺失相关内容],提出了一个根据行驶环境类型评估VPR模型的测试基准;3)评估了使用一种新型探测器导致环境路面不平坦对定位性能的影响,并与常规路面进行了比较。我们的实验表明,LPMP模型可以实现与NetVLAD和CoHog相当或更好的定位性能。