Suppr超能文献

基于深度学习的大规模视觉 SLAM 场景中鲁棒的视觉位置识别。

Towards a Robust Visual Place Recognition in Large-Scale vSLAM Scenarios Based on a Deep Distance Learning.

机构信息

School of Mechanical and Electric Engineering, Soochow University, Suzhou 215131, China.

出版信息

Sensors (Basel). 2021 Jan 5;21(1):310. doi: 10.3390/s21010310.

Abstract

The application of deep learning is blooming in the field of visual place recognition, which plays a critical role in visual Simultaneous Localization and Mapping (vSLAM) applications. The use of convolutional neural networks (CNNs) achieve better performance than handcrafted feature descriptors. However, visual place recognition is still a challenging task due to two major problems, i.e., perceptual aliasing and perceptual variability. Therefore, designing a customized distance learning method to express the intrinsic distance constraints in the large-scale vSLAM scenarios is of great importance. Traditional deep distance learning methods usually use the triplet loss which requires the mining of anchor images. This may, however, result in very tedious inefficient training and anomalous distance relationships. In this paper, a novel deep distance learning framework for visual place recognition is proposed. Through in-depth analysis of the multiple constraints of the distance relationship in the visual place recognition problem, the multi-constraint loss function is proposed to optimize the distance constraint relationships in the Euclidean space. The new framework can support any kind of CNN such as AlexNet, VGGNet and other user-defined networks to extract more distinguishing features. We have compared the results with the traditional deep distance learning method, and the results show that the proposed method can improve the performance by 19-28%. Additionally, compared to some contemporary visual place recognition techniques, the proposed method can improve the performance by 40%/36% and 27%/24% in average on VGGNet/AlexNet using the New College and the TUM datasets, respectively. It's verified the method is capable to handle appearance changes in complex environments.

摘要

深度学习在视觉位置识别领域的应用正在蓬勃发展,它在视觉同时定位与地图构建(vSLAM)应用中起着至关重要的作用。卷积神经网络(CNNs)的使用比手工制作的特征描述符能实现更好的性能。然而,视觉位置识别仍然是一个具有挑战性的任务,原因有两个,即感知混淆和感知可变性。因此,设计一种定制的距离学习方法来表达大规模 vSLAM 场景中的内在距离约束是非常重要的。传统的深度距离学习方法通常使用三元组损失,这需要挖掘锚图像。然而,这可能会导致非常繁琐的低效训练和异常的距离关系。本文提出了一种用于视觉位置识别的新的深度距离学习框架。通过深入分析视觉位置识别问题中距离关系的多个约束条件,提出了多约束损失函数来优化欧几里得空间中的距离约束关系。新框架可以支持任何类型的 CNN,如 AlexNet、VGGNet 等,以提取更具区分性的特征。我们将结果与传统的深度距离学习方法进行了比较,结果表明,所提出的方法可以提高 19-28%的性能。此外,与一些当代视觉位置识别技术相比,在所提出的方法中,在 New College 和 TUM 数据集上分别使用 VGGNet/AlexNet 时,性能可提高 40%/36%和 27%/24%。验证了该方法能够处理复杂环境中的外观变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b85f/7796086/804ebdbb3d99/sensors-21-00310-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验