Suppr超能文献

通过学习特征和视图合成实现长期视觉定位的参考姿态生成

Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis.

作者信息

Zhang Zichao, Sattler Torsten, Scaramuzza Davide

机构信息

Robotics and Perception Group, University of Zurich, Zürich, Switzerland.

Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Prague, Czech Republic.

出版信息

Int J Comput Vis. 2021;129(4):821-844. doi: 10.1007/s11263-020-01399-8. Epub 2020 Dec 23.

Abstract

Visual Localization is one of the key enabling technologies for autonomous driving and augmented reality. High quality datasets with accurate 6 Degree-of-Freedom (DoF) reference poses are the foundation for benchmarking and improving existing methods. Traditionally, reference poses have been obtained via Structure-from-Motion (SfM). However, SfM itself relies on local features which are prone to fail when images were taken under different conditions, e.g., day/night changes. At the same time, manually annotating feature correspondences is not scalable and potentially inaccurate. In this work, we propose a semi-automated approach to generate reference poses based on feature matching between renderings of a 3D model and real images via learned features. Given an initial pose estimate, our approach iteratively refines the pose based on feature matches against a rendering of the model from the current pose estimate. We significantly improve the nighttime reference poses of the popular Aachen Day-Night dataset, showing that state-of-the-art visual localization methods perform better (up to 47%) than predicted by the original reference poses. We extend the dataset with new nighttime test images, provide uncertainty estimates for our new reference poses, and introduce a new evaluation criterion. We will make our reference poses and our framework publicly available upon publication.

摘要

视觉定位是自动驾驶和增强现实的关键支撑技术之一。具有精确六自由度(DoF)参考位姿的高质量数据集是对现有方法进行基准测试和改进的基础。传统上,参考位姿是通过运动结构(SfM)获得的。然而,SfM本身依赖于局部特征,当在不同条件下(例如白天/夜晚变化)拍摄图像时,这些局部特征容易失效。同时,手动标注特征对应关系不可扩展且可能不准确。在这项工作中,我们提出了一种半自动方法,通过基于学习到的特征在3D模型渲染与真实图像之间进行特征匹配来生成参考位姿。给定初始位姿估计,我们的方法基于与当前位姿估计的模型渲染进行特征匹配,迭代地优化位姿。我们显著改进了流行的亚琛日夜数据集的夜间参考位姿,表明当前最先进的视觉定位方法的性能比原始参考位姿预测的要好(高达47%)。我们用新的夜间测试图像扩展了数据集,为我们的新参考位姿提供了不确定性估计,并引入了新的评估标准。我们将在论文发表后公开我们的参考位姿和框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2437/8550273/22d28b5b995e/11263_2020_1399_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验