基于二次曲面的室内 RGB-D 目标 SLAM 方法

RGB-D Object SLAM Using Quadrics for Indoor Environments.

机构信息

Robotics Institute, Beihang University, Beijing 100191, China.

出版信息

Sensors (Basel). 2020 Sep 9;20(18):5150. doi: 10.3390/s20185150.

DOI:10.3390/s20185150

PMID:32917023

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7571184/

Abstract

Indoor service robots need to build an object-centric semantic map to understand and execute human instructions. Conventional visual simultaneous localization and mapping (SLAM) systems build a map using geometric features such as points, lines, and planes as landmarks. However, they lack a semantic understanding of the environment. This paper proposes an object-level semantic SLAM algorithm based on RGB-D data, which uses a quadric surface as an object model to compactly represent the object's position, orientation, and shape. This paper proposes and derives two types of RGB-D camera-quadric observation models: a complete model and a partial model. The complete model combines object detection and point cloud data to estimate a complete ellipsoid in a single RGB-D frame. The partial model is activated when the depth data is severely missing because of illuminations or occlusions, which uses bounding boxes from object detection to constrain objects. Compared with the state-of-the-art quadric SLAM algorithms that use a monocular observation model, the RGB-D observation model reduces the requirements of the observation number and viewing angle changes, which helps improve the accuracy and robustness. This paper introduces a nonparametric pose graph to solve data associations in the back end, and innovatively applies it to the quadric surface model. We thoroughly evaluated the algorithm on two public datasets and an author-collected mobile robot dataset in a home-like environment. We obtained obvious improvements on the localization accuracy and mapping effects compared with two state-of-the-art object SLAM algorithms.

摘要

室内服务机器人需要构建以对象为中心的语义地图，以理解和执行人类指令。传统的视觉同步定位与建图 (SLAM) 系统使用点、线和平面等几何特征作为地标构建地图，但缺乏对环境的语义理解。本文提出了一种基于 RGB-D 数据的对象级语义 SLAM 算法，该算法使用二次曲面作为对象模型，以紧凑的方式表示对象的位置、方向和形状。本文提出并推导了两种 RGB-D 相机-二次曲面观测模型：完整模型和部分模型。完整模型结合了目标检测和点云数据，在单个 RGB-D 帧中估计完整的椭球。部分模型在深度数据因光照或遮挡而严重缺失时被激活，它使用目标检测的边界框来约束对象。与使用单目观测模型的最新二次 SLAM 算法相比，RGB-D 观测模型降低了对观测数量和视角变化的要求，有助于提高准确性和鲁棒性。本文引入了一种非参数姿态图来解决后端的数据关联，并创新性地将其应用于二次曲面模型。我们在两个公共数据集和一个作者收集的家庭环境中的移动机器人数据集上彻底评估了该算法。与两种最新的对象 SLAM 算法相比，我们在定位精度和映射效果方面都取得了明显的改进。