基于KITTI城市场景中3D感知特征的视觉目标识别

Visual Object Recognition with 3D-Aware Features in KITTI Urban Scenes.

作者信息

Yebes J Javier, Bergasa Luis M, García-Garrido Miguel Ángel

机构信息

Department of Electronics, University of Alcalá, Alcalá de Henares 28871, Spain.

出版信息

Sensors (Basel). 2015 Apr 20;15(4):9228-50. doi: 10.3390/s150409228.

DOI:10.3390/s150409228

PMID:25903553

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4431302/

Abstract

Driver assistance systems and autonomous robotics rely on the deployment of several sensors for environment perception. Compared to LiDAR systems, the inexpensive vision sensors can capture the 3D scene as perceived by a driver in terms of appearance and depth cues. Indeed, providing 3D image understanding capabilities to vehicles is an essential target in order to infer scene semantics in urban environments. One of the challenges that arises from the navigation task in naturalistic urban scenarios is the detection of road participants (e.g., cyclists, pedestrians and vehicles). In this regard, this paper tackles the detection and orientation estimation of cars, pedestrians and cyclists, employing the challenging and naturalistic KITTI images. This work proposes 3D-aware features computed from stereo color images in order to capture the appearance and depth peculiarities of the objects in road scenes. The successful part-based object detector, known as DPM, is extended to learn richer models from the 2.5D data (color and disparity), while also carrying out a detailed analysis of the training pipeline. A large set of experiments evaluate the proposals, and the best performing approach is ranked on the KITTI website. Indeed, this is the first work that reports results with stereo data for the KITTI object challenge, achieving increased detection ratios for the classes car and cyclist compared to a baseline DPM.

摘要

驾驶辅助系统和自主机器人依靠部署多个传感器来进行环境感知。与激光雷达系统相比，价格低廉的视觉传感器能够从外观和深度线索方面捕捉驾驶员所感知的三维场景。事实上，为车辆提供三维图像理解能力是推断城市环境中场景语义的一个重要目标。在自然主义城市场景中的导航任务所带来的挑战之一是检测道路参与者（如骑自行车的人、行人和车辆）。在这方面，本文利用具有挑战性的自然主义基蒂（KITTI）图像，解决汽车、行人和骑自行车者的检测与方向估计问题。这项工作提出了从立体彩色图像计算出的三维感知特征，以便捕捉道路场景中物体的外观和深度特性。成功的基于部件的目标检测器（称为DPM）被扩展，以从2.5D数据（颜色和视差）中学习更丰富的模型，同时还对训练流程进行了详细分析。大量实验对这些提议进行了评估，表现最佳的方法在基蒂网站上进行了排名。事实上，这是第一项针对基蒂目标挑战赛报告立体数据结果的工作，与基线DPM相比，汽车和骑自行车者类别的检测率有所提高。