Suppr超能文献

阿波罗景观开放数据集在自动驾驶中的应用

The ApolloScape Open Dataset for Autonomous Driving and Its Application.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2702-2719. doi: 10.1109/TPAMI.2019.2926463. Epub 2019 Jul 2.

Abstract

Autonomous driving has attracted tremendous attention especially in the past few years. The key techniques for a self-driving car include solving tasks like 3D map construction, self-localization, parsing the driving road and understanding objects, which enable vehicles to reason and act. However, large scale data set for training and system evaluation is still a bottleneck for developing robust perception models. In this paper, we present the ApolloScape dataset [1] and its applications for autonomous driving. Compared with existing public datasets from real scenes, e.g., KITTI [2] or Cityscapes [3] , ApolloScape contains much large and richer labelling including holistic semantic dense point cloud for each site, stereo, per-pixel semantic labelling, lanemark labelling, instance segmentation, 3D car instance, high accurate location for every frame in various driving videos from multiple sites, cities and daytimes. For each task, it contains at lease 15x larger amount of images than SOTA datasets. To label such a complete dataset, we develop various tools and algorithms specified for each task to accelerate the labelling process, such as joint 3D-2D segment labeling, active labelling in videos etc. Depend on ApolloScape, we are able to develop algorithms jointly consider the learning and inference of multiple tasks. In this paper, we provide a sensor fusion scheme integrating camera videos, consumer-grade motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robust self-localization and semantic segmentation for autonomous driving. We show that practically, sensor fusion and joint learning of multiple tasks are beneficial to achieve a more robust and accurate system. We expect our dataset and proposed relevant algorithms can support and motivate researchers for further development of multi-sensor fusion and multi-task learning in the field of computer vision.

摘要

自动驾驶技术在过去几年中引起了广泛关注。自动驾驶汽车的关键技术包括解决 3D 地图构建、自定位、解析驾驶道路和理解目标等任务,这些任务使车辆能够进行推理和行动。然而,用于训练和系统评估的大规模数据集仍然是开发鲁棒感知模型的瓶颈。在本文中,我们提出了 ApolloScape 数据集[1]及其在自动驾驶中的应用。与现有的真实场景公共数据集相比,例如 KITTI[2]或 Cityscapes[3],ApolloScape 包含更多、更丰富的标注,包括每个地点的整体语义密集点云、立体图像、逐像素语义标注、车道标记、实例分割、3D 车辆实例、来自多个地点、城市和不同时段的各种驾驶视频中的每一帧的高精度位置。对于每个任务,它包含的图像数量至少是 SOTA 数据集的 15 倍。为了标注这样一个完整的数据集,我们开发了针对每个任务的各种工具和算法来加速标注过程,例如联合 3D-2D 分割标注、视频中的主动标注等。基于 ApolloScape,我们能够开发联合学习多个任务的算法。在本文中,我们提供了一种传感器融合方案,该方案集成了摄像机视频、消费级运动传感器(GPS/IMU)和 3D 语义图,以实现自动驾驶的鲁棒自定位和语义分割。我们表明,实际上,传感器融合和多任务联合学习有助于实现更鲁棒和准确的系统。我们希望我们的数据集和提出的相关算法能够支持和激励研究人员进一步开发计算机视觉领域的多传感器融合和多任务学习。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验