用于3D障碍物多摄像头系统的联合目标检测与重新识别

Joint Object Detection and Re-Identification for 3D Obstacle Multi-Camera Systems.

作者信息

Cortés Irene, Beltrán Jorge, de la Escalera Arturo, García Fernando

机构信息

Department of Systems Engineering and Automation, Universidad Carlos III de Madrid (UC3M), 28911 Madrid, Spain.

Department of Signal Theory, Telematics, and Computer Science, Rey Juan Carlos University (URJC), 28922 Madrid, Spain.

出版信息

Sensors (Basel). 2023 Nov 25;23(23):9395. doi: 10.3390/s23239395.

DOI:10.3390/s23239395

PMID:38067768

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10708695/

Abstract

The growing on-board processing capabilities have led to more complex sensor configurations, enabling autonomous car prototypes to expand their operational scope. Nowadays, the joint use of LiDAR data and multiple cameras is almost a standard and poses new challenges for existing multi-modal perception pipelines, such as dealing with contradictory or redundant detections caused by inference on overlapping images. In this paper, we address this last issue in the context of sequential schemes like F-PointNets, where object candidates are obtained in the image space, and the final 3D bounding box is then inferred from point cloud information. To this end, we propose the inclusion of a re-identification branch into the 2D detector, i.e., Faster R-CNN, so that objects seen from adjacent cameras can be handled before the 3D box estimation takes place, removing duplicates and completing the object's cloud. Extensive experimental evaluations covering both the 2D and 3D domains affirm the effectiveness of the suggested methodology. The findings indicate that our approach outperforms conventional Non-Maximum Suppression (NMS) methods. Particularly, we observed a significant gain of over 5% in terms of accuracy for cars in camera overlap regions. These results highlight the potential of our upgraded detection and re-identification system in practical scenarios for autonomous driving.

摘要

不断增强的车载处理能力带来了更复杂的传感器配置，使自动驾驶汽车原型能够扩大其运行范围。如今，激光雷达数据和多个摄像头的联合使用几乎已成为标准配置，这给现有的多模态感知管道带来了新挑战，比如处理因对重叠图像进行推理而产生的矛盾或冗余检测。在本文中，我们在诸如F-PointNets这样的顺序方案背景下解决这一最后问题，在该方案中，在图像空间中获取候选对象，然后根据点云信息推断最终的3D边界框。为此，我们建议在2D检测器（即Faster R-CNN）中加入一个重新识别分支，以便在进行3D边界框估计之前处理从相邻摄像头看到的对象，消除重复项并完善对象的点云。涵盖2D和3D领域的广泛实验评估证实了所提方法的有效性。研究结果表明，我们的方法优于传统的非极大值抑制（NMS）方法。特别是，我们观察到在摄像头重叠区域中，汽车检测精度显著提高了5%以上。这些结果凸显了我们升级后的检测和重新识别系统在自动驾驶实际场景中的潜力。