Cai Qi, Pan Yingwei, Yao Ting, Mei Tao
IEEE Trans Image Process. 2022;31:5706-5719. doi: 10.1109/TIP.2022.3201469. Epub 2022 Sep 2.
Recent progress on 2D object detection has featured Cascade RCNN, which capitalizes on a sequence of cascade detectors to progressively improve proposal quality, towards high-quality object detection. However, there has not been evidence in support of building such cascade structures for 3D object detection, a challenging detection scenario with highly sparse LiDAR point clouds. In this work, we present a simple yet effective cascade architecture, named 3D Cascade RCNN, that allocates multiple detectors based on the voxelized point clouds in a cascade paradigm, pursuing higher quality 3D object detector progressively. Furthermore, we quantitatively define the sparsity level of the points within 3D bounding box of each object as the point completeness score, which is exploited as the task weight for each proposal to guide the learning of each stage detector. The spirit behind is to assign higher weights for high-quality proposals with relatively complete point distribution, while down-weight the proposals with extremely sparse points that often incur noise during training. This design of completeness-aware re-weighting elegantly upgrades the cascade paradigm to be better applicable for the sparse input data, without increasing any FLOP budgets. Through extensive experiments on both the KITTI dataset and Waymo Open Dataset, we validate the superiority of our proposed 3D Cascade RCNN, when comparing to state-of-the-art 3D object detection techniques. The source code is publicly available at https://github.com/caiqi/Cascasde-3D.
二维目标检测的最新进展以级联区域卷积神经网络(Cascade RCNN)为特色,它利用一系列级联检测器逐步提高候选框质量,以实现高质量目标检测。然而,目前尚无证据支持为三维目标检测构建这样的级联结构,三维目标检测是一种具有高度稀疏激光雷达点云的具有挑战性的检测场景。在这项工作中,我们提出了一种简单而有效的级联架构,名为三维级联区域卷积神经网络(3D Cascade RCNN),它在级联范式中基于体素化点云分配多个检测器,逐步追求更高质量的三维目标检测器。此外,我们将每个物体三维边界框内点的稀疏程度定量定义为点完整性分数,将其用作每个候选框的任务权重,以指导每个阶段检测器的学习。其背后的理念是为具有相对完整点分布的高质量候选框分配更高的权重,同时降低那些在训练期间经常产生噪声的具有极其稀疏点的候选框的权重。这种完整性感知重新加权的设计巧妙地升级了级联范式,使其更适用于稀疏输入数据,而无需增加任何浮点运算量。通过在KITTI数据集和Waymo开放数据集上进行的大量实验,与当前最先进的三维目标检测技术相比,我们验证了所提出的三维级联区域卷积神经网络(3D Cascade RCNN)的优越性。源代码可在https://github.com/caiqi/Cascasde-3D上公开获取。