Zhou Yunsong, He Yuan, Zhu Hongzi, Wang Cheng, Li Hongyang, Jiang Qinhong
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):10114-10128. doi: 10.1109/TPAMI.2021.3136899. Epub 2022 Nov 7.
Monocular 3D object detection is an important task in autonomous driving. It can be easily intractable where there exists ego-car pose change w.r.t. ground plane. This is common due to the slight fluctuation of road smoothness and slope. Due to the lack of insight in industrial application, existing methods on open datasets neglect the camera pose information, which inevitably results in the detector being susceptible to camera extrinsic parameters. The perturbation of objects is very popular in most autonomous driving cases for industrial products. To this end, we propose a novel method to capture camera pose to formulate the detector free from extrinsic perturbation. Specifically, the proposed framework predicts camera extrinsic parameters by detecting vanishing point and horizon change. A converter is designed to rectify perturbative features in the latent space. By doing so, our 3D detector works independent of the extrinsic parameter variations and produces accurate results in realistic cases, e.g., potholed and uneven roads, where almost all existing monocular detectors fail to handle. Experiments demonstrate our method yields the best performance compared with the other state-of-the-arts by a large margin on both KITTI 3D and nuScenes datasets.
单目3D目标检测是自动驾驶中的一项重要任务。在存在自车相对于地面平面姿态变化的情况下,它很容易变得难以处理。由于道路平整度和坡度的轻微波动,这种情况很常见。由于在工业应用中缺乏深入了解,现有公开数据集上的方法忽略了相机姿态信息,这不可避免地导致检测器容易受到相机外部参数的影响。在大多数工业产品的自动驾驶场景中,物体的扰动非常普遍。为此,我们提出了一种新颖的方法来捕捉相机姿态,以构建不受外部扰动影响的检测器。具体来说,所提出的框架通过检测消失点和地平线变化来预测相机外部参数。设计了一个转换器来校正潜在空间中的扰动特征。通过这样做,我们的3D检测器独立于外部参数变化工作,并在现实场景中产生准确的结果,例如在坑洼不平的道路上,几乎所有现有的单目检测器都无法处理这种情况。实验表明,在KITTI 3D和nuScenes数据集上,我们的方法与其他最先进的方法相比,在性能上有很大的优势。