HeightFormer：无需额外数据的显式高度建模，用于鸟瞰视角下仅基于相机的3D目标检测

HeightFormer: Explicit Height Modeling Without Extra Data for Camera-Only 3D Object Detection in Bird's Eye View.

作者信息

Wu Yiming, Li Ruixiang, Qin Zequn, Zhao Xinhai, Li Xi

出版信息

IEEE Trans Image Process. 2025;34:689-700. doi: 10.1109/TIP.2024.3427701. Epub 2025 Jan 28.

DOI:10.1109/TIP.2024.3427701

Abstract

Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths. Theoretically, we give proof of the equivalence between height-based methods and depth-based methods. Considering the equivalence and some advantages of modeling heights, we propose HeightFormer, which models heights and uncertainties in a self-recursive way. Without any extra data, the proposed HeightFormer could estimate heights in BEV accurately. Benchmark results show that the performance of HeightFormer achieves SOTA compared with those camera-only methods.

摘要

基于视觉的鸟瞰图（BEV）表示是一种新兴的自动驾驶感知方法。核心挑战在于利用多摄像头特征构建BEV空间，这是一个一对多的不适定问题。深入研究以往所有的BEV表示生成方法后，我们发现它们大多可分为两类：在图像视图中对深度进行建模或在BEV空间中对高度进行建模，且大多以隐式方式进行。在这项工作中，我们建议在BEV空间中显式地对高度进行建模，与对深度进行建模相比，这种方法无需像激光雷达这样的额外数据，并且能够适配任意的相机配置和类型。从理论上讲，我们证明了基于高度的方法和基于深度的方法之间的等价性。考虑到对高度进行建模的等价性和一些优势，我们提出了HeightFormer，它以自递归的方式对高度和不确定性进行建模。无需任何额外数据，所提出的HeightFormer就能在BEV中准确估计高度。基准测试结果表明，与那些仅使用摄像头的方法相比，HeightFormer的性能达到了当前最优水平。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

HeightFormer：无需额外数据的显式高度建模，用于鸟瞰视角下仅基于相机的3D目标检测

HeightFormer: Explicit Height Modeling Without Extra Data for Camera-Only 3D Object Detection in Bird's Eye View.

作者信息

出版信息

相似文献

HeightFormer：无需额外数据的显式高度建模，用于鸟瞰视角下仅基于相机的3D目标检测

HeightFormer: Explicit Height Modeling Without Extra Data for Camera-Only 3D Object Detection in Bird's Eye View.

作者信息

出版信息

相似文献