• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HeightFormer:无需额外数据的显式高度建模,用于鸟瞰视角下仅基于相机的3D目标检测

HeightFormer: Explicit Height Modeling Without Extra Data for Camera-Only 3D Object Detection in Bird's Eye View.

作者信息

Wu Yiming, Li Ruixiang, Qin Zequn, Zhao Xinhai, Li Xi

出版信息

IEEE Trans Image Process. 2025;34:689-700. doi: 10.1109/TIP.2024.3427701. Epub 2025 Jan 28.

DOI:10.1109/TIP.2024.3427701
PMID:39250369
Abstract

Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths. Theoretically, we give proof of the equivalence between height-based methods and depth-based methods. Considering the equivalence and some advantages of modeling heights, we propose HeightFormer, which models heights and uncertainties in a self-recursive way. Without any extra data, the proposed HeightFormer could estimate heights in BEV accurately. Benchmark results show that the performance of HeightFormer achieves SOTA compared with those camera-only methods.

摘要

基于视觉的鸟瞰图(BEV)表示是一种新兴的自动驾驶感知方法。核心挑战在于利用多摄像头特征构建BEV空间,这是一个一对多的不适定问题。深入研究以往所有的BEV表示生成方法后,我们发现它们大多可分为两类:在图像视图中对深度进行建模或在BEV空间中对高度进行建模,且大多以隐式方式进行。在这项工作中,我们建议在BEV空间中显式地对高度进行建模,与对深度进行建模相比,这种方法无需像激光雷达这样的额外数据,并且能够适配任意的相机配置和类型。从理论上讲,我们证明了基于高度的方法和基于深度的方法之间的等价性。考虑到对高度进行建模的等价性和一些优势,我们提出了HeightFormer,它以自递归的方式对高度和不确定性进行建模。无需任何额外数据,所提出的HeightFormer就能在BEV中准确估计高度。基准测试结果表明,与那些仅使用摄像头的方法相比,HeightFormer的性能达到了当前最优水平。

相似文献

1
HeightFormer: Explicit Height Modeling Without Extra Data for Camera-Only 3D Object Detection in Bird's Eye View.HeightFormer:无需额外数据的显式高度建模,用于鸟瞰视角下仅基于相机的3D目标检测
IEEE Trans Image Process. 2025;34:689-700. doi: 10.1109/TIP.2024.3427701. Epub 2025 Jan 28.
2
Kalman Filter-Based Fusion of LiDAR and Camera Data in Bird's Eye View for Multi-Object Tracking in Autonomous Vehicles.基于卡尔曼滤波器的激光雷达与相机数据融合在鸟瞰图中用于自动驾驶车辆的多目标跟踪
Sensors (Basel). 2024 Dec 3;24(23):7718. doi: 10.3390/s24237718.
3
Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe.深入探究鸟瞰视角感知的难题:一篇综述、评估与方法
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2151-2170. doi: 10.1109/TPAMI.2023.3333838. Epub 2024 Mar 6.
4
Robust BEV 3D Object Detection for Vehicles with Tire Blow-Out.用于爆胎车辆的鲁棒性鸟瞰图3D目标检测
Sensors (Basel). 2024 Jul 9;24(14):4446. doi: 10.3390/s24144446.
5
Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline.Fast-BEV:一种快速且强大的鸟瞰视角感知基线。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8665-8679. doi: 10.1109/TPAMI.2024.3414835. Epub 2024 Nov 6.
6
BEVFormer: Learning Bird's-Eye-View Representation From LiDAR-Camera Via Spatiotemporal Transformers.BEVFormer:通过时空变换器从激光雷达-相机学习鸟瞰视角表示
IEEE Trans Pattern Anal Mach Intell. 2024 Dec 10;PP. doi: 10.1109/TPAMI.2024.3515454.
7
Free Space Detection Using Camera-LiDAR Fusion in a Bird's Eye View Plane.基于鸟瞰面相机-激光雷达融合的自由空间检测。
Sensors (Basel). 2021 Nov 17;21(22):7623. doi: 10.3390/s21227623.
8
IRBEVF-Q: Optimization of Image-Radar Fusion Algorithm Based on Bird's Eye View Features.IRBEVF-Q:基于鸟瞰特征的图像-雷达融合算法优化
Sensors (Basel). 2024 Jul 16;24(14):4602. doi: 10.3390/s24144602.
9
PSANet: Pyramid Splitting and Aggregation Network for 3D Object Detection in Point Cloud.PSANet:用于点云中 3D 目标检测的金字塔分裂与聚合网络。
Sensors (Basel). 2020 Dec 28;21(1):136. doi: 10.3390/s21010136.
10
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving.
IEEE Trans Pattern Anal Mach Intell. 2025 May;47(5):3878-3894. doi: 10.1109/TPAMI.2025.3535960. Epub 2025 Apr 8.