• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Fast-BEV:一种快速且强大的鸟瞰视角感知基线。

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline.

作者信息

Li Yangguang, Huang Bin, Chen Zeren, Cui Yufeng, Liang Feng, Shen Mingzhu, Liu Fenggang, Xie Enze, Sheng Lu, Ouyang Wanli, Shao Jing

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8665-8679. doi: 10.1109/TPAMI.2024.3414835. Epub 2024 Nov 6.

DOI:10.1109/TPAMI.2024.3414835
PMID:38875097
Abstract

Recently, perception task based on Bird's-Eye View (BEV) representation has drawn more and more attention, and BEV representation is promising as the foundation for next-generation Autonomous Vehicle (AV) perception. However, most existing BEV solutions either require considerable resources to execute on-vehicle inference or suffer from modest performance. This paper proposes a simple yet effective framework, termed Fast-BEV, which is capable of performing faster BEV perception on the on-vehicle chips. Towards this goal, we first empirically find that the BEV representation can be sufficiently powerful without expensive transformer based transformation or depth representation. Our Fast-BEV consists of five parts, we innovatively propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image features to 3D voxel space, (2) a multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference. We further introduce (4) a strong data augmentation strategy for both image and BEV space to avoid over-fitting, (5) a multi-frame feature fusion mechanism to leverage the temporal information. Among them, (1) and (3) enable Fast-BEV to be fast inference and deployment friendly on the on-vehicle chips, (2), (4) and (5) ensure that Fast-BEV has competitive performance. All these make Fast-BEV a solution with high performance, fast inference speed, and deployment-friendly on the on-vehicle chips of autonomous driving. Through experiments, on 2080Ti platform, our R50 model can run 52.6 FPS with 47.3% NDS on the nuScenes validation set, exceeding the 41.3 FPS and 47.5% NDS of the BEVDepth-R50 model (Li et al. 2022) and 30.2 FPS and 45.7% NDS of the BEVDet4D-R50 model (J. Huang and G. Huang, 2022). Our largest model (R101@900×1600) establishes a competitive 53.5% NDS on the nuScenes validation set. We further develop a benchmark with considerable accuracy and efficiency on current popular on-vehicle chips.

摘要

最近,基于鸟瞰图(BEV)表示的感知任务越来越受到关注,并且BEV表示作为下一代自动驾驶汽车(AV)感知的基础很有前景。然而,大多数现有的BEV解决方案要么需要大量资源来执行车载推理,要么性能一般。本文提出了一个简单而有效的框架,称为Fast-BEV,它能够在车载芯片上更快地进行BEV感知。为了实现这一目标,我们首先通过实验发现,BEV表示无需基于昂贵的变压器变换或深度表示就可以足够强大。我们的Fast-BEV由五个部分组成,我们创新性地提出:(1)一种轻量级且便于部署的视图变换,它能快速将二维图像特征转换到三维体素空间;(2)一种多尺度图像编码器,它利用多尺度信息以获得更好的性能;(3)一种高效的BEV编码器,其专门设计用于加速车载推理。我们还引入了:(4)一种针对图像和BEV空间的强大数据增强策略以避免过拟合;(5)一种多帧特征融合机制以利用时间信息。其中,(1)和(3)使Fast-BEV能够在车载芯片上快速推理且便于部署,(2)、(4)和(5)确保Fast-BEV具有有竞争力的性能。所有这些使得Fast-BEV成为一种在自动驾驶车载芯片上具有高性能、快速推理速度且便于部署的解决方案。通过实验,在2080Ti平台上,我们的R50模型在nuScenes验证集上可以以47.3%的NDS运行52.6帧每秒,超过了BEVDepth-R50模型(Li等人,2022)的41.3帧每秒和47.5%的NDS以及BEVDet4D-R50模型(J. Huang和G. Huang,2022)的30.2帧每秒和45.7%的NDS。我们最大的模型(R101@900×1600)在nuScenes验证集上建立了具有竞争力的53.5%的NDS。我们还在当前流行的车载芯片上开发了一个具有相当准确性和效率的基准。

相似文献

1
Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline.Fast-BEV:一种快速且强大的鸟瞰视角感知基线。
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8665-8679. doi: 10.1109/TPAMI.2024.3414835. Epub 2024 Nov 6.
2
Robust BEV 3D Object Detection for Vehicles with Tire Blow-Out.用于爆胎车辆的鲁棒性鸟瞰图3D目标检测
Sensors (Basel). 2024 Jul 9;24(14):4446. doi: 10.3390/s24144446.
3
Fast vehicle detection based on colored point cloud with bird's eye view representation.基于鸟瞰彩色点云的快速车辆检测。
Sci Rep. 2023 May 8;13(1):7447. doi: 10.1038/s41598-023-34479-z.
4
Dense projection fusion for 3D object detection.用于3D目标检测的密集投影融合
Sci Rep. 2024 Oct 8;14(1):23492. doi: 10.1038/s41598-024-74679-9.
5
Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe.深入探究鸟瞰视角感知的难题:一篇综述、评估与方法
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2151-2170. doi: 10.1109/TPAMI.2023.3333838. Epub 2024 Mar 6.
6
IRBEVF-Q: Optimization of Image-Radar Fusion Algorithm Based on Bird's Eye View Features.IRBEVF-Q:基于鸟瞰特征的图像-雷达融合算法优化
Sensors (Basel). 2024 Jul 16;24(14):4602. doi: 10.3390/s24144602.
7
HeightFormer: Explicit Height Modeling Without Extra Data for Camera-Only 3D Object Detection in Bird's Eye View.HeightFormer:无需额外数据的显式高度建模,用于鸟瞰视角下仅基于相机的3D目标检测
IEEE Trans Image Process. 2025;34:689-700. doi: 10.1109/TIP.2024.3427701. Epub 2025 Jan 28.
8
Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries.分而治之:利用二维语义深度先验和输入相关查询改进多相机三维感知
IEEE Trans Image Process. 2024;33:897-909. doi: 10.1109/TIP.2024.3352808. Epub 2024 Jan 23.
9
Surrounding-aware representation prediction in Birds-Eye-View using transformers.使用Transformer在鸟瞰视角下进行周围环境感知表示预测。
Front Neurosci. 2023 Jul 4;17:1219363. doi: 10.3389/fnins.2023.1219363. eCollection 2023.
10
Fully Sparse Fusion for 3D Object Detection.用于3D目标检测的全稀疏融合
IEEE Trans Pattern Anal Mach Intell. 2024 Nov;46(11):7217-7231. doi: 10.1109/TPAMI.2024.3392303. Epub 2024 Oct 3.

引用本文的文献

1
CrossInteraction: Multi-Modal Interaction and Alignment Strategy for 3D Perception.交叉交互:用于3D感知的多模态交互与对齐策略
Sensors (Basel). 2025 Sep 16;25(18):5775. doi: 10.3390/s25185775.