• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

EPMF:用于3D语义分割的高效感知感知多传感器融合

EPMF: Efficient Perception-Aware Multi-Sensor Fusion for 3D Semantic Segmentation.

作者信息

Tan Mingkui, Zhuang Zhuangwei, Chen Sitao, Li Rong, Jia Kui, Wang Qicheng, Li Yuanqing

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8258-8273. doi: 10.1109/TPAMI.2024.3402232. Epub 2024 Nov 6.

DOI:10.1109/TPAMI.2024.3402232
PMID:38809744
Abstract

We study multi-sensor fusion for 3D semantic segmentation that is important to scene understanding for many applications, such as autonomous driving and robotics. Existing fusion-based methods, however, may not achieve promising performance due to the vast difference between the two modalities. In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF) to effectively exploit perceptual information from two modalities, namely, appearance information from RGB images and spatio-depth information from point clouds. To this end, we project point clouds to the camera coordinate using perspective projection, and process both inputs from LiDAR and cameras in 2D space while preventing the information loss of RGB images. Then, we propose a two-stream network to extract features from the two modalities, separately. The extracted features are fused by effective residual-based fusion modules. Moreover, we introduce additional perception-aware losses to measure the perceptual difference between the two modalities. Last, we propose an improved version of PMF, i.e., EPMF, which is more efficient and effective by optimizing data pre-processing and network architecture under perspective projection. Specifically, we propose cross-modal alignment and cropping to obtain tight inputs and reduce unnecessary computational costs. We then explore more efficient contextual modules under perspective projection and fuse the LiDAR features into the camera stream to boost the performance of the two-stream network. Extensive experiments on benchmark data sets show the superiority of our method. For example, on nuScenes test set, our EPMF outperforms the state-of-the-art method, i.e., RangeFormer, by 0.9% in mIoU.

摘要

我们研究用于3D语义分割的多传感器融合,这对于许多应用(如自动驾驶和机器人技术)中的场景理解至关重要。然而,由于两种模态之间存在巨大差异,现有的基于融合的方法可能无法取得理想的性能。在这项工作中,我们研究了一种名为感知感知多传感器融合(PMF)的协作融合方案,以有效利用来自两种模态的感知信息,即来自RGB图像的外观信息和来自点云的空间深度信息。为此,我们使用透视投影将点云投影到相机坐标,并在二维空间中处理来自激光雷达和相机的输入,同时防止RGB图像的信息丢失。然后,我们提出了一个双流网络,分别从两种模态中提取特征。提取的特征通过有效的基于残差的融合模块进行融合。此外,我们引入了额外的感知感知损失来衡量两种模态之间的感知差异。最后,我们提出了PMF的改进版本,即EPMF,通过在透视投影下优化数据预处理和网络架构,使其更高效且有效。具体来说,我们提出了跨模态对齐和裁剪,以获得紧凑的输入并减少不必要的计算成本。然后,我们在透视投影下探索更高效的上下文模块,并将激光雷达特征融合到相机流中,以提高双流网络的性能。在基准数据集上进行的大量实验表明了我们方法的优越性。例如,在nuScenes测试集上,我们的EPMF在mIoU方面比最先进的方法RangeFormer高出0.9%。

相似文献

1
EPMF: Efficient Perception-Aware Multi-Sensor Fusion for 3D Semantic Segmentation.EPMF:用于3D语义分割的高效感知感知多传感器融合
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8258-8273. doi: 10.1109/TPAMI.2024.3402232. Epub 2024 Nov 6.
2
SyS3DS: Systematic Sampling of Large-Scale LiDAR Point Clouds for Semantic Segmentation in Forestry Robotics.SyS3DS:用于林业机器人语义分割的大规模激光雷达点云系统采样
Sensors (Basel). 2024 Jan 26;24(3):823. doi: 10.3390/s24030823.
3
Enhanced Perception for Autonomous Driving Using Semantic and Geometric Data Fusion.利用语义和几何数据融合实现自动驾驶的增强感知。
Sensors (Basel). 2022 Jul 5;22(13):5061. doi: 10.3390/s22135061.
4
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-Based Perception.用于基于激光雷达感知的圆柱形和非对称3D卷积网络
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6807-6822. doi: 10.1109/TPAMI.2021.3098789. Epub 2022 Sep 14.
5
Uni-to-Multi Modal Knowledge Distillation for Bidirectional LiDAR-Camera Semantic Segmentation.用于双向激光雷达-相机语义分割的单模态到多模态知识蒸馏
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11059-11072. doi: 10.1109/TPAMI.2024.3451658. Epub 2024 Nov 6.
6
Middle-Level Feature Fusion for Lightweight RGB-D Salient Object Detection.用于轻量级RGB-D显著目标检测的中级特征融合
IEEE Trans Image Process. 2022;31:6621-6634. doi: 10.1109/TIP.2022.3214092. Epub 2022 Oct 26.
7
GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation.GMNet:用于RGB-热红外城市场景语义分割的分级特征多标签学习网络
IEEE Trans Image Process. 2021;30:7790-7802. doi: 10.1109/TIP.2021.3109518. Epub 2021 Sep 14.
8
FGCN: Image-Fused Point Cloud Semantic Segmentation with Fusion Graph Convolutional Network.FGCN:基于融合图卷积网络的图像融合点云语义分割
Sensors (Basel). 2023 Oct 9;23(19):8338. doi: 10.3390/s23198338.
9
A 3D hierarchical cross-modality interaction network using transformers and convolutions for brain glioma segmentation in MR images.一种使用变换和卷积的 3D 层次跨模态交互网络,用于磁共振图像中的脑胶质瘤分割。
Med Phys. 2024 Nov;51(11):8371-8389. doi: 10.1002/mp.17354. Epub 2024 Aug 13.
10
A Two-Phase Cross-Modality Fusion Network for Robust 3D Object Detection.用于稳健3D目标检测的两阶段跨模态融合网络
Sensors (Basel). 2020 Oct 23;20(21):6043. doi: 10.3390/s20216043.