• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

学习动态场景条件3D目标检测器。

Learning Dynamic Scene-Conditioned 3D Object Detectors.

作者信息

Zheng Yu, Duan Yueqi, Li Zongtai, Zhou Jie, Lu Jiwen

出版信息

IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):2981-2996. doi: 10.1109/TPAMI.2023.3336874. Epub 2024 Apr 3.

DOI:10.1109/TPAMI.2023.3336874
PMID:38015703
Abstract

In this paper, we propose a dynamic 3D object detector named HyperDet3D, which is adaptively adjusted based on the hyper scene-level knowledge on the fly. Existing methods strive for object-level representations of local elements and their relations without scene-level priors, which suffer from ambiguity between similarly-structured objects only based on the understanding of individual points and object candidates. Instead, we design scene-conditioned hypernetworks to simultaneously learn scene-agnostic embeddings to exploit sharable abstracts from various 3D scenes, and scene-specific knowledge which adapts the 3D detector to the given scene at test time. As a result, the lower-level ambiguity in object representations can be addressed by hierarchical context in scene priors. However, since the upstream hypernetwork in HyperDet3D takes raw scenes as input which contain noises and redundancy, it leads to sub-optimal parameters produced for the 3D detector simply under the constraint of downstream detection losses. Based on the fact that the downstream 3D detection task can be factorized into object-level semantic classification and bounding box regression, we furtherly propose HyperFormer3D by correspondingly designing their scene-level prior tasks in upstream hypernetworks, namely Semantic Occurrence and Objectness Localization. To this end, we design a transformer-based hypernetwork that translates the task-oriented scene priors into parameters of the downstream detector, which refrains from noises and redundancy of the scenes. Extensive experimental results on the ScanNet, SUN RGB-D and MatterPort3D datasets demonstrate the effectiveness of the proposed methods.

摘要

在本文中,我们提出了一种名为HyperDet3D的动态3D目标检测器,它能够根据超场景级知识实时进行自适应调整。现有方法在没有场景级先验的情况下,致力于局部元素及其关系的目标级表示,仅基于对单个点和目标候选的理解,这些方法在结构相似的目标之间存在模糊性。相反,我们设计了场景条件超网络,以同时学习与场景无关的嵌入,从而从各种3D场景中利用可共享的抽象信息,以及在测试时使3D检测器适应给定场景的特定场景知识。因此,目标表示中的低级模糊性可以通过场景先验中的层次上下文来解决。然而,由于HyperDet3D中的上游超网络以包含噪声和冗余的原始场景作为输入,这仅在下游检测损失的约束下导致为3D检测器生成次优参数。基于下游3D检测任务可以分解为目标级语义分类和边界框回归这一事实,我们通过在上游超网络中相应地设计它们的场景级先验任务,即语义出现和目标定位,进一步提出了HyperFormer3D。为此,我们设计了一个基于Transformer的超网络,将面向任务的场景先验转换为下游检测器的参数,从而避免了场景的噪声和冗余。在ScanNet、SUN RGB-D和MatterPort3D数据集上的大量实验结果证明了所提出方法的有效性。

相似文献

1
Learning Dynamic Scene-Conditioned 3D Object Detectors.学习动态场景条件3D目标检测器。
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):2981-2996. doi: 10.1109/TPAMI.2023.3336874. Epub 2024 Apr 3.
2
Refined Voting and Scene Feature Fusion for 3D Object Detection in Point Clouds.点云中的 3D 目标检测的精细化投票和场景特征融合。
Comput Intell Neurosci. 2022 Dec 29;2022:3023934. doi: 10.1155/2022/3023934. eCollection 2022.
3
Back to Reality: Learning Data-Efficient 3D Object Detector With Shape Guidance.回归现实:借助形状引导学习数据高效的3D目标检测器。
IEEE Trans Pattern Anal Mach Intell. 2024 Feb;46(2):1165-1180. doi: 10.1109/TPAMI.2023.3328880. Epub 2024 Jan 8.
4
Kalman-Based Scene Flow Estimation for Point Cloud Densification and 3D Object Detection in Dynamic Scenes.基于卡尔曼滤波的动态场景点云致密化与三维目标检测的场景流估计
Sensors (Basel). 2024 Jan 31;24(3):916. doi: 10.3390/s24030916.
5
Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries.分而治之:利用二维语义深度先验和输入相关查询改进多相机三维感知
IEEE Trans Image Process. 2024;33:897-909. doi: 10.1109/TIP.2024.3352808. Epub 2024 Jan 23.
6
HyperSOR: Context-Aware Graph Hypernetwork for Salient Object Ranking.HyperSOR:用于显著目标排序的上下文感知图超网络
IEEE Trans Pattern Anal Mach Intell. 2024 Sep;46(9):5873-5889. doi: 10.1109/TPAMI.2024.3368158. Epub 2024 Aug 6.
7
Image Representations with Spatial Object-to-Object Relations for RGB-D Scene Recognition.用于RGB-D场景识别的具有空间对象间关系的图像表示
IEEE Trans Image Process. 2019 Aug 13. doi: 10.1109/TIP.2019.2933728.
8
Joint Stereo Video Deblurring, Scene Flow Estimation and Moving Object Segmentation.联合立体视频去模糊、场景流估计与运动目标分割
IEEE Trans Image Process. 2019 Oct 11. doi: 10.1109/TIP.2019.2945867.
9
Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding.Lowis3D:语言驱动的开放世界实例级3D场景理解
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8517-8533. doi: 10.1109/TPAMI.2024.3410324. Epub 2024 Nov 6.
10
A New Method for Classifying Scenes for Simultaneous Localization and Mapping Using the Boundary Object Function Descriptor on RGB-D Points.一种基于RGB-D点上的边界对象函数描述符对同时定位与地图构建场景进行分类的新方法。
Sensors (Basel). 2023 Oct 30;23(21):8836. doi: 10.3390/s23218836.