• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

运动相关知识对无标签视频中人类-物体交互检测的影响。

Effects of Motion-Relevant Knowledge From Unlabeled Video to Human-Object Interaction Detection.

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5760-5773. doi: 10.1109/TNNLS.2021.3131154. Epub 2023 Sep 1.

DOI:10.1109/TNNLS.2021.3131154
PMID:34890337
Abstract

The existing works on human-object interaction (HOI) detection usually rely on expensive large-scale labeled image datasets. However, in real scenes, labeled data may be insufficient, and some rare HOI categories have few samples. This poses great challenges for deep-learning-based HOI detection models. Existing works tackle it by introducing compositional learning or word embedding but still need large-scale labeled data or extremely rely on the well-learned knowledge. In contrast, the freely available unlabeled videos contain rich motion-relevant information that can help infer rare HOIs. In this article, we creatively propose a multitask learning (MTL) perspective to assist in HOI detection with the aid of motion-relevant knowledge learning on unlabeled videos. Specifically, we design the appearance reconstruction loss (ARL) and sequential motion mining module in a self-supervised manner to learn more generalizable motion representations for promoting the detection of rare HOIs. Moreover, to better transfer motion-related knowledge from unlabeled videos to HOI images, a domain discriminator is introduced to decrease the domain gap between two domains. Extensive experiments on the HICO-DET dataset with rare categories and the V-COCO dataset with minimum supervision demonstrate the effectiveness of motion-aware knowledge implied in unlabeled videos for HOI detection.

摘要

现有的人与物交互 (HOI) 检测工作通常依赖于昂贵的大规模标记图像数据集。然而,在实际场景中,标记数据可能不足,并且一些罕见的 HOI 类别样本很少。这对基于深度学习的 HOI 检测模型提出了巨大的挑战。现有工作通过引入组合学习或词嵌入来解决这个问题,但仍然需要大规模的标记数据或极度依赖已学习的知识。相比之下,免费的未标记视频包含丰富的与运动相关的信息,可以帮助推断罕见的 HOI。在本文中,我们创造性地提出了一种多任务学习 (MTL) 视角,通过在未标记视频上学习与运动相关的知识来辅助 HOI 检测。具体来说,我们以自监督的方式设计了外观重建损失 (ARL) 和顺序运动挖掘模块,以学习更具泛化性的运动表示,从而促进罕见 HOI 的检测。此外,为了更好地将运动相关知识从未标记的视频转移到 HOI 图像,引入了一个域鉴别器来减小两个域之间的域差距。在具有罕见类别的 HICO-DET 数据集和具有最小监督的 V-COCO 数据集上进行的广泛实验证明了未标记视频中隐含的运动感知知识对 HOI 检测的有效性。

相似文献

1
Effects of Motion-Relevant Knowledge From Unlabeled Video to Human-Object Interaction Detection.运动相关知识对无标签视频中人类-物体交互检测的影响。
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5760-5773. doi: 10.1109/TNNLS.2021.3131154. Epub 2023 Sep 1.
2
Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning.通过零样本学习扩展视频中的人类-物体交互识别
Comput Intell Neurosci. 2021 Jun 9;2021:9922697. doi: 10.1155/2021/9922697. eCollection 2021.
3
Transferable Interactiveness Knowledge for Human-Object Interaction Detection.可迁移交互知识用于人机交互检测。
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3870-3882. doi: 10.1109/TPAMI.2021.3054048. Epub 2022 Jun 3.
4
Learning Human-Object Interaction via Interactive Semantic Reasoning.通过交互式语义推理学习人机交互。
IEEE Trans Image Process. 2021;30:9294-9305. doi: 10.1109/TIP.2021.3125258. Epub 2021 Nov 12.
5
Cascaded Parsing of Human-Object Interaction Recognition.级联解析的人机交互识别。
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2827-2840. doi: 10.1109/TPAMI.2021.3049156. Epub 2022 May 5.
6
A Novel Part Refinement Tandem Transformer for Human-Object Interaction Detection.一种用于人机交互检测的新型部件细化串联变压器。
Sensors (Basel). 2024 Jul 1;24(13):4278. doi: 10.3390/s24134278.
7
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection.FGAHOI:用于人类与物体交互检测的细粒度锚点
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2415-2429. doi: 10.1109/TPAMI.2023.3331738. Epub 2024 Mar 6.
8
ERNet: An Efficient and Reliable Human-Object Interaction Detection Network.ERNet:一种高效可靠的人-物交互检测网络。
IEEE Trans Image Process. 2023;32:964-979. doi: 10.1109/TIP.2022.3231528.
9
Semi-supervised learning with progressive unlabeled data excavation for label-efficient surgical workflow recognition.基于渐进式未标记数据挖掘的半监督学习在标签高效手术流程识别中的应用。
Med Image Anal. 2021 Oct;73:102158. doi: 10.1016/j.media.2021.102158. Epub 2021 Jul 8.
10
IPGN: Interactiveness Proposal Graph Network for Human-Object Interaction Detection.IPGN:用于人机交互检测的交互提案图网络。
IEEE Trans Image Process. 2021;30:6583-6593. doi: 10.1109/TIP.2021.3096333. Epub 2021 Jul 21.