• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过相似性传播实现零样本人类-物体交互检测

Zero-Shot Human-Object Interaction Detection via Similarity Propagation.

作者信息

Zong Daoming, Sun Shiliang

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17805-17816. doi: 10.1109/TNNLS.2023.3309104. Epub 2024 Dec 2.

DOI:10.1109/TNNLS.2023.3309104
PMID:37672372
Abstract

Human-object interaction (HOI) detection involves identifying interactions represented as , requiring the localization of human-object pairs and interaction classification within an image. This work focuses on the challenge of detecting HOIs with unseen objects using the prevalent Transformer architecture. Our empirical analysis reveals that the performance degradation of novel HOI instances primarily arises from misclassifying unseen objects as confusable seen objects. To address this issue, we propose a similarity propagation (SP) scheme that leverages cosine similarity distance to regulate the prediction margin between seen and unseen objects. In addition, we introduce pseudo-supervision for unseen objects based on class semantic similarities during training. Furthermore, we incorporate semantic-aware instance-level and interaction-level contrastive losses with Transformer to enhance intraclass compactness and interclass separability, resulting in improved visual representations. Extensive experiments on two challenging benchmarks, V-COCO and HICO-DET, demonstrate the effectiveness of our model, outperforming current state-of-the-art methods under various zero-shot settings.

摘要

人机交互(HOI)检测涉及识别表示为 的交互,这需要在图像中定位人与物体对并进行交互分类。这项工作聚焦于使用流行的Transformer架构检测未见物体的人机交互这一挑战。我们的实证分析表明,新型HOI实例的性能下降主要源于将未见物体误分类为易混淆的可见物体。为解决此问题,我们提出一种相似性传播(SP)方案,该方案利用余弦相似性距离来调节可见和未见物体之间的预测边界。此外,我们在训练期间基于类语义相似性为未见物体引入伪监督。此外,我们将语义感知的实例级和交互级对比损失与Transformer相结合,以增强类内紧凑性和类间可分离性,从而改进视觉表示。在两个具有挑战性的基准V-COCO和HICO-DET上进行的广泛实验证明了我们模型的有效性,在各种零样本设置下优于当前的最先进方法。

相似文献

1
Zero-Shot Human-Object Interaction Detection via Similarity Propagation.通过相似性传播实现零样本人类-物体交互检测
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17805-17816. doi: 10.1109/TNNLS.2023.3309104. Epub 2024 Dec 2.
2
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection.FGAHOI:用于人类与物体交互检测的细粒度锚点
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2415-2429. doi: 10.1109/TPAMI.2023.3331738. Epub 2024 Mar 6.
3
Towards zero-shot human-object interaction detection via vision-language integration.通过视觉语言整合实现零样本人类与物体交互检测
Neural Netw. 2025 Jul;187:107348. doi: 10.1016/j.neunet.2025.107348. Epub 2025 Mar 10.
4
A Novel Part Refinement Tandem Transformer for Human-Object Interaction Detection.一种用于人机交互检测的新型部件细化串联变压器。
Sensors (Basel). 2024 Jul 1;24(13):4278. doi: 10.3390/s24134278.
5
Learning Human-Object Interaction via Interactive Semantic Reasoning.通过交互式语义推理学习人机交互。
IEEE Trans Image Process. 2021;30:9294-9305. doi: 10.1109/TIP.2021.3125258. Epub 2021 Nov 12.
6
Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning.通过零样本学习扩展视频中的人类-物体交互识别
Comput Intell Neurosci. 2021 Jun 9;2021:9922697. doi: 10.1155/2021/9922697. eCollection 2021.
7
Semantics-Guided Contrastive Network for Zero-Shot Object Detection.用于零样本目标检测的语义引导对比网络
IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1530-1544. doi: 10.1109/TPAMI.2021.3140070. Epub 2024 Feb 6.
8
ERNet: An Efficient and Reliable Human-Object Interaction Detection Network.ERNet:一种高效可靠的人-物交互检测网络。
IEEE Trans Image Process. 2023;32:964-979. doi: 10.1109/TIP.2022.3231528.
9
Transformer-Based Approach Via Contrastive Learning for Zero-Shot Detection.基于对比学习的零样本检测的Transformer 方法。
Int J Neural Syst. 2023 Jul;33(7):2350035. doi: 10.1142/S0129065723500351. Epub 2023 Jun 14.
10
Semantic-Aware Dynamic Generation Networks for Few-Shot Human-Object Interaction Recognition.用于少样本人类-物体交互识别的语义感知动态生成网络
IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12564-12575. doi: 10.1109/TNNLS.2023.3263660. Epub 2024 Sep 3.