• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

级联解析的人机交互识别。

Cascaded Parsing of Human-Object Interaction Recognition.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2827-2840. doi: 10.1109/TPAMI.2021.3049156. Epub 2022 May 5.

DOI:10.1109/TPAMI.2021.3049156
PMID:33400648
Abstract

This paper addresses the task of detecting and recognizing human-object interactions (HOI) in images. Considering the intrinsic complexity and structural nature of the task, we introduce a cascaded parsing network (CP-HOI) for a multi-stage, structured HOI understanding. At each cascade stage, an instance detection module progressively refines HOI proposals and feeds them into a structured interaction reasoning module. Each of the two modules is also connected to its predecessor in the previous stage, enabling efficient cross-stage information propagation. The structured interaction reasoning module is built upon a graph parsing neural network (GPNN), which efficiently models potential HOI structures as graphs and mines rich context for comprehensive relation understanding. In particular, GPNN infers a parse graph that i) interprets meaningful HOI structures by a learnable adjacency matrix, and ii) predicts action (edge) labels. Within an end-to-end, message-passing framework, GPNN blends learning and inference, iteratively parsing HOI structures and reasoning HOI representations (i.e., instance and relation features). Further beyond relation detection at a bounding-box level, we make our framework flexible to perform fine-grained pixel-wise relation segmentation; this provides a new glimpse into better relation modeling. A preliminary version of our CP-HOI model reached 1 place in the ICCV2019 Person in Context Challenge, on both relation detection and segmentation. In addition, our CP-HOI shows promising results on two popular HOI recognition benchmarks, i.e., V-COCO and HICO-DET.

摘要

本文旨在解决图像中人体目标交互(HOI)的检测和识别问题。考虑到任务的内在复杂性和结构性质,我们引入了级联解析网络(CP-HOI),用于多阶段、结构化的 HOI 理解。在每个级联阶段,实例检测模块逐步细化 HOI 提案,并将其输入到结构化交互推理模块中。这两个模块中的每一个都与前一个阶段的前一个模块相连,从而实现有效的跨阶段信息传播。结构化交互推理模块基于图解析神经网络(GPNN)构建,该网络有效地将潜在的 HOI 结构建模为图,并挖掘丰富的上下文以进行全面的关系理解。特别是,GPNN 推断出一个解析图,该图通过可学习的邻接矩阵来解释有意义的 HOI 结构,并且 ii)预测动作(边)标签。在端到端的消息传递框架中,GPNN 融合了学习和推理,迭代地解析 HOI 结构和推理 HOI 表示(即实例和关系特征)。在边界框级别的关系检测之外,我们使我们的框架具有灵活性,可以执行更细粒度的像素级关系分割;这为更好的关系建模提供了新的视角。我们的 CP-HOI 模型的初步版本在 ICCV2019 上下文人物挑战赛中达到了 1 位,在关系检测和分割方面都取得了成绩。此外,我们的 CP-HOI 在两个流行的 HOI 识别基准(即 V-COCO 和 HICO-DET)上也显示出了有前途的结果。

相似文献

1
Cascaded Parsing of Human-Object Interaction Recognition.级联解析的人机交互识别。
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2827-2840. doi: 10.1109/TPAMI.2021.3049156. Epub 2022 May 5.
2
FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection.FGAHOI:用于人类与物体交互检测的细粒度锚点
IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2415-2429. doi: 10.1109/TPAMI.2023.3331738. Epub 2024 Mar 6.
3
Learning Human-Object Interaction via Interactive Semantic Reasoning.通过交互式语义推理学习人机交互。
IEEE Trans Image Process. 2021;30:9294-9305. doi: 10.1109/TIP.2021.3125258. Epub 2021 Nov 12.
4
Transferable Interactiveness Knowledge for Human-Object Interaction Detection.可迁移交互知识用于人机交互检测。
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3870-3882. doi: 10.1109/TPAMI.2021.3054048. Epub 2022 Jun 3.
5
A Novel Part Refinement Tandem Transformer for Human-Object Interaction Detection.一种用于人机交互检测的新型部件细化串联变压器。
Sensors (Basel). 2024 Jul 1;24(13):4278. doi: 10.3390/s24134278.
6
IPGN: Interactiveness Proposal Graph Network for Human-Object Interaction Detection.IPGN:用于人机交互检测的交互提案图网络。
IEEE Trans Image Process. 2021;30:6583-6593. doi: 10.1109/TIP.2021.3096333. Epub 2021 Jul 21.
7
Human-Object Interaction detection via Global Context and Pairwise-level Fusion Features Integration.基于全局上下文和对级别融合特征集成的人与对象交互检测。
Neural Netw. 2024 Feb;170:242-253. doi: 10.1016/j.neunet.2023.11.002. Epub 2023 Nov 13.
8
Toward a Unified Transformer-Based Framework for Scene Graph Generation and Human-Object Interaction Detection.面向场景图生成和人机交互检测的统一基于 Transformer 的框架。
IEEE Trans Image Process. 2023;32:6274-6288. doi: 10.1109/TIP.2023.3330304. Epub 2023 Nov 20.
9
Point-Based Learnable Query Generator for Human-Object Interaction Detection.用于人机交互检测的基于点的可学习查询生成器
IEEE Trans Image Process. 2023;32:6469-6484. doi: 10.1109/TIP.2023.3334100. Epub 2023 Dec 1.
10
Hierarchical Reasoning Network for Human-Object Interaction Detection.用于人机交互检测的分层推理网络
IEEE Trans Image Process. 2021;30:8306-8317. doi: 10.1109/TIP.2021.3093784. Epub 2021 Oct 5.

引用本文的文献

1
PoseNet++: A multi-scale and optimized feature extraction network for high-precision human pose estimation.PoseNet++:一种用于高精度人体姿态估计的多尺度优化特征提取网络。
PLoS One. 2025 Jun 25;20(6):e0326232. doi: 10.1371/journal.pone.0326232. eCollection 2025.
2
Intraretinal Layer Segmentation Using Cascaded Compressed U-Nets.使用级联压缩U型网络进行视网膜内层分割
J Imaging. 2022 May 17;8(5):139. doi: 10.3390/jimaging8050139.
3
Congested Crowd Counting via Adaptive Multi-Scale Context Learning.基于自适应多尺度上下文学习的拥挤人群计数。
Sensors (Basel). 2021 May 29;21(11):3777. doi: 10.3390/s21113777.