• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

共生注意力的自我中心动作识别与目标中心对齐。

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):6605-6617. doi: 10.1109/TPAMI.2020.3015894. Epub 2023 May 5.

DOI:10.1109/TPAMI.2020.3015894
PMID:32780698
Abstract

In this paper, we propose to tackle egocentric action recognition by suppressing background distractors and enhancing action-relevant interactions. The existing approaches usually utilize two independent branches to recognize egocentric actions, i.e., a verb branch and a noun branch. However, the mechanism to suppress distracting objects and exploit local human-object correlations is missing. To this end, we introduce two extra sources of information, i.e., the candidate objects spatial location and their discriminative features, to enable concentration on the occurring interactions. We design a Symbiotic Attention with Object-centric feature Alignment framework (SAOA) to provide meticulous reasoning between the actor and the environment. First, we introduce an object-centric feature alignment method to inject the local object features to the verb branch and noun branch. Second, we propose a symbiotic attention mechanism to encourage the mutual interaction between the two branches and select the most action-relevant candidates for classification. The framework benefits from the communication among the verb branch, the noun branch, and the local object information. Experiments based on different backbones and modalities demonstrate the effectiveness of our method. Notably, our framework achieves the state-of-the-art on the largest egocentric video dataset.

摘要

在本文中,我们提出通过抑制背景干扰和增强与动作相关的交互来解决自我中心动作识别问题。现有的方法通常使用两个独立的分支来识别自我中心动作,即动词分支和名词分支。然而,缺少抑制干扰物体和利用局部人与物体相关性的机制。为此,我们引入了两个额外的信息源,即候选物体的空间位置及其判别特征,以实现对发生的交互的关注。我们设计了一种具有目标中心特征对齐的共生注意力框架(SAOA),以提供演员和环境之间的细致推理。首先,我们引入了一种目标中心特征对齐方法,将局部物体特征注入到动词分支和名词分支中。其次,我们提出了一种共生注意力机制,以鼓励两个分支之间的相互作用,并选择最相关的动作候选者进行分类。该框架受益于动词分支、名词分支和局部物体信息之间的通信。基于不同的骨干网络和模态的实验证明了我们方法的有效性。值得注意的是,我们的框架在最大的自我中心视频数据集上达到了最先进的水平。

相似文献

1
Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment.共生注意力的自我中心动作识别与目标中心对齐。
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):6605-6617. doi: 10.1109/TPAMI.2020.3015894. Epub 2023 May 5.
2
Learning to Recognize Actions on Objects in Egocentric Video With Attention Dictionaries.基于注意字典的自主体视频中物体动作识别学习。
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):6674-6687. doi: 10.1109/TPAMI.2021.3058649. Epub 2023 May 5.
3
Semantic-Disentangled Transformer With Noun-Verb Embedding for Compositional Action Recognition.用于组合动作识别的具有名词-动词嵌入的语义解缠变压器。
IEEE Trans Image Process. 2024;33:297-309. doi: 10.1109/TIP.2023.3341297. Epub 2023 Dec 21.
4
Learning Visual Affordance Grounding From Demonstration Videos.从演示视频中学习视觉功能基础
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):16857-16871. doi: 10.1109/TNNLS.2023.3298638. Epub 2024 Oct 29.
5
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video.从第一人称视频中进行动作预测的滚动-展开 LSTM。
IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4021-4036. doi: 10.1109/TPAMI.2020.2992889. Epub 2021 Oct 1.
6
STAC: Spatial-Temporal Attention on Compensation Information for Activity Recognition in FPV.STAC:用于第一人称视角(FPV)活动识别的基于补偿信息的时空注意力
Sensors (Basel). 2021 Feb 5;21(4):1106. doi: 10.3390/s21041106.
7
Egocentric Action Recognition by Automatic Relation Modeling.通过自动关系建模实现自我中心动作识别
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):489-507. doi: 10.1109/TPAMI.2022.3148790. Epub 2022 Dec 5.
8
Hierarchical Reasoning Network for Human-Object Interaction Detection.用于人机交互检测的分层推理网络
IEEE Trans Image Process. 2021;30:8306-8317. doi: 10.1109/TIP.2021.3093784. Epub 2021 Oct 5.
9
Deep Attention Network for Egocentric Action Recognition.基于深度注意力网络的自我中心动作识别。
IEEE Trans Image Process. 2019 Aug;28(8):3703-3713. doi: 10.1109/TIP.2019.2901707. Epub 2019 Feb 26.
10
Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning.通过零样本学习扩展视频中的人类-物体交互识别
Comput Intell Neurosci. 2021 Jun 9;2021:9922697. doi: 10.1155/2021/9922697. eCollection 2021.

引用本文的文献

1
Hierarchical query design and distributed attention in transformer for player group activity recognition in sports analysis.用于体育分析中运动员群体活动识别的Transformer中的分层查询设计与分布式注意力
Sci Rep. 2025 Aug 27;15(1):31571. doi: 10.1038/s41598-025-16752-5.
2
The real-time hand and object recognition for virtual interaction.用于虚拟交互的实时手部与物体识别
PeerJ Comput Sci. 2024 Jun 27;10:e2110. doi: 10.7717/peerj-cs.2110. eCollection 2024.