• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于时间动作定位的结构化注意力合成

Structured Attention Composition for Temporal Action Localization.

作者信息

Yang Le, Han Junwei, Zhao Tao, Liu Nian, Zhang Dingwen

出版信息

IEEE Trans Image Process. 2022 Jun 13;PP. doi: 10.1109/TIP.2022.3180925.

DOI:10.1109/TIP.2022.3180925
PMID:35696480
Abstract

Temporal action localization aims at localizing action instances from untrimmed videos. Existing works have designed various effective modules to precisely localize action instances based on appearance and motion features. However, by treating these two kinds of features with equal importance, previous works cannot take full advantage of each modality feature, making the learned model still sub-optimal. To tackle this issue, we make an early effort to study temporal action localization from the perspective of multi-modality feature learning, based on the observation that different actions exhibit specific preferences to appearance or motion modality. Specifically, we build a novel structured attention composition module. Unlike conventional attention, the proposed module would not infer frame attention and modality attention independently. Instead, by casting the relationship between the modality attention and the frame attention as an attention assignment process, the structured attention composition module learns to encode the frame-modality structure and uses it to regularize the inferred frame attention and modality attention, respectively, upon the optimal transport theory. The final frame-modality attention is obtained by the composition of the two individual attentions. The proposed structured attention composition module can be deployed as a plug-and-play module into existing action localization frameworks. Extensive experiments on two widely used benchmarks show that the proposed structured attention composition consistently improves four state-of-the-art temporal action localization methods and builds new state-of-the-art performance on THUMOS14.

摘要

时序动作定位旨在从未经剪辑的视频中定位动作实例。现有工作已经设计了各种有效的模块,以基于外观和运动特征精确地定位动作实例。然而,由于同等重视这两种特征,先前的工作无法充分利用每种模态特征,使得学习到的模型仍然不是最优的。为了解决这个问题,基于不同动作对外观或运动模态表现出特定偏好这一观察结果,我们率先从多模态特征学习的角度研究时序动作定位。具体来说,我们构建了一个新颖的结构化注意力合成模块。与传统注意力不同,所提出的模块不会独立推断帧注意力和模态注意力。相反,通过将模态注意力和帧注意力之间的关系视为一个注意力分配过程,结构化注意力合成模块学习对帧 - 模态结构进行编码,并分别基于最优传输理论使用它来规范推断出的帧注意力和模态注意力。最终的帧 - 模态注意力通过两种个体注意力的合成得到。所提出的结构化注意力合成模块可以作为即插即用模块部署到现有的动作定位框架中。在两个广泛使用的基准上进行的大量实验表明,所提出的结构化注意力合成方法持续改进了四种最先进的时序动作定位方法,并在THUMOS14上创造了新的最先进性能。

相似文献

1
Structured Attention Composition for Temporal Action Localization.用于时间动作定位的结构化注意力合成
IEEE Trans Image Process. 2022 Jun 13;PP. doi: 10.1109/TIP.2022.3180925.
2
A Temporal-Aware Relation and Attention Network for Temporal Action Localization.用于时间动作定位的时间感知关系与注意力网络。
IEEE Trans Image Process. 2022;31:4746-4760. doi: 10.1109/TIP.2022.3182866. Epub 2022 Jul 14.
3
StochasticFormer: Stochastic Modeling for Weakly Supervised Temporal Action Localization.随机Former:弱监督时间动作定位的随机建模
IEEE Trans Image Process. 2023;32:1379-1389. doi: 10.1109/TIP.2023.3244411. Epub 2023 Feb 23.
4
PCG-TAL: Progressive Cross-Granularity Cooperation for Temporal Action Localization.PCG-TAL:用于时间动作定位的渐进式跨粒度合作
IEEE Trans Image Process. 2021;30:2103-2113. doi: 10.1109/TIP.2020.3044218. Epub 2021 Jan 25.
5
Confidence-Guided Self Refinement for Action Prediction in Untrimmed Videos.用于未修剪视频动作预测的置信度引导自精炼
IEEE Trans Image Process. 2020 Apr 17. doi: 10.1109/TIP.2020.2987425.
6
Graph Convolutional Module for Temporal Action Localization in Videos.用于视频中时间动作定位的图卷积模块
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6209-6223. doi: 10.1109/TPAMI.2021.3090167. Epub 2022 Sep 14.
7
Revisiting Anchor Mechanisms for Temporal Action Localization.重新审视用于时域动作定位的锚定机制。
IEEE Trans Image Process. 2020 Aug 19;PP. doi: 10.1109/TIP.2020.3016486.
8
Adaptive Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization.自适应双流共识网络的弱监督时间动作定位。
IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4136-4151. doi: 10.1109/TPAMI.2022.3189662. Epub 2023 Mar 7.
9
Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization.用于弱监督时间动作定位的语义和时间上下文关联学习
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12427-12443. doi: 10.1109/TPAMI.2023.3287208. Epub 2023 Sep 5.
10
Weakly supervised temporal action localization with actionness-guided false positive suppression.基于动作引导型假阳性抑制的弱监督时间动作定位。
Neural Netw. 2024 Jul;175:106307. doi: 10.1016/j.neunet.2024.106307. Epub 2024 Apr 15.