• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于时空注意力网络的在线动作建议生成。

Online action proposal generation using spatio-temporal attention network.

机构信息

Graduate School of Artificial Intelligence, Kyungpook National University, Daegu, 41566, South Korea.

KNU-LG Electronics Convergence Research Center, AI Institute of Technology, Kyungpook National University, Daegu, 41566, South Korea.

出版信息

Neural Netw. 2022 Sep;153:518-529. doi: 10.1016/j.neunet.2022.06.032. Epub 2022 Jun 30.

DOI:10.1016/j.neunet.2022.06.032
PMID:35835013
Abstract

Temporal action proposal generation aims to generate temporal boundaries containing action instances. In real-time applications such as surveillance cameras, autonomous driving, and traffic monitoring, the online localization and recognition of human activities occurring in short temporal intervals are important areas of research. Existing approaches of temporal action proposal generation consider only the offline and frame-level feature aggregation along the temporal dimension. Those offline methods also generate many redundant irrelevant proposal regions in the frames as temporal boundaries. This leads to higher computational cost along with slow processing speed which is not suitable for online tasks. In this study, we propose a novel spatio-temporal attention network for online action proposal generation as opposed to existing offline proposal generation methods. Our novel proposed approach incorporates the inter-dependency between the spatial and temporal context information of each incoming video clip to generate more relevant online temporal action proposals. First, we propose a windowed spatial attention module to capture the inter-spatial relationship between the features of incoming frames. The windowed spatial network produces more robust clip-level feature representation and efficiently deals with noisy features such as occlusion or background scenes. Second, we introduce a temporal attention module to capture relevant temporal dynamic information mutually to the localized spatial information to model the long inter-frame temporal relationship since most online real life videos are untrimmed in nature. By applying these two attention modules sequentially, the novel proposed spatio-temporal network model is able to generate precise action boundaries at a particular instant of time. In addition, the model generates fewer discriminative temporal action proposals while maintaining a low computational cost and high processing speed suitable for online settings.

摘要

时间动作提议生成旨在生成包含动作实例的时间边界。在实时应用中,如监控摄像头、自动驾驶和交通监控,对短时间间隔内发生的人类活动进行在线定位和识别是研究的重要领域。现有的时间动作提议生成方法仅考虑了沿时间维度的离线和帧级特征聚合。这些离线方法也会在帧中生成许多冗余的不相关提议区域作为时间边界。这导致计算成本增加,处理速度较慢,不适合在线任务。在这项研究中,我们提出了一种新颖的时空注意网络,用于在线动作提议生成,而不是现有的离线提议生成方法。我们的新方法结合了每个输入视频片段的空间和时间上下文信息之间的相互依赖关系,以生成更相关的在线时间动作提议。首先,我们提出了一个窗口化的空间注意模块,以捕捉输入帧特征之间的空间关系。窗口化的空间网络生成更稳健的剪辑级特征表示,并有效地处理遮挡或背景场景等噪声特征。其次,我们引入了一个时间注意模块,以捕捉相互的相关时间动态信息,与本地化的空间信息一起建模长的帧间时间关系,因为大多数在线的真实生活视频本质上是未剪辑的。通过顺序应用这两个注意模块,新提出的时空网络模型能够在特定的时间点生成精确的动作边界。此外,该模型生成的判别性时间动作提议更少,同时保持低计算成本和高处理速度,适用于在线设置。

相似文献

1
Online action proposal generation using spatio-temporal attention network.基于时空注意力网络的在线动作建议生成。
Neural Netw. 2022 Sep;153:518-529. doi: 10.1016/j.neunet.2022.06.032. Epub 2022 Jun 30.
2
Spatio-Temporal Action Detection in Untrimmed Videos by Using Multimodal Features and Region Proposals.基于多模态特征和区域建议的非裁剪视频时空动作检测。
Sensors (Basel). 2019 Mar 3;19(5):1085. doi: 10.3390/s19051085.
3
MEST: An Action Recognition Network with Motion Encoder and Spatio-Temporal Module.MEST:一种具有运动编码器和时空模块的动作识别网络。
Sensors (Basel). 2022 Sep 1;22(17):6595. doi: 10.3390/s22176595.
4
Two-Level Attention Module Based on Spurious-3D Residual Networks for Human Action Recognition.基于伪 3D 残差网络的两级注意模块的人体动作识别。
Sensors (Basel). 2023 Feb 3;23(3):1707. doi: 10.3390/s23031707.
5
Robust Online Tracking via Contrastive Spatio-Temporal Aware Network.通过对比时空感知网络实现稳健的在线跟踪
IEEE Trans Image Process. 2021;30:1989-2002. doi: 10.1109/TIP.2021.3050314. Epub 2021 Jan 20.
6
Multi-Level Content-Aware Boundary Detection for Temporal Action Proposal Generation.用于生成时间动作建议的多级内容感知边界检测
IEEE Trans Image Process. 2023;32:6090-6101. doi: 10.1109/TIP.2023.3328471. Epub 2023 Nov 8.
7
YoTube: Searching Action Proposal Via Recurrent and Static Regression Networks.YoTube:通过递归和静态回归网络进行搜索动作提案。
IEEE Trans Image Process. 2018 Jun;27(6):2609-2622. doi: 10.1109/TIP.2018.2806279.
8
Real-Time Video Super-Resolution with Spatio-Temporal Modeling and Redundancy-Aware Inference.基于时空建模和冗余感知推理的实时视频超分辨率
Sensors (Basel). 2023 Sep 14;23(18):7880. doi: 10.3390/s23187880.
9
STA-TSN: Spatial-Temporal Attention Temporal Segment Network for action recognition in video.STA-TSN:用于视频动作识别的时空注意力时间段网络。
PLoS One. 2022 Mar 17;17(3):e0265115. doi: 10.1371/journal.pone.0265115. eCollection 2022.
10
Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos.用于视频动作识别的递归时空注意网络。
IEEE Trans Image Process. 2018 Mar;27(3):1347-1360. doi: 10.1109/TIP.2017.2778563. Epub 2017 Nov 29.