• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

非局部时域差分网络的时域动作检测。

Non-Local Temporal Difference Network for Temporal Action Detection.

机构信息

Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610081, China.

School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Sensors (Basel). 2022 Nov 1;22(21):8396. doi: 10.3390/s22218396.

DOI:10.3390/s22218396
PMID:36366106
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9655564/
Abstract

As an important part of video understanding, temporal action detection (TAD) has wide application scenarios. It aims to simultaneously predict the boundary position and class label of every action instance in an untrimmed video. Most of the existing temporal action detection methods adopt a stacked convolutional block strategy to model long temporal structures. However, most of the information between adjacent frames is redundant, and distant information is weakened after multiple convolution operations. In addition, the durations of action instances vary widely, making it difficult for single-scale modeling to fit complex video structures. To address this issue, we propose a non-local temporal difference network (NTD), including a chunk convolution (CC) module, a multiple temporal coordination (MTC) module, and a temporal difference (TD) module. The TD module adaptively enhances the motion information and boundary features with temporal attention weights. The CC module evenly divides the input sequence into N chunks, using multiple independent convolution blocks to simultaneously extract features from neighboring chunks. Therefore, it realizes the information delivered from distant frames while avoiding trapping into the local convolution. The MTC module designs a cascade residual architecture, which realizes the multiscale temporal feature aggregation without introducing additional parameters. The NTD achieves a state-of-the-art performance on two large-scale datasets, 36.2% mAP@avg and 71.6% mAP@0.5 on ActivityNet-v1.3 and THUMOS-14, respectively.

摘要

作为视频理解的重要组成部分,时间动作检测(TAD)具有广泛的应用场景。它的目的是在未剪辑的视频中同时预测每个动作实例的边界位置和类别标签。大多数现有的时间动作检测方法采用堆叠卷积块策略来对长时结构进行建模。然而,相邻帧之间的大部分信息是冗余的,经过多次卷积操作后,远距离信息会被削弱。此外,动作实例的持续时间差异很大,使得单尺度建模难以适应复杂的视频结构。针对这个问题,我们提出了一种非局部时间差分网络(NTD),包括一个块卷积(CC)模块、一个多时间协调(MTC)模块和一个时间差分(TD)模块。TD 模块使用时间注意力权重自适应地增强运动信息和边界特征。CC 模块将输入序列均匀地分成 N 个块,使用多个独立的卷积块同时从相邻块中提取特征。因此,它实现了来自远距离帧的信息传递,同时避免了陷入局部卷积。MTC 模块设计了一个级联残差架构,在不引入额外参数的情况下实现了多尺度时间特征聚合。NTD 在两个大规模数据集上实现了最先进的性能,在 ActivityNet-v1.3 上的 mAP@avg 达到 36.2%,在 THUMOS-14 上的 mAP@0.5 达到 71.6%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f821/9655564/1a936a7c3027/sensors-22-08396-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f821/9655564/b9b8052c5d3a/sensors-22-08396-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f821/9655564/1a936a7c3027/sensors-22-08396-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f821/9655564/b9b8052c5d3a/sensors-22-08396-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f821/9655564/1a936a7c3027/sensors-22-08396-g002.jpg

相似文献

1
Non-Local Temporal Difference Network for Temporal Action Detection.非局部时域差分网络的时域动作检测。
Sensors (Basel). 2022 Nov 1;22(21):8396. doi: 10.3390/s22218396.
2
End-to-End Temporal Action Detection With Transformer.基于Transformer的端到端时域动作检测
IEEE Trans Image Process. 2022;31:5427-5441. doi: 10.1109/TIP.2022.3195321. Epub 2022 Aug 17.
3
STA-TSN: Spatial-Temporal Attention Temporal Segment Network for action recognition in video.STA-TSN:用于视频动作识别的时空注意力时间段网络。
PLoS One. 2022 Mar 17;17(3):e0265115. doi: 10.1371/journal.pone.0265115. eCollection 2022.
4
MCMNET: Multi-Scale Context Modeling Network for Temporal Action Detection.MCMNET:用于时态动作检测的多尺度上下文建模网络。
Sensors (Basel). 2023 Aug 31;23(17):7563. doi: 10.3390/s23177563.
5
Two-Level Attention Module Based on Spurious-3D Residual Networks for Human Action Recognition.基于伪 3D 残差网络的两级注意模块的人体动作识别。
Sensors (Basel). 2023 Feb 3;23(3):1707. doi: 10.3390/s23031707.
6
Leveraging spatial residual attention and temporal Markov networks for video action understanding.利用空间残差注意力和时间马尔可夫网络进行视频动作理解。
Neural Netw. 2024 Jan;169:378-387. doi: 10.1016/j.neunet.2023.10.047. Epub 2023 Oct 31.
7
Motion sensitive network for action recognition in control and decision-making of autonomous systems.用于自主系统控制与决策中动作识别的运动敏感网络。
Front Neurosci. 2024 Mar 25;18:1370024. doi: 10.3389/fnins.2024.1370024. eCollection 2024.
8
Multi-Modality Adaptive Feature Fusion Graph Convolutional Network for Skeleton-Based Action Recognition.基于骨架的动作识别的多模态自适应特征融合图卷积网络。
Sensors (Basel). 2023 Jun 7;23(12):5414. doi: 10.3390/s23125414.
9
Multimodal and multiscale feature fusion for weakly supervised video anomaly detection.用于弱监督视频异常检测的多模态和多尺度特征融合
Sci Rep. 2024 Oct 1;14(1):22835. doi: 10.1038/s41598-024-73462-0.
10
A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention.基于空间注意力的用于动作识别的时空运动网络
Entropy (Basel). 2022 Mar 4;24(3):368. doi: 10.3390/e24030368.

引用本文的文献

1
MCMNET: Multi-Scale Context Modeling Network for Temporal Action Detection.MCMNET:用于时态动作检测的多尺度上下文建模网络。
Sensors (Basel). 2023 Aug 31;23(17):7563. doi: 10.3390/s23177563.

本文引用的文献

1
End-to-End Temporal Action Detection With Transformer.基于Transformer的端到端时域动作检测
IEEE Trans Image Process. 2022;31:5427-5441. doi: 10.1109/TIP.2022.3195321. Epub 2022 Aug 17.
2
Revisiting Anchor Mechanisms for Temporal Action Localization.重新审视用于时域动作定位的锚定机制。
IEEE Trans Image Process. 2020 Aug 19;PP. doi: 10.1109/TIP.2020.3016486.
3
Res2Net: A New Multi-Scale Backbone Architecture.Res2Net:一种新的多尺度骨干网络架构。
IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):652-662. doi: 10.1109/TPAMI.2019.2938758. Epub 2021 Jan 8.
4
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.