基于注意力的具有背景无关运动掩码的时间编码网络用于动作识别。

Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition.

作者信息

Weng Zhengkui, Jin Zhipeng, Chen Shuangxi, Shen Quanquan, Ren Xiangyang, Li Wuzhao

机构信息

Jiaxing Vocational and Technical College, Jiaxing, Zhejiang, China.

Medical 3D Printing Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China.

出版信息

Comput Intell Neurosci. 2021 Mar 27;2021:8890808. doi: 10.1155/2021/8890808. eCollection 2021.

DOI:10.1155/2021/8890808

PMID:33859682

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8024088/

Abstract

Convolutional neural network (CNN) has been leaping forward in recent years. However, the high dimensionality, rich human dynamic characteristics, and various kinds of background interference increase difficulty for traditional CNNs in capturing complicated motion data in videos. A novel framework named the attention-based temporal encoding network (ATEN) with background-independent motion mask (BIMM) is proposed to achieve video action recognition here. Initially, we introduce one motion segmenting approach on the basis of boundary prior by associating with the minimal geodesic distance inside a weighted graph that is not directed. Then, we propose one dynamic contrast segmenting strategic procedure for segmenting the object that moves within complicated environments. Subsequently, we build the BIMM for enhancing the object that moves based on the suppression of the not relevant background inside the respective frame. Furthermore, we design one long-range attention system inside ATEN, capable of effectively remedying the dependency of sophisticated actions that are not periodic in a long term based on the more automatic focus on the semantical vital frames other than the equal process for overall sampled frames. For this reason, the attention mechanism is capable of suppressing the temporal redundancy and highlighting the discriminative frames. Lastly, the framework is assessed by using HMDB51 and UCF101 datasets. As revealed from the experimentally achieved results, our ATEN with BIMM gains 94.5% and 70.6% accuracy, respectively, which outperforms a number of existing methods on both datasets.

摘要

近年来，卷积神经网络（CNN）取得了显著进展。然而，高维度、丰富的人类动态特征以及各种背景干扰增加了传统CNN在捕获视频中复杂运动数据时的难度。本文提出了一种名为基于注意力的时间编码网络（ATEN）并带有背景无关运动掩码（BIMM）的新颖框架，以实现视频动作识别。首先，我们基于边界先验引入了一种运动分割方法，该方法通过与加权无向图内的最小测地距离相关联来实现。然后，我们提出了一种动态对比分割策略程序，用于分割在复杂环境中移动的物体。随后，我们构建了BIMM，以通过抑制各帧内不相关的背景来增强移动的物体。此外，我们在ATEN内部设计了一个长程注意力系统，该系统能够基于对语义关键帧的更自动关注，而不是对所有采样帧进行同等处理，有效地弥补长期非周期性复杂动作的依赖性。因此，注意力机制能够抑制时间冗余并突出判别性帧。最后，使用HMDB51和UCF101数据集对该框架进行了评估。从实验结果可以看出，我们带有BIMM的ATEN分别获得了94.5%和70.6%的准确率，在这两个数据集上均优于许多现有方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7661/8024088/ae7961669f10/CIN2021-8890808.001.jpg

相似文献

Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition.基于注意力的具有背景无关运动掩码的时间编码网络用于动作识别。

Comput Intell Neurosci. 2021 Mar 27;2021:8890808. doi: 10.1155/2021/8890808. eCollection 2021.

Recurrent Spatial-Temporal Attention Network for Action Recognition in Videos.用于视频动作识别的递归时空注意网络。

IEEE Trans Image Process. 2018 Mar;27(3):1347-1360. doi: 10.1109/TIP.2017.2778563. Epub 2017 Nov 29.

STA-TSN: Spatial-Temporal Attention Temporal Segment Network for action recognition in video.STA-TSN：用于视频动作识别的时空注意力时间段网络。

PLoS One. 2022 Mar 17;17(3):e0265115. doi: 10.1371/journal.pone.0265115. eCollection 2022.

STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition.STA-CNN：用于动作识别的卷积时空注意力学习

IEEE Trans Image Process. 2020 Apr 7. doi: 10.1109/TIP.2020.2984904.

MEST: An Action Recognition Network with Motion Encoder and Spatio-Temporal Module.MEST：一种具有运动编码器和时空模块的动作识别网络。

Sensors (Basel). 2022 Sep 1;22(17):6595. doi: 10.3390/s22176595.

Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network.基于注意力感知时间加权卷积神经网络的动作识别。

Sensors (Basel). 2018 Jun 21;18(7):1979. doi: 10.3390/s18071979.

Two-Level Attention Module Based on Spurious-3D Residual Networks for Human Action Recognition.基于伪 3D 残差网络的两级注意模块的人体动作识别。

Sensors (Basel). 2023 Feb 3;23(3):1707. doi: 10.3390/s23031707.

An Efficient Human Instance-Guided Framework for Video Action Recognition.高效的人类实例引导视频动作识别框架

Sensors (Basel). 2021 Dec 12;21(24):8309. doi: 10.3390/s21248309.

Global and Local Knowledge-Aware Attention Network for Action Recognition.用于动作识别的全局和局部知识感知注意力网络。

IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):334-347. doi: 10.1109/TNNLS.2020.2978613. Epub 2021 Jan 4.

DroneAttention: Sparse weighted temporal attention for drone-camera based activity recognition.无人机注意力：基于无人机摄像头的活动识别中的稀疏加权时间注意力

Neural Netw. 2023 Feb;159:57-69. doi: 10.1016/j.neunet.2022.12.005. Epub 2022 Dec 13.

引用本文的文献

ASNet: Auto-Augmented Siamese Neural Network for Action Recognition.ASNet：用于动作识别的自动增强型孪生神经网络。

Sensors (Basel). 2021 Jul 10;21(14):4720. doi: 10.3390/s21144720.

本文引用的文献

STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition.STA-CNN：用于动作识别的卷积时空注意力学习

IEEE Trans Image Process. 2020 Apr 7. doi: 10.1109/TIP.2020.2984904.

Long-Term Temporal Convolutions for Action Recognition.长期时间卷积用于动作识别。

IEEE Trans Pattern Anal Mach Intell. 2018 Jun;40(6):1510-1517. doi: 10.1109/TPAMI.2017.2712608. Epub 2017 Jun 6.

Saliency-Aware Video Object Segmentation.显著度感知的视频对象分割。

IEEE Trans Pattern Anal Mach Intell. 2018 Jan;40(1):20-33. doi: 10.1109/TPAMI.2017.2662005. Epub 2017 Jan 31.

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.长期递归卷积网络的视觉识别与描述。

IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691. doi: 10.1109/TPAMI.2016.2599174. Epub 2016 Sep 1.

SLIC superpixels compared to state-of-the-art superpixel methods.SLIC 超像素与最先进的超像素方法比较。

IEEE Trans Pattern Anal Mach Intell. 2012 Nov;34(11):2274-82. doi: 10.1109/TPAMI.2012.120.

3D convolutional neural networks for human action recognition.三维卷积神经网络的人体动作识别。

IEEE Trans Pattern Anal Mach Intell. 2013 Jan;35(1):221-31. doi: 10.1109/TPAMI.2012.59.

Context-aware saliency detection.上下文感知显著度检测。

IEEE Trans Pattern Anal Mach Intell. 2012 Oct;34(10):1915-26. doi: 10.1109/TPAMI.2011.272.

A computational approach to edge detection.一种基于计算的边缘检测方法。

IEEE Trans Pattern Anal Mach Intell. 1986 Jun;8(6):679-98.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于注意力的具有背景无关运动掩码的时间编码网络用于动作识别。

Attention-Based Temporal Encoding Network with Background-Independent Motion Mask for Action Recognition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献