用于跨域少样本动作识别的成对注意力对抗时空网络 - R2

A Pairwise Attentive Adversarial Spatiotemporal Network for Cross-Domain Few-Shot Action Recognition-R2.

作者信息

Gao Zan, Guo Leming, Guan Weili, Liu An-An, Ren Tongwei, Chen Shengyong

出版信息

IEEE Trans Image Process. 2021;30:767-782. doi: 10.1109/TIP.2020.3038372. Epub 2020 Dec 4.

DOI:10.1109/TIP.2020.3038372

PMID:33232234

Abstract

Action recognition is a popular research topic in the computer vision and machine learning domains. Although many action recognition methods have been proposed, only a few researchers have focused on cross-domain few-shot action recognition, which must often be performed in real security surveillance. Since the problems of action recognition, domain adaptation, and few-shot learning need to be simultaneously solved, the cross-domain few-shot action recognition task is a challenging problem. To solve these issues, in this work, we develop a novel end-to-end pairwise attentive adversarial spatiotemporal network (PASTN) to perform the cross-domain few-shot action recognition task, in which spatiotemporal information acquisition, few-shot learning, and video domain adaptation are realised in a unified framework. Specifically, the Resnet-50 network is selected as the backbone of the PASTN, and a 3D convolution block is embedded in the top layer of the 2D CNN (ResNet-50) to capture the spatiotemporal representations. Moreover, a novel attentive adversarial network architecture is designed to align the spatiotemporal dynamics actions with higher domain discrepancies. In addition, the pairwise margin discrimination loss is designed for the pairwise network architecture to improve the discrimination of the learned domain-invariant spatiotemporal feature. The results of extensive experiments performed on three public benchmarks of the cross-domain action recognition datasets, including SDAI Action I, SDAI Action II and UCF50-OlympicSport, demonstrate that the proposed PASTN can significantly outperform the state-of-the-art cross-domain action recognition methods in terms of both the accuracy and computational time. Even when only two labelled training samples per category are considered in the office1 scenario of the SDAI Action I dataset, the accuracy of the PASTN is improved by 6.1%, 10.9%, 16.8%, and 14% compared to that of the TAN , TemporalPooling, I3D, and P3D methods, respectively.

摘要

动作识别是计算机视觉和机器学习领域中一个热门的研究课题。尽管已经提出了许多动作识别方法，但只有少数研究人员关注跨域少样本动作识别，而这一任务在实际的安全监控中经常需要执行。由于动作识别、域适应和少样本学习的问题需要同时解决，跨域少样本动作识别任务是一个具有挑战性的问题。为了解决这些问题，在这项工作中，我们开发了一种新颖的端到端成对注意力对抗时空网络（PASTN）来执行跨域少样本动作识别任务，其中时空信息获取、少样本学习和视频域适应在一个统一的框架中实现。具体来说，选择Resnet-50网络作为PASTN的主干，并在2D CNN（ResNet-50）的顶层嵌入一个3D卷积块来捕捉时空表示。此外，设计了一种新颖的注意力对抗网络架构，以使具有更高域差异的时空动态动作对齐。另外，为成对网络架构设计了成对边际判别损失，以提高所学域不变时空特征的判别能力。在跨域动作识别数据集的三个公共基准上进行的大量实验结果，包括SDAI Action I、SDAI Action II和UCF50-OlympicSport，表明所提出的PASTN在准确性和计算时间方面都能显著优于当前的跨域动作识别方法。即使在SDAI Action I数据集的office1场景中，每类仅考虑两个标记的训练样本时，与TAN、TemporalPooling、I3D和P3D方法相比，PASTN的准确率分别提高了6.1%、10.9%、16.8%和14%。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于跨域少样本动作识别的成对注意力对抗时空网络 - R2

A Pairwise Attentive Adversarial Spatiotemporal Network for Cross-Domain Few-Shot Action Recognition-R2.

作者信息

出版信息

相似文献

引用本文的文献

用于跨域少样本动作识别的成对注意力对抗时空网络 - R2

A Pairwise Attentive Adversarial Spatiotemporal Network for Cross-Domain Few-Shot Action Recognition-R2.

作者信息

出版信息

相似文献

引用本文的文献