用于高效基于骨架的动作识别的结构知识蒸馏

Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition.

作者信息

Bian Cunling, Feng Wei, Wan Liang, Wang Song

出版信息

IEEE Trans Image Process. 2021;30:2963-2976. doi: 10.1109/TIP.2021.3056895. Epub 2021 Feb 17.

DOI:10.1109/TIP.2021.3056895

Abstract

Skeleton data have been extensively used for action recognition since they can robustly accommodate dynamic circumstances and complex backgrounds. To guarantee the action-recognition performance, we prefer to use advanced and time-consuming algorithms to get more accurate and complete skeletons from the scene. However, this may not be acceptable in time- and resource-stringent applications. In this paper, we explore the feasibility of using low-quality skeletons, which can be quickly and easily estimated from the scene, for action recognition. While the use of low-quality skeletons will surely lead to degraded action-recognition accuracy, in this paper we propose a structural knowledge distillation scheme to minimize this accuracy degradations and improve recognition model's robustness to uncontrollable skeleton corruptions. More specifically, a teacher which observes high-quality skeletons obtained from a scene is used to help train a student which only sees low-quality skeletons generated from the same scene. At inference time, only the student network is deployed for processing low-quality skeletons. In the proposed network, a graph matching loss is proposed to distill the graph structural knowledge at an intermediate representation level. We also propose a new gradient revision strategy to seek a balance between mimicking the teacher model and directly improving the student model's accuracy. Experiments are conducted on Kenetics400, NTU RGB+D and Penn action recognition datasets and the comparison results demonstrate the effectiveness of our scheme.

摘要

骨架数据因其能够稳健地适应动态环境和复杂背景而被广泛用于动作识别。为了保证动作识别性能，我们倾向于使用先进且耗时的算法从场景中获取更准确和完整的骨架。然而，这在对时间和资源要求严格的应用中可能是不可接受的。在本文中，我们探索了使用低质量骨架进行动作识别的可行性，低质量骨架可以从场景中快速且容易地估计出来。虽然使用低质量骨架肯定会导致动作识别准确率下降，但在本文中我们提出了一种结构化知识蒸馏方案，以最小化这种准确率下降，并提高识别模型对不可控骨架损坏的鲁棒性。更具体地说，一个观察从场景中获得的高质量骨架的教师模型被用来帮助训练一个只看到从同一场景生成的低质量骨架的学生模型。在推理时，只部署学生网络来处理低质量骨架。在所提出的网络中，提出了一种图匹配损失，以在中间表示层蒸馏图结构知识。我们还提出了一种新的梯度修正策略，以在模仿教师模型和直接提高学生模型准确率之间寻求平衡。在肯尼特斯400、NTU RGB+D和宾夕法尼亚动作识别数据集上进行了实验，比较结果证明了我们方案的有效性。

相似文献

Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition.用于高效基于骨架的动作识别的结构知识蒸馏

IEEE Trans Image Process. 2021;30:2963-2976. doi: 10.1109/TIP.2021.3056895. Epub 2021 Feb 17.

A 3DCNN-Based Knowledge Distillation Framework for Human Activity Recognition.一种基于3D卷积神经网络的人类活动识别知识蒸馏框架

J Imaging. 2023 Apr 14;9(4):82. doi: 10.3390/jimaging9040082.

Feedback Graph Convolutional Network for Skeleton-Based Action Recognition.用于基于骨架的动作识别的反馈图卷积网络

IEEE Trans Image Process. 2022;31:164-175. doi: 10.1109/TIP.2021.3129117. Epub 2021 Dec 2.

Multi-scale and attention enhanced graph convolution network for skeleton-based violence action recognition.用于基于骨架的暴力行为识别的多尺度注意力增强图卷积网络。

Front Neurorobot. 2022 Dec 15;16:1091361. doi: 10.3389/fnbot.2022.1091361. eCollection 2022.

Enhanced Adjacency Matrix-Based Lightweight Graph Convolution Network for Action Recognition.基于增强邻接矩阵的轻量级图卷积网络用于动作识别

Sensors (Basel). 2023 Jul 14;23(14):6397. doi: 10.3390/s23146397.

MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos.MMNet：一种基于模型的 RGB-D 视频人体动作识别多模态网络。

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3522-3538. doi: 10.1109/TPAMI.2022.3177813. Epub 2023 Feb 3.

Adversarial Attack on Skeleton-Based Human Action Recognition.基于骨架的人体动作识别的对抗攻击。

IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1609-1622. doi: 10.1109/TNNLS.2020.3043002. Epub 2022 Apr 4.

SMAM: Self and Mutual Adaptive Matching for Skeleton-Based Few-Shot Action Recognition.

IEEE Trans Image Process. 2023;32:392-402. doi: 10.1109/TIP.2022.3226410. Epub 2022 Dec 28.

Action-Attending Graphic Neural Network.动作关注图神经网络

IEEE Trans Image Process. 2018 Mar 14. doi: 10.1109/TIP.2018.2815744.

Frameless Graph Knowledge Distillation.无框架图知识蒸馏

IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):8125-8139. doi: 10.1109/TNNLS.2024.3442379. Epub 2025 May 2.

引用本文的文献

Spatial-Temporal Heatmap Masked Autoencoder for Skeleton-Based Action Recognition.用于基于骨架的动作识别的时空热图掩码自动编码器

Sensors (Basel). 2025 May 16;25(10):3146. doi: 10.3390/s25103146.

The use of artificial intelligence-based Siamese neural network in personalized guidance for sports dance teaching.基于人工智能的连体神经网络在体育舞蹈教学个性化指导中的应用。

Sci Rep. 2025 Apr 9;15(1):12112. doi: 10.1038/s41598-025-96462-0.

A Survey on 3D Skeleton-Based Action Recognition Using Learning Method.基于学习方法的三维骨骼动作识别研究

Cyborg Bionic Syst. 2024 May 16;5:0100. doi: 10.34133/cbsystems.0100. eCollection 2024.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于高效基于骨架的动作识别的结构知识蒸馏

Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献