Suppr超能文献

用于高效基于骨架的动作识别的结构知识蒸馏

Structural Knowledge Distillation for Efficient Skeleton-Based Action Recognition.

作者信息

Bian Cunling, Feng Wei, Wan Liang, Wang Song

出版信息

IEEE Trans Image Process. 2021;30:2963-2976. doi: 10.1109/TIP.2021.3056895. Epub 2021 Feb 17.

Abstract

Skeleton data have been extensively used for action recognition since they can robustly accommodate dynamic circumstances and complex backgrounds. To guarantee the action-recognition performance, we prefer to use advanced and time-consuming algorithms to get more accurate and complete skeletons from the scene. However, this may not be acceptable in time- and resource-stringent applications. In this paper, we explore the feasibility of using low-quality skeletons, which can be quickly and easily estimated from the scene, for action recognition. While the use of low-quality skeletons will surely lead to degraded action-recognition accuracy, in this paper we propose a structural knowledge distillation scheme to minimize this accuracy degradations and improve recognition model's robustness to uncontrollable skeleton corruptions. More specifically, a teacher which observes high-quality skeletons obtained from a scene is used to help train a student which only sees low-quality skeletons generated from the same scene. At inference time, only the student network is deployed for processing low-quality skeletons. In the proposed network, a graph matching loss is proposed to distill the graph structural knowledge at an intermediate representation level. We also propose a new gradient revision strategy to seek a balance between mimicking the teacher model and directly improving the student model's accuracy. Experiments are conducted on Kenetics400, NTU RGB+D and Penn action recognition datasets and the comparison results demonstrate the effectiveness of our scheme.

摘要

骨架数据因其能够稳健地适应动态环境和复杂背景而被广泛用于动作识别。为了保证动作识别性能,我们倾向于使用先进且耗时的算法从场景中获取更准确和完整的骨架。然而,这在对时间和资源要求严格的应用中可能是不可接受的。在本文中,我们探索了使用低质量骨架进行动作识别的可行性,低质量骨架可以从场景中快速且容易地估计出来。虽然使用低质量骨架肯定会导致动作识别准确率下降,但在本文中我们提出了一种结构化知识蒸馏方案,以最小化这种准确率下降,并提高识别模型对不可控骨架损坏的鲁棒性。更具体地说,一个观察从场景中获得的高质量骨架的教师模型被用来帮助训练一个只看到从同一场景生成的低质量骨架的学生模型。在推理时,只部署学生网络来处理低质量骨架。在所提出的网络中,提出了一种图匹配损失,以在中间表示层蒸馏图结构知识。我们还提出了一种新的梯度修正策略,以在模仿教师模型和直接提高学生模型准确率之间寻求平衡。在肯尼特斯400、NTU RGB+D和宾夕法尼亚动作识别数据集上进行了实验,比较结果证明了我们方案的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验