School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213164, China.
Sensors (Basel). 2020 Sep 11;20(18):5180. doi: 10.3390/s20185180.
In contemporary research on human action recognition, most methods separately consider the movement features of each joint. However, they ignore that human action is a result of integrally cooperative movement of each joint. Regarding the problem, this paper proposes an action feature representation, called Motion Collaborative Spatio-Temporal Vector (MCSTV) and Motion Spatio-Temporal Map (MSTM). MCSTV comprehensively considers the integral and cooperative between the motion joints. MCSTV weighted accumulates limbs' motion vector to form a new vector to account for the movement features of human action. To describe the action more comprehensively and accurately, we extract key motion energy by key information extraction based on inter-frame energy fluctuation, project the energy to three orthogonal axes and stitch them in temporal series to construct the MSTM. To combine the advantages of MSTM and MCSTV, we propose Multi-Target Subspace Learning (MTSL). MTSL projects MSTM and MCSTV into a common subspace and makes them complement each other. The results on MSR-Action3D and UTD-MHAD show that our method has higher recognition accuracy than most existing human action recognition algorithms.
在当前的人类动作识别研究中,大多数方法分别考虑每个关节的运动特征。然而,它们忽略了人类动作是各关节整体协同运动的结果。针对这个问题,本文提出了一种动作特征表示方法,称为运动协同时空向量(MCSTV)和运动时空图(MSTM)。MCSTV 全面考虑了运动关节之间的整体和协同性。MCSTV 通过加权累加肢体的运动向量来形成一个新的向量,以描述人体动作的运动特征。为了更全面、更准确地描述动作,我们基于帧间能量波动进行关键信息提取,提取关键运动能量,并将其投影到三个正交轴上,并在时间序列中拼接,构建 MSTM。为了结合 MSTM 和 MCSTV 的优势,我们提出了多目标子空间学习(MTSL)。MTSL 将 MSTM 和 MCSTV 投影到一个公共子空间中,并使它们相互补充。在 MSR-Action3D 和 UTD-MHAD 上的实验结果表明,与大多数现有的人动作识别算法相比,我们的方法具有更高的识别准确率。