Department of Computer Science and Engineering, Kyung Hee University, Global Campus, Yongin 17104, Korea.
Sensors (Basel). 2019 Apr 2;19(7):1599. doi: 10.3390/s19071599.
Human action recognition plays a significant part in the research community due to its emerging applications. A variety of approaches have been proposed to resolve this problem, however, several issues still need to be addressed. In action recognition, effectively extracting and aggregating the spatial-temporal information plays a vital role to describe a video. In this research, we propose a novel approach to recognize human actions by considering both deep spatial features and handcrafted spatiotemporal features. Firstly, we extract the deep spatial features by employing a state-of-the-art deep convolutional network, namely Inception-Resnet-v2. Secondly, we introduce a novel handcrafted feature descriptor, namely Weber's law based Volume Local Gradient Ternary Pattern (WVLGTP), which brings out the spatiotemporal features. It also considers the shape information by using gradient operation. Furthermore, Weber's law based threshold value and the ternary pattern based on an adaptive local threshold is presented to effectively handle the noisy center pixel value. Besides, a multi-resolution approach for WVLGTP based on an averaging scheme is also presented. Afterward, both these extracted features are concatenated and feed to the Support Vector Machine to perform the classification. Lastly, the extensive experimental analysis shows that our proposed method outperforms state-of-the-art approaches in terms of accuracy.
人体动作识别在研究社区中起着重要的作用,因为它具有新兴的应用。已经提出了多种方法来解决这个问题,但是仍然需要解决几个问题。在动作识别中,有效地提取和聚合时空信息对于描述视频起着至关重要的作用。在这项研究中,我们提出了一种新的方法,通过考虑深度空间特征和手工制作的时空特征来识别人体动作。首先,我们通过使用最先进的深度卷积网络,即 Inception-Resnet-v2,提取深度空间特征。其次,我们引入了一种新的手工制作的特征描述符,即基于韦伯定律的体积局部梯度三元模式(WVLGTP),它提取了时空特征。它还通过梯度运算考虑了形状信息。此外,提出了基于韦伯定律的阈值和基于自适应局部阈值的三元模式,以有效地处理噪声中心像素值。此外,还提出了一种基于平均方案的基于 WVLGTP 的多分辨率方法。然后,将这两个提取的特征连接起来,并将其馈送到支持向量机进行分类。最后,广泛的实验分析表明,我们提出的方法在准确性方面优于最先进的方法。