Venkatachalam Shanmuga, Nair Harideep, Zeng Ming, Tan Cathy Shunwen, Mengshoel Ole J, Shen John Paul
Department of ECE, Carnegie Mellon University, Pittsburgh, PA, United States.
Department of ECE, Anderson School of Management, University of California, Los Angeles, Los Angeles, CA, United States.
Front Big Data. 2022 Aug 30;5:879389. doi: 10.3389/fdata.2022.879389. eCollection 2022.
Human Activity Recognition (HAR) is a prominent application in mobile computing and Internet of Things (IoT) that aims to detect human activities based on multimodal sensor signals generated as a result of diverse body movements. Human physical activities are typically composed of simple actions (such as "arm up", "arm down", "arm curl", etc.), referred to as features. Such abstract semantic features, in contrast to high-level activities ("walking", "sitting", etc.) and low-level signals (raw sensor readings), can be developed manually to assist activity recognition. Although effective, this manual approach relies heavily on human domain expertise and is not scalable. In this paper, we address this limitation by proposing a machine learning method, SemNet, based on deep belief networks. SemNet automatically constructs semantic features representative of the axial bodily movements. Experimental results show that SemNet outperforms baseline approaches and is capable of learning features that highly correlate with manually defined semantic attributes. Furthermore, our experiments using a different model, namely deep convolutional LSTM, on household activities illustrate the broader applicability of semantic attribute interpretation to diverse deep neural network approaches. These empirical results not only demonstrate that such a deep learning technique is semantically meaningful and superior to its handcrafted counterpart, but also provides a better understanding of the deep learning methods that are used for Human Activity Recognition.
人类活动识别(HAR)是移动计算和物联网(IoT)中的一个重要应用,旨在基于因各种身体运动而产生的多模态传感器信号来检测人类活动。人类身体活动通常由简单动作(如“手臂向上”“手臂向下”“手臂卷曲”等)组成,这些动作被称为特征。与高级活动(如“行走”“坐着”等)和低级信号(原始传感器读数)相比,这种抽象语义特征可以手动开发以辅助活动识别。尽管这种方法有效,但它严重依赖人类领域专业知识且不可扩展。在本文中,我们通过提出一种基于深度信念网络的机器学习方法SemNet来解决这一局限性。SemNet自动构建代表身体轴向运动的语义特征。实验结果表明,SemNet优于基线方法,并且能够学习与手动定义的语义属性高度相关的特征。此外,我们使用不同模型(即深度卷积LSTM)对家庭活动进行的实验说明了语义属性解释在各种深度神经网络方法中的更广泛适用性。这些实证结果不仅表明这种深度学习技术在语义上有意义且优于手工制作的技术,还能更好地理解用于人类活动识别的深度学习方法。