IEEE Trans Cybern. 2021 Dec;51(12):5859-5870. doi: 10.1109/TCYB.2019.2960481. Epub 2021 Dec 22.
Automatic human activity recognition is an integral part of any interactive application involving humans (e.g., human-robot interaction systems). One of the main challenges for activity recognition is the diversity in the way individuals often perform activities. Furthermore, changes in any of the environment factors (i.e., illumination, complex background, human body shapes, viewpoint, etc.) intensify this challenge. In addition, there are different types of activities that robots need to interpret for seamless interaction with humans. Some activities are short, quick, and simple (e.g., sitting), while others may be detailed/complex, and spread throughout a long span of time (e.g., washing mouth). In this article, we recognize the activities within the context of graphical models in a sequence-labeling framework based on skeleton data. We propose a new structured prediction strategy based on probabilistic graphical models (PGMs) to recognize both types of activities (i.e., complex and simple). These activity types are often spanned in very diverse subspaces in the space of all possible activities, which would require different model parameterizations. In order to deal with these parameterization and structural breaks across models, a category-switching scheme is proposed to switch over the models based on the activity types. For parameter optimization, we utilize a distributed structured prediction technique to implement our model in a distributed setting. The method is tested on three widely used datasets (CAD-60, UT-Kinect, and Florence 3-D) that cover both activity types. The results illustrate that our proposed method is able to recognize simple and complex activities while the previous work concentrated on only one of these two main types.
自动人体活动识别是任何涉及人类的交互应用程序的一个组成部分(例如,人机交互系统)。活动识别的主要挑战之一是个体执行活动的方式的多样性。此外,任何环境因素的变化(即照明、复杂背景、人体形状、视点等)都会加剧这一挑战。此外,机器人需要解释不同类型的活动,以便与人类进行无缝交互。有些活动是短暂、快速和简单的(例如,坐着),而有些活动可能是详细/复杂的,并且持续很长时间(例如,漱口)。在本文中,我们在基于骨架数据的序列标注框架中基于图形模型识别活动。我们提出了一种新的基于概率图形模型(PGM)的结构化预测策略,以识别复杂和简单两种类型的活动。这些活动类型通常在所有可能活动的空间中跨越非常不同的子空间,这将需要不同的模型参数化。为了处理这些跨模型的参数化和结构断裂,我们提出了一种类别切换方案,根据活动类型切换模型。对于参数优化,我们利用分布式结构化预测技术在分布式设置中实现我们的模型。该方法在三个广泛使用的数据集(CAD-60、UT-Kinect 和 Florence 3-D)上进行了测试,这些数据集涵盖了这两种主要类型的活动。结果表明,我们提出的方法能够识别简单和复杂的活动,而以前的工作集中于这两种主要类型中的一种。