基于骨骼数据动态聚类的人体活动识别系统。

A Human Activity Recognition System Based on Dynamic Clustering of Skeleton Data.

机构信息

The BioRobotics Institute, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio, 34, 56026 Pontedera (PI), Italy.

出版信息

Sensors (Basel). 2017 May 11;17(5):1100. doi: 10.3390/s17051100.

DOI:10.3390/s17051100

PMID:28492486

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5470490/

Abstract

Human activity recognition is an important area in computer vision, with its wide range of applications including ambient assisted living. In this paper, an activity recognition system based on skeleton data extracted from a depth camera is presented. The system makes use of machine learning techniques to classify the actions that are described with a set of a few basic postures. The training phase creates several models related to the number of clustered postures by means of a multiclass Support Vector Machine (SVM), trained with Sequential Minimal Optimization (SMO). The classification phase adopts the X-means algorithm to find the optimal number of clusters dynamically. The contribution of the paper is twofold. The first aim is to perform activity recognition employing features based on a small number of informative postures, extracted independently from each activity instance; secondly, it aims to assess the minimum number of frames needed for an adequate classification. The system is evaluated on two publicly available datasets, the Cornell Activity Dataset (CAD-60) and the Telecommunication Systems Team (TST) Fall detection dataset. The number of clusters needed to model each instance ranges from two to four elements. The proposed approach reaches excellent performances using only about 4 s of input data (~100 frames) and outperforms the state of the art when it uses approximately 500 frames on the CAD-60 dataset. The results are promising for the test in real context.

摘要

人体活动识别是计算机视觉中的一个重要领域，其应用范围广泛，包括环境辅助生活。本文提出了一种基于深度相机提取的骨骼数据的活动识别系统。该系统利用机器学习技术对用少量基本姿势描述的动作进行分类。训练阶段通过使用顺序最小优化（SMO）的多类支持向量机（SVM），针对聚类姿势的数量创建几个模型。分类阶段采用 X-means 算法动态找到最佳的聚类数量。本文的贡献有两个方面。第一个目标是使用从每个活动实例中独立提取的基于少量信息姿势的特征进行活动识别；其次，评估分类所需的最少帧数。该系统在两个公开可用的数据集，即康奈尔活动数据集（CAD-60）和电信系统团队（TST）跌倒检测数据集上进行了评估。每个实例建模所需的聚类数量从两个到四个元素不等。该方法在使用大约 4 秒的输入数据（约 100 帧）时达到了优异的性能，并且在 CAD-60 数据集上使用大约 500 帧时表现优于最新技术。在真实环境下的测试结果很有前景。