Escuela de Ingeniería Eléctrica, Pontificia Universidad Católica de Valparaíso, Av. Brasil 2147, Valparaíso 2362804, Chile.
School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK.
Sensors (Basel). 2023 Jan 26;23(3):1400. doi: 10.3390/s23031400.
Recently, the scientific community has placed great emphasis on the recognition of human activity, especially in the area of health and care for the elderly. There are already practical applications of activity recognition and unusual conditions that use body sensors such as wrist-worn devices or neck pendants. These relatively simple devices may be prone to errors, might be uncomfortable to wear, might be forgotten or not worn, and are unable to detect more subtle conditions such as incorrect postures. Therefore, other proposed methods are based on the use of images and videos to carry out human activity recognition, even in open spaces and with multiple people. However, the resulting increase in the size and complexity involved when using image data requires the use of the most recent advanced machine learning and deep learning techniques. This paper presents an approach based on deep learning with attention to the recognition of activities from multiple frames. Feature extraction is performed by estimating the pose of the human skeleton, and classification is performed using a neural network based on Bidirectional Encoder Representation of Transformers (BERT). This algorithm was trained with the UP-Fall public dataset, generating more balanced artificial data with a Generative Adversarial Neural network (GAN), and evaluated with real data, outperforming the results of other activity recognition methods using the same dataset.
最近,科学界非常重视人类活动的识别,特别是在健康和老年人护理领域。已经有一些活动识别和异常情况的实际应用,这些应用使用身体传感器,如腕戴设备或颈戴吊坠。这些相对简单的设备可能容易出错,佩戴起来可能不舒服,可能会被遗忘或不佩戴,并且无法检测到更微妙的情况,如不正确的姿势。因此,其他提出的方法基于使用图像和视频来进行人体活动识别,甚至在开放空间和有多人的情况下也是如此。然而,使用图像数据时,规模和复杂性的增加需要使用最新的先进机器学习和深度学习技术。本文提出了一种基于深度学习的方法,重点是从多个帧识别活动。特征提取是通过估计人体骨架的姿势来实现的,分类是使用基于双向编码器表示转换器(BERT)的神经网络来实现的。该算法使用 UP-Fall 公共数据集进行训练,生成了更平衡的人工数据,使用生成对抗网络(GAN),并使用真实数据进行评估,在使用相同数据集的其他活动识别方法中表现优于其他方法。