Electrical Engineering Department, École de Technologies Supérieure, Montreal, QC H3C 1K3, Canada.
Sensors (Basel). 2020 Sep 1;20(17):4946. doi: 10.3390/s20174946.
In the domain of human action recognition, existing works mainly focus on using RGB, depth, skeleton and infrared data for analysis. While these methods have the benefit of being non-invasive, they can only be used within limited setups, are prone to issues such as occlusion and often need substantial computational resources. In this work, we address human action recognition through inertial sensor signals, which have a vast quantity of practical applications in fields such as sports analysis and human-machine interfaces. For that purpose, we propose a new learning framework built around a 1D-CNN architecture, which we validated by achieving very competitive results on the publicly available UTD-MHAD dataset. Moreover, the proposed method provides some answers to two of the greatest challenges currently faced by action recognition algorithms, which are (1) the recognition of high-level activities and (2) the reduction of their computational cost in order to make them accessible to embedded devices. Finally, this paper also investigates the tractability of the features throughout the proposed framework, both in time and duration, as we believe it could play an important role in future works in order to make the solution more intelligible, hardware-friendly and accurate.
在人类动作识别领域,现有工作主要集中在使用 RGB、深度、骨骼和红外数据进行分析。虽然这些方法具有非侵入性的优点,但它们只能在有限的设置中使用,容易出现遮挡等问题,并且通常需要大量的计算资源。在这项工作中,我们通过惯性传感器信号来解决人类动作识别问题,这些信号在运动分析和人机接口等领域有大量的实际应用。为此,我们提出了一个新的学习框架,该框架围绕 1D-CNN 架构构建,我们在公开的 UTD-MHAD 数据集上验证了该框架,取得了非常有竞争力的结果。此外,该方法为动作识别算法目前面临的两个最大挑战提供了一些解决方案,这两个挑战是:(1)高级活动的识别,(2)降低其计算成本,以便使其可用于嵌入式设备。最后,本文还研究了整个框架中特征的可处理性,包括时间和持续时间,我们认为这在未来的工作中可能会发挥重要作用,以使解决方案更加易于理解、硬件友好和准确。