School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA 30332, USA.
Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA.
Sensors (Basel). 2021 Dec 13;21(24):8337. doi: 10.3390/s21248337.
Supervised training of human activity recognition (HAR) systems based on body-worn inertial measurement units (IMUs) is often constrained by the typically rather small amounts of labeled sample data. Systems like IMUTube have been introduced that employ cross-modality transfer approaches to convert videos of activities of interest into virtual IMU data. We demonstrate for the first time how such large-scale virtual IMU datasets can be used to train HAR systems that are substantially more complex than the state-of-the-art. Complexity is thereby represented by the number of model parameters that can be trained robustly. Our models contain components that are dedicated to capture the essentials of IMU data as they are of relevance for activity recognition, which increased the number of trainable parameters by a factor of 1100 compared to state-of-the-art model architectures. We evaluate the new model architecture on the challenging task of analyzing free-weight gym exercises, specifically on classifying 13 dumbbell execises. We have collected around 41 h of virtual IMU data using IMUTube from exercise videos available from YouTube. The proposed model is trained with the large amount of virtual IMU data and calibrated with a mere 36 min of real IMU data. The trained model was evaluated on a real IMU dataset and we demonstrate the substantial performance improvements of 20% absolute F1 score compared to the state-of-the-art convolutional models in HAR.
基于穿戴式惯性测量单元 (IMU) 的人体活动识别 (HAR) 系统的监督训练通常受到标记样本数据量通常较小的限制。已经引入了像 IMUTube 这样的系统,它们采用跨模态转移方法将感兴趣的活动视频转换为虚拟 IMU 数据。我们首次展示了如何使用如此大规模的虚拟 IMU 数据集来训练比最先进方法复杂得多的 HAR 系统。复杂性由可以稳健训练的模型参数数量来表示。我们的模型包含专门用于捕获与活动识别相关的 IMU 数据要点的组件,这使得可训练参数的数量与最先进的模型架构相比增加了 1100 倍。我们在分析自由重量健身房锻炼的挑战性任务上评估了新的模型架构,特别是在分类 13 种哑铃练习方面。我们使用来自 YouTube 的锻炼视频中的 IMUTube 收集了大约 41 小时的虚拟 IMU 数据。该模型使用大量的虚拟 IMU 数据进行训练,并仅使用 36 分钟的真实 IMU 数据进行校准。在真实的 IMU 数据集上对训练好的模型进行评估,并与 HAR 中的最先进的卷积模型相比,展示了绝对 F1 分数提高 20%的显著性能提升。