Abdellatef Essam, Al-Makhlasawy Rasha M, Shalaby Wafaa A
Department of Electrical Engineering, Faculty of Engineering, Sinai University, El-Arish, 45511, Egypt.
Electronics Research Institute, Joseph Tito St, El Nozha, P.O. Box: 12622, Cairo, Cairo, Egypt.
Sci Rep. 2025 Feb 27;15(1):7004. doi: 10.1038/s41598-025-90307-6.
Human Activity Recognition (HAR) plays a critical role in fields such as healthcare, sports, and human-computer interaction. However, achieving high accuracy and robustness remains a challenge, particularly when dealing with noisy sensor data from accelerometers and gyroscopes. This paper introduces HARCNN, a novel approach leveraging Convolutional Neural Networks (CNNs) to extract hierarchical spatial and temporal features from raw sensor data, enhancing activity recognition performance. The HARCNN model is designed with 10 convolutional blocks, referred to as "ConvBlk." Each block integrates a convolutional layer, a ReLU activation function, and a batch normalization layer. The outputs from specific blocks "ConvBlk_3 and ConvBlk_4," "ConvBlk_6 and ConvBlk_7," and "ConvBlk_9 and ConvBlk_10" are fused using a depth concatenation approach. The concatenated outputs are subsequently passed through a 2 × 2 max-pooling layer with a stride of 2 for further processing. The proposed HARCNN framework is evaluated using accuracy, precision, sensitivity, and f-score as key metrics, reflecting the model's ability to correctly classify and differentiate between human activities. The proposed model's performance is compared to traditional pre-trained Convolutional Neural Networks (CNNs) and other state-of-the-art techniques. By leveraging advanced feature extraction and optimized learning strategies, the proposed model demonstrates its efficacy in achieving accuracy of 97.87%, 99.12%, 96.58%, and 98.51% for various human activities datasets; UCI-HAR, KU-HAR, WISDM, and HMDB51, respectively. This comparison underscores the model's robustness, highlighting improvements in minimizing false positives and false negatives, which are crucial for real-world applications where reliable predictions are essential. The experiments were conducted with various window sizes (50ms, 100ms, 200ms, 500ms, 1s, and 2s). The results indicate that the proposed method achieves high accuracy and reliability across these different window sizes, highlighting its ability to adapt to varying temporal granularities without significant loss of performance. This demonstrates the method's effectiveness and robustness, making it well-suited for deployment in diverse HAR scenarios. Notably, the best results were obtained with a window size of 200ms.
人类活动识别(HAR)在医疗保健、体育和人机交互等领域发挥着关键作用。然而,要实现高精度和鲁棒性仍然是一项挑战,尤其是在处理来自加速度计和陀螺仪的噪声传感器数据时。本文介绍了HARCNN,这是一种利用卷积神经网络(CNN)从原始传感器数据中提取分层空间和时间特征的新颖方法,可提高活动识别性能。HARCNN模型设计有10个卷积块,称为“ConvBlk”。每个块集成了一个卷积层、一个ReLU激活函数和一个批量归一化层。来自特定块“ConvBlk_3和ConvBlk_4”、“ConvBlk_6和ConvBlk_7”以及“ConvBlk_9和ConvBlk_10”的输出使用深度拼接方法进行融合。拼接后的输出随后通过一个步长为2的2×2最大池化层进行进一步处理。所提出的HARCNN框架使用准确率、精确率、灵敏度和F分数作为关键指标进行评估,反映了模型正确分类和区分人类活动的能力。将所提出模型的性能与传统预训练卷积神经网络(CNN)和其他先进技术进行了比较。通过利用先进的特征提取和优化的学习策略,所提出的模型在各种人类活动数据集(UCI-HAR、KU-HAR、WISDM和HMDB51)上分别实现了97.87%、99.12%、96.58%和98.51%的准确率,证明了其有效性。这种比较突出了模型的鲁棒性,强调了在最小化误报和漏报方面的改进,这对于需要可靠预测的实际应用至关重要。实验在各种窗口大小(50毫秒、100毫秒、200毫秒、500毫秒、1秒和2秒)下进行。结果表明,所提出的方法在这些不同的窗口大小下都实现了高精度和可靠性,突出了其适应不同时间粒度而不会显著损失性能的能力。这证明了该方法的有效性和鲁棒性,使其非常适合部署在各种HAR场景中。值得注意的是,窗口大小为200毫秒时获得了最佳结果。