Suppr超能文献

通过集成多模态分析增强人类活动识别:重点关注 RGB 成像、骨骼跟踪和姿势估计。

Enhancing Human Activity Recognition through Integrated Multimodal Analysis: A Focus on RGB Imaging, Skeletal Tracking, and Pose Estimation.

机构信息

Department of Creative Technologies, Air University, Islamabad 44000, Pakistan.

Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea.

出版信息

Sensors (Basel). 2024 Jul 17;24(14):4646. doi: 10.3390/s24144646.

Abstract

Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR by integrating two critical modalities: RGB imaging and advanced pose estimation features. Our methodology leverages the strengths of each modality to overcome the drawbacks of unimodal systems, providing a richer and more accurate representation of activities. We propose a two-stream network that processes skeletal and RGB data in parallel, enhanced by pose estimation techniques for refined feature extraction. The integration of these modalities is facilitated through advanced fusion algorithms, significantly improving recognition accuracy. Extensive experiments conducted on the UTD multimodal human action dataset (UTD MHAD) demonstrate that the proposed approach exceeds the performance of existing state-of-the-art algorithms, yielding improved outcomes. This study not only sets a new benchmark for HAR systems but also highlights the importance of feature engineering in capturing the complexity of human movements and the integration of optimal features. Our findings pave the way for more sophisticated, reliable, and applicable HAR systems in real-world scenarios.

摘要

人体活动识别(HAR)在推动应用方面至关重要,这些应用涵盖从医疗保健监测到互动游戏等领域。传统的 HAR 系统主要依赖单一数据源,在捕捉人类活动的全貌方面存在局限性。本研究通过整合两种关键模态:RGB 成像和高级姿态估计功能,引入了一种全面的 HAR 方法。我们的方法利用每种模态的优势来克服单模态系统的缺点,为活动提供更丰富、更准确的表示。我们提出了一种双流网络,该网络并行处理骨骼和 RGB 数据,并通过姿态估计技术进行精细特征提取。通过先进的融合算法实现这些模态的集成,显著提高了识别准确性。在 UTD 多模态人体动作数据集(UTD MHAD)上进行的广泛实验表明,所提出的方法优于现有的最先进算法,取得了更好的结果。本研究不仅为 HAR 系统设定了新的基准,还强调了特征工程在捕捉人类运动复杂性和整合最优特征方面的重要性。我们的研究结果为在实际场景中构建更复杂、可靠和适用的 HAR 系统铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/9691eba33440/sensors-24-04646-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验