• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过集成多模态分析增强人类活动识别:重点关注 RGB 成像、骨骼跟踪和姿势估计。

Enhancing Human Activity Recognition through Integrated Multimodal Analysis: A Focus on RGB Imaging, Skeletal Tracking, and Pose Estimation.

机构信息

Department of Creative Technologies, Air University, Islamabad 44000, Pakistan.

Department of Biomedical Engineering, College of IT Convergence, Gachon University, 1342 Seongnamdaero, Sujeong-gu, Seongnam-si 13120, Republic of Korea.

出版信息

Sensors (Basel). 2024 Jul 17;24(14):4646. doi: 10.3390/s24144646.

DOI:10.3390/s24144646
PMID:39066043
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11280841/
Abstract

Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR by integrating two critical modalities: RGB imaging and advanced pose estimation features. Our methodology leverages the strengths of each modality to overcome the drawbacks of unimodal systems, providing a richer and more accurate representation of activities. We propose a two-stream network that processes skeletal and RGB data in parallel, enhanced by pose estimation techniques for refined feature extraction. The integration of these modalities is facilitated through advanced fusion algorithms, significantly improving recognition accuracy. Extensive experiments conducted on the UTD multimodal human action dataset (UTD MHAD) demonstrate that the proposed approach exceeds the performance of existing state-of-the-art algorithms, yielding improved outcomes. This study not only sets a new benchmark for HAR systems but also highlights the importance of feature engineering in capturing the complexity of human movements and the integration of optimal features. Our findings pave the way for more sophisticated, reliable, and applicable HAR systems in real-world scenarios.

摘要

人体活动识别(HAR)在推动应用方面至关重要,这些应用涵盖从医疗保健监测到互动游戏等领域。传统的 HAR 系统主要依赖单一数据源,在捕捉人类活动的全貌方面存在局限性。本研究通过整合两种关键模态:RGB 成像和高级姿态估计功能,引入了一种全面的 HAR 方法。我们的方法利用每种模态的优势来克服单模态系统的缺点,为活动提供更丰富、更准确的表示。我们提出了一种双流网络,该网络并行处理骨骼和 RGB 数据,并通过姿态估计技术进行精细特征提取。通过先进的融合算法实现这些模态的集成,显著提高了识别准确性。在 UTD 多模态人体动作数据集(UTD MHAD)上进行的广泛实验表明,所提出的方法优于现有的最先进算法,取得了更好的结果。本研究不仅为 HAR 系统设定了新的基准,还强调了特征工程在捕捉人类运动复杂性和整合最优特征方面的重要性。我们的研究结果为在实际场景中构建更复杂、可靠和适用的 HAR 系统铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d6c2156dd268/sensors-24-04646-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/9691eba33440/sensors-24-04646-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/446e2844b31f/sensors-24-04646-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/2cfc3fe7405c/sensors-24-04646-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/dcaceaba7fbf/sensors-24-04646-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/48edfd9d33dd/sensors-24-04646-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/541dba657dcc/sensors-24-04646-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d7021490c91c/sensors-24-04646-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d6465afe2478/sensors-24-04646-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/16eb7401d141/sensors-24-04646-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/f069bab5463c/sensors-24-04646-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d6c2156dd268/sensors-24-04646-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/9691eba33440/sensors-24-04646-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/446e2844b31f/sensors-24-04646-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/2cfc3fe7405c/sensors-24-04646-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/dcaceaba7fbf/sensors-24-04646-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/48edfd9d33dd/sensors-24-04646-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/541dba657dcc/sensors-24-04646-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d7021490c91c/sensors-24-04646-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d6465afe2478/sensors-24-04646-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/16eb7401d141/sensors-24-04646-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/f069bab5463c/sensors-24-04646-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9fd/11280841/d6c2156dd268/sensors-24-04646-g011.jpg

相似文献

1
Enhancing Human Activity Recognition through Integrated Multimodal Analysis: A Focus on RGB Imaging, Skeletal Tracking, and Pose Estimation.通过集成多模态分析增强人类活动识别:重点关注 RGB 成像、骨骼跟踪和姿势估计。
Sensors (Basel). 2024 Jul 17;24(14):4646. doi: 10.3390/s24144646.
2
MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos.MMNet:一种基于模型的 RGB-D 视频人体动作识别多模态网络。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3522-3538. doi: 10.1109/TPAMI.2022.3177813. Epub 2023 Feb 3.
3
A union of deep learning and swarm-based optimization for 3D human action recognition.基于深度学习和群体智能优化的三维人体动作识别方法。
Sci Rep. 2022 Mar 31;12(1):5494. doi: 10.1038/s41598-022-09293-8.
4
Dynamic Edge Convolutional Neural Network for Skeleton-Based Human Action Recognition.基于骨架的人体动作识别的动态边缘卷积神经网络。
Sensors (Basel). 2023 Jan 10;23(2):778. doi: 10.3390/s23020778.
5
Deep Wavelet Convolutional Neural Networks for Multimodal Human Activity Recognition Using Wearable Inertial Sensors.基于可穿戴惯性传感器的多模态人体活动识别的深度小波卷积神经网络
Sensors (Basel). 2023 Dec 9;23(24):9721. doi: 10.3390/s23249721.
6
An improved human activity recognition technique based on convolutional neural network.基于卷积神经网络的改进型人体活动识别技术。
Sci Rep. 2023 Dec 19;13(1):22581. doi: 10.1038/s41598-023-49739-1.
7
Human Action Recognition From Various Data Modalities: A Review.基于多种数据模态的人类行为识别综述
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3200-3225. doi: 10.1109/TPAMI.2022.3183112. Epub 2023 Feb 3.
8
A Hierarchical Learning Approach for Human Action Recognition.分层学习方法在人体动作识别中的应用。
Sensors (Basel). 2020 Sep 1;20(17):4946. doi: 10.3390/s20174946.
9
Learning a Tracking and Estimation Integrated Graphical Model for Human Pose Tracking.学习用于人体姿态跟踪的跟踪和估计集成图形模型。
IEEE Trans Neural Netw Learn Syst. 2015 Dec;26(12):3176-86. doi: 10.1109/TNNLS.2015.2411287. Epub 2015 Mar 27.
10
Ambient intelligence-based multimodal human action recognition for autonomous systems.基于环境智能的多模态人体动作识别在自主系统中的应用。
ISA Trans. 2023 Jan;132:94-108. doi: 10.1016/j.isatra.2022.10.034. Epub 2022 Nov 1.

引用本文的文献

1
A Sliding Window-Based CNN-BiGRU Approach for Human Skeletal Pose Estimation Using mmWave Radar.一种基于滑动窗口的CNN-BiGRU方法用于使用毫米波雷达进行人体骨骼姿态估计。
Sensors (Basel). 2025 Feb 11;25(4):1070. doi: 10.3390/s25041070.
2
A Two-Stream Method for Human Action Recognition Using Facial Action Cues.基于面部动作线索的人体动作识别的双流方法。
Sensors (Basel). 2024 Oct 23;24(21):6817. doi: 10.3390/s24216817.

本文引用的文献

1
A joint multi-modal learning method for early-stage knee osteoarthritis disease classification.一种用于早期膝关节骨关节炎疾病分类的联合多模态学习方法。
Heliyon. 2023 Apr 13;9(4):e15461. doi: 10.1016/j.heliyon.2023.e15461. eCollection 2023 Apr.
2
Skeleton-Based Fall Detection with Multiple Inertial Sensors Using Spatial-Temporal Graph Convolutional Networks.基于骨架的多惯性传感器跌倒检测方法研究 使用时空图卷积网络
Sensors (Basel). 2023 Feb 14;23(4):2153. doi: 10.3390/s23042153.
3
Multi-scale and attention enhanced graph convolution network for skeleton-based violence action recognition.
用于基于骨架的暴力行为识别的多尺度注意力增强图卷积网络。
Front Neurorobot. 2022 Dec 15;16:1091361. doi: 10.3389/fnbot.2022.1091361. eCollection 2022.
4
A Parallel Multi-Modal Factorized Bilinear Pooling Fusion Method Based on the Semi-Tensor Product for Emotion Recognition.一种基于半张量积的并行多模态因子分解双线性池化融合情感识别方法。
Entropy (Basel). 2022 Dec 16;24(12):1836. doi: 10.3390/e24121836.
5
Multimodal Feature Fusion Method for Unbalanced Sample Data in Social Network Public Opinion.社交网络舆情中不平衡样本数据的多模态特征融合方法
Sensors (Basel). 2022 Jul 25;22(15):5528. doi: 10.3390/s22155528.
6
BEMD-3DCNN-based method for COVID-19 detection.基于 BEMD-3DCNN 的 COVID-19 检测方法。
Comput Biol Med. 2022 Mar;142:105188. doi: 10.1016/j.compbiomed.2021.105188. Epub 2021 Dec 30.
7
Surgical workflow recognition with 3DCNN for Sleeve Gastrectomy.基于 3DCNN 的袖状胃切除术手术流程识别。
Int J Comput Assist Radiol Surg. 2021 Nov;16(11):2029-2036. doi: 10.1007/s11548-021-02473-3. Epub 2021 Aug 20.
8
AR3D: Attention Residual 3D Network for Human Action Recognition.AR3D:用于人体动作识别的注意力残差 3D 网络。
Sensors (Basel). 2021 Feb 28;21(5):1656. doi: 10.3390/s21051656.
9
Compressing 3DCNNs based on tensor train decomposition.基于张量树分解的 3DCNN 压缩。
Neural Netw. 2020 Nov;131:215-230. doi: 10.1016/j.neunet.2020.07.028. Epub 2020 Aug 7.
10
A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera.基于单目 RGB 相机的联合 3D 姿态估计和动作识别的统一深度框架。
Sensors (Basel). 2020 Mar 25;20(7):1825. doi: 10.3390/s20071825.