• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 RGB+D 视频的深度多模态特征分析用于动作识别

Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2018 May;40(5):1045-1058. doi: 10.1109/TPAMI.2017.2691321. Epub 2017 Apr 5.

DOI:10.1109/TPAMI.2017.2691321
PMID:28391189
Abstract

Single modality action recognition on RGB or depth sequences has been extensively explored recently. It is generally accepted that each of these two modalities has different strengths and limitations for the task of action recognition. Therefore, analysis of the RGB+D videos can help us to better study the complementary properties of these two types of modalities and achieve higher levels of performance. In this paper, we propose a new deep autoencoder based shared-specific feature factorization network to separate input multimodal signals into a hierarchy of components. Further, based on the structure of the features, a structured sparsity learning machine is proposed which utilizes mixed norms to apply regularization within components and group selection between them for better classification performance. Our experimental results show the effectiveness of our cross-modality feature analysis framework by achieving state-of-the-art accuracy for action classification on five challenging benchmark datasets.

摘要

最近,人们广泛研究了基于 RGB 或深度序列的单一模式动作识别。人们普遍认为,这两种模式对于动作识别任务都具有不同的优势和局限性。因此,对 RGB+D 视频的分析可以帮助我们更好地研究这两种模式的互补特性,并实现更高的性能水平。在本文中,我们提出了一种新的基于深度自动编码器的共享-特定特征分解网络,将输入多模态信号分解为层次化的组件。进一步地,基于特征的结构,提出了一种结构稀疏学习机,该学习机利用混合范数在组件内和组件间进行正则化,以获得更好的分类性能。我们的实验结果表明,通过在五个具有挑战性的基准数据集上实现最先进的动作分类精度,我们的跨模态特征分析框架是有效的。

相似文献

1
Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos.基于 RGB+D 视频的深度多模态特征分析用于动作识别
IEEE Trans Pattern Anal Mach Intell. 2018 May;40(5):1045-1058. doi: 10.1109/TPAMI.2017.2691321. Epub 2017 Apr 5.
2
Discriminative Relational Representation Learning for RGB-D Action Recognition.用于RGB-D动作识别的判别关系表示学习
IEEE Trans Image Process. 2016 Jun;25(6):2856-2865. doi: 10.1109/TIP.2016.2556940. Epub 2016 Apr 20.
3
Deep Multimodal Fusion Autoencoder for Saliency Prediction of RGB-D Images.用于RGB-D图像显著性预测的深度多模态融合自动编码器
Comput Intell Neurosci. 2021 May 5;2021:6610997. doi: 10.1155/2021/6610997. eCollection 2021.
4
Multimodal Multipart Learning for Action Recognition in Depth Videos.多模态多部分学习在深度视频中的动作识别。
IEEE Trans Pattern Anal Mach Intell. 2016 Oct;38(10):2123-9. doi: 10.1109/TPAMI.2015.2505295. Epub 2015 Dec 3.
5
MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos.MMNet:一种基于模型的 RGB-D 视频人体动作识别多模态网络。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3522-3538. doi: 10.1109/TPAMI.2022.3177813. Epub 2023 Feb 3.
6
Multipe/single-view human action recognition via part-induced multitask structural learning.基于部件诱导多任务结构学习的多/单视图人体动作识别。
IEEE Trans Cybern. 2015 Jun;45(6):1194-208. doi: 10.1109/TCYB.2014.2347057. Epub 2014 Aug 27.
7
Unsupervised Joint Feature Learning and Encoding for RGB-D Scene Labeling.基于 RGB-D 场景标注的无监督联合特征学习与编码。
IEEE Trans Image Process. 2015 Nov;24(11):4459-73. doi: 10.1109/TIP.2015.2465133. Epub 2015 Aug 11.
8
RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory.基于多模态深度神经网络和证据理论的 RGB-D 目标识别。
Sensors (Basel). 2019 Jan 27;19(3):529. doi: 10.3390/s19030529.
9
Robust action recognition via borrowing information across video modalities.通过跨视频模态信息借用实现鲁棒的动作识别。
IEEE Trans Image Process. 2015 Feb;24(2):709-23. doi: 10.1109/TIP.2014.2385591. Epub 2014 Dec 23.
10
Learning with Privileged Information via Adversarial Discriminative Modality Distillation.通过对抗性判别模态蒸馏进行带特权信息的学习。
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2581-2593. doi: 10.1109/TPAMI.2019.2929038. Epub 2019 Jul 16.

引用本文的文献

1
A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities.跨多种数据模态的人类活动识别综合方法学综述
Sensors (Basel). 2025 Jun 27;25(13):4028. doi: 10.3390/s25134028.
2
Human Action Recognition: A Taxonomy-Based Survey, Updates, and Opportunities.人体动作识别:基于分类的综述、更新与机遇。
Sensors (Basel). 2023 Feb 15;23(4):2182. doi: 10.3390/s23042182.
3
Dance Movement Recognition Based on Multimodal Environmental Monitoring Data.基于多模态环境监测数据的舞蹈动作识别。
J Environ Public Health. 2022 Jul 19;2022:1568930. doi: 10.1155/2022/1568930. eCollection 2022.
4
A union of deep learning and swarm-based optimization for 3D human action recognition.基于深度学习和群体智能优化的三维人体动作识别方法。
Sci Rep. 2022 Mar 31;12(1):5494. doi: 10.1038/s41598-022-09293-8.
5
Action Recognition Using Close-Up of Maximum Activation and ETRI-Activity3D LivingLab Dataset.利用最大激活特写和 ETRI-Activity3D LivingLab 数据集进行动作识别。
Sensors (Basel). 2021 Oct 12;21(20):6774. doi: 10.3390/s21206774.
6
RGB-D Data-Based Action Recognition: A Review.基于 RGB-D 数据的动作识别:综述。
Sensors (Basel). 2021 Jun 21;21(12):4246. doi: 10.3390/s21124246.
7
Complex Human Action Recognition Using a Hierarchical Feature Reduction and Deep Learning-Based Method.基于层次特征约简和深度学习方法的复杂人类行为识别
SN Comput Sci. 2021;2(2):94. doi: 10.1007/s42979-021-00484-0. Epub 2021 Feb 13.
8
Activity Recognition for Ambient Assisted Living with Videos, Inertial Units and Ambient Sensors.利用视频、惯性单元和环境传感器进行安养环境活动识别。
Sensors (Basel). 2021 Jan 24;21(3):768. doi: 10.3390/s21030768.
9
A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities.一种用于从 RGB 和深度模态进行大规模动作识别的混合网络。
Sensors (Basel). 2020 Jun 10;20(11):3305. doi: 10.3390/s20113305.
10
C-MHAD: Continuous Multimodal Human Action Dataset of Simultaneous Video and Inertial Sensing.C-MHAD:同时视频和惯性感知的连续多模态人体动作数据集。
Sensors (Basel). 2020 May 20;20(10):2905. doi: 10.3390/s20102905.