• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

NTU RGB+D 120:用于三维人体活动理解的大规模基准测试。

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2684-2701. doi: 10.1109/TPAMI.2019.2916873. Epub 2019 May 14.

DOI:10.1109/TPAMI.2019.2916873
PMID:31095476
Abstract

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.

摘要

基于深度的人体活动分析研究取得了优异的性能,并证明了 3D 表示对于动作识别的有效性。现有的基于深度和 RGB+D 的动作识别基准存在许多局限性,包括缺乏大规模的训练样本、现实的不同类别数量、摄像机视角的多样性、不同的环境条件和不同的人体主体。在这项工作中,我们引入了一个用于 RGB+D 人体动作识别的大规模数据集,该数据集由 106 个不同的主体采集,包含超过 114000 个视频样本和 800 万帧。该数据集包含 120 个不同的动作类别,包括日常、相互和与健康相关的活动。我们评估了一系列现有的 3D 活动分析方法在该数据集上的性能,并展示了应用深度学习方法进行基于 3D 的人体动作识别的优势。此外,我们还研究了我们数据集上的一个新的单次 3D 活动识别问题,并提出了一种简单而有效的基于动作部分语义相关性感知的(APSR)框架来解决这个问题,该方法对新的动作类别的识别产生了有前景的结果。我们相信,这个大规模数据集的引入将使社区能够应用、适应和开发各种基于深度和 RGB+D 的人体活动理解的急需数据的学习技术。

相似文献

1
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.NTU RGB+D 120:用于三维人体活动理解的大规模基准测试。
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2684-2701. doi: 10.1109/TPAMI.2019.2916873. Epub 2019 May 14.
2
Multipe/single-view human action recognition via part-induced multitask structural learning.基于部件诱导多任务结构学习的多/单视图人体动作识别。
IEEE Trans Cybern. 2015 Jun;45(6):1194-208. doi: 10.1109/TCYB.2014.2347057. Epub 2014 Aug 27.
3
Desktop Action Recognition From First-Person Point-of-View.基于第一人称视角的桌面行为识别。
IEEE Trans Cybern. 2019 May;49(5):1616-1628. doi: 10.1109/TCYB.2018.2806381. Epub 2018 Feb 27.
4
Learning a Deep Model for Human Action Recognition from Novel Viewpoints.从新视角学习人类动作识别的深度模型。
IEEE Trans Pattern Anal Mach Intell. 2018 Mar;40(3):667-681. doi: 10.1109/TPAMI.2017.2691768. Epub 2017 Apr 6.
5
MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos.MMNet:一种基于模型的 RGB-D 视频人体动作识别多模态网络。
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3522-3538. doi: 10.1109/TPAMI.2022.3177813. Epub 2023 Feb 3.
6
A general framework for tracking multiple people from a moving camera.从移动摄像机跟踪多个人的通用框架。
IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1577-91. doi: 10.1109/TPAMI.2012.248.
7
Explicit modeling of human-object interactions in realistic videos.真实视频中人类-物体交互的显式建模。
IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):835-48. doi: 10.1109/TPAMI.2012.175.
8
Deeply Learned View-Invariant Features for Cross-View Action Recognition.深度学习的视图不变特征用于跨视图动作识别。
IEEE Trans Image Process. 2017 Jun;26(6):3028-3037. doi: 10.1109/TIP.2017.2696786. Epub 2017 Apr 24.
9
Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.发现运动基元,用于人类动作、手势和表情的无监督分组和一次性学习。
IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1635-48. doi: 10.1109/TPAMI.2012.253.
10
Multiview Semantic Representation for Visual Recognition.多视图语义表示在视觉识别中的应用。
IEEE Trans Cybern. 2020 May;50(5):2038-2049. doi: 10.1109/TCYB.2018.2875728. Epub 2018 Nov 6.

引用本文的文献

1
AI-Driven Tai Chi mastery using deep learning framework for movement assessment and personalized training.使用深度学习框架进行动作评估和个性化训练的人工智能驱动的太极拳掌握。
Sci Rep. 2025 Aug 28;15(1):31700. doi: 10.1038/s41598-025-17187-8.
2
A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities.跨多种数据模态的人类活动识别综合方法学综述
Sensors (Basel). 2025 Jun 27;25(13):4028. doi: 10.3390/s25134028.
3
Gait Recognition via Enhanced Visual-Audio Ensemble Learning with Decision Support Methods.
基于增强视觉-音频集成学习与决策支持方法的步态识别
Sensors (Basel). 2025 Jun 18;25(12):3794. doi: 10.3390/s25123794.
4
A Structured and Methodological Review on Multi-View Human Activity Recognition for Ambient Assisted Living.面向环境辅助生活的多视图人类活动识别的结构化与方法学综述
J Imaging. 2025 Jun 3;11(6):182. doi: 10.3390/jimaging11060182.
5
Semantics-Assisted Training Graph Convolution Network for Skeleton-Based Action Recognition.用于基于骨架的动作识别的语义辅助训练图卷积网络
Sensors (Basel). 2025 Mar 15;25(6):1841. doi: 10.3390/s25061841.
6
Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends.用于人类活动识别的机器学习:最新技术与新兴趋势。
J Imaging. 2025 Mar 20;11(3):91. doi: 10.3390/jimaging11030091.
7
Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition.用于基于自监督3D骨架的动作识别的对比掩码学习
Sensors (Basel). 2025 Feb 28;25(5):1521. doi: 10.3390/s25051521.
8
Two-stream spatio-temporal GCN-transformer networks for skeleton-based action recognition.用于基于骨架的动作识别的双流时空GCN-Transformer网络
Sci Rep. 2025 Feb 10;15(1):4982. doi: 10.1038/s41598-025-87752-8.
9
Safety After Dark: A Privacy Compliant and Real-Time Edge Computing Intelligent Video Analytics for Safer Public Transportation.夜幕下的安全:一种符合隐私要求的实时边缘计算智能视频分析技术,用于更安全的公共交通。
Sensors (Basel). 2024 Dec 19;24(24):8102. doi: 10.3390/s24248102.
10
Multi-Level Feature Fusion in CNN-Based Human Action Recognition: A Case Study on EfficientNet-B7.基于卷积神经网络的人类动作识别中的多级特征融合:以EfficientNet-B7为例
J Imaging. 2024 Dec 12;10(12):320. doi: 10.3390/jimaging10120320.