• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于从 RGB 和深度模态进行大规模动作识别的混合网络。

A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities.

机构信息

School of Electrical and Information Engineering,Tianjin University, Tianjin 300072, China.

Advanced Multimedia Research Lab, University of Wollongong, NSW 2522, Australia.

出版信息

Sensors (Basel). 2020 Jun 10;20(11):3305. doi: 10.3390/s20113305.

DOI:10.3390/s20113305
PMID:32532007
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7308905/
Abstract

The paper presents a novel hybrid network for large-scale action recognition from multiple modalities. The network is built upon the proposed weighted dynamic images. It effectively leverages the strengths of the emerging Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches to specifically address the challenges that occur in large-scale action recognition and are not fully dealt with by the state-of-the-art methods. Specifically, the proposed hybrid network consists of a CNN based component and an RNN based component. Features extracted by the two components are fused through canonical correlation analysis and then fed to a linear Support Vector Machine (SVM) for classification. The proposed network achieved state-of-the-art results on the ChaLearn LAP IsoGD, NTU RGB+D and Multi-modal & Multi-view & Interactive ( M 2 I ) datasets and outperformed existing methods by a large margin (over 10 percentage points in some cases).

摘要

本文提出了一种新颖的混合网络,用于从多种模态进行大规模动作识别。该网络建立在提出的加权动态图像之上。它有效地利用了新兴的卷积神经网络(CNN)和基于递归神经网络(RNN)的方法的优势,专门解决了在大规模动作识别中出现的挑战,而这些挑战无法被最先进的方法完全解决。具体来说,所提出的混合网络由基于 CNN 的组件和基于 RNN 的组件组成。两个组件提取的特征通过典型相关分析进行融合,然后馈送到线性支持向量机(SVM)进行分类。该网络在 ChaLearn LAP IsoGD、NTU RGB+D 和多模态和多视图和交互(M2I)数据集上实现了最先进的结果,并在某些情况下以较大的优势超过了现有方法(超过 10 个百分点)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/35f36b129d0d/sensors-20-03305-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/6883610fcd08/sensors-20-03305-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/8c1f444908f3/sensors-20-03305-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/de6d2a43755c/sensors-20-03305-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/df57dfcd7b69/sensors-20-03305-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/f8f72cbcc0b1/sensors-20-03305-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/35f36b129d0d/sensors-20-03305-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/6883610fcd08/sensors-20-03305-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/8c1f444908f3/sensors-20-03305-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/de6d2a43755c/sensors-20-03305-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/df57dfcd7b69/sensors-20-03305-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/f8f72cbcc0b1/sensors-20-03305-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c019/7308905/35f36b129d0d/sensors-20-03305-g006.jpg

相似文献

1
A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities.一种用于从 RGB 和深度模态进行大规模动作识别的混合网络。
Sensors (Basel). 2020 Jun 10;20(11):3305. doi: 10.3390/s20113305.
2
Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection.边缘保持和多尺度上下文神经网络的显著目标检测。
IEEE Trans Image Process. 2018;27(1):121-134. doi: 10.1109/TIP.2017.2756825.
3
RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory.基于多模态深度神经网络和证据理论的 RGB-D 目标识别。
Sensors (Basel). 2019 Jan 27;19(3):529. doi: 10.3390/s19030529.
4
A Multi-Modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling.一种用于RGB-D目标标注的多模态、判别式且空间不变的卷积神经网络。
IEEE Trans Pattern Anal Mach Intell. 2018 Sep;40(9):2051-2065. doi: 10.1109/TPAMI.2017.2747134. Epub 2017 Aug 30.
5
Sketch-R2CNN: An RNN-Rasterization-CNN Architecture for Vector Sketch Recognition.Sketch-R2CNN:一种用于矢量草图识别的循环神经网络-光栅化-卷积神经网络架构
IEEE Trans Vis Comput Graph. 2021 Sep;27(9):3745-3754. doi: 10.1109/TVCG.2020.2987626. Epub 2021 Jul 29.
6
Multi-Scale Attention 3D Convolutional Network for Multimodal Gesture Recognition.用于多模态手势识别的多尺度注意力3D卷积网络
Sensors (Basel). 2022 Mar 21;22(6):2405. doi: 10.3390/s22062405.
7
A transfer learning-based CNN and LSTM hybrid deep learning model to classify motor imagery EEG signals.一种基于迁移学习的卷积神经网络和长短期记忆网络混合深度学习模型,用于对运动想象脑电信号进行分类。
Comput Biol Med. 2022 Apr;143:105288. doi: 10.1016/j.compbiomed.2022.105288. Epub 2022 Feb 10.
8
Classification of benign and malignant subtypes of breast cancer histopathology imaging using hybrid CNN-LSTM based transfer learning.基于混合 CNN-LSTM 的迁移学习的乳腺癌组织病理学成像的良恶性亚型分类。
BMC Med Imaging. 2023 Jan 30;23(1):19. doi: 10.1186/s12880-023-00964-0.
9
Multi-Class Weed Recognition Using Hybrid CNN-SVM Classifier.使用混合CNN-SVM分类器的多类杂草识别
Sensors (Basel). 2023 Aug 13;23(16):7153. doi: 10.3390/s23167153.
10
Facial expression recognition in videos using hybrid CNN & ConvLSTM.使用混合卷积神经网络(CNN)和卷积长短期记忆网络(ConvLSTM)进行视频中的面部表情识别。
Int J Inf Technol. 2023;15(4):1819-1830. doi: 10.1007/s41870-023-01183-0. Epub 2023 Mar 21.

引用本文的文献

1
An Efficient Human Instance-Guided Framework for Video Action Recognition.高效的人类实例引导视频动作识别框架
Sensors (Basel). 2021 Dec 12;21(24):8309. doi: 10.3390/s21248309.
2
TUHAD: Taekwondo Unit Technique Human Action Dataset with Key Frame-Based CNN Action Recognition.TUHAD:基于关键帧的 CNN 动作识别的跆拳道单元技术人体动作数据集。
Sensors (Basel). 2020 Aug 28;20(17):4871. doi: 10.3390/s20174871.

本文引用的文献

1
Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos.基于 RGB+D 视频的深度多模态特征分析用于动作识别
IEEE Trans Pattern Anal Mach Intell. 2018 May;40(5):1045-1058. doi: 10.1109/TPAMI.2017.2691321. Epub 2017 Apr 5.
2
Rank Pooling for Action Recognition.动作识别的等级池化。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):773-787. doi: 10.1109/TPAMI.2016.2558148.
3
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.长期递归卷积网络的视觉识别与描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691. doi: 10.1109/TPAMI.2016.2599174. Epub 2016 Sep 1.
4
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
5
Global Contrast Based Salient Region Detection.基于全局对比度的显著区域检测。
IEEE Trans Pattern Anal Mach Intell. 2015 Mar;37(3):569-82. doi: 10.1109/TPAMI.2014.2345401.
6
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.
7
3D convolutional neural networks for human action recognition.三维卷积神经网络的人体动作识别。
IEEE Trans Pattern Anal Mach Intell. 2013 Jan;35(1):221-31. doi: 10.1109/TPAMI.2012.59.
8
Canonical correlation analysis: an overview with application to learning methods.典型相关分析:概述及其在学习方法中的应用
Neural Comput. 2004 Dec;16(12):2639-64. doi: 10.1162/0899766042321814.