• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于动作识别的骨骼时空与动态信息深度融合

Deep Fusion of Skeleton Spatial-Temporal and Dynamic Information for Action Recognition.

作者信息

Gao Song, Zhang Dingzhuo, Tang Zhaoming, Wang Hongyan

机构信息

Aviation Maintenance NCO Academy, Air Force Engineering University, Xinyang 464007, China.

College of Information Engineering, Dalian University, Dalian 116622, China.

出版信息

Sensors (Basel). 2024 Nov 28;24(23):7609. doi: 10.3390/s24237609.

DOI:10.3390/s24237609
PMID:39686146
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11645088/
Abstract

Focusing on the issue of the low recognition rates achieved by traditional deep-information-based action recognition algorithms, an action recognition approach was developed based on skeleton spatial-temporal and dynamic features combined with a two-stream convolutional neural network (TS-CNN). Firstly, the skeleton's three-dimensional coordinate system was transformed to obtain coordinate information related to relative joint positions. Subsequently, this relevant joint information was encoded as a color texture map to construct the spatial-temporal feature descriptor of the skeleton. Furthermore, physical structure constraints of the human body were considered to enhance class differences. Additionally, the speed information for each joint was estimated and encoded as a color texture map to achieve the skeleton motion feature descriptor. The resulting spatial-temporal and dynamic features were further enhanced using motion saliency and morphology operators to improve their expression ability. Finally, these enhanced skeleton spatial-temporal and dynamic features were deeply fused via TS-CNN for implementing action recognition. Numerous results from experiments conducted on the publicly available datasets NTU RGB-D, Northwestern-UCLA, and UTD-MHAD demonstrate that the recognition rates achieved via the developed approach are 86.25%, 87.37%, and 93.75%, respectively, indicating that the approach can effectively improve the accuracy of action recognition in complex environments compared to state-of-the-art algorithms.

摘要

针对传统基于深度信息的动作识别算法识别率较低的问题,开发了一种基于骨骼时空和动态特征并结合双流卷积神经网络(TS-CNN)的动作识别方法。首先,对骨骼的三维坐标系进行变换,以获取与相对关节位置相关的坐标信息。随后,将此相关关节信息编码为彩色纹理图,以构建骨骼的时空特征描述符。此外,考虑人体的物理结构约束以增强类别差异。另外,估计每个关节的速度信息并将其编码为彩色纹理图,以实现骨骼运动特征描述符。使用运动显著性和形态学算子进一步增强所得的时空和动态特征,以提高其表达能力。最后,通过TS-CNN对这些增强的骨骼时空和动态特征进行深度融合,以实现动作识别。在公开可用数据集NTU RGB-D、西北大学UCLA和UTD-MHAD上进行的大量实验结果表明,通过所开发方法实现的识别率分别为86.25%、87.37%和93.75%,这表明与现有最先进算法相比,该方法能够有效提高复杂环境下动作识别的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/cc02c094fb28/sensors-24-07609-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/4cc02bfc4414/sensors-24-07609-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/457b4900ef7b/sensors-24-07609-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/67a926d33e81/sensors-24-07609-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/3d244221c96c/sensors-24-07609-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/758fd56c3c88/sensors-24-07609-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/e1dc3a61b3e4/sensors-24-07609-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/9a8512ff3898/sensors-24-07609-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/00afea57f9f6/sensors-24-07609-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/8887ec1f43bb/sensors-24-07609-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/cc02c094fb28/sensors-24-07609-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/4cc02bfc4414/sensors-24-07609-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/457b4900ef7b/sensors-24-07609-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/67a926d33e81/sensors-24-07609-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/3d244221c96c/sensors-24-07609-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/758fd56c3c88/sensors-24-07609-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/e1dc3a61b3e4/sensors-24-07609-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/9a8512ff3898/sensors-24-07609-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/00afea57f9f6/sensors-24-07609-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/8887ec1f43bb/sensors-24-07609-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/018d/11645088/cc02c094fb28/sensors-24-07609-g010.jpg

相似文献

1
Deep Fusion of Skeleton Spatial-Temporal and Dynamic Information for Action Recognition.用于动作识别的骨骼时空与动态信息深度融合
Sensors (Basel). 2024 Nov 28;24(23):7609. doi: 10.3390/s24237609.
2
Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network.基于有向无环图的线性映射卷积神经网络在骨骼动作识别中的应用。
Sensors (Basel). 2021 Apr 29;21(9):3112. doi: 10.3390/s21093112.
3
Multi-Modality Adaptive Feature Fusion Graph Convolutional Network for Skeleton-Based Action Recognition.基于骨架的动作识别的多模态自适应特征融合图卷积网络。
Sensors (Basel). 2023 Jun 7;23(12):5414. doi: 10.3390/s23125414.
4
Multi-scale and attention enhanced graph convolution network for skeleton-based violence action recognition.用于基于骨架的暴力行为识别的多尺度注意力增强图卷积网络。
Front Neurorobot. 2022 Dec 15;16:1091361. doi: 10.3389/fnbot.2022.1091361. eCollection 2022.
5
MSST-RT: Multi-Stream Spatial-Temporal Relative Transformer for Skeleton-Based Action Recognition.基于骨架的动作识别的多流时空相对Transformer(MSST-RT):Multi-Stream Spatial-Temporal Relative Transformer for Skeleton-Based Action Recognition。
Sensors (Basel). 2021 Aug 7;21(16):5339. doi: 10.3390/s21165339.
6
Dynamic Edge Convolutional Neural Network for Skeleton-Based Human Action Recognition.基于骨架的人体动作识别的动态边缘卷积神经网络。
Sensors (Basel). 2023 Jan 10;23(2):778. doi: 10.3390/s23020778.
7
Feedback Graph Convolutional Network for Skeleton-Based Action Recognition.用于基于骨架的动作识别的反馈图卷积网络
IEEE Trans Image Process. 2022;31:164-175. doi: 10.1109/TIP.2021.3129117. Epub 2021 Dec 2.
8
A discriminative multi-modal adaptation neural network model for video action recognition.一种用于视频动作识别的判别式多模态自适应神经网络模型。
Neural Netw. 2025 May;185:107114. doi: 10.1016/j.neunet.2024.107114. Epub 2025 Jan 3.
9
TFC-GCN: Lightweight Temporal Feature Cross-Extraction Graph Convolutional Network for Skeleton-Based Action Recognition.TFC-GCN:基于骨架的动作识别的轻量级时间特征交叉提取图卷积网络。
Sensors (Basel). 2023 Jun 15;23(12):5593. doi: 10.3390/s23125593.
10
Two-stream spatio-temporal GCN-transformer networks for skeleton-based action recognition.用于基于骨架的动作识别的双流时空GCN-Transformer网络
Sci Rep. 2025 Feb 10;15(1):4982. doi: 10.1038/s41598-025-87752-8.

本文引用的文献

1
Exploring 3D Human Action Recognition Using STACOG on Multi-View Depth Motion Maps Sequences.基于多视角深度运动图序列的 STACOG 探索三维人体动作识别。
Sensors (Basel). 2021 May 24;21(11):3642. doi: 10.3390/s21113642.
2
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.NTU RGB+D 120:用于三维人体活动理解的大规模基准测试。
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2684-2701. doi: 10.1109/TPAMI.2019.2916873. Epub 2019 May 14.
3
Utilising the Intel RealSense Camera for Measuring Health Outcomes in Clinical Research.
利用英特尔实感摄像头测量临床研究中的健康结果。
J Med Syst. 2018 Feb 5;42(3):53. doi: 10.1007/s10916-018-0905-x.
4
Latent Max-Margin Multitask Learning With Skelets for 3-D Action Recognition.基于骨架的潜在最大间隔多任务学习的三维动作识别。
IEEE Trans Cybern. 2017 Feb;47(2):439-448. doi: 10.1109/TCYB.2016.2519448. Epub 2016 Feb 2.