用于人体动作识别的姿态-运动特征自组织神经整合

Self-organizing neural integration of pose-motion features for human action recognition.

作者信息

Parisi German I, Weber Cornelius, Wermter Stefan

机构信息

Department of Informatics, Knowledge Technology Institute, University of Hamburg Hamburg, Germany.

出版信息

Front Neurorobot. 2015 Jun 9;9:3. doi: 10.3389/fnbot.2015.00003. eCollection 2015.

DOI:10.3389/fnbot.2015.00003

PMID:26106323

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4460528/

Abstract

The visual recognition of complex, articulated human movements is fundamental for a wide range of artificial systems oriented toward human-robot communication, action classification, and action-driven perception. These challenging tasks may generally involve the processing of a huge amount of visual information and learning-based mechanisms for generalizing a set of training actions and classifying new samples. To operate in natural environments, a crucial property is the efficient and robust recognition of actions, also under noisy conditions caused by, for instance, systematic sensor errors and temporarily occluded persons. Studies of the mammalian visual system and its outperforming ability to process biological motion information suggest separate neural pathways for the distinct processing of pose and motion features at multiple levels and the subsequent integration of these visual cues for action perception. We present a neurobiologically-motivated approach to achieve noise-tolerant action recognition in real time. Our model consists of self-organizing Growing When Required (GWR) networks that obtain progressively generalized representations of sensory inputs and learn inherent spatio-temporal dependencies. During the training, the GWR networks dynamically change their topological structure to better match the input space. We first extract pose and motion features from video sequences and then cluster actions in terms of prototypical pose-motion trajectories. Multi-cue trajectories from matching action frames are subsequently combined to provide action dynamics in the joint feature space. Reported experiments show that our approach outperforms previous results on a dataset of full-body actions captured with a depth sensor, and ranks among the best results for a public benchmark of domestic daily actions.

摘要

对于广泛的面向人机通信、动作分类和动作驱动感知的人工系统而言，对复杂、有关节的人体动作进行视觉识别至关重要。这些具有挑战性的任务通常可能涉及大量视觉信息的处理以及基于学习的机制，用于归纳一组训练动作并对新样本进行分类。为了在自然环境中运行，一个关键特性是在例如由系统传感器误差和人员临时遮挡等噪声条件下，也能高效且稳健地识别动作。对哺乳动物视觉系统及其处理生物运动信息的卓越能力的研究表明，存在单独的神经通路，用于在多个层面上对姿势和运动特征进行不同处理，并随后将这些视觉线索整合以进行动作感知。我们提出一种受神经生物学启发的方法，以实时实现耐噪声动作识别。我们的模型由自组织按需生长（GWR）网络组成，该网络获得感觉输入的逐步泛化表示并学习固有的时空依赖性。在训练期间，GWR网络动态改变其拓扑结构以更好地匹配输入空间。我们首先从视频序列中提取姿势和运动特征，然后根据典型的姿势 - 运动轨迹对动作进行聚类。随后将来自匹配动作帧的多线索轨迹进行组合，以在联合特征空间中提供动作动态。报告的实验表明，我们的方法在使用深度传感器捕获的全身动作数据集上优于先前的结果，并且在家庭日常动作的公共基准测试中名列前茅。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac58/4460528/54d369c0deb7/fnbot-09-00003-g0001.jpg

相似文献

Self-organizing neural integration of pose-motion features for human action recognition.

Front Neurorobot. 2015 Jun 9;9:3. doi: 10.3389/fnbot.2015.00003. eCollection 2015.

Compositional Learning of Human Activities With a Self-Organizing Neural Architecture.

Front Robot AI. 2019 Aug 27;6:72. doi: 10.3389/frobt.2019.00072. eCollection 2019.

Performance of a Computational Model of the Mammalian Olfactory System

Real-Time Biologically Inspired Action Recognition from Key Poses Using a Neuromorphic Architecture.

Front Neurorobot. 2017 Mar 22;11:13. doi: 10.3389/fnbot.2017.00013. eCollection 2017.

Engineering Aspects of Olfaction

Lifelong learning of human actions with deep neural network self-organization.

Neural Netw. 2017 Dec;96:137-149. doi: 10.1016/j.neunet.2017.09.001. Epub 2017 Sep 20.

A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset.

Sensors (Basel). 2022 Sep 9;22(18):6841. doi: 10.3390/s22186841.

Animated pose templates for modeling and detecting human actions.

IEEE Trans Pattern Anal Mach Intell. 2014 Mar;36(3):436-52. doi: 10.1109/TPAMI.2013.144.

Exploring biological motion perception in two-stream convolutional neural networks.

Vision Res. 2021 Jan;178:28-40. doi: 10.1016/j.visres.2020.09.005. Epub 2020 Oct 19.

Data-driven spatio-temporal RGBD feature encoding for action recognition in operating rooms.

Int J Comput Assist Radiol Surg. 2015 Jun;10(6):737-47. doi: 10.1007/s11548-015-1186-1. Epub 2015 Apr 7.

引用本文的文献

Differential contributions of body form, motion, and temporal information to subjective action understanding in naturalistic stimuli.

Front Integr Neurosci. 2024 Mar 12;18:1302960. doi: 10.3389/fnint.2024.1302960. eCollection 2024.

Compositional Learning of Human Activities With a Self-Organizing Neural Architecture.

Front Robot AI. 2019 Aug 27;6:72. doi: 10.3389/frobt.2019.00072. eCollection 2019.

Online recognition of unsegmented actions with hierarchical SOM architecture.

Cogn Process. 2021 Feb;22(1):77-91. doi: 10.1007/s10339-020-00986-4. Epub 2020 Jul 22.

Lifelong Learning of Spatiotemporal Representations With Dual-Memory Recurrent Self-Organization.

Front Neurorobot. 2018 Nov 28;12:78. doi: 10.3389/fnbot.2018.00078. eCollection 2018.

An fNIRS-based investigation of visual merchandising displays for fashion stores.

PLoS One. 2018 Dec 11;13(12):e0208843. doi: 10.1371/journal.pone.0208843. eCollection 2018.

Evaluating Integration Strategies for Visuo-Haptic Object Recognition.

Cognit Comput. 2018;10(3):408-425. doi: 10.1007/s12559-017-9536-7. Epub 2017 Dec 28.

A Human Activity Recognition System Based on Dynamic Clustering of Skeleton Data.

Sensors (Basel). 2017 May 11;17(5):1100. doi: 10.3390/s17051100.

Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition.

Sensors (Basel). 2016 Dec 17;16(12):2171. doi: 10.3390/s16122171.

A Human Activity Recognition System Using Skeleton Data from RGBD Sensors.

Comput Intell Neurosci. 2016;2016:4351435. doi: 10.1155/2016/4351435. Epub 2016 Mar 16.

Editorial: Neural plasticity for rich and uncertain robotic information streams.

Front Neurorobot. 2015 Oct 27;9:12. doi: 10.3389/fnbot.2015.00012. eCollection 2015.

本文引用的文献

Multilevel depth and image fusion for human activity detection.

IEEE Trans Cybern. 2013 Oct;43(5):1383-94. doi: 10.1109/TCYB.2013.2276433. Epub 2013 Aug 27.

Enhanced computer vision with Microsoft Kinect sensor: a review.

IEEE Trans Cybern. 2013 Oct;43(5):1318-34. doi: 10.1109/TCYB.2013.2265378. Epub 2013 Jun 25.

Rare neural correlations implement robotic conditioning with delayed rewards and disturbances.

Front Neurorobot. 2013 Apr 2;7:6. doi: 10.3389/fnbot.2013.00006. eCollection 2013.

Feature-based attention promotes biological motion recognition.

J Vis. 2011 Sep 1;11(10):11. doi: 10.1167/11.10.11.

Characterizing brain cortical plasticity and network dynamics across the age-span in health and disease with TMS-EEG and TMS-fMRI.

Brain Topogr. 2011 Oct;24(3-4):302-15. doi: 10.1007/s10548-011-0196-8. Epub 2011 Aug 14.

Recognizing human actions by learning and matching shape-motion prototype trees.

IEEE Trans Pattern Anal Mach Intell. 2012 Mar;34(3):533-47. doi: 10.1109/TPAMI.2011.147.

Temporal cortex neurons encode articulated actions as slow sequences of integrated poses.

J Neurosci. 2010 Feb 24;30(8):3133-45. doi: 10.1523/JNEUROSCI.3211-09.2010.

Contributions of form, motion and task to biological motion perception.

J Vis. 2009 Mar 31;9(3):28.1-11. doi: 10.1167/9.3.28.

Metrics of the perception of body movement.

J Vis. 2008 Jul 28;8(9):13.1-18. doi: 10.1167/8.9.13.

Functional differentiation of macaque visual temporal cortical neurons using a parametric action space.

Cereb Cortex. 2009 Mar;19(3):593-611. doi: 10.1093/cercor/bhn109. Epub 2008 Jul 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于人体动作识别的姿态-运动特征自组织神经整合

Self-organizing neural integration of pose-motion features for human action recognition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献