发现运动基元，用于人类动作、手势和表情的无监督分组和一次性学习。

Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.

机构信息

Department of Electrical Engineering and Computer Science, University of Central Florida, Orlando, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1635-48. doi: 10.1109/TPAMI.2012.253.

DOI:10.1109/TPAMI.2012.253

PMID:23681992

Abstract

This paper proposes a novel representation of articulated human actions and gestures and facial expressions. The main goals of the proposed approach are: 1) to enable recognition using very few examples, i.e., one or k-shot learning, and 2) meaningful organization of unlabeled datasets by unsupervised clustering. Our proposed representation is obtained by automatically discovering high-level subactions or motion primitives, by hierarchical clustering of observed optical flow in four-dimensional, spatial, and motion flow space. The completely unsupervised proposed method, in contrast to state-of-the-art representations like bag of video words, provides a meaningful representation conducive to visual interpretation and textual labeling. Each primitive action depicts an atomic subaction, like directional motion of limb or torso, and is represented by a mixture of four-dimensional Gaussian distributions. For one--shot and k-shot learning, the sequence of primitive labels discovered in a test video are labeled using KL divergence, and can then be represented as a string and matched against similar strings of training videos. The same sequence can also be collapsed into a histogram of primitives or be used to learn a Hidden Markov model to represent classes. We have performed extensive experiments on recognition by one and k-shot learning as well as unsupervised action clustering on six human actions and gesture datasets, a composite dataset, and a database of facial expressions. These experiments confirm the validity and discriminative nature of the proposed representation.

摘要

本文提出了一种新的人体关节动作、手势和面部表情表示方法。该方法的主要目标是：1）通过少量示例（即一次或 k-shot 学习）实现识别，以及 2）通过无监督聚类对未标记数据集进行有意义的组织。我们的建议表示法是通过自动发现高级子动作或运动基元，通过对四维度、空间和运动流空间中观察到的光流进行层次聚类来获得的。与像“视频词袋”这样的最新表示方法完全不同，完全无监督的建议方法提供了一种有利于视觉解释和文本标记的有意义表示法。每个基元动作都描绘了一个原子子动作，例如肢体或躯干的定向运动，并用四个维度的高斯混合分布表示。对于一次和 k-shot 学习，在测试视频中发现的基元标签序列使用 KL 散度进行标记，然后可以表示为字符串，并与训练视频的相似字符串进行匹配。同一序列也可以折叠成基元的直方图，或者用于学习隐马尔可夫模型来表示类。我们在六个人体动作和手势数据集、一个组合数据集以及一个面部表情数据库上进行了广泛的一次和 k-shot 学习识别以及无监督动作聚类实验。这些实验证实了所提出的表示法的有效性和区分性。

相似文献

Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.

IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1635-48. doi: 10.1109/TPAMI.2012.253.

Animated pose templates for modeling and detecting human actions.

IEEE Trans Pattern Anal Mach Intell. 2014 Mar;36(3):436-52. doi: 10.1109/TPAMI.2013.144.

Learning sparse representations for human action recognition.

IEEE Trans Pattern Anal Mach Intell. 2012 Aug;34(8):1576-88. doi: 10.1109/TPAMI.2011.253.

Discovery and recognition of motion primitives in human activities.

PLoS One. 2019 Apr 1;14(4):e0214499. doi: 10.1371/journal.pone.0214499. eCollection 2019.

Cross-domain human action recognition.

IEEE Trans Syst Man Cybern B Cybern. 2012 Apr;42(2):298-307. doi: 10.1109/TSMCB.2011.2166761. Epub 2011 Sep 26.

Explicit modeling of human-object interactions in realistic videos.

IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):835-48. doi: 10.1109/TPAMI.2012.175.

Recognizing gestures by learning local motion signatures of HOG descriptors.

IEEE Trans Pattern Anal Mach Intell. 2012 Nov;34(11):2247-58. doi: 10.1109/TPAMI.2012.19.

A dynamic texture-based approach to recognition of facial actions and their temporal models.

IEEE Trans Pattern Anal Mach Intell. 2010 Nov;32(11):1940-54. doi: 10.1109/TPAMI.2010.50.

Surgical gesture classification from video and kinematic data.

Med Image Anal. 2013 Oct;17(7):732-45. doi: 10.1016/j.media.2013.04.007. Epub 2013 Apr 28.

Multiple target tracking by learning-based hierarchical association of detection responses.

IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):898-910. doi: 10.1109/TPAMI.2012.159.

引用本文的文献

Exploring motor skill acquisition in bimanual coordination: insights from navigating a novel maze task.

Sci Rep. 2024 Aug 14;14(1):18887. doi: 10.1038/s41598-024-69200-1.

A Two-Layer Self-Organizing Map with Vector Symbolic Architecture for Spatiotemporal Sequence Learning and Prediction.

Biomimetics (Basel). 2024 Mar 13;9(3):175. doi: 10.3390/biomimetics9030175.

Toward Modeling Psychomotor Performance in Karate Combats Using Computer Vision Pose Estimation.

Sensors (Basel). 2021 Dec 15;21(24):8378. doi: 10.3390/s21248378.

Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video.

Int J Comput Vis. 2017;121(2):303-325. doi: 10.1007/s11263-016-0939-9. Epub 2016 Aug 10.

An Unsupervised Framework for Online Spatiotemporal Detection of Activities of Daily Living by Hierarchical Activity Models.

Sensors (Basel). 2019 Sep 29;19(19):4237. doi: 10.3390/s19194237.

Discovery and recognition of motion primitives in human activities.

PLoS One. 2019 Apr 1;14(4):e0214499. doi: 10.1371/journal.pone.0214499. eCollection 2019.

A Branch-and-Bound Framework for Unsupervised Common Event Discovery.

Int J Comput Vis. 2017 Jul;123(3):372-391. doi: 10.1007/s11263-017-0989-7. Epub 2017 Feb 9.

Human Pose Estimation from Monocular Images: A Comprehensive Survey.

Sensors (Basel). 2016 Nov 25;16(12):1966. doi: 10.3390/s16121966.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

发现运动基元，用于人类动作、手势和表情的无监督分组和一次性学习。

Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献