学习时空表示进行动作识别：一种遗传编程方法。

Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach.

出版信息

IEEE Trans Cybern. 2016 Jan;46(1):158-70. doi: 10.1109/TCYB.2015.2399172. Epub 2015 Feb 13.

DOI:10.1109/TCYB.2015.2399172

Abstract

Extracting discriminative and robust features from video sequences is the first and most critical step in human action recognition. In this paper, instead of using handcrafted features, we automatically learn spatio-temporal motion features for action recognition. This is achieved via an evolutionary method, i.e., genetic programming (GP), which evolves the motion feature descriptor on a population of primitive 3D operators (e.g., 3D-Gabor and wavelet). In this way, the scale and shift invariant features can be effectively extracted from both color and optical flow sequences. We intend to learn data adaptive descriptors for different datasets with multiple layers, which makes fully use of the knowledge to mimic the physical structure of the human visual cortex for action recognition and simultaneously reduce the GP searching space to effectively accelerate the convergence of optimal solutions. In our evolutionary architecture, the average cross-validation classification error, which is calculated by an support-vector-machine classifier on the training set, is adopted as the evaluation criterion for the GP fitness function. After the entire evolution procedure finishes, the best-so-far solution selected by GP is regarded as the (near-)optimal action descriptor obtained. The GP-evolving feature extraction method is evaluated on four popular action datasets, namely KTH, HMDB51, UCF YouTube, and Hollywood2. Experimental results show that our method significantly outperforms other types of features, either hand-designed or machine-learned.

摘要

从视频序列中提取有区别性和鲁棒性的特征是人类动作识别的第一步，也是最关键的一步。在本文中，我们不是使用手工制作的特征，而是自动学习用于动作识别的时空运动特征。这是通过进化方法，即遗传编程（GP）来实现的，它在原始 3D 操作符（例如 3D-Gabor 和小波）的种群上进化运动特征描述符。这样，就可以从颜色和光流序列中有效地提取具有尺度和位移不变性的特征。我们打算使用多层学习自适应描述符来处理不同的数据集，这充分利用了知识来模拟人类视觉皮层的物理结构，从而进行动作识别，同时减少 GP 搜索空间，以有效地加快最优解的收敛速度。在我们的进化架构中，采用支持向量机分类器在训练集上计算的平均交叉验证分类误差作为 GP 适应度函数的评估标准。在整个进化过程完成后，GP 选择的最佳解决方案被视为（接近）最优的动作描述符。我们的 GP 进化特征提取方法在四个流行的动作数据集（即 KTH、HMDB51、UCF YouTube 和 Hollywood2）上进行了评估。实验结果表明，我们的方法明显优于其他类型的特征，无论是手工设计的还是机器学习的。

相似文献

Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach.

IEEE Trans Cybern. 2016 Jan;46(1):158-70. doi: 10.1109/TCYB.2015.2399172. Epub 2015 Feb 13.

Spatio-temporal Laplacian pyramid coding for action recognition.

IEEE Trans Cybern. 2014 Jun;44(6):817-27. doi: 10.1109/TCYB.2013.2273174. Epub 2013 Jul 31.

Learning Human Actions by Combining Global Dynamics and Local Appearance.

IEEE Trans Pattern Anal Mach Intell. 2014 Dec;36(12):2466-82. doi: 10.1109/TPAMI.2014.2329301.

Action Spotting and Recognition Based on a Spatiotemporal Orientation Analysis.

IEEE Trans Pattern Anal Mach Intell. 2013 Mar;35(3):527-40. doi: 10.1109/TPAMI.2012.141.

Dynamic Spatio-Temporal Bag of Expressions (D-STBoE) Model for Human Action Recognition.

Sensors (Basel). 2019 Jun 21;19(12):2790. doi: 10.3390/s19122790.

Evaluation of color spatio-temporal interest points for human action recognition.

IEEE Trans Image Process. 2014 Apr;23(4):1569-80. doi: 10.1109/TIP.2014.2302677.

Robust video content analysis schemes for human action recognition.

Sci Prog. 2021 Apr-Jun;104(2):368504211005480. doi: 10.1177/00368504211005480.

Learning discriminative key poses for action recognition.

IEEE Trans Cybern. 2013 Dec;43(6):1860-70. doi: 10.1109/TSMCB.2012.2231959.

Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.

IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1635-48. doi: 10.1109/TPAMI.2012.253.

Learning sparse representations for human action recognition.

IEEE Trans Pattern Anal Mach Intell. 2012 Aug;34(8):1576-88. doi: 10.1109/TPAMI.2011.253.

引用本文的文献

A Comprehensive Methodological Survey of Human Activity Recognition Across Diverse Data Modalities.

Sensors (Basel). 2025 Jun 27;25(13):4028. doi: 10.3390/s25134028.

Human Action Recognition: A Taxonomy-Based Survey, Updates, and Opportunities.

Sensors (Basel). 2023 Feb 15;23(4):2182. doi: 10.3390/s23042182.

AttendAffectNet-Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-Attention.

Sensors (Basel). 2021 Dec 14;21(24):8356. doi: 10.3390/s21248356.

Robust video content analysis schemes for human action recognition.

Sci Prog. 2021 Apr-Jun;104(2):368504211005480. doi: 10.1177/00368504211005480.

Histogram of Oriented Gradient-Based Fusion of Features for Human Action Recognition in Action Video Sequences.

Sensors (Basel). 2020 Dec 18;20(24):7299. doi: 10.3390/s20247299.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

学习时空表示进行动作识别：一种遗传编程方法。

Learning Spatio-Temporal Representations for Action Recognition: A Genetic Programming Approach.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献