使用具有多级时间对齐的核方法进行视频事件识别。

Video event recognition using kernel methods with multilevel temporal alignment.

作者信息

Xu Dong, Chang Shih-Fu

机构信息

School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Blk N4, Singapore.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1985-97. doi: 10.1109/TPAMI.2008.129.

DOI:10.1109/TPAMI.2008.129

PMID:18787246

Abstract

In this work, we systematically study the problem of event recognition in unconstrained news video sequences. We adopt the discriminative kernel-based method for which video clip similarity plays an important role. First, we represent a video clip as a bag of orderless descriptors extracted from all of the constituent frames and apply the earth mover's distance (EMD) to integrate similarities among frames from two clips. Observing that a video clip is usually comprised of multiple subclips corresponding to event evolution over time, we further build a multilevel temporal pyramid. At each pyramid level, we integrate the information from different subclips with Integer-value-constrained EMD to explicitly align the subclips. By fusing the information from the different pyramid levels, we develop Temporally Aligned Pyramid Matching (TAPM) for measuring video similarity. We conduct comprehensive experiments on the TRECVID 2005 corpus, which contains more than 6,800 clips. Our experiments demonstrate that 1) the TAPM multilevel method clearly outperforms single-level EMD (SLEMD) and 2) SLEMD outperforms keyframe and multiframe-based detection methods by a large margin. In addition, we conduct in-depth investigation of various aspects of the proposed techniques such as weight selection in SLEMD, sensitivity to temporal clustering, the effect of temporal alignment, and possible approaches for speed up. Extensive analysis of the results also reveals intuitive interpretation of video event recognition through video subclip alignment at different levels.

摘要

在这项工作中，我们系统地研究了无约束新闻视频序列中的事件识别问题。我们采用基于判别核的方法，其中视频片段相似度起着重要作用。首先，我们将视频片段表示为从所有组成帧中提取的无序描述符包，并应用推土机距离（EMD）来整合两个片段中各帧之间的相似度。鉴于视频片段通常由多个与事件随时间演变相对应的子片段组成，我们进一步构建了一个多级时间金字塔。在每个金字塔级别，我们使用整数值约束的EMD来整合来自不同子片段的信息，以明确对齐子片段。通过融合来自不同金字塔级别的信息，我们开发了时间对齐金字塔匹配（TAPM）来测量视频相似度。我们在包含超过6800个片段的TRECVID 2005语料库上进行了全面的实验。我们的实验表明：1）TAPM多级方法明显优于单级EMD（SLEMD）；2）SLEMD比基于关键帧和多帧的检测方法有大幅提升。此外，我们对所提出技术的各个方面进行了深入研究，例如SLEMD中的权重选择、对时间聚类的敏感性、时间对齐的效果以及可能的加速方法。对结果的广泛分析还揭示了通过不同级别视频子片段对齐对视频事件识别的直观解释。

相似文献

Video event recognition using kernel methods with multilevel temporal alignment.使用具有多级时间对齐的核方法进行视频事件识别。

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1985-97. doi: 10.1109/TPAMI.2008.129.

Segmenting, modeling, and matching video clips containing multiple moving objects.对包含多个移动对象的视频片段进行分割、建模和匹配。

IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):477-91. doi: 10.1109/TPAMI.2007.57.

Robust object matching for persistent tracking with heterogeneous features.用于基于异构特征的持续跟踪的鲁棒对象匹配

IEEE Trans Pattern Anal Mach Intell. 2007 May;29(5):824-39. doi: 10.1109/TPAMI.2007.1052.

An efficient Earth Mover's Distance algorithm for robust histogram comparison.一种用于稳健直方图比较的高效推土机距离算法。

IEEE Trans Pattern Anal Mach Intell. 2007 May;29(5):840-53. doi: 10.1109/TPAMI.2007.1058.

Space-time adaptation for patch-based image sequence restoration.基于块的图像序列恢复的时空自适应

IEEE Trans Pattern Anal Mach Intell. 2007 Jun;29(6):1096-102. doi: 10.1109/TPAMI.2007.1064.

Design of multimodal dissimilarity spaces for retrieval of video documents.用于视频文档检索的多模态差异空间设计

IEEE Trans Pattern Anal Mach Intell. 2008 Sep;30(9):1520-33. doi: 10.1109/TPAMI.2007.70801.

Alignment of continuous video onto 3D point clouds.将连续视频与3D点云对齐。

IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1305-18. doi: 10.1109/TPAMI.2005.152.

Overlapping events with application to image sequences.适用于图像序列的重叠事件。

IEEE Trans Pattern Anal Mach Intell. 2006 Oct;28(10):1707-12. doi: 10.1109/TPAMI.2006.199.

Statistical analysis of dynamic actions.动态动作的统计分析

IEEE Trans Pattern Anal Mach Intell. 2006 Sep;28(9):1530-5. doi: 10.1109/TPAMI.2006.194.

Infinite hidden Markov models for unusual-event detection in video.用于视频中异常事件检测的无限隐马尔可夫模型

IEEE Trans Image Process. 2008 May;17(5):811-22. doi: 10.1109/TIP.2008.919359.

引用本文的文献

Physical Activity Recognition Based on Motion in Images Acquired by a Wearable Camera.基于可穿戴摄像头采集图像中的运动进行身体活动识别

Neurocomputing (Amst). 2011 Jun 1;74(12-13):2184-2192. doi: 10.1016/j.neucom.2011.02.014.

使用具有多级时间对齐的核方法进行视频事件识别。

Video event recognition using kernel methods with multilevel temporal alignment.

作者信息

Xu Dong, Chang Shih-Fu

机构信息

School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Blk N4, Singapore.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1985-97. doi: 10.1109/TPAMI.2008.129.

DOI:10.1109/TPAMI.2008.129

PMID:18787246

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用具有多级时间对齐的核方法进行视频事件识别。

Video event recognition using kernel methods with multilevel temporal alignment.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

使用具有多级时间对齐的核方法进行视频事件识别。

Video event recognition using kernel methods with multilevel temporal alignment.

作者信息

机构信息

出版信息

相似文献

引用本文的文献