Suppr超能文献

基于 Grassmann 和 Stiefel 流形的统计计算在图像和视频识别中的应用。

Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition.

机构信息

Center for Automation Research, University of Maryland, College Park, College Park, MD 20742, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2011 Nov;33(11):2273-86. doi: 10.1109/TPAMI.2011.52.

Abstract

In this paper, we examine image and video-based recognition applications where the underlying models have a special structure—the linear subspace structure. We discuss how commonly used parametric models for videos and image sets can be described using the unified framework of Grassmann and Stiefel manifolds. We first show that the parameters of linear dynamic models are finite-dimensional linear subspaces of appropriate dimensions. Unordered image sets as samples from a finite-dimensional linear subspace naturally fall under this framework. We show that an inference over subspaces can be naturally cast as an inference problem on the Grassmann manifold. To perform recognition using subspace-based models, we need tools from the Riemannian geometry of the Grassmann manifold. This involves a study of the geometric properties of the space, appropriate definitions of Riemannian metrics, and definition of geodesics. Further, we derive statistical modeling of inter and intraclass variations that respect the geometry of the space. We apply techniques such as intrinsic and extrinsic statistics to enable maximum-likelihood classification. We also provide algorithms for unsupervised clustering derived from the geometry of the manifold. Finally, we demonstrate the improved performance of these methods in a wide variety of vision applications such as activity recognition, video-based face recognition, object recognition from image sets, and activity-based video clustering.

摘要

本文研究了基于图像和视频的识别应用,其中底层模型具有特殊的结构——线性子空间结构。我们讨论了如何使用 Grassmann 和 Stiefel 流形的统一框架来描述视频和图像集的常用参数模型。我们首先表明线性动态模型的参数是适当维数的有限维线性子空间。无序的图像集作为有限维线性子空间的样本自然属于这个框架。我们表明,对子空间的推断可以自然地转化为 Grassmann 流形上的推断问题。为了使用基于子空间的模型进行识别,我们需要 Grassmann 流形的黎曼几何的工具。这涉及到对空间的几何性质、合适的黎曼度量的定义以及测地线的定义的研究。此外,我们推导出了尊重空间几何性质的类间和类内变化的统计建模。我们应用诸如内在和外在统计等技术来实现最大似然分类。我们还从流形的几何形状导出了用于无监督聚类的算法。最后,我们在各种视觉应用中展示了这些方法的改进性能,例如活动识别、基于视频的人脸识别、基于图像集的目标识别和基于活动的视频聚类。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验