Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117576.
IEEE Trans Image Process. 2012 Feb;21(2):778-88. doi: 10.1109/TIP.2011.2163521. Epub 2011 Aug 4.
Histograms have been widely used for feature representation in image and video content analysis. However, due to the orderless nature of the summarization process, histograms generally lack spatial information. This may degrade their discrimination capability in visual classification tasks. Although there have been several research attempts to encode spatial context into histograms, how to extend the encodings to higher order spatial context is still an open problem. In this paper,we propose a general histogram contextualization method to encode efficiently higher order spatial context. The method is based on the cooccurrence of local visual homogeneity patterns and hence is able to generate more discriminative histogram representations while remaining compact and robust. Moreover, we also investigate how to extend the histogram contextualization to multiple modalities of context. It is shown that the proposed method can be naturally extended to combine both temporal and spatial context and facilitate video content analysis. In addition, a method to combine cross-feature context with spatial context via the technique of random forest is also introduced in this paper. Comprehensive experiments on face image classification and human activity recognition tasks demonstrate the superiority of the proposed histogram contextualization method compared with the existing encoding methods.
直方图在图像和视频内容分析中被广泛用于特征表示。然而,由于摘要过程的无序性质,直方图通常缺乏空间信息。这可能会降低它们在视觉分类任务中的判别能力。尽管已经有几项研究尝试将空间上下文编码到直方图中,但如何将编码扩展到更高阶的空间上下文仍然是一个开放的问题。在本文中,我们提出了一种通用的直方图上下文化方法,以有效地编码更高阶的空间上下文。该方法基于局部视觉同质性模式的共现,因此能够生成更具判别力的直方图表示,同时保持紧凑和鲁棒。此外,我们还研究了如何将直方图上下文化扩展到多种模态的上下文。结果表明,所提出的方法可以自然地扩展到结合时间和空间上下文,从而促进视频内容分析。此外,本文还介绍了一种通过随机森林技术将跨特征上下文与空间上下文相结合的方法。在人脸图像分类和人体活动识别任务上的综合实验表明,所提出的直方图上下文化方法优于现有编码方法。