Statistical Visual Computing Laboratory, Department of Electrical and Computer Engineering, University of California, San Diego, EBU 1, Room 5512, 9500 Gilman Drive, La Jolla, CA 92093-0407, USA.
IEEE Trans Pattern Anal Mach Intell. 2012 May;34(5):902-17. doi: 10.1109/TPAMI.2011.175.
A novel framework to context modeling based on the probability of co-occurrence of objects and scenes is proposed. The modeling is quite simple, and builds upon the availability of robust appearance classifiers. Images are represented by their posterior probabilities with respect to a set of contextual models, built upon the bag-of-features image representation, through two layers of probabilistic modeling. The first layer represents the image in a semantic space, where each dimension encodes an appearance-based posterior probability with respect to a concept. Due to the inherent ambiguity of classifying image patches, this representation suffers from a certain amount of contextual noise. The second layer enables robust inference in the presence of this noise by modeling the distribution of each concept in the semantic space. A thorough and systematic experimental evaluation of the proposed context modeling is presented. It is shown that it captures the contextual “gist” of natural images. Scene classification experiments show that contextual classifiers outperform their appearance-based counterparts, irrespective of the precise choice and accuracy of the latter. The effectiveness of the proposed approach to context modeling is further demonstrated through a comparison to existing approaches on scene classification and image retrieval, on benchmark data sets. In all cases, the proposed approach achieves superior results.
提出了一种基于对象和场景共现概率的上下文建模新框架。该模型构建简单,建立在强大的外观分类器的可用性之上。通过两层概率建模,使用基于特征袋的图像表示来构建上下文模型,图像由其相对于一组上下文模型的后验概率表示。第一层将图像表示为语义空间中的一个点,其中每个维度都对应于一个概念的基于外观的后验概率。由于分类图像补丁的固有歧义,此表示存在一定程度的上下文噪声。通过对语义空间中每个概念的分布进行建模,第二层能够在存在这种噪声的情况下进行稳健推断。对所提出的上下文建模进行了彻底和系统的实验评估。结果表明,它能够捕捉自然图像的上下文“要点”。场景分类实验表明,上下文分类器的性能优于基于外观的分类器,而不论后者的选择和准确性如何。通过与现有场景分类和图像检索方法在基准数据集上的比较,进一步证明了所提出的上下文建模方法的有效性。在所有情况下,所提出的方法都能取得更好的结果。