Suppr超能文献

自然场景中的局部特征的多尺度空间连接及其场景分类。

Multi-scale spatial concatenations of local features in natural scenes and scene classification.

机构信息

Brain and Behavior Discovery Institute, Georgia Regents University, Augusta, Georgia, United States of America.

出版信息

PLoS One. 2013 Sep 30;8(9):e76393. doi: 10.1371/journal.pone.0076393. eCollection 2013.

Abstract

How does the visual system encode natural scenes? What are the basic structures of natural scenes? In current models of scene perception, there are two broad feature representations, global and local representations. Both representations are useful and have some successes; however, many observations on human scene perception seem to point to an intermediate-level representation. In this paper, we proposed natural scene structures, i.e., multi-scale spatial concatenations of local features, as an intermediate-level representation of natural scenes. To compile the natural scene structures, we first sampled a large number of multi-scale circular scene patches in a hexagonal configuration. We then performed independent component analysis on the patches and classified the independent components into a set of clusters using the K-means method. Finally, we obtained a set of natural scene structures, each of which is characterized by a set of dominant clusters of independent components. We examined a range of statistics of the natural scene structures, compiled from two widely used datasets of natural scenes, and modeled their spatial arrangements at larger spatial scales using adjacency matrices. We found that the natural scene structures include a full range of concatenations of visual features in natural scenes, and can be used to encode spatial information at various scales. We then selected a set of natural scene structures with high information, and used the occurring frequencies and the eigenvalues of the adjacency matrices to classify scenes in the datasets. We found that the performance of this model is comparable to or better than the state-of-the-art models on the two datasets. These results suggest that the natural scene structures are a useful intermediate-level representation of visual scenes for our understanding of natural scene perception.

摘要

视觉系统如何对自然场景进行编码?自然场景的基本结构是什么?在当前的场景感知模型中,存在两种广泛的特征表示,即全局和局部表示。这两种表示都很有用,并且取得了一些成功;然而,许多关于人类场景感知的观察似乎指向一种中间层次的表示。在本文中,我们提出了自然场景结构,即局部特征的多尺度空间串联,作为自然场景的中间层次表示。为了编译自然场景结构,我们首先以六边形配置在大量多尺度圆形场景斑块上进行采样。然后,我们对斑块进行独立成分分析,并使用 K-均值方法将独立成分分类为一组聚类。最后,我们获得了一组自然场景结构,每个结构都由一组独立成分的主导聚类来特征化。我们检查了从两个广泛使用的自然场景数据集编译的自然场景结构的一系列统计信息,并使用邻接矩阵对较大空间尺度上的空间排列进行建模。我们发现自然场景结构包括自然场景中视觉特征的各种串联,并且可以用于编码各种尺度的空间信息。然后,我们选择了一组具有高信息量的自然场景结构,并使用邻接矩阵的出现频率和特征值对数据集中的场景进行分类。我们发现该模型的性能与两个数据集上的最新模型相当或更好。这些结果表明,自然场景结构是理解自然场景感知的一种有用的中间层次视觉场景表示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f861/3787016/9284f3f4003a/pone.0076393.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验