Yu Chen-Ping, Samaras Dimitris, Zelinsky Gregory J
Department of Computer Science, Stony Brook University, Stony Brook, NY, USA.
Department of Computer Science, Stony Brook University, Stony Brook, NY, USADepartment of Psychology, Stony Brook University, Stony Brook, NY, USA.
J Vis. 2014 Jun 5;14(7):4. doi: 10.1167/14.7.4.
We introduce the proto-object model of visual clutter perception. This unsupervised model segments an image into superpixels, then merges neighboring superpixels that share a common color cluster to obtain proto-objects-defined here as spatially extended regions of coherent features. Clutter is estimated by simply counting the number of proto-objects. We tested this model using 90 images of realistic scenes that were ranked by observers from least to most cluttered. Comparing this behaviorally obtained ranking to a ranking based on the model clutter estimates, we found a significant correlation between the two (Spearman's ρ = 0.814, p < 0.001). We also found that the proto-object model was highly robust to changes in its parameters and was generalizable to unseen images. We compared the proto-object model to six other models of clutter perception and demonstrated that it outperformed each, in some cases dramatically. Importantly, we also showed that the proto-object model was a better predictor of clutter perception than an actual count of the number of objects in the scenes, suggesting that the set size of a scene may be better described by proto-objects than objects. We conclude that the success of the proto-object model is due in part to its use of an intermediate level of visual representation-one between features and objects-and that this is evidence for the potential importance of a proto-object representation in many common visual percepts and tasks.
我们介绍了视觉杂乱感知的原物体模型。这个无监督模型将图像分割为超像素,然后合并共享同一颜色聚类的相邻超像素,以获得原物体——在此定义为具有连贯特征的空间扩展区域。通过简单地计算原物体的数量来估计杂乱程度。我们使用90张现实场景图像对该模型进行了测试,这些图像由观察者从最不杂乱到最杂乱进行排序。将通过行为获得的排序与基于模型杂乱估计的排序进行比较,我们发现两者之间存在显著相关性(斯皮尔曼相关系数ρ = 0.814,p < 0.001)。我们还发现原物体模型对其参数的变化具有高度鲁棒性,并且可以推广到未见过的图像。我们将原物体模型与其他六种杂乱感知模型进行了比较,结果表明它在每种情况下都优于其他模型,在某些情况下优势明显。重要的是,我们还表明,原物体模型比场景中物体的实际数量更能准确预测杂乱感知,这表明场景的集合大小可能用原物体比用物体来描述更好。我们得出结论,原物体模型的成功部分归因于它使用了视觉表征的中间层次——介于特征和物体之间的层次——并且这证明了原物体表征在许多常见视觉感知和任务中的潜在重要性。