Zhao Qi, Koch Christof
Computation and Neural Systems, California Institute of Technology, Pasadena, CA, USA.
J Vis. 2011 Mar 10;11(3):9. doi: 10.1167/11.3.9.
Inspired by the primate visual system, computational saliency models decompose visual input into a set of feature maps across spatial scales in a number of pre-specified channels. The outputs of these feature maps are summed to yield the final saliency map. Here we use a least square technique to learn the weights associated with these maps from subjects freely fixating natural scenes drawn from four recent eye-tracking data sets. Depending on the data set, the weights can be quite different, with the face and orientation channels usually more important than color and intensity channels. Inter-subject differences are negligible. We also model a bias toward fixating at the center of images and consider both time-varying and constant factors that contribute to this bias. To compensate for the inadequacy of the standard method to judge performance (area under the ROC curve), we use two other metrics to comprehensively assess performance. Although our model retains the basic structure of the standard saliency model, it outperforms several state-of-the-art saliency algorithms. Furthermore, the simple structure makes the results applicable to numerous studies in psychophysics and physiology and leads to an extremely easy implementation for real-world applications.
受灵长类视觉系统的启发,计算显著性模型将视觉输入分解为多个预先指定通道中跨空间尺度的一组特征图。这些特征图的输出相加以生成最终的显著性图。在这里,我们使用最小二乘法从自由注视从四个近期眼动追踪数据集中提取的自然场景的受试者那里学习与这些图相关的权重。根据数据集的不同,权重可能会有很大差异,面部和方向通道通常比颜色和强度通道更重要。受试者间的差异可以忽略不计。我们还对偏向图像中心注视的偏差进行建模,并考虑导致这种偏差的时变因素和恒定因素。为了弥补标准方法在判断性能(ROC曲线下面积)方面的不足,我们使用另外两种指标来全面评估性能。尽管我们的模型保留了标准显著性模型的基本结构,但它优于几种最先进的显著性算法。此外,简单的结构使得结果适用于心理物理学和生理学中的众多研究,并导致在实际应用中极其容易实现。