Wang Panqu, Cottrell Garrison W
Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA,
Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA,
J Vis. 2017 Apr 1;17(4):9. doi: 10.1167/17.4.9.
What are the roles of central and peripheral vision in human scene recognition? Larson and Loschky (2009) showed that peripheral vision contributes more than central vision in obtaining maximum scene recognition accuracy. However, central vision is more efficient for scene recognition than peripheral, based on the amount of visual area needed for accurate recognition. In this study, we model and explain the results of Larson and Loschky (2009) using a neurocomputational modeling approach. We show that the advantage of peripheral vision in scene recognition, as well as the efficiency advantage for central vision, can be replicated using state-of-the-art deep neural network models. In addition, we propose and provide support for the hypothesis that the peripheral advantage comes from the inherent usefulness of peripheral features. This result is consistent with data presented by Thibaut, Tran, Szaffarczyk, and Boucart (2014), who showed that patients with central vision loss can still categorize natural scenes efficiently. Furthermore, by using a deep mixture-of-experts model ("The Deep Model," or TDM) that receives central and peripheral visual information on separate channels simultaneously, we show that the peripheral advantage emerges naturally in the learning process: When trained to categorize scenes, the model weights the peripheral pathway more than the central pathway. As we have seen in our previous modeling work, learning creates a transform that spreads different scene categories into different regions in representational space. Finally, we visualize the features for the two pathways, and find that different preferences for scene categories emerge for the two pathways during the training process.
中央视觉和周边视觉在人类场景识别中分别扮演什么角色?拉尔森和洛施基(2009年)指出,在获得最高场景识别准确率方面,周边视觉比中央视觉的贡献更大。然而,基于准确识别所需的视觉区域数量,中央视觉在场景识别方面比周边视觉更高效。在本研究中,我们使用神经计算建模方法对拉尔森和洛施基(2009年)的研究结果进行建模和解释。我们表明,使用最先进的深度神经网络模型可以复制周边视觉在场景识别中的优势以及中央视觉的效率优势。此外,我们提出并支持了这样一种假设,即周边优势来自周边特征的内在有用性。这一结果与蒂博、陈、萨法尔奇克和布卡尔(2014年)提供的数据一致,他们表明中央视力丧失的患者仍然能够有效地对自然场景进行分类。此外,通过使用一种深度专家混合模型(“深度模型”,即TDM),该模型同时在不同通道上接收中央和周边视觉信息,我们表明周边优势在学习过程中自然出现:当训练对场景进行分类时,该模型对周边通路的加权比对中央通路的加权更多。正如我们在之前的建模工作中所看到的,学习会创建一种变换,将不同的场景类别散布到表征空间的不同区域。最后,我们对两条通路的特征进行可视化,发现在训练过程中两条通路对场景类别的偏好不同。