Suppr超能文献

人类与深度卷积神经网络中的物体识别比较——一项眼动追踪研究

Comparing Object Recognition in Humans and Deep Convolutional Neural Networks-An Eye Tracking Study.

作者信息

van Dyck Leonard Elia, Kwitt Roland, Denzler Sebastian Jochen, Gruber Walter Roland

机构信息

Department of Psychology, University of Salzburg, Salzburg, Austria.

Center for Cognitive Neuroscience, University of Salzburg, Salzburg, Austria.

出版信息

Front Neurosci. 2021 Oct 6;15:750639. doi: 10.3389/fnins.2021.750639. eCollection 2021.

Abstract

Deep convolutional neural networks (DCNNs) and the ventral visual pathway share vast architectural and functional similarities in visual challenges such as object recognition. Recent insights have demonstrated that both hierarchical cascades can be compared in terms of both exerted behavior and underlying activation. However, these approaches ignore key differences in spatial priorities of information processing. In this proof-of-concept study, we demonstrate a comparison of human observers ( = 45) and three feedforward DCNNs through eye tracking and saliency maps. The results reveal fundamentally different resolutions in both visualization methods that need to be considered for an insightful comparison. Moreover, we provide evidence that a DCNN with biologically plausible receptive field sizes called reveals higher agreement with human viewing behavior as contrasted with a standard ResNet architecture. We find that image-specific factors such as category, animacy, arousal, and valence have a direct link to the agreement of spatial object recognition priorities in humans and DCNNs, while other measures such as difficulty and general image properties do not. With this approach, we try to open up new perspectives at the intersection of biological and computer vision research.

摘要

深度卷积神经网络(DCNNs)与腹侧视觉通路在诸如目标识别等视觉挑战方面具有广泛的结构和功能相似性。最近的研究表明,这两种层次级联在施加的行为和潜在激活方面都可以进行比较。然而,这些方法忽略了信息处理空间优先级的关键差异。在这项概念验证研究中,我们通过眼动追踪和显著性图展示了人类观察者(n = 45)与三种前馈DCNNs的比较。结果揭示了两种可视化方法中存在根本不同的分辨率,为了进行有洞察力的比较需要考虑这些差异。此外,我们提供证据表明,与标准ResNet架构相比,具有生物学上合理感受野大小的DCNN与人类观看行为具有更高的一致性。我们发现,诸如类别、生动性、唤醒度和效价等特定于图像的因素与人类和DCNN中空间目标识别优先级的一致性有直接联系,而诸如难度和一般图像属性等其他指标则没有。通过这种方法,我们试图在生物视觉和计算机视觉研究的交叉点上开辟新的视角。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a862/8526843/d65cc00da926/fnins-15-750639-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验