Suppr超能文献

深度神经网络与人类共有的真实世界场景视觉数字感。

Visual number sense for real-world scenes shared by deep neural networks and humans.

作者信息

Wencheng Wu, Ge Yingxi, Zuo Zhentao, Chen Lin, Qin Xu, Zuxiang Liu

机构信息

AHU-IAI AI Joint Laboratory, Anhui University, Hefei, 230601, China.

Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230088, China.

出版信息

Heliyon. 2023 Jul 24;9(8):e18517. doi: 10.1016/j.heliyon.2023.e18517. eCollection 2023 Aug.

Abstract

Recently, visual number sense has been identified from deep neural networks (DNNs). However, whether DNNs have the same capacity for real-world scenes, rather than the simple geometric figures that are often tested, is unclear. In this study, we explore the number perception of scenes using AlexNet and find that numerosity can be represented by the pattern of group activation of the category layer units. The global activation of these units increases with the number of objects in the scene, and the variations in their activation decrease accordingly. By decoding the numerosity from this pattern, we reveal that the embedding coefficient of a scene determines the likelihood of potential objects to contribute to numerical perception. This was demonstrated by the more optimized performance for pictures with relatively high embedding coefficients in both DNNs and humans. This study for the first time shows that a distinct feature in visual environments, revealed by DNNs, can modulate human perception, supported by a group-coding mechanism.

摘要

最近,视觉数字感已在深度神经网络(DNN)中被识别出来。然而,DNN是否具有感知现实世界场景的能力,而非仅限于常被测试的简单几何图形,目前尚不清楚。在本研究中,我们使用AlexNet探索场景的数字感知,发现数字量可以通过类别层单元的群体激活模式来表示。这些单元的全局激活随场景中物体数量的增加而增加,其激活的变化相应减小。通过从这种模式中解码数字量,我们发现场景的嵌入系数决定了潜在物体对数字感知做出贡献的可能性。这在DNN和人类对具有相对较高嵌入系数的图片表现出更优化的性能中得到了证明。本研究首次表明,DNN揭示的视觉环境中的独特特征可以通过群体编码机制调节人类感知。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63e3/10407052/aa5cf8d5567e/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验