利用绘图和深度神经网络来刻画人类视觉相似性的组成要素。

Using drawings and deep neural networks to characterize the building blocks of human visual similarity.

作者信息

Mukherjee Kushin, Rogers Timothy T

机构信息

Department of Psychology & Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA.

出版信息

Mem Cognit. 2025 Jan;53(1):219-241. doi: 10.3758/s13421-024-01580-1. Epub 2024 May 30.

DOI:10.3758/s13421-024-01580-1

PMID:38814385

Abstract

Early in life and without special training, human beings discern resemblance between abstract visual stimuli, such as drawings, and the real-world objects they represent. We used this capacity for visual abstraction as a tool for evaluating deep neural networks (DNNs) as models of human visual perception. Contrasting five contemporary DNNs, we evaluated how well each explains human similarity judgments among line drawings of recognizable and novel objects. For object sketches, human judgments were dominated by semantic category information; DNN representations contributed little additional information. In contrast, such features explained significant unique variance perceived similarity of abstract drawings. In both cases, a vision transformer trained to blend representations of images and their natural language descriptions showed the greatest ability to explain human perceptual similarity-an observation consistent with contemporary views of semantic representation and processing in the human mind and brain. Together, the results suggest that the building blocks of visual similarity may arise within systems that learn to use visual information, not for specific classification, but in service of generating semantic representations of objects.

摘要

在生命早期且未经特殊训练的情况下，人类就能辨别抽象视觉刺激（如图画）与其所代表的现实世界物体之间的相似性。我们利用这种视觉抽象能力作为一种工具，来评估深度神经网络（DNN）作为人类视觉感知模型的性能。通过对比五个当代DNN，我们评估了每个模型在解释人类对可识别和新颖物体的线条画之间的相似性判断方面的表现。对于物体草图，人类的判断主要受语义类别信息主导；DNN表示几乎没有提供额外信息。相比之下，这些特征解释了抽象画感知相似性中显著的独特方差。在这两种情况下，一个经过训练以融合图像及其自然语言描述表示的视觉Transformer表现出最强的解释人类感知相似性的能力——这一观察结果与当代关于人类心智和大脑中语义表示与处理的观点一致。总之，结果表明视觉相似性的构建模块可能出现在那些学习使用视觉信息的系统中，这些系统并非用于特定分类，而是为生成物体的语义表示服务。

相似文献

Using drawings and deep neural networks to characterize the building blocks of human visual similarity.

Mem Cognit. 2025 Jan;53(1):219-241. doi: 10.3758/s13421-024-01580-1. Epub 2024 May 30.

Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics.

J Neurosci. 2023 Mar 8;43(10):1731-1741. doi: 10.1523/JNEUROSCI.1424-22.2022. Epub 2023 Feb 9.

Visual features as stepping stones toward semantics: Explaining object similarity in IT and perception with non-negative least squares.

Neuropsychologia. 2016 Mar;83:201-226. doi: 10.1016/j.neuropsychologia.2015.10.023. Epub 2015 Oct 19.

The Spatiotemporal Neural Dynamics of Object Recognition for Natural Images and Line Drawings.

J Neurosci. 2023 Jan 18;43(3):484-500. doi: 10.1523/JNEUROSCI.1546-22.2022. Epub 2022 Dec 19.

Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments.

Front Psychol. 2017 Oct 9;8:1726. doi: 10.3389/fpsyg.2017.01726. eCollection 2017.

Which deep learning model can best explain object representations of within-category exemplars?

J Vis. 2021 Sep 1;21(10):12. doi: 10.1167/jov.21.10.12.

From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction.

J Vis. 2022 Feb 1;22(2):4. doi: 10.1167/jov.22.2.4.

Can machine learning account for human visual object shape similarity judgments?

Vision Res. 2020 Feb;167:87-99. doi: 10.1016/j.visres.2019.12.001. Epub 2020 Jan 20.

The relative contributions of visual and semantic information in the neural representation of object categories.

Brain Behav. 2019 Oct;9(10):e01373. doi: 10.1002/brb3.1373. Epub 2019 Sep 27.

Perceptual and Semantic Representations at Encoding Contribute to True and False Recognition of Objects.

J Neurosci. 2021 Oct 6;41(40):8375-8389. doi: 10.1523/JNEUROSCI.0677-21.2021. Epub 2021 Aug 19.

引用本文的文献

Fine-grained knowledge about manipulable objects is well-predicted by contrastive language image pre-training.

iScience. 2024 Jun 17;27(7):110297. doi: 10.1016/j.isci.2024.110297. eCollection 2024 Jul 19.

本文引用的文献

Reassessing hierarchical correspondences between brain and deep networks through direct interface.

Sci Adv. 2022 Jul 15;8(28):eabm2219. doi: 10.1126/sciadv.abm2219. Epub 2022 Jul 13.

Superordinate Categorization Based on the Perceptual Organization of Parts.

Brain Sci. 2022 May 20;12(5):667. doi: 10.3390/brainsci12050667.

From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction.

J Vis. 2022 Feb 1;22(2):4. doi: 10.1167/jov.22.2.4.

The Science of Visual Data Communication: What Works.

Psychol Sci Public Interest. 2021 Dec;22(3):110-161. doi: 10.1177/15291006211051956.

Evidence for a deep, distributed and dynamic code for animacy in human ventral anterior temporal cortex.

Elife. 2021 Oct 27;10:e66276. doi: 10.7554/eLife.66276.

An image-computable model of human visual shape similarity.

PLoS Comput Biol. 2021 Jun 1;17(6):e1008981. doi: 10.1371/journal.pcbi.1008981. eCollection 2021 Jun.

From convolutional neural networks to models of higher-level cognition (and back again).

Ann N Y Acad Sci. 2021 Dec;1505(1):55-78. doi: 10.1111/nyas.14593. Epub 2021 Mar 22.

Reverse-engineering the cortical architecture for controlled semantic cognition.

Nat Hum Behav. 2021 Jun;5(6):774-786. doi: 10.1038/s41562-020-01034-z. Epub 2021 Jan 18.

Infants' recognition of their mothers' faces in facial drawings.

Dev Psychobiol. 2020 Dec;62(8):1011-1020. doi: 10.1002/dev.21972. Epub 2020 Mar 29.

Deep convolutional networks do not classify based on global object shape.

PLoS Comput Biol. 2018 Dec 7;14(12):e1006613. doi: 10.1371/journal.pcbi.1006613. eCollection 2018 Dec.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用绘图和深度神经网络来刻画人类视觉相似性的组成要素。

Using drawings and deep neural networks to characterize the building blocks of human visual similarity.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献