ViSpa（视觉空间）：一种基于计算机视觉的个体图像和概念原型表示系统，具有大规模评估。

ViSpa (Vision Spaces): A computer-vision-based representation system for individual images and concept prototypes, with large-scale evaluation.

机构信息

Department of Psychology, Humboldt-Universitat zu Berlin.

Department of Psychology, University of Milano-Bicocca.

出版信息

Psychol Rev. 2023 Jul;130(4):896-934. doi: 10.1037/rev0000392. Epub 2022 Oct 6.

DOI:10.1037/rev0000392

PMID:36201829

Abstract

Quantitative, data-driven models for mental representations have long enjoyed popularity and success in psychology (e.g., distributional semantic models in the language domain), but have largely been missing for the visual domain. To overcome this, we present ViSpa (Vision Spaces), high-dimensional vector spaces that include vision-based representation for naturalistic images as well as concept prototypes. These vectors are derived directly from visual stimuli through a deep convolutional neural network trained to classify images and allow us to compute vision-based similarity scores between any pair of images and/or concept prototypes. We successfully evaluate these similarities against human behavioral data in a series of large-scale studies, including off-line judgments-visual similarity judgments for the referents of word pairs (Study 1) and for image pairs (Study 2), and typicality judgments for images given a label (Study 3)-as well as online processing times and error rates in a discrimination (Study 4) and priming task (Study 5) with naturalistic image material. similarities predict behavioral data across all tasks, which renders a theoretically appealing model for vision-based representations and a valuable research tool for data analysis and the construction of experimental material: allows for precise control over experimental material consisting of images and/or words denoting imageable concepts and introduces a specifically vision-based similarity for word pairs. To make available to a wide audience, this article (a) includes (video) tutorials on how to use in R and (b) presents a user-friendly web interface at http://vispa.fritzguenther.de. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

摘要

长期以来，用于心理表象的定量、数据驱动模型在心理学中一直很受欢迎并取得了成功（例如，语言领域的分布语义模型），但在视觉领域却基本上没有。为了克服这个问题，我们提出了 ViSpa（视觉空间），这是一种高维向量空间，其中包括基于视觉的自然图像表示以及概念原型。这些向量是通过一个经过训练来对图像进行分类的深度卷积神经网络从视觉刺激中直接导出的，这使我们能够计算任意一对图像和/或概念原型之间的基于视觉的相似度得分。我们在一系列大规模研究中成功地评估了这些相似性与人类行为数据之间的关系，包括离线判断-词对（研究 1）和图像对（研究 2）的视觉相似性判断，以及给定标签的图像的典型性判断（研究 3）-以及自然图像材料的辨别（研究 4）和启动任务（研究 5）中的在线处理时间和错误率。相似性可以预测所有任务中的行为数据，这为基于视觉的表示提供了一个理论上吸引人的模型，也是数据分析和实验材料构建的有价值的研究工具：它可以精确控制由表示可想象概念的图像和/或单词组成的实验材料，并为词对引入了特定的基于视觉的相似度。为了让更广泛的受众能够使用，本文（a）包括了如何在 R 中使用的（视频）教程，以及（b）在 http://vispa.fritzguenther.de 上提供了一个用户友好的网络界面。（PsycInfo 数据库记录（c）2023 APA，保留所有权利）。

相似文献

ViSpa (Vision Spaces): A computer-vision-based representation system for individual images and concept prototypes, with large-scale evaluation.

Psychol Rev. 2023 Jul;130(4):896-934. doi: 10.1037/rev0000392. Epub 2022 Oct 6.

Predicting patterns of similarity among abstract semantic relations.

J Exp Psychol Learn Mem Cogn. 2022 Jan;48(1):108-121. doi: 10.1037/xlm0001010. Epub 2021 Jul 1.

Probing the link between vision and language in material perception using psychophysics and unsupervised learning.

PLoS Comput Biol. 2024 Oct 3;20(10):e1012481. doi: 10.1371/journal.pcbi.1012481. eCollection 2024 Oct.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model.

Psychol Res. 2022 Nov;86(8):2512-2532. doi: 10.1007/s00426-020-01429-7.

Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks.

J Neurosci. 2018 Aug 15;38(33):7255-7269. doi: 10.1523/JNEUROSCI.0388-18.2018. Epub 2018 Jul 13.

Visual features as stepping stones toward semantics: Explaining object similarity in IT and perception with non-negative least squares.

Neuropsychologia. 2016 Mar;83:201-226. doi: 10.1016/j.neuropsychologia.2015.10.023. Epub 2015 Oct 19.

Constructing Semantic Models From Words, Images, and Emojis.

Cogn Sci. 2020 Apr;44(4):e12830. doi: 10.1111/cogs.12830.

Cogn Sci. 2021 Aug;45(8):e13030. doi: 10.1111/cogs.13030.

Learning semantic and visual similarity for endomicroscopy video retrieval.

IEEE Trans Med Imaging. 2012 Jun;31(6):1276-88. doi: 10.1109/TMI.2012.2188301. Epub 2012 Feb 16.

引用本文的文献

Cracking arbitrariness: A data-driven study of auditory iconicity in spoken English.

Psychon Bull Rev. 2025 Jun;32(3):1425-1442. doi: 10.3758/s13423-024-02630-0. Epub 2025 Jan 8.

Visual search and real-image similarity: An empirical assessment through the lens of deep learning.

Psychon Bull Rev. 2025 Apr;32(2):822-838. doi: 10.3758/s13423-024-02583-4. Epub 2024 Sep 26.

Visual experience modulates the sensitivity to the distributional history of words in natural language.

Psychon Bull Rev. 2025 Feb;32(1):472-481. doi: 10.3758/s13423-024-02557-6. Epub 2024 Aug 22.

Decomposing geographical judgments into spatial, temporal and linguistic components.

Psychol Res. 2024 Jul;88(5):1590-1601. doi: 10.1007/s00426-024-01980-7. Epub 2024 Jun 5.

Taboo language across the globe: A multi-lab study.

Behav Res Methods. 2024 Apr;56(4):3794-3813. doi: 10.3758/s13428-024-02376-6. Epub 2024 May 9.

From vector spaces to DRM lists: False Memory Generator, a software for automated generation of lists of stimuli inducing false memories.

Behav Res Methods. 2024 Apr;56(4):3779-3793. doi: 10.3758/s13428-024-02425-0. Epub 2024 May 6.

Visual Intuitions in the Absence of Visual Experience: The Role of Direct Experience in Concreteness and Imageability Judgements.

J Cogn. 2024 Jan 9;7(1):3. doi: 10.5334/joc.328. eCollection 2024.

A Cross-Modal and Cross-lingual Study of Iconicity in Language: Insights From Deep Learning.

Cogn Sci. 2022 Jun;46(6):e13147. doi: 10.1111/cogs.13147.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ViSpa（视觉空间）：一种基于计算机视觉的个体图像和概念原型表示系统，具有大规模评估。

ViSpa (Vision Spaces): A computer-vision-based representation system for individual images and concept prototypes, with large-scale evaluation.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献