• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大型语言模型可预测人类在六种感觉模式下的判断。

Large language models predict human sensory judgments across six modalities.

机构信息

Department of Psychology, Princeton University, Princeton, USA.

Department of Computer Science, Princeton University, Princeton, USA.

出版信息

Sci Rep. 2024 Sep 13;14(1):21445. doi: 10.1038/s41598-024-72071-1.

DOI:10.1038/s41598-024-72071-1
PMID:39271909
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11399123/
Abstract

Determining the extent to which the perceptual world can be recovered from language is a longstanding problem in philosophy and cognitive science. We show that state-of-the-art large language models can unlock new insights into this problem by providing a lower bound on the amount of perceptual information that can be extracted from language. Specifically, we elicit pairwise similarity judgments from GPT models across six psychophysical datasets. We show that the judgments are significantly correlated with human data across all domains, recovering well-known representations like the color wheel and pitch spiral. Surprisingly, we find that a model (GPT-4) co-trained on vision and language does not necessarily lead to improvements specific to the visual modality, and provides highly correlated predictions with human data irrespective of whether direct visual input is provided or purely textual descriptors. To study the impact of specific languages, we also apply the models to a multilingual color-naming task. We find that GPT-4 replicates cross-linguistic variation in English and Russian illuminating the interaction of language and perception.

摘要

确定语言能够在多大程度上还原感知世界是哲学和认知科学中长期存在的问题。我们表明,最先进的大型语言模型可以通过提供从语言中提取的感知信息量的下限,为解决这个问题提供新的见解。具体来说,我们在六个心理物理数据集上从 GPT 模型中引出了成对的相似性判断。我们发现,这些判断与所有领域的人类数据具有显著相关性,能够很好地再现颜色轮和音高螺旋等知名表示。令人惊讶的是,我们发现一个经过视觉和语言共同训练的模型(GPT-4)并不一定会提高特定于视觉的模态的性能,并且无论是否提供直接的视觉输入或仅提供文本描述,它都能提供与人类数据高度相关的预测。为了研究特定语言的影响,我们还将模型应用于多语言颜色命名任务。我们发现 GPT-4 再现了英语和俄语中的跨语言变化,揭示了语言和感知的相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/6e03b314a577/41598_2024_72071_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/244fea52d33c/41598_2024_72071_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/7ffd45b3c3c4/41598_2024_72071_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/b1f555cbb480/41598_2024_72071_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/6e03b314a577/41598_2024_72071_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/244fea52d33c/41598_2024_72071_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/7ffd45b3c3c4/41598_2024_72071_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/b1f555cbb480/41598_2024_72071_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd10/11399123/6e03b314a577/41598_2024_72071_Fig4_HTML.jpg

相似文献

1
Large language models predict human sensory judgments across six modalities.大型语言模型可预测人类在六种感觉模式下的判断。
Sci Rep. 2024 Sep 13;14(1):21445. doi: 10.1038/s41598-024-72071-1.
2
Probing the link between vision and language in material perception using psychophysics and unsupervised learning.使用心理物理学和无监督学习探究物质感知中视觉和语言之间的联系。
PLoS Comput Biol. 2024 Oct 3;20(10):e1012481. doi: 10.1371/journal.pcbi.1012481. eCollection 2024 Oct.
3
Can large language models help augment English psycholinguistic datasets?大型语言模型能否帮助扩充英语心理语言学数据集?
Behav Res Methods. 2024 Sep;56(6):6082-6100. doi: 10.3758/s13428-024-02337-z. Epub 2024 Jan 23.
4
Gromov-Wasserstein unsupervised alignment reveals structural correspondences between the color similarity structures of humans and large language models.无监督的 Gromov-Wasserstein 对齐揭示了人类和大型语言模型的颜色相似性结构之间的结构对应关系。
Sci Rep. 2024 Jul 10;14(1):15917. doi: 10.1038/s41598-024-65604-1.
5
Two representations of a high-dimensional perceptual space.高维感知空间的两种表示形式。
Vision Res. 2017 Aug;137:1-23. doi: 10.1016/j.visres.2017.05.003. Epub 2017 Jul 12.
6
Shared representations of human actions across vision and language.人类动作在视觉和语言上的共享表示。
Neuropsychologia. 2024 Sep 9;202:108962. doi: 10.1016/j.neuropsychologia.2024.108962. Epub 2024 Jul 22.
7
Neural resources associated with perceptual judgment across sensory modalities.跨感觉通道与知觉判断相关的神经资源。
Cereb Cortex. 2008 Jan;18(1):38-45. doi: 10.1093/cercor/bhm029. Epub 2007 Apr 5.
8
Does cross-modal correspondence modulate modality-specific perceptual processing? Study using timing judgment tasks.跨模态对应是否调节模态特定的感知处理?使用定时判断任务的研究。
Atten Percept Psychophys. 2024 Jan;86(1):273-284. doi: 10.3758/s13414-023-02812-3. Epub 2023 Nov 6.
9
An ERP study of Chinese speakers' rhyme judgments to Chinese and English words.一项关于汉语使用者对中文和英文单词韵脚判断的事件相关电位研究。
Neuroreport. 2010 Jun 23;21(9):636-40. doi: 10.1097/WNR.0b013e32833a5d2c.
10
Dissociation of Choice Formation and Choice-Correlated Activity in Macaque Visual Cortex.猕猴视觉皮层中选择形成与选择相关活动的分离
J Neurosci. 2017 May 17;37(20):5195-5203. doi: 10.1523/JNEUROSCI.3331-16.2017. Epub 2017 Apr 21.

引用本文的文献

1
Correspondence of high dimensional emotion structures elicited from video clips between humans and multimodal LLMs.人类与多模态语言模型之间从视频片段中引发的高维情感结构的对应关系。
Sci Rep. 2025 Sep 1;15(1):32175. doi: 10.1038/s41598-025-14961-6.
2
Comparing perceptual judgments in large multimodal models and humans.比较大型多模态模型与人类的感知判断。
Behav Res Methods. 2025 Jun 19;57(7):203. doi: 10.3758/s13428-025-02728-w.
3
From text to motion: grounding GPT-4 in a humanoid robot "Alter3".从文本到行动:将GPT-4应用于仿人机器人“Alter3”并实现基础功能

本文引用的文献

1
Emergent analogical reasoning in large language models.大语言模型中的紧急类比推理。
Nat Hum Behav. 2023 Sep;7(9):1526-1541. doi: 10.1038/s41562-023-01659-w. Epub 2023 Jul 31.
2
Can AI language models replace human participants?人工智能语言模型能否替代人类参与者?
Trends Cogn Sci. 2023 Jul;27(7):597-600. doi: 10.1016/j.tics.2023.04.008. Epub 2023 May 10.
3
Large-scale iterated singing experiments reveal oral transmission mechanisms underlying music evolution.大规模迭代歌唱实验揭示了音乐进化的口头传播机制。
Front Robot AI. 2025 May 27;12:1581110. doi: 10.3389/frobt.2025.1581110. eCollection 2025.
4
Large language models without grounding recover non-sensorimotor but not sensorimotor features of human concepts.没有基础的大语言模型能够恢复人类概念的非感觉运动特征,但无法恢复感觉运动特征。
Nat Hum Behav. 2025 Jun 4. doi: 10.1038/s41562-025-02203-8.
5
Unsupervised alignment reveals structural commonalities and differences in neural representations of natural scenes across individuals and brain areas.无监督对齐揭示了个体和脑区之间自然场景神经表征的结构共性与差异。
iScience. 2025 Apr 15;28(5):112427. doi: 10.1016/j.isci.2025.112427. eCollection 2025 May 16.
6
Attention heads of large language models.大型语言模型负责人请注意。
Patterns (N Y). 2025 Feb 6;6(2):101176. doi: 10.1016/j.patter.2025.101176. eCollection 2025 Feb 14.
Curr Biol. 2023 Apr 24;33(8):1472-1486.e12. doi: 10.1016/j.cub.2023.02.070. Epub 2023 Mar 22.
4
Perspective changes in human listeners are aligned with the contextual transformation of the word embedding space.人类听众的视角变化与词嵌入空间的语境转换一致。
Cereb Cortex. 2023 Jun 8;33(12):7830-7842. doi: 10.1093/cercor/bhad082.
5
Using cognitive psychology to understand GPT-3.利用认知心理学理解 GPT-3。
Proc Natl Acad Sci U S A. 2023 Feb 7;120(6):e2218523120. doi: 10.1073/pnas.2218523120. Epub 2023 Feb 2.
6
Over-reliance on English hinders cognitive science.过度依赖英语阻碍认知科学发展。
Trends Cogn Sci. 2022 Dec;26(12):1153-1170. doi: 10.1016/j.tics.2022.09.015. Epub 2022 Oct 14.
7
Shared computational principles for language processing in humans and deep language models.人类和深度语言模型语言处理的共享计算原则。
Nat Neurosci. 2022 Mar;25(3):369-380. doi: 10.1038/s41593-022-01026-4. Epub 2022 Mar 7.
8
Shared understanding of color among sighted and blind adults.盲人和明眼成人对颜色的共同理解。
Proc Natl Acad Sci U S A. 2021 Aug 17;118(33). doi: 10.1073/pnas.2020192118.
9
Communicating artificial neural networks develop efficient color-naming systems.通信人工神经网络开发出高效的颜色命名系统。
Proc Natl Acad Sci U S A. 2021 Mar 23;118(12). doi: 10.1073/pnas.2016569118.
10
Revealing the multidimensional mental representations of natural objects underlying human similarity judgements.揭示人类相似性判断所基于的自然物体的多维心理表象。
Nat Hum Behav. 2020 Nov;4(11):1173-1185. doi: 10.1038/s41562-020-00951-3. Epub 2020 Oct 12.