• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无监督的 Gromov-Wasserstein 对齐揭示了人类和大型语言模型的颜色相似性结构之间的结构对应关系。

Gromov-Wasserstein unsupervised alignment reveals structural correspondences between the color similarity structures of humans and large language models.

机构信息

Department of Bioengineering, Imperial College London, London, UK.

School of Psychological Sciences, Monash University, Melbourne, Australia.

出版信息

Sci Rep. 2024 Jul 10;14(1):15917. doi: 10.1038/s41598-024-65604-1.

DOI:10.1038/s41598-024-65604-1
PMID:38987348
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11237038/
Abstract

Large Language Models (LLMs), such as the General Pre-trained Transformer (GPT), have shown remarkable performance in various cognitive tasks. However, it remains unclear whether these models have the ability to accurately infer human perceptual representations. Previous research has addressed this question by quantifying correlations between similarity response patterns of humans and LLMs. Correlation provides a measure of similarity, but it relies pre-defined item labels and does not distinguish category- and item- level similarity, falling short of characterizing detailed structural correspondence between humans and LLMs. To assess their structural equivalence in more detail, we propose the use of an unsupervised alignment method based on Gromov-Wasserstein optimal transport (GWOT). GWOT allows for the comparison of similarity structures without relying on pre-defined label correspondences and can reveal fine-grained structural similarities and differences that may not be detected by simple correlation analysis. Using a large dataset of similarity judgments of 93 colors, we compared the color similarity structures of humans (color-neurotypical and color-atypical participants) and two GPT models (GPT-3.5 and GPT-4). Our results show that the similarity structure of color-neurotypical participants can be remarkably well aligned with that of GPT-4 and, to a lesser extent, to that of GPT-3.5. These results contribute to the methodological advancements of comparing LLMs with human perception, and highlight the potential of unsupervised alignment methods to reveal detailed structural correspondences.

摘要

大型语言模型(LLMs),如通用预训练转换器(GPT),在各种认知任务中表现出了显著的性能。然而,目前还不清楚这些模型是否有能力准确推断人类的感知表示。为了解决这个问题,先前的研究通过量化人类和 LLM 的相似性响应模式之间的相关性来进行研究。相关性提供了一种相似性的度量方法,但它依赖于预先定义的项目标签,并且不能区分类别和项目级别的相似性,无法刻画人类和 LLM 之间的详细结构对应关系。为了更详细地评估它们的结构等价性,我们提出使用基于 Gromov-Wasserstein 最优传输(GWOT)的无监督对齐方法。GWOT 允许在不依赖于预先定义的标签对应关系的情况下比较相似性结构,并且可以揭示出简单的相关分析可能无法检测到的细微的结构相似性和差异。我们使用了一个包含 93 种颜色的相似性判断的大型数据集,比较了人类(颜色神经典型和颜色非典型参与者)和两个 GPT 模型(GPT-3.5 和 GPT-4)的颜色相似性结构。我们的结果表明,颜色神经典型参与者的相似性结构可以与 GPT-4 非常好地对齐,并且在一定程度上与 GPT-3.5 对齐。这些结果为比较 LLM 与人类感知的方法学发展做出了贡献,并强调了无监督对齐方法揭示详细结构对应关系的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/bb029640f201/41598_2024_65604_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/f54133285013/41598_2024_65604_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/77880a84e293/41598_2024_65604_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/5b3867ad7848/41598_2024_65604_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/a83f88b1ffd3/41598_2024_65604_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/5d105e97eb9c/41598_2024_65604_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/bb029640f201/41598_2024_65604_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/f54133285013/41598_2024_65604_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/77880a84e293/41598_2024_65604_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/5b3867ad7848/41598_2024_65604_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/a83f88b1ffd3/41598_2024_65604_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/5d105e97eb9c/41598_2024_65604_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee35/11237038/bb029640f201/41598_2024_65604_Fig6_HTML.jpg

相似文献

1
Gromov-Wasserstein unsupervised alignment reveals structural correspondences between the color similarity structures of humans and large language models.无监督的 Gromov-Wasserstein 对齐揭示了人类和大型语言模型的颜色相似性结构之间的结构对应关系。
Sci Rep. 2024 Jul 10;14(1):15917. doi: 10.1038/s41598-024-65604-1.
2
Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values.评估大型语言模型与人类心理健康整合价值观的一致性:使用施瓦茨基本价值观理论的横断面研究。
JMIR Ment Health. 2024 Apr 9;11:e55988. doi: 10.2196/55988.
3
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.生成式大语言模型与同行用户对解释非专业患者实验室检测结果的答案质量比较:评估研究。
J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.
4
Can large language models understand molecules?大语言模型能理解分子吗?
BMC Bioinformatics. 2024 Jun 26;25(1):225. doi: 10.1186/s12859-024-05847-x.
5
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型:GPT-3.5、GPT-4 和 Bard 的比较分析。
JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.
6
The performance of large language models in intercollegiate Membership of the Royal College of Surgeons examination.大型语言模型在皇家外科学院会员联合考试中的表现。
Ann R Coll Surg Engl. 2024 Nov;106(8):700-704. doi: 10.1308/rcsann.2024.0023. Epub 2024 Mar 6.
7
LPOT: Locality-Preserving Gromov-Wasserstein Discrepancy for Nonrigid Point Set Registration.LPOT:用于非刚性点集配准的局部保持Gromov-Wasserstein差异
IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9213-9225. doi: 10.1109/TNNLS.2022.3231652. Epub 2024 Jul 8.
8
Evaluation of large language models in breast cancer clinical scenarios: a comparative analysis based on ChatGPT-3.5, ChatGPT-4.0, and Claude2.评估大语言模型在乳腺癌临床场景中的应用:基于 ChatGPT-3.5、ChatGPT-4.0 和 Claude2 的比较分析
Int J Surg. 2024 Apr 1;110(4):1941-1950. doi: 10.1097/JS9.0000000000001066.
9
Large language models predict human sensory judgments across six modalities.大型语言模型可预测人类在六种感觉模式下的判断。
Sci Rep. 2024 Sep 13;14(1):21445. doi: 10.1038/s41598-024-72071-1.
10
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.

引用本文的文献

1
Correspondence of high dimensional emotion structures elicited from video clips between humans and multimodal LLMs.人类与多模态语言模型之间从视频片段中引发的高维情感结构的对应关系。
Sci Rep. 2025 Sep 1;15(1):32175. doi: 10.1038/s41598-025-14961-6.
2
From text to motion: grounding GPT-4 in a humanoid robot "Alter3".从文本到行动:将GPT-4应用于仿人机器人“Alter3”并实现基础功能
Front Robot AI. 2025 May 27;12:1581110. doi: 10.3389/frobt.2025.1581110. eCollection 2025.
3
Unsupervised alignment reveals structural commonalities and differences in neural representations of natural scenes across individuals and brain areas.

本文引用的文献

1
Generalized Shape Metrics on Neural Representations.神经表征上的广义形状度量
Adv Neural Inf Process Syst. 2021 Dec;34:4738-4750.
2
Modeling Similarity and Psychological Space.建模相似性和心理空间。
Annu Rev Psychol. 2024 Jan 18;75:215-240. doi: 10.1146/annurev-psych-040323-115131. Epub 2023 Aug 10.
3
THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior.事物数据集(THINGS-data),一个多模态大型数据集集合,用于研究人类大脑和行为中的目标表示。
无监督对齐揭示了个体和脑区之间自然场景神经表征的结构共性与差异。
iScience. 2025 Apr 15;28(5):112427. doi: 10.1016/j.isci.2025.112427. eCollection 2025 May 16.
4
Is my "red" your "red"?: Evaluating structural correspondences between color similarity judgments using unsupervised alignment.我的“红色”是你的“红色”吗?:使用无监督对齐评估颜色相似性判断之间的结构对应关系。
iScience. 2025 Feb 15;28(3):112029. doi: 10.1016/j.isci.2025.112029. eCollection 2025 Mar 21.
5
Comparing color qualia structures through a similarity task in young children versus adults.通过相似性任务比较幼儿与成人的颜色感受质结构。
Proc Natl Acad Sci U S A. 2025 Mar 18;122(11):e2415346122. doi: 10.1073/pnas.2415346122. Epub 2025 Mar 11.
6
Probing the link between vision and language in material perception using psychophysics and unsupervised learning.使用心理物理学和无监督学习探究物质感知中视觉和语言之间的联系。
PLoS Comput Biol. 2024 Oct 3;20(10):e1012481. doi: 10.1371/journal.pcbi.1012481. eCollection 2024 Oct.
7
Collective predictive coding hypothesis: symbol emergence as decentralized Bayesian inference.集体预测编码假说:符号的出现即去中心化贝叶斯推理。
Front Robot AI. 2024 Jul 23;11:1353870. doi: 10.3389/frobt.2024.1353870. eCollection 2024.
8
Language models and psychological sciences.语言模型与心理科学。
Front Psychol. 2023 Oct 20;14:1279317. doi: 10.3389/fpsyg.2023.1279317. eCollection 2023.
Elife. 2023 Feb 27;12:e82580. doi: 10.7554/eLife.82580.
4
Using cognitive psychology to understand GPT-3.利用认知心理学理解 GPT-3。
Proc Natl Acad Sci U S A. 2023 Feb 7;120(6):e2218523120. doi: 10.1073/pnas.2218523120. Epub 2023 Feb 2.
5
A Quantum Geometric Framework for Modeling Color Similarity Judgments.用于模拟颜色相似性判断的量子几何框架。
Cogn Sci. 2023 Jan;47(1):e13231. doi: 10.1111/cogs.13231.
6
Are Color Experiences the Same across the Visual Field?颜色体验在整个视野中是否相同?
J Cogn Neurosci. 2023 Apr 1;35(4):509-542. doi: 10.1162/jocn_a_01962.
7
Revealing the multidimensional mental representations of natural objects underlying human similarity judgements.揭示人类相似性判断所基于的自然物体的多维心理表象。
Nat Hum Behav. 2020 Nov;4(11):1173-1185. doi: 10.1038/s41562-020-00951-3. Epub 2020 Oct 12.
8
Acquisition of the Meaning of the Word Orange Requires Understanding of the Meanings of Red, Pink, and Purple: Constructing a Lexicon as a Connected System.获取“orange”一词的意思需要理解“red”、“pink”和“purple”的意思:将词汇构建成一个有联系的系统。
Cogn Sci. 2020 Jan;44(1):e12813. doi: 10.1111/cogs.12813.
9
A method for identifying color vision deficiency malingering.一种识别伪装色觉缺陷的方法。
Graefes Arch Clin Exp Ophthalmol. 2017 Mar;255(3):613-618. doi: 10.1007/s00417-016-3570-0. Epub 2016 Dec 21.
10
Representational geometry: integrating cognition, computation, and the brain.表象几何:认知、计算与大脑的整合。
Trends Cogn Sci. 2013 Aug;17(8):401-12. doi: 10.1016/j.tics.2013.06.007. Epub 2013 Jul 19.