Suppr超能文献

自然语言处理模型中单词表示向量的神经关联:事件相关脑电位的代表性相似性分析证据。

Neural correlates of word representation vectors in natural language processing models: Evidence from representational similarity analysis of event-related brain potentials.

机构信息

Center for Neuroscience, University of California, Davis, California, USA.

Department of Linguistics, University of California, Davis, California, USA.

出版信息

Psychophysiology. 2022 Mar;59(3):e13976. doi: 10.1111/psyp.13976. Epub 2021 Nov 24.

Abstract

Natural language processing models based on machine learning (ML-NLP models) have been developed to solve practical problems, such as interpreting an Internet search query. These models are not intended to reflect human language comprehension mechanisms, and the word representations used by ML-NLP models and human brains might therefore be quite different. However, because ML-NLP models are trained with the same kinds of inputs that humans must process, and they must solve many of the same computational problems as the human brain, ML-NLP models and human brains may end up with similar word representations. To distinguish between these hypotheses, we used representational similarity analysis to compare the representational geometry of word representations in two ML-NLP models with the representational geometry of the human brain, as indexed with event-related potentials (ERPs). Participants listened to stories while the electroencephalogram was recorded. We extracted averaged ERPs for each of the 100 words that occurred most frequently in the stories, and we calculated the similarity of the neural response for each pair of words. We compared this 100 × 100 similarity matrix to the 100 × 100 similarity matrix for the word pairs according to two ML-NLP models. We found significant representational similarity between the neural data and each ML-NLP model, beginning within 250 ms of word onset. These results indicate that ML-NLP systems that are designed to solve practical technology problems have a representational geometry that is correlated with that of the human brain, presumably because both are influenced by the structural properties and statistics of language.

摘要

基于机器学习的自然语言处理模型(ML-NLP 模型)已经被开发出来,以解决实际问题,例如解释互联网搜索查询。这些模型并不是为了反映人类语言理解机制,因此 ML-NLP 模型和人类大脑使用的词表示可能非常不同。然而,由于 ML-NLP 模型是用人类必须处理的相同类型的输入进行训练的,并且它们必须解决与人类大脑相同的计算问题,因此 ML-NLP 模型和人类大脑最终可能具有相似的词表示。为了区分这些假设,我们使用表示相似性分析来比较两个 ML-NLP 模型中的词表示的表示几何形状与事件相关电位 (ERP) 索引的人脑的表示几何形状。参与者在听故事时记录脑电图。我们提取了故事中出现频率最高的 100 个单词的平均 ERP,并计算了每个单词对的神经反应的相似性。我们将这个 100×100 相似矩阵与两个 ML-NLP 模型中词对的 100×100 相似矩阵进行了比较。我们发现,从单词出现开始的 250 毫秒内,神经数据与每个 ML-NLP 模型之间存在显著的表示相似性。这些结果表明,旨在解决实际技术问题的 ML-NLP 系统具有与人类大脑相关的表示几何形状,这可能是因为两者都受到语言结构特性和统计特性的影响。

相似文献

本文引用的文献

2
Concepts and Compositionality: In Search of the Brain's Language of Thought.概念与组合性:探寻大脑的思维语言。
Annu Rev Psychol. 2020 Jan 4;71:273-303. doi: 10.1146/annurev-psych-122216-011829. Epub 2019 Sep 24.
4
I must have missed that: Alpha-band oscillations track attention to spoken language.我肯定错过了:α 波段振荡跟踪对口语的注意力。
Neuropsychologia. 2018 Aug;117:148-155. doi: 10.1016/j.neuropsychologia.2018.05.024. Epub 2018 May 26.
10
A toolbox for representational similarity analysis.用于表征相似性分析的工具箱。
PLoS Comput Biol. 2014 Apr 17;10(4):e1003553. doi: 10.1371/journal.pcbi.1003553. eCollection 2014 Apr.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验