词汇语义内容而非句法结构，是语言网络中功能磁共振成像反应的人工神经网络与大脑相似性的主要促成因素。

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network.

作者信息

Kauf Carina, Tuckute Greta, Levy Roger, Andreas Jacob, Fedorenko Evelina

机构信息

Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology.

McGovern Institute for Brain Research, Massachusetts Institute of Technology.

出版信息

bioRxiv. 2023 May 6:2023.05.05.539646. doi: 10.1101/2023.05.05.539646.

DOI:10.1101/2023.05.05.539646

PMID:37205405

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10187317/

Abstract

Representations from artificial neural network (ANN) language models have been shown to predict human brain activity in the language network. To understand what aspects of linguistic stimuli contribute to ANN-to-brain similarity, we used an fMRI dataset of responses to n=627 naturalistic English sentences (Pereira et al., 2018) and systematically manipulated the stimuli for which ANN representations were extracted. In particular, we i) perturbed sentences' word order, ii) removed different subsets of words, or iii) replaced sentences with other sentences of varying semantic similarity. We found that the lexical semantic content of the sentence (largely carried by content words) rather than the sentence's syntactic form (conveyed via word order or function words) is primarily responsible for the ANN-to-brain similarity. In follow-up analyses, we found that perturbation manipulations that adversely affect brain predictivity also lead to more divergent representations in the ANN's embedding space and decrease the ANN's ability to predict upcoming tokens in those stimuli. Further, results are robust to whether the mapping model is trained on intact or perturbed stimuli, and whether the ANN sentence representations are conditioned on the same linguistic context that humans saw. The critical result-that lexical-semantic content is the main contributor to the similarity between ANN representations and neural ones-aligns with the idea that the goal of the human language system is to extract meaning from linguistic strings. Finally, this work highlights the strength of systematic experimental manipulations for evaluating how close we are to accurate and generalizable models of the human language network.

摘要

人工神经网络（ANN）语言模型的表征已被证明能够预测语言网络中的人类大脑活动。为了了解语言刺激的哪些方面促成了人工神经网络与大脑的相似性，我们使用了一个功能磁共振成像（fMRI）数据集，该数据集记录了对n = 627个自然英语句子的反应（佩雷拉等人，2018年），并系统地操纵了从中提取人工神经网络表征的刺激。具体而言，我们：i）打乱句子的词序；ii）删除不同的词子集；或iii）用具有不同语义相似性的其他句子替换原句。我们发现，句子的词汇语义内容（主要由实词承载）而非句子的句法形式（通过词序或虚词传达）是人工神经网络与大脑相似性的主要原因。在后续分析中，我们发现对大脑预测性有不利影响的扰动操作也会导致人工神经网络嵌入空间中的表征更加分散，并降低人工神经网络预测这些刺激中后续词元的能力。此外，无论映射模型是在完整刺激还是扰动刺激上进行训练，以及人工神经网络句子表征是否以人类所看到的相同语言上下文为条件，结果都是稳健的。关键结果——词汇语义内容是人工神经网络表征与神经表征之间相似性的主要贡献因素——与人类语言系统的目标是从语言字符串中提取意义这一观点一致。最后，这项工作凸显了系统实验操作在评估我们距离人类语言网络的准确且通用模型有多近方面的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b844/10187317/4b96da6b7947/nihpp-2023.05.05.539646v1-f0001.jpg

相似文献

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network.词汇语义内容而非句法结构，是语言网络中功能磁共振成像反应的人工神经网络与大脑相似性的主要促成因素。

bioRxiv. 2023 May 6:2023.05.05.539646. doi: 10.1101/2023.05.05.539646.

Lexical-Semantic Content, Not Syntactic Structure, Is the Main Contributor to ANN-Brain Similarity of fMRI Responses in the Language Network.词汇语义内容而非句法结构是语言网络中功能磁共振成像反应的人工神经网络与大脑相似性的主要贡献因素。

Neurobiol Lang (Camb). 2024 Apr 1;5(1):7-42. doi: 10.1162/nol_a_00116. eCollection 2024.

Deep Artificial Neural Networks Reveal a Distributed Cortical Network Encoding Propositional Sentence-Level Meaning.深度人工神经网络揭示命题句级意义的分布式皮层网络编码。

J Neurosci. 2021 May 5;41(18):4100-4119. doi: 10.1523/JNEUROSCI.1152-20.2021. Epub 2021 Mar 22.

Lack of selectivity for syntax relative to word meanings throughout the language network.在整个语言网络中，相对于词义而言，对句法缺乏选择性。

Cognition. 2020 Oct;203:104348. doi: 10.1016/j.cognition.2020.104348. Epub 2020 Jun 20.

Lexical and syntactic representations in the brain: an fMRI investigation with multi-voxel pattern analyses.大脑中的词汇和句法表征：基于多体素模式分析的 fMRI 研究。

Neuropsychologia. 2012 Mar;50(4):499-513. doi: 10.1016/j.neuropsychologia.2011.09.014. Epub 2011 Sep 17.

Linguistic Structure and Meaning Organize Neural Oscillations into a Content-Specific Hierarchy.语言结构和意义将神经振荡组织成内容特定的层级。

J Neurosci. 2020 Dec 2;40(49):9467-9475. doi: 10.1523/JNEUROSCI.0302-20.2020. Epub 2020 Oct 23.

Lists with and without Syntax: A New Approach to Measuring the Neural Processing of Syntax.带和不带句法的列表：一种测量句法神经处理的新方法。

J Neurosci. 2021 Mar 10;41(10):2186-2196. doi: 10.1523/JNEUROSCI.1179-20.2021. Epub 2021 Jan 26.

Delta-Band Neural Responses to Individual Words Are Modulated by Sentence Processing.Delta 波段神经对单个单词的反应受句子处理的调节。

J Neurosci. 2023 Jun 28;43(26):4867-4883. doi: 10.1523/JNEUROSCI.0964-22.2023. Epub 2023 May 23.

How Grammar Conveys Meaning: Language-Specific Spatial Encoding Patterns and Cross-Language Commonality in Higher-Order Neural Space.语法如何传达意义：高阶神经空间中的语言特异性空间编码模式和跨语言共性。

J Neurosci. 2023 Nov 15;43(46):7831-7841. doi: 10.1523/JNEUROSCI.0599-23.2023. Epub 2023 Sep 15.

Deep neural networks reveal topic-level representations of sentences in medial prefrontal cortex, lateral anterior temporal lobe, precuneus, and angular gyrus.深度神经网络揭示了内侧前额叶皮质、外侧前颞叶、楔前叶和角回中句子的主题层次表示。

Neuroimage. 2022 May 1;251:119005. doi: 10.1016/j.neuroimage.2022.119005. Epub 2022 Feb 14.

本文引用的文献

Scaling laws for language encoding models in fMRI.功能磁共振成像中语言编码模型的标度律

Adv Neural Inf Process Syst. 2023;36:21895-21907.

Artificial Neural Network Language Models Predict Human Brain Responses to Language Even After a Developmentally Realistic Amount of Training.即使经过符合发育实际的训练量，人工神经网络语言模型仍能预测人类大脑对语言的反应。

Neurobiol Lang (Camb). 2024 Apr 1;5(1):43-63. doi: 10.1162/nol_a_00137. eCollection 2024.

Predictive Coding or Just Feature Discovery? An Alternative Account of Why Language Models Fit Brain Data.预测编码还是仅仅是特征发现？关于语言模型为何符合大脑数据的另一种解释。

Neurobiol Lang (Camb). 2024 Apr 1;5(1):64-79. doi: 10.1162/nol_a_00087. eCollection 2024.

Composition is the Core Driver of the Language-selective Network.成分是语言选择网络的核心驱动因素。

Neurobiol Lang (Camb). 2020 Mar 1;1(1):104-134. doi: 10.1162/nol_a_00005. eCollection 2020.

A resource-rational model of human processing of recursive linguistic structure.递归语言结构的人类处理的资源理性模型。

Proc Natl Acad Sci U S A. 2022 Oct 25;119(43):e2122602119. doi: 10.1073/pnas.2122602119. Epub 2022 Oct 19.

Probabilistic atlas for the language network based on precision fMRI data from >800 individuals.基于超过 800 个人的精确 fMRI 数据的语言网络概率图谱。

Sci Data. 2022 Aug 29;9(1):529. doi: 10.1038/s41597-022-01645-3.

Robust Effects of Working Memory Demand during Naturalistic Language Comprehension in Language-Selective Cortex.自然语言理解过程中工作记忆需求对语言选择皮层的强大影响。

J Neurosci. 2022 Sep 28;42(39):7412-7430. doi: 10.1523/JNEUROSCI.1894-21.2022.

A hierarchy of linguistic predictions during natural language comprehension.自然语言理解过程中的语言预测层次。

Proc Natl Acad Sci U S A. 2022 Aug 9;119(32):e2201968119. doi: 10.1073/pnas.2201968119. Epub 2022 Aug 3.

An investigation across 45 languages and 12 language families reveals a universal language network.一项涵盖 45 种语言和 12 个语系的调查揭示了一个普遍的语言网络。

Nat Neurosci. 2022 Aug;25(8):1014-1019. doi: 10.1038/s41593-022-01114-5. Epub 2022 Jul 18.

The transposed-word effect revisited: the role of syntax in word position coding.再探换位词效应：句法在词序编码中的作用

Lang Cogn Neurosci. 2021 Feb 2;36(5):668-673. doi: 10.1080/23273798.2021.1880608. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

词汇语义内容而非句法结构，是语言网络中功能磁共振成像反应的人工神经网络与大脑相似性的主要促成因素。

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献