Kauf Carina, Tuckute Greta, Levy Roger, Andreas Jacob, Fedorenko Evelina
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology.
McGovern Institute for Brain Research, Massachusetts Institute of Technology.
bioRxiv. 2023 May 6:2023.05.05.539646. doi: 10.1101/2023.05.05.539646.
Representations from artificial neural network (ANN) language models have been shown to predict human brain activity in the language network. To understand what aspects of linguistic stimuli contribute to ANN-to-brain similarity, we used an fMRI dataset of responses to n=627 naturalistic English sentences (Pereira et al., 2018) and systematically manipulated the stimuli for which ANN representations were extracted. In particular, we i) perturbed sentences' word order, ii) removed different subsets of words, or iii) replaced sentences with other sentences of varying semantic similarity. We found that the lexical semantic content of the sentence (largely carried by content words) rather than the sentence's syntactic form (conveyed via word order or function words) is primarily responsible for the ANN-to-brain similarity. In follow-up analyses, we found that perturbation manipulations that adversely affect brain predictivity also lead to more divergent representations in the ANN's embedding space and decrease the ANN's ability to predict upcoming tokens in those stimuli. Further, results are robust to whether the mapping model is trained on intact or perturbed stimuli, and whether the ANN sentence representations are conditioned on the same linguistic context that humans saw. The critical result-that lexical-semantic content is the main contributor to the similarity between ANN representations and neural ones-aligns with the idea that the goal of the human language system is to extract meaning from linguistic strings. Finally, this work highlights the strength of systematic experimental manipulations for evaluating how close we are to accurate and generalizable models of the human language network.
人工神经网络(ANN)语言模型的表征已被证明能够预测语言网络中的人类大脑活动。为了了解语言刺激的哪些方面促成了人工神经网络与大脑的相似性,我们使用了一个功能磁共振成像(fMRI)数据集,该数据集记录了对n = 627个自然英语句子的反应(佩雷拉等人,2018年),并系统地操纵了从中提取人工神经网络表征的刺激。具体而言,我们:i)打乱句子的词序;ii)删除不同的词子集;或iii)用具有不同语义相似性的其他句子替换原句。我们发现,句子的词汇语义内容(主要由实词承载)而非句子的句法形式(通过词序或虚词传达)是人工神经网络与大脑相似性的主要原因。在后续分析中,我们发现对大脑预测性有不利影响的扰动操作也会导致人工神经网络嵌入空间中的表征更加分散,并降低人工神经网络预测这些刺激中后续词元的能力。此外,无论映射模型是在完整刺激还是扰动刺激上进行训练,以及人工神经网络句子表征是否以人类所看到的相同语言上下文为条件,结果都是稳健的。关键结果——词汇语义内容是人工神经网络表征与神经表征之间相似性的主要贡献因素——与人类语言系统的目标是从语言字符串中提取意义这一观点一致。最后,这项工作凸显了系统实验操作在评估我们距离人类语言网络的准确且通用模型有多近方面的优势。