Sassenhagen Jona, Fiebach Christian J
Department of Psychology, Goethe University Frankfurt, Germany.
Brain Imaging Center, Goethe University Frankfurt, Germany.
Neurobiol Lang (Camb). 2020 Mar 1;1(1):54-76. doi: 10.1162/nol_a_00003. eCollection 2020.
How is semantic information stored in the human mind and brain? Some philosophers and cognitive scientists argue for vectorial representations of concepts, where the meaning of a word is represented as its position in a high-dimensional neural state space. At the intersection of natural language processing and artificial intelligence, a class of very successful distributional word vector models has developed that can account for classic EEG findings of language, that is, the ease versus difficulty of integrating a word with its sentence context. However, models of semantics have to account not only for context-based word processing, but should also describe how word meaning is represented. Here, we investigate whether distributional vector representations of word meaning can model brain activity induced by words presented without context. Using EEG activity (event-related brain potentials) collected while participants in two experiments (English and German) read isolated words, we encoded and decoded word vectors taken from the family of prediction-based Word2vec algorithms. We found that, first, the position of a word in vector space allows the prediction of the pattern of corresponding neural activity over time, in particular during a time window of 300 to 500 ms after word onset. Second, distributional models perform better than a human-created taxonomic baseline model (WordNet), and this holds for several distinct vector-based models. Third, multiple latent semantic dimensions of word meaning can be decoded from brain activity. Combined, these results suggest that empiricist, prediction-based vectorial representations of meaning are a viable candidate for the representational architecture of human semantic knowledge.
语义信息是如何存储在人类的心智和大脑中的?一些哲学家和认知科学家主张概念的向量表示,即一个词的意义被表示为其在高维神经状态空间中的位置。在自然语言处理和人工智能的交叉领域,已经开发出一类非常成功的分布词向量模型,这些模型可以解释语言的经典脑电图结果,也就是说,一个词与它的句子语境整合的难易程度。然而,语义模型不仅要解释基于语境的词处理,还应该描述词的意义是如何被表示的。在这里,我们研究词意义的分布向量表示是否可以模拟在无语境呈现词时诱发的大脑活动。利用在两个实验(英语和德语)中参与者阅读孤立单词时收集的脑电图活动(事件相关脑电位),我们对从基于预测的Word2vec算法家族中提取的词向量进行编码和解码。我们发现,首先,一个词在向量空间中的位置可以预测相应神经活动随时间的模式,特别是在单词出现后300到500毫秒的时间窗口内。其次,分布模型的表现优于人工创建的分类基线模型(WordNet),并且这适用于几个不同的基于向量的模型。第三,可以从大脑活动中解码词意义的多个潜在语义维度。综合起来,这些结果表明,基于预测的经验主义向量意义表示是人类语义知识表示架构的一个可行候选方案。