Max Planck Institute for Human Cognitive and Brain Sciences, Cognition and Plasticity Research Group, Leipzig, Germany.
Queen Mary University of London, London, United Kingdom.
PLoS One. 2024 Nov 22;19(11):e0291099. doi: 10.1371/journal.pone.0291099. eCollection 2024.
Knowledge about personally familiar people and places is extremely rich and varied, involving pieces of semantic information connected in unpredictable ways through past autobiographical memories. In this work, we investigate whether we can capture brain processing of personally familiar people and places using subject-specific memories, after transforming them into vectorial semantic representations using language models. First, we asked participants to provide us with the names of the closest people and places in their lives. Then we collected open-ended answers to a questionnaire, aimed at capturing various facets of declarative knowledge. We collected EEG data from the same participants while they were reading the names and subsequently mentally visualizing their referents. As a control set of stimuli, we also recorded evoked responses to a matched set of famous people and places. We then created original semantic representations for the individual entities using language models. For personally familiar entities, we used the text of the answers to the questionnaire. For famous entities, we employed their Wikipedia page, which reflects shared declarative knowledge about them. Through whole-scalp time-resolved and searchlight encoding analyses, we found that we could capture how the brain processes one's closest people and places using person-specific answers to questionnaires, as well as famous entities. Overall encoding performance was significant in a large time window (200-800ms). Using spatio-temporal EEG searchlight, we found that we could predict brain responses significantly better than chance earlier (200-500ms) in bilateral temporo-parietal electrodes and later (500-700ms) in frontal and posterior central electrodes. We also found that XLM, a contextualized (or large) language model, provided superior encoding scores when compared with a simpler static language model as word2vec. Overall, these results indicate that language models can capture subject-specific semantic representations as they are processed in the human brain, by exploiting small-scale distributional lexical data.
关于个人熟悉的人和地方的知识是极其丰富和多样的,涉及通过过去自传体记忆以不可预测的方式连接的语义信息片段。在这项工作中,我们研究了是否可以使用特定于主题的记忆来捕获大脑对个人熟悉的人和地方的处理,这些记忆是使用语言模型将其转换为向量语义表示后得到的。首先,我们要求参与者提供他们生活中最亲近的人和地方的名字。然后,我们收集了参与者对问卷的开放式回答,旨在捕捉陈述性知识的各个方面。我们在同一参与者阅读名字并随后在脑海中想象他们的参照时收集了 EEG 数据。作为刺激的对照组,我们还记录了对一组匹配的名人的诱发反应。然后,我们使用语言模型为个体实体创建了原始语义表示。对于个人熟悉的实体,我们使用问卷回答的文本。对于著名的实体,我们使用了它们的维基百科页面,这反映了对他们的共同陈述性知识。通过全头皮时间分辨和搜索灯编码分析,我们发现我们可以使用问卷中对个人最亲近的人的特定回答来捕获大脑对他们的处理方式,以及对著名实体的处理方式。整体编码性能在一个大的时间窗口(200-800ms)内是显著的。使用时空 EEG 搜索灯,我们发现我们可以更早(200-500ms)在双侧颞顶电极和更晚(500-700ms)在额中和后中央电极中比随机更好地预测大脑反应。我们还发现,与简单的静态语言模型 word2vec 相比,上下文化(或大型)语言模型 XLM 提供了更好的编码分数。总体而言,这些结果表明,语言模型可以通过利用小规模的分布式词汇数据来捕获在人类大脑中处理的特定于主题的语义表示。