Suppr超能文献

强预测:语言模型意外值解释多种N400效应。

Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects.

作者信息

Michaelov James A, Bardolph Megan D, Van Petten Cyma K, Bergen Benjamin K, Coulson Seana

机构信息

Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA.

Department of Psychology, Binghamton University, State University of New York, Binghamton, NY, USA.

出版信息

Neurobiol Lang (Camb). 2024 Apr 1;5(1):107-135. doi: 10.1162/nol_a_00105. eCollection 2024.

Abstract

Theoretical accounts of the N400 are divided as to whether the amplitude of the N400 response to a stimulus reflects the extent to which the stimulus was predicted, the extent to which the stimulus is semantically similar to its preceding context, or both. We use state-of-the-art machine learning tools to investigate which of these three accounts is best supported by the evidence. GPT-3, a neural language model trained to compute the conditional probability of any word based on the words that precede it, was used to operationalize contextual predictability. In particular, we used an information-theoretic construct known as surprisal (the negative logarithm of the conditional probability). Contextual semantic similarity was operationalized by using two high-quality co-occurrence-derived vector-based meaning representations for words: GloVe and fastText. The cosine between the vector representation of the sentence frame and final word was used to derive contextual cosine similarity estimates. A series of regression models were constructed, where these variables, along with cloze probability and plausibility ratings, were used to predict single trial N400 amplitudes recorded from healthy adults as they read sentences whose final word varied in its predictability, plausibility, and semantic relationship to the likeliest sentence completion. Statistical model comparison indicated GPT-3 surprisal provided the best account of N400 amplitude and suggested that apparently disparate N400 effects of expectancy, plausibility, and contextual semantic similarity can be reduced to variation in the predictability of words. The results are argued to support predictive coding in the human language network.

摘要

关于N400的理论解释存在分歧,即N400对刺激的反应幅度反映的是刺激被预测的程度、刺激与前文语境在语义上的相似程度,还是两者兼而有之。我们使用最先进的机器学习工具来研究这三种解释中哪一种最有证据支持。GPT-3是一种经过训练以根据前文单词计算任何单词的条件概率的神经语言模型,用于实现语境可预测性。具体而言,我们使用了一种称为意外性(条件概率的负对数)的信息论结构。通过使用两种基于高质量共现的单词向量表示法(GloVe和fastText)来实现语境语义相似性。句子框架和最后一个单词的向量表示之间的余弦用于得出语境余弦相似性估计值。构建了一系列回归模型,其中这些变量,连同完形填空概率和合理性评级,用于预测健康成年人阅读句子时记录的单次试验N400幅度,这些句子的最后一个单词在可预测性、合理性以及与最可能的句子完成形式的语义关系方面各不相同。统计模型比较表明,GPT-3意外性对N400幅度的解释最佳,并表明预期、合理性和语境语义相似性方面明显不同的N400效应可以归结为单词可预测性的变化。这些结果被认为支持人类语言网络中的预测编码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d92a/11025652/dcfcc72cce81/nol-5-1-107-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验