Department of Fundamental Neuroscience, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
Swiss National Centre of Competence in Research "Evolving Language" (NCCR EvolvingLanguage), Geneva, Switzerland.
PLoS Biol. 2023 Mar 22;21(3):e3002046. doi: 10.1371/journal.pbio.3002046. eCollection 2023 Mar.
Understanding speech requires mapping fleeting and often ambiguous soundwaves to meaning. While humans are known to exploit their capacity to contextualize to facilitate this process, how internal knowledge is deployed online remains an open question. Here, we present a model that extracts multiple levels of information from continuous speech online. The model applies linguistic and nonlinguistic knowledge to speech processing, by periodically generating top-down predictions and incorporating bottom-up incoming evidence in a nested temporal hierarchy. We show that a nonlinguistic context level provides semantic predictions informed by sensory inputs, which are crucial for disambiguating among multiple meanings of the same word. The explicit knowledge hierarchy of the model enables a more holistic account of the neurophysiological responses to speech compared to using lexical predictions generated by a neural network language model (GPT-2). We also show that hierarchical predictions reduce peripheral processing via minimizing uncertainty and prediction error. With this proof-of-concept model, we demonstrate that the deployment of hierarchical predictions is a possible strategy for the brain to dynamically utilize structured knowledge and make sense of the speech input.
理解言语需要将短暂且常常模糊的声波映射到意义上。虽然众所周知,人类利用语境化能力来促进这一过程,但内部知识是如何在线部署的仍然是一个悬而未决的问题。在这里,我们提出了一个从连续语音中在线提取多个层次信息的模型。该模型通过定期生成自上而下的预测,并在嵌套的时间层次结构中结合自下而上的传入证据,将语言和非语言知识应用于语音处理。我们表明,非语言语境层提供了由感官输入提供的语义预测,这对于消除同一个词的多种含义之间的歧义至关重要。该模型的显式知识层次结构使得对言语的神经生理反应有了更全面的解释,而不是使用神经网络语言模型 (GPT-2) 生成的词汇预测。我们还表明,分层预测通过最小化不确定性和预测误差来减少外围处理。通过这个概念验证模型,我们证明了分层预测的部署是大脑动态利用结构化知识并理解言语输入的一种可能策略。