Olasagasti Itsaso, Giraud Anne-Lise
Department of Basic Neuroscience, University of Geneva, Geneva, Switzerland.
Elife. 2020 Mar 30;9:e44516. doi: 10.7554/eLife.44516.
Speech perception presumably arises from internal models of how specific sensory features are associated with speech sounds. These features change constantly (e.g. different speakers, articulation modes etc.), and listeners need to recalibrate their internal models by appropriately weighing new versus old evidence. Models of speech recalibration classically ignore this volatility. The effect of volatility in tasks where sensory cues were associated with arbitrary experimenter-defined categories were well described by models that continuously adapt the learning rate while keeping a single representation of the category. Using neurocomputational modelling we show that recalibration of speech sound categories is better described by representing the latter at different time scales. We illustrate our proposal by modeling fast recalibration of speech sounds after experiencing the McGurk effect. We propose that working representations of speech categories are driven both by their current environment and their long-term memory representations.
语音感知大概源于特定感官特征与语音如何关联的内部模型。这些特征不断变化(例如不同的说话者、发音方式等),听众需要通过适当权衡新证据与旧证据来重新校准他们的内部模型。传统的语音重新校准模型忽略了这种波动性。在感官线索与实验者定义的任意类别相关联的任务中,波动性的影响可以通过在保持类别的单一表示的同时不断调整学习率的模型得到很好的描述。我们使用神经计算模型表明,通过在不同时间尺度上表示语音类别,可以更好地描述语音类别重新校准。我们通过模拟麦格克效应后语音的快速重新校准来说明我们的提议。我们提出,语音类别的工作表征是由其当前环境和长期记忆表征共同驱动的。