Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; School of Communication Sciences and Disorders, University of Memphis, Memphis, TN, USA; Division of Clinical Neuroscience, School of Medicine, Hearing Sciences - Scottish Section, University of Nottingham, Glasgow, Scotland, UK.
Department of Speech, Language and Hearing Sciences, Indiana University, Bloomington, IN, USA; Program in Neuroscience, Indiana University, Bloomington, IN, USA.
Neuroimage. 2023 Apr 1;269:119899. doi: 10.1016/j.neuroimage.2023.119899. Epub 2023 Jan 28.
The brain transforms continuous acoustic events into discrete category representations to downsample the speech signal for our perceptual-cognitive systems. Such phonetic categories are highly malleable, and their percepts can change depending on surrounding stimulus context. Previous work suggests these acoustic-phonetic mapping and perceptual warping of speech emerge in the brain no earlier than auditory cortex. Here, we examined whether these auditory-category phenomena inherent to speech perception occur even earlier in the human brain, at the level of auditory brainstem. We recorded speech-evoked frequency following responses (FFRs) during a task designed to induce more/less warping of listeners' perceptual categories depending on stimulus presentation order of a speech continuum (random, forward, backward directions). We used a novel clustered stimulus paradigm to rapidly record the high trial counts needed for FFRs concurrent with active behavioral tasks. We found serial stimulus order caused perceptual shifts (hysteresis) near listeners' category boundary confirming identical speech tokens are perceived differentially depending on stimulus context. Critically, we further show neural FFRs during active (but not passive) listening are enhanced for prototypical vs. category-ambiguous tokens and are biased in the direction of listeners' phonetic label even for acoustically-identical speech stimuli. These findings were not observed in the stimulus acoustics nor model FFR responses generated via a computational model of cochlear and auditory nerve transduction, confirming a central origin to the effects. Our data reveal FFRs carry category-level information and suggest top-down processing actively shapes the neural encoding and categorization of speech at subcortical levels. These findings suggest the acoustic-phonetic mapping and perceptual warping in speech perception occur surprisingly early along the auditory neuroaxis, which might aid understanding by reducing ambiguity inherent to the speech signal.
大脑将连续的声学事件转化为离散的类别表示,以便对我们的感知认知系统进行语音信号的下采样。这种语音类别具有高度的可变性,它们的感知可以根据周围的刺激环境而改变。先前的研究表明,这些语音的声学-语音映射和感知扭曲现象最早出现在听觉皮层中。在这里,我们研究了这些与语音感知相关的听觉类别现象是否更早出现在人类大脑中,即在听觉脑干水平上。我们在一项任务中记录了语音诱发的频率跟随反应(FFR),该任务旨在根据语音连续体的刺激呈现顺序(随机、正向、反向)来诱导听众的感知类别发生更多/更少的扭曲。我们使用了一种新颖的聚类刺激范式,以快速记录与主动行为任务同时进行的 FFR 所需的高试验计数。我们发现,连续的刺激顺序导致了听众类别边界附近的感知变化(滞后),这证实了相同的语音标记根据刺激上下文的不同而被不同地感知。至关重要的是,我们进一步表明,在主动(而非被动)聆听期间,神经 FFR 对原型与类别模糊的语音标记增强,并且即使对于声学上相同的语音刺激,也偏向于听众的语音标签。这些发现既未在刺激声学中观察到,也未在通过耳蜗和听觉神经转导的计算模型生成的模型 FFR 响应中观察到,这证实了这些影响的中枢起源。我们的数据表明 FFR 携带类别级别的信息,并表明自上而下的处理积极地塑造了亚皮质水平上语音的神经编码和分类。这些发现表明,语音感知中的声学-语音映射和感知扭曲现象沿着听觉神经轴惊人地早发生,这可能有助于通过减少语音信号固有的歧义来理解。