Institute for Systems Research, University of Maryland, College Park, MD 20742, USA.
Department of Psychiatry, Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA.
Curr Biol. 2018 Dec 17;28(24):3976-3983.e5. doi: 10.1016/j.cub.2018.10.042. Epub 2018 Nov 29.
During speech perception, a central task of the auditory cortex is to analyze complex acoustic patterns to allow detection of the words that encode a linguistic message [1]. It is generally thought that this process includes at least one intermediate, phonetic, level of representations [2-6], localized bilaterally in the superior temporal lobe [7-9]. Phonetic representations reflect a transition from acoustic to linguistic information, classifying acoustic patterns into linguistically meaningful units, which can serve as input to mechanisms that access abstract word representations [10, 11]. While recent research has identified neural signals arising from successful recognition of individual words in continuous speech [12-15], no explicit neurophysiological signal has been found demonstrating the transition from acoustic and/or phonetic to symbolic, lexical representations. Here, we report a response reflecting the incremental integration of phonetic information for word identification, dominantly localized to the left temporal lobe. The short response latency, approximately 114 ms relative to phoneme onset, suggests that phonetic information is used for lexical processing as soon as it becomes available. Responses also tracked word boundaries, confirming previous reports of immediate lexical segmentation [16, 17]. These new results were further investigated using a cocktail-party paradigm [18, 19] in which participants listened to a mix of two talkers, attending to one and ignoring the other. Analysis indicates neural lexical processing of only the attended, but not the unattended, speech stream. Thus, while responses to acoustic features reflect attention through selective amplification of attended speech, responses consistent with a lexical processing model reveal categorically selective processing.
在言语感知过程中,听觉皮层的一项主要任务是分析复杂的声学模式,以检测编码语言信息的单词[1]。人们普遍认为,这个过程至少包括一个中间的、语音的表示层次[2-6],定位于颞叶的双侧[7-9]。语音表示反映了从声学到语言信息的转变,将声学模式分类为具有语言意义的单元,这些单元可以作为访问抽象单词表示的机制的输入[10,11]。虽然最近的研究已经确定了在连续语音中识别单个单词时产生的神经信号[12-15],但尚未发现明确的神经生理信号表明从声学和/或语音到符号、词汇表示的转变。在这里,我们报告了一个反映用于单词识别的语音信息逐步整合的反应,主要定位于左颞叶。大约 114 毫秒相对于音素起始的短反应潜伏期表明,一旦语音信息可用,它就被用于词汇处理。反应也跟踪单词边界,证实了之前关于立即词汇分割的报告[16,17]。使用鸡尾酒会范式[18,19]进一步研究了这些新结果,参与者在其中听两个说话者的混合音,只关注一个说话者而忽略另一个说话者。分析表明,只有被关注的,而不是未被关注的,语音流进行神经词汇处理。因此,虽然对声学特征的反应通过选择性放大被关注的语音来反映注意力,但与词汇处理模型一致的反应则揭示了分类选择性处理。