Department of Neurology, Rose F. Kennedy Center, Albert Einstein College of Medicine, Room 322, 1300 Morris Park Avenue, Bronx, NY 10461, USA; Department of Neuroscience, Rose F. Kennedy Center, Albert Einstein College of Medicine, Room 322, 1300 Morris Park Avenue, Bronx, NY 10461, USA.
Hear Res. 2013 Nov;305:57-73. doi: 10.1016/j.heares.2013.05.013. Epub 2013 Jun 18.
Successful categorization of phonemes in speech requires that the brain analyze the acoustic signal along both spectral and temporal dimensions. Neural encoding of the stimulus amplitude envelope is critical for parsing the speech stream into syllabic units. Encoding of voice onset time (VOT) and place of articulation (POA), cues necessary for determining phonemic identity, occurs within shorter time frames. An unresolved question is whether the neural representation of speech is based on processing mechanisms that are unique to humans and shaped by learning and experience, or is based on rules governing general auditory processing that are also present in non-human animals. This question was examined by comparing the neural activity elicited by speech and other complex vocalizations in primary auditory cortex of macaques, who are limited vocal learners, with that in Heschl's gyrus, the putative location of primary auditory cortex in humans. Entrainment to the amplitude envelope is neither specific to humans nor to human speech. VOT is represented by responses time-locked to consonant release and voicing onset in both humans and monkeys. Temporal representation of VOT is observed both for isolated syllables and for syllables embedded in the more naturalistic context of running speech. The fundamental frequency of male speakers is represented by more rapid neural activity phase-locked to the glottal pulsation rate in both humans and monkeys. In both species, the differential representation of stop consonants varying in their POA can be predicted by the relationship between the frequency selectivity of neurons and the onset spectra of the speech sounds. These findings indicate that the neurophysiology of primary auditory cortex is similar in monkeys and humans despite their vastly different experience with human speech, and that Heschl's gyrus is engaged in general auditory, and not language-specific, processing. This article is part of a Special Issue entitled "Communication Sounds and the Brain: New Directions and Perspectives".
成功地对语音中的音素进行分类,要求大脑沿着频谱和时间维度对声学信号进行分析。刺激幅度包络的神经编码对于将语音流解析为音节单位至关重要。语音起始时间 (VOT) 和发音部位 (POA) 的编码,是确定音素身份所必需的线索,发生在更短的时间框架内。一个悬而未决的问题是,语音的神经表示是基于人类特有的处理机制,还是基于也存在于非人类动物中的一般听觉处理规则。这个问题通过比较恒河猴初级听觉皮层中语音和其他复杂发声引起的神经活动与人类初级听觉皮层(假定位于 Heschl 回)的神经活动来检验。幅度包络的同步既不是人类特有的,也不是人类语音特有的。VOT 是通过响应时间锁定在辅音释放和发声开始来表示的,无论是在人类还是猴子中。VOT 的时间表示既可以观察到孤立的音节,也可以观察到更自然的连续语音中的音节。男性说话者的基频通过与声门脉冲率快速锁相的更快的神经活动来表示,无论是在人类还是猴子中。在这两个物种中,根据神经元的频率选择性和语音的起始谱之间的关系,可以预测在 POA 上变化的闭塞辅音的差异表示。这些发现表明,尽管猴子和人类在人类语音方面的经验有很大的不同,但初级听觉皮层的神经生理学在猴子和人类中是相似的,并且 Heschl 回参与的是一般听觉处理,而不是特定于语言的处理。本文是一个特刊的一部分,题为“交流声音与大脑:新方向和视角”。