Department of Neuroscience, Max Planck Institute, Frankfurt, Germany.
Department of Psychology, New York University, New York, NY, USA.
Nat Rev Neurosci. 2020 Jun;21(6):322-334. doi: 10.1038/s41583-020-0304-4. Epub 2020 May 6.
The recognition of spoken language has typically been studied by focusing on either words or their constituent elements (for example, low-level features or phonemes). More recently, the 'temporal mesoscale' of speech has been explored, specifically regularities in the envelope of the acoustic signal that correlate with syllabic information and that play a central role in production and perception processes. The temporal structure of speech at this scale is remarkably stable across languages, with a preferred range of rhythmicity of 2- 8 Hz. Importantly, this rhythmicity is required by the processes underlying the construction of intelligible speech. A lot of current work focuses on audio-motor interactions in speech, highlighting behavioural and neural evidence that demonstrates how properties of perceptual and motor systems, and their relation, can underlie the mesoscale speech rhythms. The data invite the hypothesis that the speech motor cortex is best modelled as a neural oscillator, a conjecture that aligns well with current proposals highlighting the fundamental role of neural oscillations in perception and cognition. The findings also show motor theories (of speech) in a different light, placing new mechanistic constraints on accounts of the action-perception interface.
口语识别通常通过关注单词或其组成部分(例如,低水平特征或音素)来研究。最近,人们探索了“语言的时间中尺度”,特别是与音节信息相关的声信号包络中的规律性,这些规律性在产生和感知过程中起着核心作用。该尺度下的语言的时间结构在不同语言中非常稳定,其节奏的优选范围为 2-8 Hz。重要的是,这些节奏是构建可理解语音的过程所必需的。目前的许多研究都集中在语音的音频-运动交互上,突出了行为和神经证据,这些证据表明感知和运动系统的特性及其关系如何能够成为中尺度语音节奏的基础。这些数据使人们假设,言语运动皮层最好被建模为神经振荡器,这一推测与当前强调神经振荡在感知和认知中基本作用的建议非常吻合。这些发现也从不同角度展示了运动理论(言语),为言语动作感知界面的解释提供了新的机械约束。