Leong Victoria, Stone Michael A, Turner Richard E, Goswami Usha
Centre for Neuroscience in Education, Department of Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom.
Auditory Perception Group, Department of Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom.
J Acoust Soc Am. 2014 Jul;136(1):366-81. doi: 10.1121/1.4883366.
Prosodic rhythm in speech [the alternation of "Strong" (S) and "weak" (w) syllables] is cued, among others, by slow rates of amplitude modulation (AM) within the speech envelope. However, it is unclear exactly which envelope modulation rates and statistics are the most important for the rhythm percept. Here, the hypothesis that the phase relationship between "Stress" rate (∼2 Hz) and "Syllable" rate (∼4 Hz) AMs provides a perceptual cue for speech rhythm is tested. In a rhythm judgment task, adult listeners identified AM tone-vocoded nursery rhyme sentences that carried either trochaic (S-w) or iambic patterning (w-S). Manipulation of listeners' rhythm perception was attempted by parametrically phase-shifting the Stress AM and Syllable AM in the vocoder. It was expected that a 1π radian phase-shift (half a cycle) would reverse the perceived rhythm pattern (i.e., trochaic → iambic) whereas a 2π radian shift (full cycle) would retain the perceived rhythm pattern (i.e., trochaic → trochaic). The results confirmed these predictions. Listeners judgments of rhythm systematically followed Stress-Syllable AM phase-shifts, but were unaffected by phase-shifts between the Syllable AM and the Sub-beat AM (∼14 Hz) in a control condition. It is concluded that the Stress-Syllable AM phase relationship is an envelope-based modulation statistic that supports speech rhythm perception.
言语中的韵律节奏(“强”(S)音节和“弱”(w)音节的交替),除其他因素外,还由言语包络内缓慢的调幅(AM)速率来提示。然而,目前尚不清楚究竟哪些包络调制速率和统计数据对节奏感知最为重要。在此,对“重音”速率(约2赫兹)和“音节”速率(约4赫兹)调幅之间的相位关系为言语节奏提供感知线索这一假设进行了测试。在一项节奏判断任务中,成年听众识别了带有扬抑抑格(S-w)或抑扬格模式(w-S)的调幅音调编码的童谣句子。通过在声码器中对重音调幅和音节调幅进行参数化相移,试图操纵听众的节奏感知。预计1π弧度的相移(半个周期)会逆转感知到的节奏模式(即扬抑抑格→抑扬格),而2π弧度的相移(完整周期)会保留感知到的节奏模式(即扬抑抑格→扬抑抑格)。结果证实了这些预测。听众对节奏的判断系统地跟随重音 - 音节调幅的相移,但在控制条件下不受音节调幅和子节拍调幅(约14赫兹)之间相移的影响。得出的结论是,重音 - 音节调幅相位关系是一种基于包络的调制统计数据,支持言语节奏感知。