Department of Psychology, The University of Reading, Reading RG6 6AL, United Kingdom.
J Acoust Soc Am. 2011 Nov;130(5):2777-88. doi: 10.1121/1.3641399.
Three experiments measured constancy in speech perception, using natural-speech messages or noise-band vocoder versions of them. The eight vocoder-bands had equally log-spaced center-frequencies and the shapes of corresponding "auditory" filters. Consequently, the bands had the temporal envelopes that arise in these auditory filters when the speech is played. The "sir" or "stir" test-words were distinguished by degrees of amplitude modulation, and played in the context; "next you'll get _ to click on." Listeners identified test-words appropriately, even in the vocoder conditions where the speech had a "noise-like" quality. Constancy was assessed by comparing the identification of test-words with low or high levels of room reflections across conditions where the context had either a low or a high level of reflections. Constancy was obtained with both the natural and the vocoded speech, indicating that the effect arises through temporal-envelope processing. Two further experiments assessed perceptual weighting of the different bands, both in the test word and in the context. The resulting weighting functions both increase monotonically with frequency, following the spectral characteristics of the test-word's [s]. It is suggested that these two weighting functions are similar because they both come about through the perceptual grouping of the test-word's bands.
三个实验使用自然语音信息或其噪声带声码器版本测量了语音感知的恒定性。八个声码器频段具有相等对数间隔的中心频率和相应的“听觉”滤波器形状。因此,当语音播放时,这些频段具有在这些听觉滤波器中产生的时间包络。“sir”或“stir”测试词通过幅度调制的程度来区分,并在“接下来你将点击 _”的语境中播放。即使在语音具有“噪声样”质量的声码器条件下,听众也能正确识别测试词。通过比较在具有低或高声反射水平的语境条件下,低或高声反射水平的测试词的识别,可以评估恒定性。自然语音和声码语音都具有恒定性,表明该效应是通过时间包络处理产生的。另外两个实验评估了测试词和语境中不同频段的感知加权。在测试词和语境中,得出的加权函数都随着频率单调增加,遵循测试词[s]的频谱特征。有人认为,这两个加权函数相似,因为它们都是通过测试词的频段的知觉分组产生的。