Ito M, Tsuchida J, Yano M
Wako Research Center, Honda R&D Co, Ltd, Saitama, Japan.
J Acoust Soc Am. 2001 Aug;110(2):1141-9. doi: 10.1121/1.1384908.
The formant hypothesis of vowel perception, where the lowest two or three formant frequencies are essential cues for vowel quality perception, is widely accepted. There has, however, been some controversy suggesting that formant frequencies are not sufficient and that the whole spectral shape is necessary for perception. Three psychophysical experiments were performed to study this question. In the first experiment, the first or second formant peak of stimuli was suppressed as much as possible while still maintaining the original spectral shape. The responses to these stimuli were not radically different from the ones for the unsuppressed control. In the second experiment, F2-suppressed stimuli, whose amplitude ratios of high- to low-frequency components were systemically changed, were used. The results indicate that the ratio changes can affect perceived vowel quality, especially its place of articulation. In the third experiment, the full-formant stimuli, whose amplitude ratios were changed from the original and whose F2's were kept constant, were used. The results suggest that the amplitude ratio is equal to or more effective than F2 as a cue for place of articulation. We conclude that formant frequencies are not exclusive cues and that the whole spectral shape can be crucial for vowel perception.
元音感知的共振峰假说被广泛接受,该假说认为最低的两三个共振峰频率是元音音质感知的关键线索。然而,也存在一些争议,有人认为共振峰频率并不充分,整个频谱形状对于感知是必要的。为此进行了三项心理物理学实验来研究这个问题。在第一个实验中,尽可能抑制刺激的第一或第二共振峰峰值,同时保持原始频谱形状。对这些刺激的反应与未抑制的对照组相比没有根本差异。在第二个实验中,使用了F2抑制的刺激,其高频与低频成分的幅度比被系统地改变。结果表明,比率变化会影响感知到的元音音质,尤其是其发音部位。在第三个实验中,使用了全共振峰刺激,其幅度比与原始值不同,且F2保持不变。结果表明,幅度比作为发音部位的线索与F2一样有效或更有效。我们得出结论,共振峰频率不是唯一的线索,整个频谱形状对于元音感知可能至关重要。