Kewley-Port D, Zheng Y
Department of Speech and Hearing Sciences, Indiana University, Bloomington 47405, USA.
J Acoust Soc Am. 1999 Nov;106(5):2945-58. doi: 10.1121/1.428134.
Thresholds for formant frequency discrimination have been established using optimal listening conditions. In normal conversation, the ability to discriminate formant frequency is probably substantially degraded. The purpose of the present study was to change the listening procedures in several substantial ways from optimal towards more ordinary listening conditions, including a higher level of stimulus uncertainty, increased levels of phonetic context, and with the addition of a sentence identification task. Four vowels synthesized from a female talker were presented in isolation, or in the phonetic context of /bVd/ syllables, three-word phrases, or nine-word sentences. In the first experiment, formant resolution was estimated under medium stimulus uncertainty for three levels of phonetic context. Some undesirable training effects were obtained and led to the design of a new protocol for the second experiment to reduce this problem and to manipulate both length of phonetic context and level of difficulty in the simultaneous sentence identification task. Similar results were obtained in both experiments. The effect of phonetic context on formant discrimination is reduced as context lengthens such that no difference was found between vowels embedded in the phrase or sentence contexts. The addition of a challenging sentence identification task to the discrimination task did not degrade performance further and a stable pattern for formant discrimination in sentences emerged. This norm for the resolution of vowel formants under these more ordinary listening conditions was shown to be nearly a constant at 0.28 barks. Analysis of vowel spaces from 16 American English talkers determined that the closest vowels, on average, were 0.56 barks apart, that is, a factor of 2 larger than the norm obtained in these vowel formant discrimination tasks.
已在最佳聆听条件下确定了共振峰频率辨别阈值。在正常对话中,辨别共振峰频率的能力可能会大幅下降。本研究的目的是在几个重要方面改变聆听程序,从最佳条件转向更普通的聆听条件,包括更高水平的刺激不确定性、增加语音语境水平,并增加一个句子识别任务。由女性说话者合成的四个元音单独呈现,或呈现在/bVd/音节、三字短语或九字句子的语音语境中。在第一个实验中,在中等刺激不确定性下,针对三个语音语境水平估计共振峰分辨率。获得了一些不良的训练效果,这导致为第二个实验设计了一个新方案,以减少这个问题,并控制语音语境的长度和同步句子识别任务的难度水平。两个实验都得到了相似的结果。随着语境变长,语音语境对共振峰辨别的影响会降低,以至于在短语或句子语境中嵌入的元音之间没有发现差异。在辨别任务中增加一个具有挑战性的句子识别任务并没有进一步降低表现,并且出现了句子中共振峰辨别的稳定模式。在这些更普通的聆听条件下,元音共振峰分辨率的这个标准显示几乎恒定在0.28 Bark。对16位美国英语说话者的元音空间分析确定,平均而言,最接近的元音之间相隔0.56 Bark,即比这些元音共振峰辨别任务中获得的标准大2倍。