Assmann P F
School of Human Development, University of Texas at Dallas, Richardson 75083.
J Acoust Soc Am. 1995 Jan;97(1):575-84. doi: 10.1121/1.412281.
When two voices compete for the attention of the listener, the spectral peaks that define the formants of one voice can be intermittently obscured or distorted by formants of the other voice. However, formant peaks vary slowly and continuously in frequency and time, providing a basis for tracking through regions of overlap. Three experiments investigated the ability of listeners to exploit formant-pattern continuity to segregate pairs of synthesized vowels that were presented simultaneously and monaurally. Experiment 1 and 2 examined the effects of introducing one member of the pair with formant-frequency transitions that specified syllable-initial glides /w/ or /j/. Identification accuracy was generally higher in conditions where glides were present. Gliding formants provided smaller benefits than a two-semitone difference in fundamental frequency between the vowels. Experiment 3 found larger effects of formant transitions specifying initial or final /l/. Overall, formant transitions did not make it easier to identify the vowel to which they were linked; instead, they helped by making the competing vowel more identifiable. One explanation for improvement in the glide conditions is a formant-tracking process which groups together the formants of each voice using the Gestalt principle of good continuation. However, this account predicts improvement for both vowels which was generally not observed. An alternative explanation is suggested by models that apply a brief, sliding temporal window to determine which region of the signal provides the strongest evidence of each vowel constituent. The formant transition region may provide a time interval during which the competing steady-state vowel is perceptually more prominent.
当两个声音争夺听众的注意力时,界定一个声音共振峰的频谱峰值可能会被另一个声音的共振峰间歇性地掩盖或扭曲。然而,共振峰峰值在频率和时间上缓慢且连续地变化,这为在重叠区域进行追踪提供了基础。三项实验研究了听众利用共振峰模式连续性来分离同时单耳呈现的合成元音对的能力。实验1和2考察了引入元音对中的一个成员时,其共振峰频率过渡指定音节起始滑音/w/或/j/的效果。在存在滑音的条件下,识别准确率通常更高。与元音之间基频相差两个半音相比,滑动共振峰带来的益处较小。实验3发现,共振峰过渡指定起始或结尾/l/时效果更显著。总体而言,共振峰过渡并没有使与其相连的元音更容易被识别;相反,它们通过使竞争元音更易于识别而起到了帮助作用。对滑音条件下识别准确率提高的一种解释是共振峰追踪过程,该过程利用良好延续性的格式塔原则将每个声音的共振峰归为一组。然而,这种解释预测两个元音的识别准确率都会提高,但这一情况通常并未观察到。另一种解释由一些模型提出,这些模型应用一个短暂的滑动时间窗口来确定信号的哪个区域为每个元音成分提供了最有力的证据。共振峰过渡区域可能提供了一个时间间隔,在此期间竞争的稳态元音在感知上更加突出。