Andruski J E, Nearey T M
Department of Linguistics, University of Alberta, Edmonton, Canada.
J Acoust Soc Am. 1992 Jan;91(1):390-410. doi: 10.1121/1.402781.
It has been suggested [e.g., Strange et al., J. Acoust. Soc. Am. 74, 695-705 (1983); Verbrugge and Rakerd, Language Speech 29, 39-57 (1986)] that the temporal margins of vowels in consonantal contexts, consisting mainly of the rapid CV and VC transitions of CVC's, contain dynamic cues to vowel identity that are not available in isolated vowels and that may be perceptually superior in some circumstances to cues which are inherent to the vowels proper. However, this study shows that vowel-inherent formant targets and cues to vowel-inherent spectral change (measured from nucleus to offglide sections of the vowel itself) persist in the margins of /bVb/ syllables, confirming a hypothesis of Nearey and Assmann [J. Acoust. Soc. Am. 80, 1297-1308 (1986)]. Experiments were conducted to test whether listeners might be using such vowel-inherent, rather than coarticulatory information to identify the vowels. In the first experiment, perceptual tests using "hybrid silent center" syllables (i.e., syllables which contain only brief initial and final portions of the original syllable, and in which speaker identity changes from the initial to the final portion) show that listeners' error rates and confusion matrices for vowels in /bVb/ syllables are very similar to those for isolated vowels. These results suggest that listeners are using essentially the same type of information in essentially the same way to identify both kinds of stimuli. Statistical pattern recognition models confirm the relative robustness of nucleus and vocalic offglide cues and can predict reasonably well listeners' error patterns in all experimental conditions, though performance for /bVb/ syllables is somewhat worse than for isolated vowels. The second experiment involves the use of simplified synthetic stimuli, lacking consonantal transitions, which are shown to provide information that is nearly equivalent phonetically to that of the natural silent center /bVb/ syllables (from which the target measurements were extracted). Although no conclusions are drawn about other contexts, for speakers of Western Canadian English coarticulatory cues appear to play at best a minor role in the perception of vowels in /bVb/ context, while vowel-inherent factors dominate listeners' perception.
有人提出[例如,斯特兰奇等人,《美国声学学会杂志》74,695 - 705(1983);韦布鲁格和拉克德,《语言与言语》29,39 - 57(1986)],在辅音语境中元音的时间边缘,主要由CVC中快速的CV和VC过渡组成,包含元音识别的动态线索,这些线索在孤立元音中是不存在的,并且在某些情况下在感知上可能优于元音本身固有的线索。然而,本研究表明,元音固有的共振峰目标和元音固有频谱变化的线索(从元音本身的核到滑音部分测量)在/bVb/音节的边缘持续存在,证实了尼里和阿斯曼的一个假设[《美国声学学会杂志》80,1297 - 1308(1986)]。进行了实验来测试听众是否可能使用这种元音固有的信息,而不是协同发音信息来识别元音。在第一个实验中,使用“混合无声中心”音节的感知测试(即只包含原始音节的简短开头和结尾部分,并且说话者身份从开头部分到结尾部分发生变化的音节)表明,听众对/bVb/音节中元音的错误率和混淆矩阵与对孤立元音的非常相似。这些结果表明,听众使用基本相同类型的信息,以基本相同的方式来识别这两种刺激。统计模式识别模型证实了核和元音滑音线索的相对稳健性,并且在所有实验条件下都能较好地预测听众的错误模式,尽管/bVb/音节的表现比孤立元音稍差。第二个实验涉及使用简化的合成刺激,缺乏辅音过渡,结果表明这些刺激提供的语音信息几乎与自然无声中心/bVb/音节(从中提取目标测量值)的等效。虽然没有对其他语境得出结论,但对于加拿大西部英语的说话者来说,协同发音线索在/bVb/语境中元音的感知中似乎至多起到次要作用,而元音固有因素主导着听众的感知。