Bakkum M J, Plomp R, Pols L C
Department of Oto-Rhino-Laryngology, Free University Hospital, Amsterdam, The Netherlands.
J Acoust Soc Am. 1993 Oct;94(4):1989-2004. doi: 10.1121/1.407502.
An objective analysis has been performed on all 15 Dutch vowels pronounced in /hVt/ words by nine native Dutch, nine non-native, and six deaf males. Spectral representations of the vowel segments were created by determining the mean output levels of a bank of 16 filters (90-7200 Hz), with 1/3-oct bandwidths and logarithmic spacing of their center frequencies. The adequacy of the objective analysis is determined by the extent to which spectral information provides an accurate description of pronunciation quality. Spectral distances between the 24 utterances of each monophthong agree rather well with subjective distances obtained by listeners in an elaborate paired-comparisons experiment. For the various monophthongs, the correlation coefficients are within the range 0.63 to 0.88; averaging across all 12 monophthongs of each speaker results in a coefficient of 0.94. Furthermore, it appeared that the objective spectral analysis is as reliable as a subjective assessment by magnitude estimation by two to three listeners. Using principal components analysis (PCA), the number of dimensions by which the vowel spectra are described can be reduced. For the various monophthongs the range of the correlation coefficients between subjective distances and objective distances in a two-dimensional PCA subspace is 0.30-0.93. The three groups of speakers can still be distinguished in this subspace. In the extreme case of the deaf speakers all vowels are strongly "neutralized," whereas the different vowels of the native speakers are well separated, especially after speaker normalization; results are less clear for the non-natives.
对九名以荷兰语为母语的男性、九名非母语男性和六名失聪男性在/hVt/单词中所发的全部15个荷兰元音进行了客观分析。通过确定一组16个滤波器(90 - 7200赫兹)的平均输出水平来创建元音段的频谱表示,这些滤波器具有1/3倍频程带宽且中心频率呈对数间隔。客观分析的充分性取决于频谱信息对发音质量准确描述的程度。每个单元音的24次发音之间的频谱距离与听众在精心设计的配对比较实验中获得的主观距离相当吻合。对于各种单元音,相关系数在0.63至0.88之间;对每个说话者的所有12个单元音进行平均后,相关系数为0.94。此外,结果表明客观频谱分析与两到三名听众通过量级估计进行的主观评估一样可靠。使用主成分分析(PCA),可以减少描述元音频谱的维度数量。在二维PCA子空间中,主观距离与客观距离之间的相关系数范围对于各种单元音为0.30 - 0.93。在这个子空间中,仍然可以区分这三组说话者。在失聪说话者的极端情况下,所有元音都强烈“中和”,而母语者的不同元音则分得很开,尤其是在进行说话者归一化之后;非母语者的结果不太清晰。