Forrest K, Weismer G, Milenkovic P, Dougall R N
Speech Motor Control Laboratories, Waisman Center, University of Wisconsin, Madison.
J Acoust Soc Am. 1988 Jul;84(1):115-23. doi: 10.1121/1.396977.
A statistical procedure for classifying word-initial voiceless obstruents is described. The data set to which the analysis was applied consisted of monosyllabic words starting with a voiceless obstruent. Each word was repeated six times in the carrier phrase "I can say again" by each of ten speakers. Fast Fourier transforms (FFTs), using a 20-ms Hamming window, were calculated every 10 ms from the onset of the obstruent through the third cycle of the following vowel. Each FFT was treated as a random probability distribution from which the first four moments (mean, variance, skewness, and kurtosis) were computed. Moments were calculated from linear and Bark transformed spectra. Data were pooled across vowel contexts for speakers of a given gender and input to a discriminant analysis. Using the moments calculated from the linear spectra, 92% of the voiceless stops were classified correctly when dynamic aspects of the stop were included. Even more important, the model constructed from the males' data correctly classified about 94% of the voiceless stops produced by the female speakers. Classification of the voiceless fricatives when all places of articulation were included in the analysis did not exceed 80% correct when the moments from either the linear or Bark transformed scales were used. However, classification of only the voiceless sibilants was 98% correct when the moments from the Bark transformed spectra were used. As with the stops, the classification model held across gender.
本文描述了一种对词首清塞音进行分类的统计程序。应用该分析的数据集由以清塞音开头的单音节词组成。每个词由十位说话者中的每一位在载体短语“我可以再说一遍”中重复六次。从塞音起始到后续元音的第三个周期,每隔10毫秒使用20毫秒汉明窗计算快速傅里叶变换(FFT)。每个FFT被视为一个随机概率分布,并计算其前四个矩(均值、方差、偏度和峰度)。矩是根据线性和巴克变换谱计算的。针对给定性别的说话者,将不同元音语境下的数据合并,并输入到判别分析中。使用从线性谱计算出的矩,当纳入塞音的动态特征时,92%的清塞音被正确分类。更重要的是,由男性数据构建的模型正确分类了女性说话者产生的约94%的清塞音。当分析中包括所有发音部位时,使用线性或巴克变换尺度的矩对清擦音进行分类时,正确率不超过80%。然而,当使用巴克变换谱的矩时,仅对清擦音的分类正确率为98%。与塞音一样,分类模型在不同性别间都适用。