Kewley-Port D
J Acoust Soc Am. 1983 Jan;73(1):322-35. doi: 10.1121/1.388813.
Running spectral displays derived from linear prediction analysis were used to examine the initial 40 ms of stop-vowel CV syllables for possible acoustic correlates to place of articulation. Known spectral and temporal properties associated with the stop consonant release gesture were used to define a set of three-time-varying features observable in the visual displays. Judges identified place of articulation using these proposed features from running spectra of the syllables /b,d,g/paired with eight vowels produced by three talkers. Average correct identification of place was 88%; identification was better for the male talkers (92%) than the one female talker (78%). Post hoc analyses suggested, however, that simple rules could be incorporated in the feature definitions to account for differences in vocal tract size. The nature of the information contained in linear prediction running spectra was analyzed further to take account of known properties of the peripheral auditory system. The three proposed time-varying features were shown to be displayed robustly in auditory filtered running spectra. The advantages of describing acoustic correlates for place from the dynamically varying temporal and spectral information in running spectra is discussed with regard to the static template matching approach advocated recently by Blumstein and Stevens [J. Acoust. Soc. Am. 66, 1001-1017 (1979)].
由线性预测分析得出的动态频谱显示被用于检查塞音-元音CV音节开头的40毫秒,以寻找与发音部位可能相关的声学特征。与塞音释放动作相关的已知频谱和时间特性被用于定义一组在视觉显示中可观察到的随时间变化的特征。评判者利用这些从音节/b、d、g/与三位说话者发出的八个元音配对的动态频谱中提取的特征来识别发音部位。发音部位的平均正确识别率为88%;男性说话者的识别率(92%)高于女性说话者(78%)。然而,事后分析表明,可以在特征定义中纳入简单规则,以解释声道大小的差异。为了考虑外周听觉系统的已知特性,对线性预测动态频谱中包含的信息性质进行了进一步分析。结果表明,所提出的三个随时间变化的特征在听觉滤波后的动态频谱中表现稳定。文中还讨论了从动态变化的时间和频谱信息中描述发音部位声学特征的优势,与Blumstein和Stevens最近倡导的静态模板匹配方法[《美国声学学会杂志》66, 1001 - 1017 (1979)]进行了比较。