Iskarous Khalil
Haskins Laboratories, 300 George Street, New Haven, CT 06511, USA.
J Phon. 2010 Jul 1;38(3):375-387. doi: 10.1016/j.wocn.2010.03.002.
The area function of the vocal tract in all of its spatial detail is not directly computable from the speech signal. But is partial, yet phonetically distinctive, information about articulation recoverable from the acoustic signal that arrives at the listener's ear? The answer to this question is important for phonetics, because various theories of speech perception predict different answers. Some theories assume that recovery of articulatory information must be possible, while others assume that it is impossible. However, neither type of theory provides firm evidence showing that distinctive articulatory information is or is not extractable from the acoustic signal. The present study focuses on vowel gestures and examines whether linguistically significant information, such as the constriction location, constriction degree, and rounding, is contained in the speech signal, and whether such information is recoverable from formant parameters. Perturbation theory and linear prediction were combined, in a manner similar to that in Mokhtari (1998) [Mokhtari, P. (1998). An acoustic-phonetic and articulatory study of speech-speaker dichotomy. Doctoral dissertation, University of New South Wales], to assess the accuracy of recovery of information about vowel constrictions. Distinctive constriction information estimated from the speech signal for ten American English vowels were compared to the constriction information derived from simultaneously collected X-ray microbeam articulatory data for 39 speakers [Westbury (1994). Xray microbeam speech production database user's handbook. University of Wisconsin, Madison, WI]. The recovery of distinctive articulatory information relies on a novel technique that uses formant frequencies and amplitudes, and does not depend on a principal components analysis of the articulatory data, as do most other inversion techniques. These results provide evidence that distinctive articulatory information for vowels can be recovered from the acoustic signal.
声道的完整空间细节区域功能无法直接从语音信号中计算得出。但是,关于发音的部分但在语音上具有区别性的信息能否从到达听众耳朵的声学信号中恢复呢?这个问题的答案对语音学很重要,因为各种语音感知理论预测了不同的答案。一些理论假设必须能够恢复发音信息,而另一些理论则假设这是不可能的。然而,这两种理论都没有提供确凿的证据表明区别性发音信息是否可以从声学信号中提取。本研究聚焦于元音手势,考察语音信号中是否包含诸如收缩位置、收缩程度和圆唇等具有语言学意义的信息,以及这些信息是否可以从共振峰参数中恢复。扰动理论和线性预测以类似于莫赫塔里(1998年)[莫赫塔里,P.(1998年)。语音 - 说话者二分法的声学语音学和发音研究。博士论文,新南威尔士大学]的方式相结合,以评估元音收缩信息恢复的准确性。将从语音信号中估计出的十个美国英语元音的区别性收缩信息与从39名说话者同时收集的X射线微束发音数据中得出的收缩信息进行比较[韦斯特伯里(1994年)。X射线微束语音产生数据库用户手册。威斯康星大学麦迪逊分校,威斯康星州]。区别性发音信息的恢复依赖于一种使用共振峰频率和幅度的新技术,并且不像大多数其他反转技术那样依赖于发音数据的主成分分析。这些结果提供了证据,表明元音的区别性发音信息可以从声学信号中恢复。