King Simon, Frankel Joe, Livescu Karen, McDermott Erik, Richmond Korin, Wester Mirjam
Centre for Speech Technology Research, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, United Kingdom.
J Acoust Soc Am. 2007 Feb;121(2):723-42. doi: 10.1121/1.2404622.
Although much is known about how speech is produced, and research into speech production has resulted in measured articulatory data, feature systems of different kinds, and numerous models, speech production knowledge is almost totally ignored in current mainstream approaches to automatic speech recognition. Representations of speech production allow simple explanations for many phenomena observed in speech which cannot be easily analyzed from either acoustic signal or phonetic transcription alone. In this article, a survey of a growing body of work in which such representations are used to improve automatic speech recognition is provided.
尽管人们对语音的产生方式已经有了很多了解,并且对语音产生的研究已经产生了可测量的发音数据、各种特征系统和众多模型,但在当前主流的自动语音识别方法中,语音产生知识几乎完全被忽视。语音产生的表征能够对语音中观察到的许多现象给出简单解释,而这些现象仅从声学信号或语音转录中是不容易分析出来的。本文对越来越多使用此类表征来改进自动语音识别的研究工作进行了综述。