Liberman M
Department of Computer and Information Science, University of Pennsylvania, Philadelphia 19104, USA.
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9928-31. doi: 10.1073/pnas.92.22.9928.
Computer speech synthesis has reached a high level of performance, with increasingly sophisticated models of linguistic structure, low error rates in text analysis, and high intelligibility in synthesis from phonemic input. Mass market applications are beginning to appear. However, the results are still not good enough for the ubiquitous application that such technology will eventually have. A number of alternative directions of current research aim at the ultimate goal of fully natural synthetic speech. One especially promising trend is the systematic optimization of large synthesis systems with respect to formal criteria of evaluation. Speech recognition has progressed rapidly in the past decade through such approaches, and it seems likely that their application in synthesis will produce similar improvements.
计算机语音合成已达到很高的性能水平,具备日益复杂的语言结构模型、文本分析中的低错误率以及基于音素输入合成时的高清晰度。面向大众市场的应用开始出现。然而,对于这项技术最终将实现的广泛应用而言,其结果仍不尽如人意。当前一些研究的替代方向旨在实现完全自然合成语音这一最终目标。一个特别有前景的趋势是根据形式化评估标准对大型合成系统进行系统优化。在过去十年中,通过此类方法语音识别取得了快速进展,并且它们在合成中的应用似乎有可能带来类似的改进。