MEMBER, IEEE, IBM T. J. Watson Research Center, Yorktown Heights, NY 10598.
IEEE Trans Pattern Anal Mach Intell. 1983 Feb;5(2):179-90. doi: 10.1109/tpami.1983.4767370.
Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two decoding methods, one appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks. To illustrate the usefulness of the methods described, we review a number of decoding results that have been obtained with them.
语音识别被表述为最大似然解码的问题。这种表述需要语音产生过程的统计模型。在本文中,我们描述了一些用于语音识别的统计模型。我们特别关注如何从稀疏数据中确定这些模型的参数。我们还描述了两种解码方法,一种适用于约束性人工语言,另一种适用于更现实的解码任务。为了说明所描述方法的有用性,我们回顾了用这些方法获得的一些解码结果。