• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Training and search methods for speech recognition.语音识别的训练与搜索方法。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9964-9. doi: 10.1073/pnas.92.22.9964.
2
State of the art in continuous speech recognition.连续语音识别的技术现状。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63. doi: 10.1073/pnas.92.22.9956.
3
The roles of language processing in a spoken language interface.语言处理在口语界面中的作用。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9970-6. doi: 10.1073/pnas.92.22.9970.
4
A commercial large-vocabulary discrete speech recognition system: DragonDictate.
Lang Speech. 1992 Jan-Jun;35 ( Pt 1-2):237-46. doi: 10.1177/002383099203500218.
5
Segmenting speech using dynamic programming.使用动态规划对语音进行分割。
J Acoust Soc Am. 1981 May;69(5):1430-8. doi: 10.1121/1.385826.
6
Structural design of hidden Markov model speech recognizer using multivalued phonetic features: comparison with segmental speech units.
J Acoust Soc Am. 1992 Dec;92(6):3058-67. doi: 10.1121/1.404202.
7
Deployment of human-machine dialogue systems.人机对话系统的部署
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10017-22. doi: 10.1073/pnas.92.22.10017.
8
Probabilistic independence networks for hidden Markov probability models.用于隐马尔可夫概率模型的概率独立网络。
Neural Comput. 1997 Feb 15;9(2):227-69. doi: 10.1162/neco.1997.9.2.227.
9
Hidden Markov models for speech and signal recognition.用于语音和信号识别的隐马尔可夫模型。
Electroencephalogr Clin Neurophysiol Suppl. 1996;45:137-52.
10
Toward the ultimate synthesis/recognition system.迈向终极合成/识别系统。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10040-5. doi: 10.1073/pnas.92.22.10040.

引用本文的文献

1
Models of natural language understanding.自然语言理解模型。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9977-82. doi: 10.1073/pnas.92.22.9977.
2
State of the art in continuous speech recognition.连续语音识别的技术现状。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63. doi: 10.1073/pnas.92.22.9956.
3
Speech recognition technology: a critique.语音识别技术:一篇评论
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9953-5. doi: 10.1073/pnas.92.22.9953.
4
New trends in natural language processing: statistical natural language processing.自然语言处理的新趋势:统计自然语言处理。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10052-9. doi: 10.1073/pnas.92.22.10052.

本文引用的文献

1
State of the art in continuous speech recognition.连续语音识别的技术现状。
Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63. doi: 10.1073/pnas.92.22.9956.

语音识别的训练与搜索方法。

Training and search methods for speech recognition.

作者信息

Jelinek F

机构信息

IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA.

出版信息

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9964-9. doi: 10.1073/pnas.92.22.9964.

DOI:10.1073/pnas.92.22.9964
PMID:7479810
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC40719/
Abstract

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.

摘要

语音识别涉及三个过程

从语音信号中提取声学指标,估计观察到的指标串由假设的话语片段引起的概率,以及通过在假设的备选方案中进行搜索来确定识别出的话语。本文不涉及第一个过程。指标串概率的估计涉及任何给定话语片段(例如一个单词)的指标生成模型。为此使用了隐马尔可夫模型(HMM)[马赫库尔,J. & 施瓦茨,R.(1995年)《美国国家科学院院刊》92,9956 - 9963]。它们的参数是状态转移概率和与转移相关的输出概率分布。本文将描述通过连续重新估计从语音数据中获取这些参数值的鲍姆算法。识别器希望找到最有可能导致观察到的声学指标串的话语。该概率是两个因素的乘积:话语产生该串的概率和说话者希望产生该话语的概率(语言模型概率)。即使词汇量适中,也不可能详尽地搜索话语。描述了一种实用算法[维特比,A. J.(1967年)《IEEE信息论学报》IT - 13,260 - 267],给定指标串时,该算法很有可能找到最有可能的话语。