连续语音识别的技术现状。

State of the art in continuous speech recognition.

作者信息

Makhoul J, Schwartz R

机构信息

BBN Systems and Technologies, Cambridge, MA 02138, USA.

出版信息

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63. doi: 10.1073/pnas.92.22.9956.

DOI:10.1073/pnas.92.22.9956

PMID:7479809

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC40718/

Abstract

In the past decade, tremendous advances in the state of the art of automatic speech recognition by machine have taken place. A reduction in the word error rate by more than a factor of 5 and an increase in recognition speeds by several orders of magnitude (brought about by a combination of faster recognition search algorithms and more powerful computers), have combined to make high-accuracy, speaker-independent, continuous speech recognition for large vocabularies possible in real time, on off-the-shelf workstations, without the aid of special hardware. These advances promise to make speech recognition technology readily available to the general public. This paper focuses on the speech recognition advances made through better speech modeling techniques, chiefly through more accurate mathematical modeling of speech sounds.

摘要

在过去十年中，机器自动语音识别技术取得了巨大进展。单词错误率降低了五倍多，识别速度提高了几个数量级（这是由更快的识别搜索算法和更强大的计算机共同实现的），使得在无需特殊硬件的现成工作站上实时进行高精度、与说话者无关的大词汇量连续语音识别成为可能。这些进展有望使语音识别技术为广大公众所广泛使用。本文重点关注通过更好的语音建模技术取得的语音识别进展，主要是通过对语音进行更精确的数学建模。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4074/40718/871a2e02fda3/pnas01500-0055-a.jpg

相似文献

State of the art in continuous speech recognition.连续语音识别的技术现状。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63. doi: 10.1073/pnas.92.22.9956.

Toward the ultimate synthesis/recognition system.迈向终极合成/识别系统。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10040-5. doi: 10.1073/pnas.92.22.10040.

Deployment of human-machine dialogue systems.人机对话系统的部署

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10017-22. doi: 10.1073/pnas.92.22.10017.

Scientific bases of human-machine communication by voice.人机语音通信的科学基础。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9914-20. doi: 10.1073/pnas.92.22.9914.

Commercial applications of speech interface technology: an industry at the threshold.语音接口技术的商业应用：一个处于起步阶段的行业。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10007-10. doi: 10.1073/pnas.92.22.10007.

Training and search methods for speech recognition.语音识别的训练与搜索方法。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9964-9. doi: 10.1073/pnas.92.22.9964.

Speech recognition technology: a critique.语音识别技术：一篇评论

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9953-5. doi: 10.1073/pnas.92.22.9953.

Voice-processing technologies--their application in telecommunications.语音处理技术——及其在电信领域的应用。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9991-8. doi: 10.1073/pnas.92.22.9991.

The role of voice input for human-machine communication.语音输入在人机通信中的作用。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9921-7. doi: 10.1073/pnas.92.22.9921.

Models of speech synthesis.语音合成模型。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9932-7. doi: 10.1073/pnas.92.22.9932.

引用本文的文献

TiCT Composite Aerogels Enable Pressure Sensors for Dialect Speech Recognition Assisted by Deep Learning.TiCT复合气凝胶助力用于深度学习辅助方言语音识别的压力传感器。

Nanomicro Lett. 2024 Dec 30;17(1):101. doi: 10.1007/s40820-024-01605-z.

Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models.回顾性分析爱沙尼亚语音识别系统在放射学中的临床性能：不同声学和语言模型的影响。

J Digit Imaging. 2018 Oct;31(5):615-621. doi: 10.1007/s10278-018-0085-8.

Analysis of Documentation Speed Using Web-Based Medical Speech Recognition Technology: Randomized Controlled Trial.使用基于网络的医学语音识别技术分析文档记录速度：随机对照试验。

J Med Internet Res. 2015 Nov 3;17(11):e247. doi: 10.2196/jmir.5072.

[Speech recognition: impact on workflow and report availability].[语音识别：对工作流程和报告可用性的影响]

Radiologe. 2005 Aug;45(8):735-42. doi: 10.1007/s00117-005-1253-7.

Continuous speech recognition for clinicians.面向临床医生的连续语音识别

J Am Med Inform Assoc. 1999 May-Jun;6(3):195-204. doi: 10.1136/jamia.1999.0060195.

Integration of speech with natural language understanding.语音与自然语言理解的整合。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9983-8. doi: 10.1073/pnas.92.22.9983.

Models of natural language understanding.自然语言理解模型。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9977-82. doi: 10.1073/pnas.92.22.9977.

Training and search methods for speech recognition.语音识别的训练与搜索方法。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9964-9. doi: 10.1073/pnas.92.22.9964.

Speech recognition technology: a critique.语音识别技术：一篇评论

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9953-5. doi: 10.1073/pnas.92.22.9953.

Speech technology in 2001: new research directions.2001年的语音技术：新的研究方向。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10046-51. doi: 10.1073/pnas.92.22.10046.

本文引用的文献

[Uremia in oxalic acid poisoning].[草酸中毒中的尿毒症]

Wien Med Wochenschr. 1961 Feb 11;111:111-4.

Training and search methods for speech recognition.语音识别的训练与搜索方法。

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9964-9. doi: 10.1073/pnas.92.22.9964.

Application of an auditory model to speech recognition.一种听觉模型在语音识别中的应用。

J Acoust Soc Am. 1989 Jun;85(6):2623-9. doi: 10.1121/1.397756.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

连续语音识别的技术现状。

State of the art in continuous speech recognition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献