Suppr
超能文献

面向临床医生的连续语音识别

Continuous speech recognition for clinicians.

作者信息

Zafar A, Overhage J M, McDonald C J

机构信息

Indiana University, Regenstrief Institute for Health Care, Indianapolis 46202-2859, USA.

出版信息

J Am Med Inform Assoc. 1999 May-Jun;6(3):195-204. doi: 10.1136/jamia.1999.0060195.

DOI:10.1136/jamia.1999.0060195

PMID:10332653

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC61360/

Abstract

The current generation of continuous speech recognition systems claims to offer high accuracy (greater than 95 percent) speech recognition at natural speech rates (150 words per minute) on low-cost (under $2000) platforms. This paper presents a state-of-the-technology summary, along with insights the authors have gained through testing one such product extensively and other products superficially. The authors have identified a number of issues that are important in managing accuracy and usability. First, for efficient recognition users must start with a dictionary containing the phonetic spellings of all words they anticipate using. The authors dictated 50 discharge summaries using one inexpensive internal medicine dictionary ($30) and found that they needed to add an additional 400 terms to get recognition rates of 98 percent. However, if they used either of two more expensive and extensive commercial medical vocabularies ($349 and $695), they did not need to add terms to get a 98 percent recognition rate. Second, users must speak clearly and continuously, distinctly pronouncing all syllables. Users must also correct errors as they occur, because accuracy improves with error correction by at least 5 percent over two weeks. Users may find it difficult to train the system to recognize certain terms, regardless of the amount of training, and appropriate substitutions must be created. For example, the authors had to substitute "twice a day" for "bid" when using the less expensive dictionary, but not when using the other two dictionaries. From trials they conducted in settings ranging from an emergency room to hospital wards and clinicians' offices, they learned that ambient noise has minimal effect. Finally, they found that a minimal "usable" hardware configuration (which keeps up with dictation) comprises a 300-MHz Pentium processor with 128 MB of RAM and a "speech quality" sound card (e.g., SoundBlaster, $99). Anything less powerful will result in the system lagging behind the speaking rate. The authors obtained 97 percent accuracy with just 30 minutes of training when using the latest edition of one of the speech recognition systems supplemented by a commercial medical dictionary. This technology has advanced considerably in recent years and is now a serious contender to replace some or all of the increasingly expensive alternative methods of dictation with human transcription.

摘要

当前一代的连续语音识别系统宣称能在低成本（低于2000美元）平台上，以自然语速（每分钟150个单词）实现高精度（超过95%）的语音识别。本文给出了一份技术现状总结，以及作者通过对一款此类产品进行广泛测试和对其他产品进行粗略测试所获得的见解。作者们确定了在管理准确性和可用性方面一些重要的问题。首先，为了实现高效识别，用户必须从一个包含他们预期会使用的所有单词的语音拼写的词典开始。作者使用一本便宜的内科词典（30美元）听写了50份出院小结，发现他们需要额外添加400个术语才能获得98%的识别率。然而，如果他们使用另外两本更昂贵且内容更丰富的商业医学词汇表（分别为349美元和695美元），则无需添加术语就能获得98%的识别率。其次，用户必须清晰且连续地说话，清晰地发出所有音节。用户还必须在错误出现时进行纠正，因为通过纠错，两周内准确率至少能提高5%。无论训练量如何，用户可能会发现难以训练系统识别某些术语，必须创建合适的替代词。例如，使用较便宜的词典时，作者不得不将“一天两次”替换为“bid”，而使用另外两本词典时则无需这样做。从他们在从急诊室到医院病房以及临床医生办公室等各种环境中进行的试验来看，他们了解到环境噪音的影响极小。最后，他们发现一个最低限度的“可用”硬件配置（能跟上听写速度）包括一台带有128兆字节随机存取存储器的300兆赫奔腾处理器和一块“语音质量”声卡（例如声霸卡，99美元）。任何性能更低的配置都会导致系统跟不上说话速度。作者在使用其中一款语音识别系统的最新版本并辅以一本商业医学词典时仅经过30分钟训练就获得了97%的准确率。近年来这项技术有了很大进展，现在它已成为一个有力的竞争者，有望取代部分或全部日益昂贵的人工转录听写替代方法。

相似文献

Continuous speech recognition for clinicians.

J Am Med Inform Assoc. 1999 May-Jun;6(3):195-204. doi: 10.1136/jamia.1999.0060195.

Comparison of voice-automated transcription and human transcription in generating pathology reports.

Arch Pathol Lab Med. 2003 Jun;127(6):721-5. doi: 10.5858/2003-127-721-COVTAH.

Evaluation of VoiceType Dictation for Windows for the radiologist.

Med Prog Technol. 1996;21(4):177-80.

Comparative evaluation of three continuous speech recognition software packages in the generation of medical reports.

J Am Med Inform Assoc. 2000 Sep-Oct;7(5):462-8. doi: 10.1136/jamia.2000.0070462.

Speech-to-text: the next revelation for recording data.

Radiol Manage. 1997 Nov-Dec;19(6):50-3.

Speech recognition systems. Are they up to the task?

Health Devices. 2002 Feb;31(2):65-71.

[Computer-assisted speech recognition in diagnostic pathology. Development of the DragonDictate-30 K system for documentation].

Pathologe. 1995 Nov;16(6):439-42.

Automated speech recognition for time recording in out-of-hospital emergency medicine-an experimental approach.

Resuscitation. 2004 Feb;60(2):205-12. doi: 10.1016/j.resuscitation.2003.10.006.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

A continuous-speech interface to a decision support system: I. Techniques to accommodate for misrecognized input.

J Am Med Inform Assoc. 1995 Jan-Feb;2(1):36-45. doi: 10.1136/jamia.1995.95202546.

引用本文的文献

Evaluation and comparison of errors on nursing notes created by online and offline speech recognition technology and handwritten: an interventional study.

BMC Med Inform Decis Mak. 2022 Apr 8;22(1):96. doi: 10.1186/s12911-022-01835-4.

A systematic comparison of contemporary automatic speech recognition engines for conversational clinical speech.

AMIA Annu Symp Proc. 2018 Dec 5;2018:683-689. eCollection 2018.

Speech recognition for clinical documentation from 1990 to 2018: a systematic review.

J Am Med Inform Assoc. 2019 Apr 1;26(4):324-338. doi: 10.1093/jamia/ocy179.

Sharing Annotated Audio Recordings of Clinic Visits With Patients-Development of the Open Recording Automated Logging System (ORALS): Study Protocol.

JMIR Res Protoc. 2017 Jul 6;6(7):e121. doi: 10.2196/resprot.7735.

Gestonurse: a robotic surgical nurse for handling surgical instruments in the operating room.

J Robot Surg. 2012 Mar;6(1):53-63. doi: 10.1007/s11701-011-0325-0. Epub 2011 Nov 27.

Real-Time Captioning for Improving Informed Consent: Patient and Physician Benefits.

Reg Anesth Pain Med. 2016 Jan-Feb;41(1):65-8. doi: 10.1097/AAP.0000000000000347.

Towards spoken clinical-question answering: evaluating and adapting automatic speech-recognition systems for spoken clinical questions.

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):625-30. doi: 10.1136/amiajnl-2010-000071. Epub 2011 Jun 24.

An Evidence-Based Adoption of Technology Model for Remote Monitoring of Elders' Daily Activities.

Ageing Int. 2010 Sep 23;36(1):66-81. doi: 10.1007/s12126-010-9073-0.

Lessons learned from implementation of voice recognition for documentation in the military electronic health record system.

Perspect Health Inf Manag. 2010 Jan 1;7(Winter):1e.

Voice recognition dictation: radiologist as transcriptionist.

J Digit Imaging. 2008 Dec;21(4):384-9. doi: 10.1007/s10278-007-9039-2.

本文引用的文献

Voice recognition for the radiology market.

Top Health Rec Manage. 1992 Mar;12(3):58-63.

Computer-based speech recognition as a replacement for medical transcription.

AJR Am J Roentgenol. 1998 Jan;170(1):23-5. doi: 10.2214/ajr.170.1.9423591.

Development of a controlled medical terminology: knowledge acquisition and knowledge representation.

Methods Inf Med. 1995 Mar;34(1-2):85-95.

Implementation of a comprehensive computer-based patient record system in Kaiser Permanente's Northwest Region.

MD Comput. 1997 Jan-Feb;14(1):41-5.

A continuous-speech interface to a decision support system: II. An evaluation using a Wizard-of-Oz experimental paradigm.

J Am Med Inform Assoc. 1995 Jan-Feb;2(1):46-57. doi: 10.1136/jamia.1995.95202548.

State of the art in continuous speech recognition.

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9956-63. doi: 10.1073/pnas.92.22.9956.

What does voice-processing technology support today?

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10023-30. doi: 10.1073/pnas.92.22.10023.

Computerized radiologic reporting with voice data-entry.

Radiology. 1981 Mar;138(3):585-8. doi: 10.1148/radiology.138.3.7465833.

Advances in radiologic reporting with Computerized Language Information Processing (CLIP).

Radiology. 1979 Nov;133(2):349-53. doi: 10.1148/133.2.349.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

面向临床医生的连续语音识别

Continuous speech recognition for clinicians.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译