Suppr
超能文献

当代用于对话式临床语音的自动语音识别引擎的系统比较。

A systematic comparison of contemporary automatic speech recognition engines for conversational clinical speech.

作者信息

Kodish-Wachs Jodi, Agassi Emin, Kenny Patrick, Overhage J Marc

机构信息

Cerner Corporation, Malvern, PA.

出版信息

AMIA Annu Symp Proc. 2018 Dec 5;2018:683-689. eCollection 2018.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6371385/

Abstract

Conversations especially between a clinician and a patient are important sources of data to support clinical care. To date, clinicians act as the sensor to capture these data and record them in the medical record. Automatic speech recognition (ASR) engines have advanced to support continuous speech, to work independently of speaker and deliver continuously improving performance. Near human levels of performance have been reported for several ASR engines. We undertook a systematic comparison of selected ASRs for clinical conversational speech. Using audio recorded from unscripted clinical scenarios using two microphones, we evaluated eight ASR engines using word error rate (WER) and the precision, recall and F1 scores for concept extraction. We found a wide range of word errors across the ASR engines, with values ranging from 65% to 34%, all falling short of the rates achieved for other conversational speech. Recall for health concepts also ranged from 22% to 74%. Concept recall rates match or exceed expectations given measured word error rates suggesting that vocabulary is not the dominant issue.

摘要

对话，尤其是临床医生与患者之间的对话，是支持临床护理的数据的重要来源。迄今为止，临床医生充当捕捉这些数据并将其记录在病历中的传感器。自动语音识别（ASR）引擎已经取得进展，以支持连续语音，独立于说话者工作并不断提高性能。已有报道称，几款ASR引擎的性能接近人类水平。我们对用于临床对话语音的选定ASR进行了系统比较。我们使用两个麦克风从无脚本临床场景中录制的音频，通过单词错误率（WER）以及概念提取的精确率、召回率和F1分数对八个ASR引擎进行了评估。我们发现，各ASR引擎的单词错误率差异很大，数值范围从65%到34%，均低于其他对话语音所达到的比率。健康概念的召回率也在22%到74%之间。考虑到测得的单词错误率，概念召回率符合或超过预期，这表明词汇不是主要问题。

相似文献

1

A systematic comparison of contemporary automatic speech recognition engines for conversational clinical speech.

AMIA Annu Symp Proc. 2018 Dec 5;2018:683-689. eCollection 2018.

2

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

J Am Med Inform Assoc. 2023 Mar 16;30(4):703-711. doi: 10.1093/jamia/ocad001.

3

Automatic speech recognition performance for digital scribes: a performance comparison between general-purpose and specialized models tuned for patient-clinician conversations.

AMIA Annu Symp Proc. 2023 Apr 29;2022:1072-1080. eCollection 2022.

4

The development of an automatic speech recognition model using interview data from long-term care for older adults.

J Am Med Inform Assoc. 2023 Feb 16;30(3):411-417. doi: 10.1093/jamia/ocac241.

5

A simple error classification system for understanding sources of error in automatic speech recognition and human transcription.

Int J Med Inform. 2004 Sep;73(9-10):719-30. doi: 10.1016/j.ijmedinf.2004.05.008.

6

Automatic Speech Recognition Performance Improvement for Mandarin Based on Optimizing Gain Control Strategy.

Sensors (Basel). 2022 Apr 15;22(8):3027. doi: 10.3390/s22083027.

7

A comparison of automatic and human speech recognition in null grammar.

J Acoust Soc Am. 2012 Mar;131(3):EL256-61. doi: 10.1121/1.3684744.

8

Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models.

J Digit Imaging. 2018 Oct;31(5):615-621. doi: 10.1007/s10278-018-0085-8.

9

Assessing the Effectiveness of Automatic Speech Recognition Technology in Emergency Medicine Settings: A Comparative Study of Four AI-powered Engines.

Res Sq. 2024 Aug 17:rs.3.rs-4727659. doi: 10.21203/rs.3.rs-4727659/v1.

10

Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech.

Neural Netw. 2016 Jun;78:97-111. doi: 10.1016/j.neunet.2015.12.010. Epub 2015 Dec 30.

引用本文的文献

1

The impact of using AI-powered voice-to-text technology for clinical documentation on quality of care in primary care and outpatient settings: a systematic review.

EBioMedicine. 2025 Jul 21;118:105861. doi: 10.1016/j.ebiom.2025.105861.

2

Evaluating the performance of artificial intelligence-based speech recognition for clinical documentation: a systematic review.

BMC Med Inform Decis Mak. 2025 Jul 1;25(1):236. doi: 10.1186/s12911-025-03061-0.

3

Association of machine-learning-rated supportive counseling skills with psychotherapy outcome.

J Consult Clin Psychol. 2025 Feb;93(2):110-119. doi: 10.1037/ccp0000935.

4

Cognitive Computing-Based CDSS in Medical Practice.

Health Data Sci. 2021 Jul 22;2021:9819851. doi: 10.34133/2021/9819851. eCollection 2021.

5

Feedback From Automatic Speech Recognition to Elicit Clear Speech in Healthy Speakers.

Am J Speech Lang Pathol. 2023 Nov 6;32(6):2940-2959. doi: 10.1044/2023_AJSLP-23-00030. Epub 2023 Oct 12.

6

Automatic speech recognition performance for digital scribes: a performance comparison between general-purpose and specialized models tuned for patient-clinician conversations.

AMIA Annu Symp Proc. 2023 Apr 29;2022:1072-1080. eCollection 2022.

7

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

J Am Med Inform Assoc. 2023 Mar 16;30(4):703-711. doi: 10.1093/jamia/ocad001.

8

Automatic Assessment of Intelligibility in Noise in Parkinson Disease: Validation Study.

J Med Internet Res. 2022 Oct 20;24(10):e40567. doi: 10.2196/40567.

9

Expectations for Artificial Intelligence (AI) in Psychiatry.

Curr Psychiatry Rep. 2022 Nov;24(11):709-721. doi: 10.1007/s11920-022-01378-5. Epub 2022 Oct 10.

10

A dataset of simulated patient-physician medical interviews with a focus on respiratory cases.

Sci Data. 2022 Jun 16;9(1):313. doi: 10.1038/s41597-022-01423-1.

本文引用的文献

1

Benchmarking clinical speech recognition and information extraction: new data, methods, and evaluations.

JMIR Med Inform. 2015 Apr 27;3(2):e19. doi: 10.2196/medinform.4321.

2

Towards spoken clinical-question answering: evaluating and adapting automatic speech-recognition systems for spoken clinical questions.

J Am Med Inform Assoc. 2011 Sep-Oct;18(5):625-30. doi: 10.1136/amiajnl-2010-000071. Epub 2011 Jun 24.

3

Continuous speech recognition for clinicians.

J Am Med Inform Assoc. 1999 May-Jun;6(3):195-204. doi: 10.1136/jamia.1999.0060195.

4

The validity of the medical record.

Med Care. 1981 Mar;19(3):310-5. doi: 10.1097/00005650-198103000-00006.

5

Validating the content of pediatric outpatient medical records by means of tape-recording doctor-patient encounters.

Pediatrics. 1975 Sep;56(3):407-11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。