“嗯”“呃”：非词汇性会话声音是否是环境临床文档技术的障碍？

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

机构信息

Department of Informatics, Donald Bren School of Informatics and Computer Science, University of California, Irvine, Irvine, California, USA.

School of Medicine, University of California, Irvine, Irvine, California, USA.

出版信息

J Am Med Inform Assoc. 2023 Mar 16;30(4):703-711. doi: 10.1093/jamia/ocad001.

DOI:10.1093/jamia/ocad001

PMID:36688526

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10018260/

Abstract

OBJECTIVES

Ambient clinical documentation technology uses automatic speech recognition (ASR) and natural language processing (NLP) to turn patient-clinician conversations into clinical documentation. It is a promising approach to reducing clinician burden and improving documentation quality. However, the performance of current-generation ASR remains inadequately validated. In this study, we investigated the impact of non-lexical conversational sounds (NLCS) on ASR performance. NLCS, such as Mm-hm and Uh-uh, are commonly used to convey important information in clinical conversations, for example, Mm-hm as a "yes" response from the patient to the clinician question "are you allergic to antibiotics?"

MATERIALS AND METHODS

In this study, we evaluated 2 contemporary ASR engines, Google Speech-to-Text Clinical Conversation ("Google ASR"), and Amazon Transcribe Medical ("Amazon ASR"), both of which have their language models specifically tailored to clinical conversations. The empirical data used were from 36 primary care encounters. We conducted a series of quantitative and qualitative analyses to examine the word error rate (WER) and the potential impact of misrecognized NLCS on the quality of clinical documentation.

RESULTS

Out of a total of 135 647 spoken words contained in the evaluation data, 3284 (2.4%) were NLCS. Among these NLCS, 76 (0.06% of total words, 2.3% of all NLCS) were used to convey clinically relevant information. The overall WER, of all spoken words, was 11.8% for Google ASR and 12.8% for Amazon ASR. However, both ASR engines demonstrated poor performance in recognizing NLCS: the WERs across frequently used NLCS were 40.8% (Google) and 57.2% (Amazon), respectively; and among the NLCS that conveyed clinically relevant information, 94.7% and 98.7%, respectively.

DISCUSSION AND CONCLUSION

Current ASR solutions are not capable of properly recognizing NLCS, particularly those that convey clinically relevant information. Although the volume of NLCS in our evaluation data was very small (2.4% of the total corpus; and for NLCS that conveyed clinically relevant information: 0.06%), incorrect recognition of them could result in inaccuracies in clinical documentation and introduce new patient safety risks.

摘要

目的

环境临床文档技术使用自动语音识别（ASR）和自然语言处理（NLP）将医患对话转化为临床文档。这是一种有前途的减轻临床医生负担和提高文档质量的方法。然而，当前一代 ASR 的性能仍未得到充分验证。在这项研究中，我们研究了非词汇对话声音（NLCS）对 ASR 性能的影响。NLCS 如 Mm-hm 和 Uh-uh 常用于在临床对话中传达重要信息，例如，Mm-hm 是患者对临床医生询问“你对抗生素过敏吗？”的问题的“是”的回答。

材料和方法

在这项研究中，我们评估了两个现代 ASR 引擎，Google Speech-to-Text Clinical Conversation（“Google ASR”）和 Amazon Transcribe Medical（“Amazon ASR”），它们的语言模型都是专门为临床对话量身定制的。使用的实证数据来自 36 次初级保健就诊。我们进行了一系列定量和定性分析，以检查单词错误率（WER）和误识别 NLCS 对临床文档质量的潜在影响。

结果

在评估数据中包含的总共 135647 个口语单词中，有 3284 个（2.4%）是 NLCS。在这些 NLCS 中，有 76 个（占总字数的 0.06%，所有 NLCS 的 2.3%）用于传达临床相关信息。所有口语单词的总体 WER 分别为 Google ASR 的 11.8%和 Amazon ASR 的 12.8%。然而，这两个 ASR 引擎在识别 NLCS 方面表现不佳：常用 NLCS 的 WER 分别为 40.8%（Google）和 57.2%（Amazon）；在传达临床相关信息的 NLCS 中，分别为 94.7%和 98.7%。

讨论和结论

当前的 ASR 解决方案无法正确识别 NLCS，特别是那些传达临床相关信息的 NLCS。尽管我们评估数据中的 NLCS 量非常小（总语料库的 2.4%；对于传达临床相关信息的 NLCS：0.06%），但错误识别它们可能导致临床文档不准确，并引入新的患者安全风险。

相似文献

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

J Am Med Inform Assoc. 2023 Mar 16;30(4):703-711. doi: 10.1093/jamia/ocad001.

Assessing the Effectiveness of Automatic Speech Recognition Technology in Emergency Medicine Settings: A Comparative Study of Four AI-powered Engines.

Res Sq. 2024 Aug 17:rs.3.rs-4727659. doi: 10.21203/rs.3.rs-4727659/v1.

How does medical scribes' work inform development of speech-based clinical documentation technologies? A systematic review.

J Am Med Inform Assoc. 2020 May 1;27(5):808-817. doi: 10.1093/jamia/ocaa020.

Complete and Resilient Documentation for Operational Medical Environments Leveraging Mobile Hands-free Technology in a Systems Approach: Experimental Study.

JMIR Mhealth Uhealth. 2021 Oct 12;9(10):e32301. doi: 10.2196/32301.

A systematic comparison of contemporary automatic speech recognition engines for conversational clinical speech.

AMIA Annu Symp Proc. 2018 Dec 5;2018:683-689. eCollection 2018.

A proof-of-concept study for automatic speech recognition to transcribe AAC speakers' speech from high-technology AAC systems.

Assist Technol. 2024 Jul 3;36(4):319-326. doi: 10.1080/10400435.2023.2260860. Epub 2023 Oct 5.

Automatic speech recognition performance for digital scribes: a performance comparison between general-purpose and specialized models tuned for patient-clinician conversations.

AMIA Annu Symp Proc. 2023 Apr 29;2022:1072-1080. eCollection 2022.

Combining automatic speech recognition with semantic natural language processing in schizophrenia.

Psychiatry Res. 2023 Jul;325:115252. doi: 10.1016/j.psychres.2023.115252. Epub 2023 May 16.

Using HIPAA (Health Insurance Portability and Accountability Act)-Compliant Transcription Services for Virtual Psychiatric Interviews: Pilot Comparison Study.

JMIR Ment Health. 2023 Oct 31;10:e48517. doi: 10.2196/48517.

Analysis of Documentation Speed Using Web-Based Medical Speech Recognition Technology: Randomized Controlled Trial.

J Med Internet Res. 2015 Nov 3;17(11):e247. doi: 10.2196/jmir.5072.

引用本文的文献

The impact of using AI-powered voice-to-text technology for clinical documentation on quality of care in primary care and outpatient settings: a systematic review.

EBioMedicine. 2025 Jul 21;118:105861. doi: 10.1016/j.ebiom.2025.105861.

Year 2023 in Biomedical Natural Language Processing: a Tribute to Large Language Models and Generative AI.

Yearb Med Inform. 2024 Aug;33(1):241-248. doi: 10.1055/s-0044-1800751. Epub 2025 Apr 8.

Artificial intelligence-generated feedback on social signals in patient-provider communication: technical performance, feedback usability, and impact.

JAMIA Open. 2024 Oct 18;7(4):ooae106. doi: 10.1093/jamiaopen/ooae106. eCollection 2024 Dec.

How Artificial Intelligence Will Transform Clinical Care, Research, and Trials for Inflammatory Bowel Disease.

Clin Gastroenterol Hepatol. 2025 Feb;23(3):428-439.e4. doi: 10.1016/j.cgh.2024.05.048. Epub 2024 Jul 9.

A voice-based digital assistant for intelligent prompting of evidence-based practices during ICU rounds.

J Biomed Inform. 2023 Oct;146:104483. doi: 10.1016/j.jbi.2023.104483. Epub 2023 Aug 30.

本文引用的文献

The digital scribe in clinical practice: a scoping review and research agenda.

NPJ Digit Med. 2021 Mar 26;4(1):57. doi: 10.1038/s41746-021-00432-5.

Automated rating of patient and physician emotion in primary care visits.

Patient Educ Couns. 2021 Aug;104(8):2098-2105. doi: 10.1016/j.pec.2021.01.004. Epub 2021 Jan 7.

Identifying relevant information in medical conversations to summarize a clinician-patient encounter.

Health Informatics J. 2020 Dec;26(4):2906-2914. doi: 10.1177/1460458220951719. Epub 2020 Aug 29.

How does medical scribes' work inform development of speech-based clinical documentation technologies? A systematic review.

J Am Med Inform Assoc. 2020 May 1;27(5):808-817. doi: 10.1093/jamia/ocaa020.

Challenges of developing a digital scribe to reduce clinical documentation burden.

NPJ Digit Med. 2019 Nov 22;2:114. doi: 10.1038/s41746-019-0190-1. eCollection 2019.

Lower Adherence: A Description of Colorectal Cancer Screening Barrier Talk.

J Health Commun. 2020;25(1):43-53. doi: 10.1080/10810730.2019.1697909. Epub 2019 Dec 4.

Detecting conversation topics in primary care office visits from transcripts of patient-provider interactions.

J Am Med Inform Assoc. 2019 Dec 1;26(12):1493-1504. doi: 10.1093/jamia/ocz140.

The digital scribe.

NPJ Digit Med. 2018 Oct 16;1:58. doi: 10.1038/s41746-018-0066-9. eCollection 2018.

A systematic comparison of contemporary automatic speech recognition engines for conversational clinical speech.

AMIA Annu Symp Proc. 2018 Dec 5;2018:683-689. eCollection 2018.

Speech recognition for clinical documentation from 1990 to 2018: a systematic review.

J Am Med Inform Assoc. 2019 Apr 1;26(4):324-338. doi: 10.1093/jamia/ocy179.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

“嗯”“呃”：非词汇性会话声音是否是环境临床文档技术的障碍？

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

机构信息

Department of Informatics, Donald Bren School of Informatics and Computer Science, University of California, Irvine, Irvine, California, USA.

School of Medicine, University of California, Irvine, Irvine, California, USA.

出版信息

J Am Med Inform Assoc. 2023 Mar 16;30(4):703-711. doi: 10.1093/jamia/ocad001.

DOI:10.1093/jamia/ocad001

PMID:36688526

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10018260/

Abstract

OBJECTIVES

MATERIALS AND METHODS

RESULTS

DISCUSSION AND CONCLUSION

摘要

“嗯”“呃”：非词汇性会话声音是否是环境临床文档技术的障碍？

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

DISCUSSION AND CONCLUSION

目的

材料和方法

结果

讨论和结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

“嗯”“呃”：非词汇性会话声音是否是环境临床文档技术的障碍？

"Mm-hm," "Uh-uh": are non-lexical conversational sounds deal breakers for the ambient clinical documentation technology?

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

DISCUSSION AND CONCLUSION

目的

材料和方法

结果

讨论和结论