• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

解码差异:评估自动语音识别系统在转录家庭医疗保健中黑人和白人患者与护士的言语交流方面的性能。

Decoding disparities: evaluating automatic speech recognition system performance in transcribing Black and White patient verbal communication with nurses in home healthcare.

作者信息

Zolnoori Maryam, Vergez Sasha, Xu Zidu, Esmaeili Elyas, Zolnour Ali, Anne Briggs Krystal, Scroggins Jihye Kim, Hosseini Ebrahimabad Seyed Farid, Noble James M, Topaz Maxim, Bakken Suzanne, Bowles Kathryn H, Spens Ian, Onorato Nicole, Sridharan Sridevi, McDonald Margaret V

机构信息

Columbia University Irving Medical Center, New York, NY 10032, United States.

School of Nursing, Columbia University, New York, NY 10032, United States.

出版信息

JAMIA Open. 2024 Dec 10;7(4):ooae130. doi: 10.1093/jamiaopen/ooae130. eCollection 2024 Dec.

DOI:10.1093/jamiaopen/ooae130
PMID:39659993
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11631515/
Abstract

OBJECTIVES

As artificial intelligence evolves, integrating speech processing into home healthcare (HHC) workflows is increasingly feasible. Audio-recorded communications enhance risk identification models, with automatic speech recognition (ASR) systems as a key component. This study evaluates the transcription accuracy and equity of 4 ASR systems-Amazon Web Services (AWS) General, AWS Medical, Whisper, and Wave2Vec-in transcribing patient-nurse communication in US HHC, focusing on their ability in accurate transcription of speech from Black and White English-speaking patients.

MATERIALS AND METHODS

We analyzed audio recordings of patient-nurse encounters from 35 patients (16 Black and 19 White) in a New York City-based HHC service. Overall, 860 utterances were available for study, including 475 drawn from Black patients and 385 from White patients. Automatic speech recognition performance was measured using word error rate (WER), benchmarked against a manual gold standard. Disparities were assessed by comparing ASR performance across racial groups using the linguistic inquiry and word count (LIWC) tool, focusing on 10 linguistic dimensions, as well as specific speech elements including repetition, filler words, and proper nouns (medical and nonmedical terms).

RESULTS

The average age of participants was 67.8 years (SD = 14.4). Communication lasted an average of 15 minutes (range: 11-21 minutes) with a median of 1186 words per patient. Of 860 total utterances, 475 were from Black patients and 385 from White patients. Amazon Web Services General had the highest accuracy, with a median WER of 39%. However, all systems showed reduced accuracy for Black patients, with significant discrepancies in LIWC dimensions such as "Affect," "Social," and "Drives." Amazon Web Services Medical performed best for medical terms, though all systems have difficulties with filler words, repetition, and nonmedical terms, with AWS General showing the lowest error rates at 65%, 64%, and 53%, respectively.

DISCUSSION

While AWS systems demonstrated superior accuracy, significant disparities by race highlight the need for more diverse training datasets and improved dialect sensitivity. Addressing these disparities is critical for ensuring equitable ASR performance in HHC settings and enhancing risk prediction models through audio-recorded communication.

摘要

目的

随着人工智能的发展,将语音处理集成到家庭医疗保健(HHC)工作流程中越来越可行。音频记录的通信增强了风险识别模型,自动语音识别(ASR)系统是关键组成部分。本研究评估了4种ASR系统——亚马逊网络服务(AWS)通用版、AWS医疗版、Whisper和Wave2Vec——在美国HHC中对患者与护士沟通内容进行转录的准确性和公平性,重点关注它们对黑人和白人英语患者语音进行准确转录的能力。

材料与方法

我们分析了纽约市一家HHC服务机构中35名患者(16名黑人患者和19名白人患者)与护士交流的音频记录。总体而言,有860条话语可供研究,其中475条来自黑人患者,385条来自白人患者。使用单词错误率(WER)来衡量自动语音识别性能,并以人工黄金标准为基准。通过使用语言查询和单词计数(LIWC)工具比较不同种族群体的ASR性能来评估差异,重点关注10个语言维度以及包括重复、填充词和专有名词(医学和非医学术语)在内的特定语音元素。

结果

参与者的平均年龄为67.8岁(标准差=14.4)。交流平均持续15分钟(范围:11 - 21分钟),每位患者的单词中位数为1186个。在860条总话语中,475条来自黑人患者,385条来自白人患者。AWS通用版的准确性最高,WER中位数为39%。然而,所有系统对黑人患者的准确性都有所降低,在LIWC维度如“情感”“社交”和“驱动力”等方面存在显著差异。AWS医疗版在医学术语方面表现最佳,不过所有系统在处理填充词、重复内容和非医学术语时都存在困难,AWS通用版在这方面的错误率分别为65%、64%和53%,是最低的。

讨论

虽然AWS系统表现出卓越的准确性,但种族方面的显著差异凸显了需要更多样化的训练数据集以及提高方言敏感性。解决这些差异对于确保HHC环境中ASR性能的公平性以及通过音频记录通信增强风险预测模型至关重要。

相似文献

1
Decoding disparities: evaluating automatic speech recognition system performance in transcribing Black and White patient verbal communication with nurses in home healthcare.解码差异:评估自动语音识别系统在转录家庭医疗保健中黑人和白人患者与护士的言语交流方面的性能。
JAMIA Open. 2024 Dec 10;7(4):ooae130. doi: 10.1093/jamiaopen/ooae130. eCollection 2024 Dec.
2
Voice for All: Evaluating the Accuracy and Equity of Automatic Speech Recognition Systems in Transcribing Patient Communications in Home Healthcare.全民之声:评估家庭医疗保健中患者沟通转录自动语音识别系统的准确性与公平性
Stud Health Technol Inform. 2025 Aug 7;329:1904-1906. doi: 10.3233/SHTI251273.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.儿科言语和语言治疗师转写语音样本的音标转录的一致性。
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.
5
Sexual Harassment and Prevention Training性骚扰与预防培训
6
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
7
Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation.发育性运动言语障碍中单词和句子层面言语可懂度的自动评估:一项跨语言研究。
Diagnostics (Basel). 2025 Jul 28;15(15):1892. doi: 10.3390/diagnostics15151892.
8
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
9
Racial and ethnic disparities in fecundability: a North American preconception cohort study.生育力方面的种族和族裔差异:一项北美孕前队列研究。
Hum Reprod. 2025 Apr 17. doi: 10.1093/humrep/deaf067.
10
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

引用本文的文献

1
Automated Assessment of Word- and Sentence-Level Speech Intelligibility in Developmental Motor Speech Disorders: A Cross-Linguistic Investigation.发育性运动言语障碍中单词和句子层面言语可懂度的自动评估:一项跨语言研究。
Diagnostics (Basel). 2025 Jul 28;15(15):1892. doi: 10.3390/diagnostics15151892.

本文引用的文献

1
Evaluating OpenAI's Whisper ASR: Performance analysis across diverse accents and speaker traits.评估OpenAI的Whisper自动语音识别技术:不同口音和说话者特征下的性能分析。
JASA Express Lett. 2024 Feb 1;4(2). doi: 10.1121/10.0024876.
2
Utilizing patient-nurse verbal communication in building risk identification models: the missing critical data stream in home healthcare.利用医患口头交流来构建风险识别模型:家庭医疗中缺失的关键数据流。
J Am Med Inform Assoc. 2024 Jan 18;31(2):435-444. doi: 10.1093/jamia/ocad195.
3
ADscreen: A speech processing-based screening system for automatic identification of patients with Alzheimer's disease and related dementia.
ADscreen:一种基于语音处理的筛查系统,用于自动识别阿尔茨海默病和相关痴呆患者。
Artif Intell Med. 2023 Sep;143:102624. doi: 10.1016/j.artmed.2023.102624. Epub 2023 Jul 17.
4
Is the patient speaking or the nurse? Automatic speaker type identification in patient-nurse audio recordings.患者在说话还是护士在说话?患者-护士录音中的自动说话人类型识别。
J Am Med Inform Assoc. 2023 Sep 25;30(10):1673-1683. doi: 10.1093/jamia/ocad139.
5
Is Auto-generated Transcript of Patient-Nurse Communication Ready to Use for Identifying the Risk for Hospitalizations or Emergency Department Visits in Home Health Care? A Natural Language Processing Pilot Study.患者-护士沟通的自动生成转录本是否可用于识别家庭医疗保健中的住院或急诊就诊风险?一项自然语言处理试点研究。
AMIA Annu Symp Proc. 2023 Apr 29;2022:992-1001. eCollection 2022.
6
Audio Recording Patient-Nurse Verbal Communications in Home Health Care Settings: Pilot Feasibility and Usability Study.家庭医疗环境中患者-护士言语交流的音频记录:初步可行性和可用性研究
JMIR Hum Factors. 2022 May 11;9(2):e35325. doi: 10.2196/35325.
7
"I don't Think These Devices are Very Culturally Sensitive."-Impact of Automated Speech Recognition Errors on African Americans.“我认为这些设备在文化敏感性方面做得很不够。”——自动语音识别错误对非裔美国人的影响
Front Artif Intell. 2021 Nov 26;4:725911. doi: 10.3389/frai.2021.725911. eCollection 2021.
8
Frustration With Technology and its Relation to Emotional Exhaustion Among Health Care Workers: Cross-sectional Observational Study.医护人员对技术的挫败感及其与情绪疲惫的关系:横断面观察研究。
J Med Internet Res. 2021 Jul 6;23(7):e26817. doi: 10.2196/26817.
9
Assessing the accuracy of automatic speech recognition for psychotherapy.评估心理治疗中自动语音识别的准确性。
NPJ Digit Med. 2020 Jun 3;3:82. doi: 10.1038/s41746-020-0285-8. eCollection 2020.
10
Racial disparities in automated speech recognition.种族差异与自动化语音识别。
Proc Natl Acad Sci U S A. 2020 Apr 7;117(14):7684-7689. doi: 10.1073/pnas.1915768117. Epub 2020 Mar 23.