Suppr超能文献

加强医患沟通:利用大语言模型在医学诊断中模拟非裔美国黑人英语

Enhancing Patient-Physician Communication: Simulating African American Vernacular English in Medical Diagnostics with Large Language Models.

作者信息

Lee Yeawon, Chang Chia-Hsuan, Yang Christopher C

机构信息

Drexel University, Philadelphia, PA 19104 USA.

Yale University, New Haven, CT 06510 USA.

出版信息

J Healthc Inform Res. 2025 Mar 11;9(2):119-153. doi: 10.1007/s41666-025-00194-9. eCollection 2025 Jun.

Abstract

UNLABELLED

Effective communication is crucial in reducing health disparities. However, linguistic differences, such as African American Vernacular English (AAVE), can lead to communication gaps between patients and physicians, negatively affecting care and outcomes. This study examines whether large language models (LLMs), specifically GPT-4 and Llama 3.3, can replicate AAVE in simulated clinical dialogues to improve cultural sensitivity. We tested four prompt types-BaseP, DemoP, LingP, and CompP-using United States Medical Licensing Examination (USMLE) case simulations. Statistical analyses on the models' outputs showed a significant difference among prompt types for both GPT-4 ((2,70) = 6.218,  = 0.003) and Llama 3.3 ((2,70) = 12.124,  < 0.001), indicating that including demographic information and/or explicit AAVE cues influences each model's output. Combining demographic and linguistic cues (CompP) yielded the highest mean AAVE feature counts (e.g., 9.83 for GPT-4 vs. 16.06 for Llama 3.3), although neither model fully captured the diversity of AAVE. Moreover, simply mentioning African American demographics triggers extra informal forms, suggesting built-in stereotypes or biases in both models. Overall, these findings highlight the promise of LLMs for culturally sensitive healthcare communication, while underscoring the need for continued refinement to address stereotypes and more accurately represent diverse linguistic styles.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s41666-025-00194-9.

摘要

未标注

有效的沟通对于减少健康差距至关重要。然而,语言差异,如非裔美国黑人英语(AAVE),可能导致患者与医生之间的沟通障碍,对医疗护理和结果产生负面影响。本研究探讨了大语言模型(LLMs),特别是GPT - 4和Llama 3.3,是否能够在模拟临床对话中复制AAVE以提高文化敏感性。我们使用美国医学执照考试(USMLE)案例模拟测试了四种提示类型——基础提示(BaseP)、示范提示(DemoP)、语言提示(LingP)和综合提示(CompP)。对模型输出的统计分析表明,GPT - 4((2,70) = 6.218, = 0.003)和Llama 3.3((2,70) = 12.124, < 0.001)的提示类型之间存在显著差异,这表明纳入人口统计学信息和/或明确的AAVE线索会影响每个模型的输出。尽管两个模型都没有完全捕捉到AAVE 的多样性,但结合人口统计学和语言线索(CompP)产生了最高的平均AAVE特征计数(例如,GPT - 4为9.83,Llama 3.3为16.06)。此外,仅仅提及非裔美国人的人口统计学特征就会引发额外的非正式形式,这表明两个模型中都存在固有的刻板印象或偏见。总体而言,这些发现凸显了大语言模型在具有文化敏感性的医疗保健沟通方面的前景,同时强调了持续改进以解决刻板印象并更准确地呈现多样语言风格的必要性。

补充信息

在线版本包含可在10.1007/s41666 - 025 - 00194 - 9获取的补充材料。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2264/12037967/3d09cb0de52e/41666_2025_194_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验