Suppr超能文献

患者特定出院指导机器翻译的准确性和安全性评估:一项对比分析。

Evaluation of the accuracy and safety of machine translation of patient-specific discharge instructions: a comparative analysis.

作者信息

Kong Marianna, Fernandez Alicia, Bains Jaskaran, Milisavljevic Ana, Brooks Katherine C, Shanmugam Akash, Avilez Leslie, Li Junhong, Honcharov Vladyslav, Yang Andersen, Khoong Elaine C

机构信息

Department of Family and Community Medicine, University of California San Francisco, San Francisco, California, USA.

Division of General Internal Medicine at Zuckerberg San Francisco General Hospital, University of California San Francisco, San Francisco, California, USA.

出版信息

BMJ Qual Saf. 2025 Jul 9. doi: 10.1136/bmjqs-2024-018384.

Abstract

INTRODUCTION

Machine translation of patient-specific information could mitigate language barriers if sufficiently accurate and non-harmful and may be particularly useful in healthcare encounters when professional translators are not readily available. We evaluated the translation accuracy and potential for harm of ChatGPT-4 and Google Translate in translating from English to Spanish, Chinese and Russian.

METHODS

We used ChatGPT-4 and Google Translate to translate 50 sets (316 sentences) of deidentified, patient-specific, clinician free-text emergency department instructions into Spanish, Chinese and Russian. These were then back-translated into English by professional translators and double-coded by physicians for accuracy and potential for clinical harm.

RESULTS

At the sentence level, we found that both tools were ≥90% accurate in translating English to Spanish (accuracy: GPT 97%, Google Translate 96%) and English to Chinese (accuracy: GPT 95%; Google Translate 90%); neither tool performed as well in translating English to Russian (accuracy: GPT 89%; Google Translate 80%). At the instruction set level, 16%, 24% and 56% of Spanish, Chinese and Russian GPT-translated instruction sets contained at least one inaccuracy. For Google Translate, 24%, 56% and 66% of Spanish, Chinese and Russian translations contained at least one inaccuracy. The potential for harm due to inaccurate translations was ≤1% for both tools in all languages at the sentence level and ≤6% at the instruction set level. GPT was significantly more accurate than Google Translate in Chinese and Russian at the sentence level; the potential for harm was similar.

CONCLUSION

These results support the potential of machine translation tools to mitigate gaps in translation services for low-stakes written communication from English to Spanish, while also strengthening the case for caution and for professional oversight in non-low-risk communication. Further research is needed to evaluate machine translation for other languages and more technical content.

摘要

引言

如果患者特定信息的机器翻译足够准确且无害,那么它可以缓解语言障碍,并且在专业翻译人员难以获取的医疗服务场景中可能会特别有用。我们评估了ChatGPT-4和谷歌翻译从英语翻译成西班牙语、中文和俄语的翻译准确性及潜在危害。

方法

我们使用ChatGPT-4和谷歌翻译将50组(316个句子)经过去识别处理的、针对特定患者的、临床医生的急诊部门自由文本指令翻译成西班牙语、中文和俄语。然后由专业翻译人员将这些译文回译成英语,并由医生进行双重编码以评估准确性和临床危害可能性。

结果

在句子层面,我们发现两种工具在将英语翻译成西班牙语(准确率:ChatGPT 97%,谷歌翻译96%)以及英语翻译成中文(准确率:ChatGPT 95%;谷歌翻译90%)时准确率均≥90%;在将英语翻译成俄语时,两种工具的表现都没那么好(准确率:ChatGPT 89%;谷歌翻译80%)。在指令集层面,ChatGPT翻译成西班牙语、中文和俄语的指令集中分别有16%、24%和56%至少包含一处不准确之处。对于谷歌翻译,翻译成西班牙语、中文和俄语的译文中分别有24%、56%和66%至少包含一处不准确之处。在句子层面,两种工具在所有语言中因翻译不准确导致的潜在危害均≤1%,在指令集层面则≤6%。在句子层面,ChatGPT在中文和俄语翻译上比谷歌翻译显著更准确;潜在危害相似。

结论

这些结果支持了机器翻译工具在缓解从英语到西班牙语的低风险书面交流翻译服务差距方面的潜力,同时也强化了在非低风险交流中保持谨慎和进行专业监督的理由。需要进一步研究来评估机器翻译在其他语言和更专业内容方面的表现。

相似文献

5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
9
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
10
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.
Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.

本文引用的文献

1
Evaluating the quality and equity of patient hospital discharge instructions.
BMC Health Serv Res. 2025 Feb 21;25(1):291. doi: 10.1186/s12913-025-12410-8.
2
Pursuing Equity With Artificial Intelligence in Health Care.
JAMA Health Forum. 2025 Jan 3;6(1):e245031. doi: 10.1001/jamahealthforum.2024.5031.
6
8
Guiding Principles to Address the Impact of Algorithm Bias on Racial and Ethnic Disparities in Health and Health Care.
JAMA Netw Open. 2023 Dec 1;6(12):e2345050. doi: 10.1001/jamanetworkopen.2023.45050.
9
Language Profile of the US Physician Workforce: a Descriptive Study from a National Physician Survey.
J Gen Intern Med. 2023 Mar;38(4):1098-1101. doi: 10.1007/s11606-022-07938-y. Epub 2022 Nov 16.
10
A Research Agenda for Using Machine Translation in Clinical Medicine.
J Gen Intern Med. 2022 Apr;37(5):1275-1277. doi: 10.1007/s11606-021-07164-y. Epub 2022 Feb 7.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验