• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Accuracy of Spanish and English-generated ChatGPT responses to commonly asked patient questions about labor epidurals: a survey-based study among bilingual obstetric anesthesia experts.

作者信息

Gonzalez Fiol Antonio, Mootz Allison A, He Zili, Delgado Carlos, Ortiz Vilma, Reale Sharon C

机构信息

Department of Anesthesiology, Yale School of Medicine, New Haven, CT, United States.

Department of Anesthesiology, University of Texas Southwestern Medical Center & Parkland Memorial Hospital, Dallas, TX, United States.

出版信息

Int J Obstet Anesth. 2025 Feb;61:104290. doi: 10.1016/j.ijoa.2024.104290. Epub 2024 Nov 6.

DOI:10.1016/j.ijoa.2024.104290
PMID:39579604
Abstract

BACKGROUND

Large language models (LLMs), of which ChatGPT is the most well known, are now available to patients to seek medical advice in various languages. However, the accuracy of the information utilized to train these models remains unknown.

METHODS

Ten commonly asked questions regarding labor epidurals were translated from English to Spanish, and all 20 questions were entered into ChatGPT version 3.5. The answers were transcribed. A survey was then sent to 10 bilingual fellowship-trained obstetric anesthesiologists to assess the accuracy of these answers utilizing a 5-point Likert scale.

RESULTS

Overall, the accuracy scores for the ChatGPT-generated answers in Spanish were lower than for the English answers with a median score of 34 (IQR 33-36.5) versus 40.5 (IQR 39-44.3), respectively (P value 0.02). Answers to two questions were scored significantly lower: "Do epidurals prolong labor?" (2 (IQR 2-2.5) versus 4 (IQR 4-4.5), P value 0.03) and "Do epidurals increase the risk of needing cesarean delivery?" (3(IQR 2-4) versus 4 (IQR 4-5); P value 0.03). There was a strong agreement that answers to the question "Do epidurals cause autism" were accurate in both Spanish and English.

CONCLUSION

ChatGPT-generated answers in Spanish to ten questions about labor epidurals scored lower for accuracythananswers generated in English, particularly regarding the effect of labor epidurals on labor course and mode of delivery. This disparity in ChatGPT-generated information may extend already-known health inequities among non-English-speaking patients and perpetuate misinformation.

摘要

相似文献

1
Accuracy of Spanish and English-generated ChatGPT responses to commonly asked patient questions about labor epidurals: a survey-based study among bilingual obstetric anesthesia experts.
Int J Obstet Anesth. 2025 Feb;61:104290. doi: 10.1016/j.ijoa.2024.104290. Epub 2024 Nov 6.
2
Artificial intelligence chatbots versus traditional medical resources for patient education on "Labor Epidurals": an evaluation of accuracy, emotional tone, and readability.用于“分娩硬膜外麻醉”患者教育的人工智能聊天机器人与传统医学资源的比较:准确性、情感基调及可读性评估
Int J Obstet Anesth. 2025 Feb;61:104302. doi: 10.1016/j.ijoa.2024.104302. Epub 2024 Nov 26.
3
Readability, quality and accuracy of generative artificial intelligence chatbots for commonly asked questions about labor epidurals: a comparison of ChatGPT and Bard.生成式人工智能聊天机器人针对分娩硬膜外麻醉常见问题的可读性、质量和准确性:ChatGPT与Bard的比较
Int J Obstet Anesth. 2025 Feb;61:104317. doi: 10.1016/j.ijoa.2024.104317. Epub 2024 Dec 20.
4
The Accuracy of ChatGPT-Generated Responses in Answering Commonly Asked Patient Questions About Labor Epidurals: Correspondence.
Anesth Analg. 2024 Jun 1;138(6):e37. doi: 10.1213/ANE.0000000000006978. Epub 2024 May 20.
5
Ensuring Accuracy and Equity in Vaccination Information From ChatGPT and CDC: Mixed-Methods Cross-Language Evaluation.确保 ChatGPT 和 CDC 疫苗信息的准确性和公平性:混合方法跨语言评估。
JMIR Form Res. 2024 Oct 30;8:e60939. doi: 10.2196/60939.
6
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.生成式大语言模型与同行用户对解释非专业患者实验室检测结果的答案质量比较:评估研究。
J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.
7
Evidence-Based Potential of Generative Artificial Intelligence Large Language Models on Dental Avulsion: ChatGPT Versus Gemini.生成式人工智能大语言模型在牙脱位方面基于证据的潜力:ChatGPT与Gemini对比
Dent Traumatol. 2025 Apr;41(2):178-186. doi: 10.1111/edt.12999. Epub 2024 Nov 2.
8
A comparative study of English and Japanese ChatGPT responses to anaesthesia-related medical questions.关于英语和日语版ChatGPT对麻醉相关医学问题回答的比较研究。
BJA Open. 2024 Jun 14;10:100296. doi: 10.1016/j.bjao.2024.100296. eCollection 2024 Jun.
9
Proficiency, Clarity, and Objectivity of Large Language Models Versus Specialists' Knowledge on COVID-19's Impacts in Pregnancy: Cross-Sectional Pilot Study.大型语言模型在新冠肺炎对妊娠影响方面的熟练度、清晰度和客观性与专家知识对比:横断面试点研究
JMIR Form Res. 2025 Feb 5;9:e56126. doi: 10.2196/56126.
10
Evaluating ChatGPT's Multilingual Performance in Clinical Nutrition Advice Using Synthetic Medical Text: Insights from Central Asia.使用合成医学文本评估ChatGPT在临床营养建议方面的多语言表现:来自中亚的见解。
J Nutr. 2025 Mar;155(3):729-735. doi: 10.1016/j.tjnut.2024.12.018. Epub 2024 Dec 26.

引用本文的文献

1
Harnessing Generative Artificial Intelligence in Pediatric Anesthesia: Enhancing Learning, Patient Care, and Family Communication.在儿科麻醉中利用生成式人工智能:加强学习、患者护理和医患沟通。
Paediatr Anaesth. 2025 Sep;35(9):691-694. doi: 10.1111/pan.70005. Epub 2025 Jun 24.