Suppr超能文献

对ChatGPT的医学建议进行(图灵)测试:调查研究。

Putting ChatGPT's Medical Advice to the (Turing) Test: Survey Study.

作者信息

Nov Oded, Singh Nina, Mann Devin

机构信息

Department of Technology Management, Tandon School of Engineering, New York University, New York, NY, United States.

Department of Population Health, Grossman School of Medicine, New York University, New York, NY, United States.

出版信息

JMIR Med Educ. 2023 Jul 10;9:e46939. doi: 10.2196/46939.

Abstract

BACKGROUND

Chatbots are being piloted to draft responses to patient questions, but patients' ability to distinguish between provider and chatbot responses and patients' trust in chatbots' functions are not well established.

OBJECTIVE

This study aimed to assess the feasibility of using ChatGPT (Chat Generative Pre-trained Transformer) or a similar artificial intelligence-based chatbot for patient-provider communication.

METHODS

A survey study was conducted in January 2023. Ten representative, nonadministrative patient-provider interactions were extracted from the electronic health record. Patients' questions were entered into ChatGPT with a request for the chatbot to respond using approximately the same word count as the human provider's response. In the survey, each patient question was followed by a provider- or ChatGPT-generated response. Participants were informed that 5 responses were provider generated and 5 were chatbot generated. Participants were asked-and incentivized financially-to correctly identify the response source. Participants were also asked about their trust in chatbots' functions in patient-provider communication, using a Likert scale from 1-5.

RESULTS

A US-representative sample of 430 study participants aged 18 and older were recruited on Prolific, a crowdsourcing platform for academic studies. In all, 426 participants filled out the full survey. After removing participants who spent less than 3 minutes on the survey, 392 respondents remained. Overall, 53.3% (209/392) of respondents analyzed were women, and the average age was 47.1 (range 18-91) years. The correct classification of responses ranged between 49% (192/392) to 85.7% (336/392) for different questions. On average, chatbot responses were identified correctly in 65.5% (1284/1960) of the cases, and human provider responses were identified correctly in 65.1% (1276/1960) of the cases. On average, responses toward patients' trust in chatbots' functions were weakly positive (mean Likert score 3.4 out of 5), with lower trust as the health-related complexity of the task in the questions increased.

CONCLUSIONS

ChatGPT responses to patient questions were weakly distinguishable from provider responses. Laypeople appear to trust the use of chatbots to answer lower-risk health questions. It is important to continue studying patient-chatbot interaction as chatbots move from administrative to more clinical roles in health care.

摘要

背景

目前正在试点使用聊天机器人来起草对患者问题的回复,但患者区分医疗服务提供者和聊天机器人回复的能力以及患者对聊天机器人功能的信任程度尚未明确。

目的

本研究旨在评估使用ChatGPT(聊天生成预训练变换器)或类似的基于人工智能的聊天机器人进行医患沟通的可行性。

方法

2023年1月进行了一项调查研究。从电子健康记录中提取了10次具有代表性的、非行政性的医患互动。将患者的问题输入ChatGPT,并要求聊天机器人以与医疗服务提供者回复大致相同的字数进行回复。在调查中,每个患者问题后面都跟着一个由医疗服务提供者或ChatGPT生成的回复。参与者被告知,其中5个回复是由医疗服务提供者生成的,5个是由聊天机器人生成的。参与者被要求——并给予经济激励——正确识别回复来源。参与者还被问及他们对聊天机器人在医患沟通中功能的信任程度,使用1-5的李克特量表。

结果

在学术研究众包平台Prolific上招募了430名年龄在18岁及以上的美国代表性研究参与者样本。共有426名参与者填写了完整的调查问卷。在剔除了在调查中花费时间少于3分钟的参与者后,剩下392名受访者。总体而言,接受分析的受访者中有53.3%(209/392)为女性,平均年龄为47.1岁(范围为18-91岁)。对于不同的问题,回复的正确分类率在49%(192/392)至85.7%(336/392)之间。平均而言,在65.5%(1284/1960)的情况下正确识别出聊天机器人的回复,在65.1%(1276/1960)的情况下正确识别出医疗服务提供者的回复。平均而言,患者对聊天机器人功能的信任程度回复呈弱阳性(平均李克特评分为3.4分(满分5分)),随着问题中与健康相关的任务复杂性增加,信任度降低。

结论

ChatGPT对患者问题的回复与医疗服务提供者的回复难以区分。外行人似乎信任使用聊天机器人来回答低风险的健康问题。随着聊天机器人在医疗保健中从行政角色转向更多的临床角色,继续研究患者与聊天机器人的互动非常重要。

相似文献

引用本文的文献

5
Large language models in oncology: a review.肿瘤学中的大语言模型:综述
BMJ Oncol. 2025 May 15;4(1):e000759. doi: 10.1136/bmjonc-2025-000759. eCollection 2025.
6
Ethical Considerations for Generative Artificial Intelligence in Plastic Surgery.整形手术中生成式人工智能的伦理考量
Plast Reconstr Surg Glob Open. 2025 Jun 2;13(6):e6825. doi: 10.1097/GOX.0000000000006825. eCollection 2025 Jun.
9
Clinical insights: A comprehensive review of language models in medicine.临床见解:医学领域语言模型的全面综述
PLOS Digit Health. 2025 May 8;4(5):e0000800. doi: 10.1371/journal.pdig.0000800. eCollection 2025 May.

本文引用的文献

1
ChatGPT and Physicians' Malpractice Risk.ChatGPT与医生的医疗事故风险。
JAMA Health Forum. 2023 May 5;4(5):e231938. doi: 10.1001/jamahealthforum.2023.1938.
6
Role of Chat GPT in Public Health.Chat GPT 在公共卫生中的作用。
Ann Biomed Eng. 2023 May;51(5):868-869. doi: 10.1007/s10439-023-03172-7. Epub 2023 Mar 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验