审视真实世界中的药物咨询与药物-草药相互作用：ChatGPT性能评估

Examining Real-World Medication Consultations and Drug-Herb Interactions: ChatGPT Performance Evaluation.

作者信息

Hsu Hsing-Yu, Hsu Kai-Cheng, Hou Shih-Yen, Wu Ching-Lung, Hsieh Yow-Wen, Cheng Yih-Dih

机构信息

Department of Pharmacy, China Medical University Hospital, Taichung, Taiwan.

Graduate Institute of Clinical Pharmacy, College of Medicine, National Taiwan University, Taipei, Taiwan.

出版信息

JMIR Med Educ. 2023 Aug 21;9:e48433. doi: 10.2196/48433.

DOI:10.2196/48433

PMID:37561097

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10477918/

Abstract

BACKGROUND

Since OpenAI released ChatGPT, with its strong capability in handling natural tasks and its user-friendly interface, it has garnered significant attention.

OBJECTIVE

A prospective analysis is required to evaluate the accuracy and appropriateness of medication consultation responses generated by ChatGPT.

METHODS

A prospective cross-sectional study was conducted by the pharmacy department of a medical center in Taiwan. The test data set comprised retrospective medication consultation questions collected from February 1, 2023, to February 28, 2023, along with common questions about drug-herb interactions. Two distinct sets of questions were tested: real-world medication consultation questions and common questions about interactions between traditional Chinese and Western medicines. We used the conventional double-review mechanism. The appropriateness of each response from ChatGPT was assessed by 2 experienced pharmacists. In the event of a discrepancy between the assessments, a third pharmacist stepped in to make the final decision.

RESULTS

Of 293 real-world medication consultation questions, a random selection of 80 was used to evaluate ChatGPT's performance. ChatGPT exhibited a higher appropriateness rate in responding to public medication consultation questions compared to those asked by health care providers in a hospital setting (31/51, 61% vs 20/51, 39%; P=.01).

CONCLUSIONS

The findings from this study suggest that ChatGPT could potentially be used for answering basic medication consultation questions. Our analysis of the erroneous information allowed us to identify potential medical risks associated with certain questions; this problem deserves our close attention.

摘要

背景

自OpenAI发布ChatGPT以来，凭借其在处理自然任务方面的强大能力和用户友好的界面，它受到了广泛关注。

目的

需要进行前瞻性分析，以评估ChatGPT生成的用药咨询回复的准确性和适用性。

方法

台湾一家医疗中心的药房部门进行了一项前瞻性横断面研究。测试数据集包括从2023年2月1日至2023年2月28日收集的回顾性用药咨询问题，以及关于药物与草药相互作用的常见问题。测试了两组不同的问题：现实世界中的用药咨询问题和关于中西药相互作用的常见问题。我们采用了传统的双重审核机制。ChatGPT的每条回复的适用性由2名经验丰富的药剂师进行评估。如果评估结果存在差异，则由第三名药剂师做出最终决定。

结果

在293个现实世界中的用药咨询问题中，随机选择了80个来评估ChatGPT的表现。与医院环境中医护人员提出的问题相比，ChatGPT在回答公众用药咨询问题时表现出更高的适用率（31/51，61%对20/51，39%；P = 0.01）。

结论

本研究结果表明，ChatGPT可能可用于回答基本的用药咨询问题。我们对错误信息的分析使我们能够识别与某些问题相关的潜在医疗风险；这个问题值得我们密切关注。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0049/10477918/aec097f368ff/mededu_v9i1e48433_fig1.jpg

相似文献

Examining Real-World Medication Consultations and Drug-Herb Interactions: ChatGPT Performance Evaluation.审视真实世界中的药物咨询与药物-草药相互作用：ChatGPT性能评估

JMIR Med Educ. 2023 Aug 21;9:e48433. doi: 10.2196/48433.

How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试（USMLE）中的表现如何？大语言模型对医学教育和知识评估的影响。

JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.

ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。

Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.

Assessing the applicability and appropriateness of ChatGPT in answering clinical pharmacy questions.评估 ChatGPT 在回答临床药学问题方面的适用性和适宜性。

Ann Pharm Fr. 2024 May;82(3):507-513. doi: 10.1016/j.pharma.2023.11.001. Epub 2023 Nov 20.

Evaluating the performance of ChatGPT in clinical pharmacy: A comparative study of ChatGPT and clinical pharmacists.评估 ChatGPT 在临床药学中的性能：ChatGPT 与临床药师的对比研究。

Br J Clin Pharmacol. 2024 Jan;90(1):232-238. doi: 10.1111/bcp.15896. Epub 2023 Sep 13.

Performance and exploration of ChatGPT in medical examination, records and education in Chinese: Pave the way for medical AI.ChatGPT 在中文体检、病历和教育方面的表现和探索：为医疗 AI 铺平道路。

Int J Med Inform. 2023 Sep;177:105173. doi: 10.1016/j.ijmedinf.2023.105173. Epub 2023 Aug 4.

Appropriateness and Comprehensiveness of Using ChatGPT for Perioperative Patient Education in Thoracic Surgery in Different Language Contexts: Survey Study.不同语言背景下使用ChatGPT进行胸外科围手术期患者教育的适用性和全面性：调查研究

Interact J Med Res. 2023 Aug 14;12:e46900. doi: 10.2196/46900.

ChatGPT Versus Consultants: Blinded Evaluation on Answering Otorhinolaryngology Case-Based Questions.ChatGPT与医学顾问的对比：对耳鼻喉科基于病例问题回答的盲法评估

JMIR Med Educ. 2023 Dec 5;9:e49183. doi: 10.2196/49183.

Effectiveness of ChatGPT in clinical pharmacy and the role of artificial intelligence in medication therapy management.ChatGPT在临床药学中的有效性以及人工智能在药物治疗管理中的作用。

J Am Pharm Assoc (2003). 2024 Mar-Apr;64(2):422-428.e8. doi: 10.1016/j.japh.2023.11.023. Epub 2023 Dec 2.

How does ChatGPT-4 preform on non-English national medical licensing examination? An evaluation in Chinese language.ChatGPT-4在非英语国家医学执照考试中的表现如何？中文语言环境下的一项评估。

PLOS Digit Health. 2023 Dec 1;2(12):e0000397. doi: 10.1371/journal.pdig.0000397. eCollection 2023 Dec.

引用本文的文献

Young Adult Perspectives on Artificial Intelligence-Based Medication Counseling in China: Discrete Choice Experiment.中国年轻人对基于人工智能的药物咨询的看法：离散选择实验

J Med Internet Res. 2025 Apr 9;27:e67744. doi: 10.2196/67744.

Navigating the potential and pitfalls of large language models in patient-centered medication guidance and self-decision support.探索大语言模型在以患者为中心的用药指导和自我决策支持中的潜力与陷阱。

Front Med (Lausanne). 2025 Jan 23;12:1527864. doi: 10.3389/fmed.2025.1527864. eCollection 2025.

Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study.ChatGPT-3.5和ChatGPT-4在台湾国家药剂师执照考试中的表现：比较评估研究。

JMIR Med Educ. 2025 Jan 17;11:e56850. doi: 10.2196/56850.

Evaluating accuracy and reproducibility of large language model performance on critical care assessments in pharmacy education.评估大语言模型在药学教育中的重症护理评估方面的性能准确性和可重复性。

Front Artif Intell. 2025 Jan 9;7:1514896. doi: 10.3389/frai.2024.1514896. eCollection 2024.

Usefulness of Generative Artificial Intelligence (AI) Tools in Pediatric Dentistry.生成式人工智能工具在儿童牙科中的实用性。

Diagnostics (Basel). 2024 Dec 14;14(24):2818. doi: 10.3390/diagnostics14242818.

Comparative evaluation of artificial intelligence systems' accuracy in providing medical drug dosages: A methodological study.人工智能系统在提供药物剂量方面准确性的比较评估：一项方法学研究。

World J Methodol. 2024 Dec 20;14(4):92802. doi: 10.5662/wjm.v14.i4.92802.

Engine of Innovation in Hospital Pharmacy: Applications and Reflections of ChatGPT.医院药学创新引擎：ChatGPT 的应用与思考。

J Med Internet Res. 2024 Oct 4;26:e51635. doi: 10.2196/51635.

The Evaluation of Generative AI Should Include Repetition to Assess Stability.生成式 AI 的评估应包括重复以评估稳定性。

JMIR Mhealth Uhealth. 2024 May 6;12:e57978. doi: 10.2196/57978.

Physician Versus Large Language Model Chatbot Responses to Web-Based Questions From Autistic Patients in Chinese: Cross-Sectional Comparative Analysis.中文自闭症患者网络问诊中，医生与大型语言模型聊天机器人回复的对比分析：横断面研究。

J Med Internet Res. 2024 Apr 30;26:e54706. doi: 10.2196/54706.

A commentary on 'ChatGPT in medicine: prospects and challenges: a review article' - correspondence.对《医学中的ChatGPT：前景与挑战：一篇综述文章》的评论——通信

Int J Surg. 2024 Aug 1;110(8):5178-5179. doi: 10.1097/JS9.0000000000001487.

本文引用的文献

Evaluating GPT as an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot.评估 GPT 作为放射学决策辅助工具：GPT-4 与 GPT-3.5 在乳腺成像试点中的比较。

J Am Coll Radiol. 2023 Oct;20(10):990-997. doi: 10.1016/j.jacr.2023.05.003. Epub 2023 Jun 21.

Using ChatGPT for language editing in scientific articles.在科学文章中使用ChatGPT进行语言编辑。

Maxillofac Plast Reconstr Surg. 2023 Mar 8;45(1):13. doi: 10.1186/s40902-023-00381-x.

Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios.评估 ChatGPT 在医疗保健中的可行性：对多个临床和研究场景的分析。

J Med Syst. 2023 Mar 4;47(1):33. doi: 10.1007/s10916-023-01925-4.

ChatGPT in Clinical Toxicology.临床毒理学中的ChatGPT

JMIR Med Educ. 2023 Mar 8;9:e46876. doi: 10.2196/46876.

The Role of ChatGPT, Generative Language Models, and Artificial Intelligence in Medical Education: A Conversation With ChatGPT and a Call for Papers.ChatGPT、生成式语言模型和人工智能在医学教育中的作用：与ChatGPT的对话及论文征集

JMIR Med Educ. 2023 Mar 6;9:e46885. doi: 10.2196/46885.

Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.ChatGPT在美国医师执照考试中的表现：使用大语言模型进行人工智能辅助医学教育的潜力。

PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.

Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.ChatGPT中的人工幻觉：对科学写作的影响

Cureus. 2023 Feb 19;15(2):e35179. doi: 10.7759/cureus.35179. eCollection 2023 Feb.

JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.

Nonhuman "Authors" and Implications for the Integrity of Scientific Publication and Medical Knowledge.非人类“作者”以及对科学出版物和医学知识完整性的影响。

JAMA. 2023 Feb 28;329(8):637-639. doi: 10.1001/jama.2023.1344.

ChatGPT and Other Large Language Models Are Double-edged Swords.ChatGPT和其他大型语言模型是双刃剑。

Radiology. 2023 Apr;307(2):e230163. doi: 10.1148/radiol.230163. Epub 2023 Jan 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

审视真实世界中的药物咨询与药物-草药相互作用：ChatGPT性能评估

Examining Real-World Medication Consultations and Drug-Herb Interactions: ChatGPT Performance Evaluation.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献