• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT-3.5和ChatGPT-4在台湾国家药剂师执照考试中的表现:比较评估研究。

Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study.

作者信息

Wang Ying-Mei, Shen Hung-Wei, Chen Tzeng-Ji, Chiang Shu-Chiung, Lin Ting-Guan

机构信息

Department of Medical Education and Research, Taipei Veterans General Hospital Hsinchu Branch, 81, Section 1, Zhongfeng Road, Zhudong, Hsinchu, 310, Taiwan, 886 03-5962134 ext 127.

Department of Pharmacy, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu, Taiwan.

出版信息

JMIR Med Educ. 2025 Jan 17;11:e56850. doi: 10.2196/56850.

DOI:10.2196/56850
PMID:39864950
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11769692/
Abstract

BACKGROUND

OpenAI released versions ChatGPT-3.5 and GPT-4 between 2022 and 2023. GPT-3.5 has demonstrated proficiency in various examinations, particularly the United States Medical Licensing Examination. However, GPT-4 has more advanced capabilities.

OBJECTIVE

This study aims to examine the efficacy of GPT-3.5 and GPT-4 within the Taiwan National Pharmacist Licensing Examination and to ascertain their utility and potential application in clinical pharmacy and education.

METHODS

The pharmacist examination in Taiwan consists of 2 stages: basic subjects and clinical subjects. In this study, exam questions were manually fed into the GPT-3.5 and GPT-4 models, and their responses were recorded; graphic-based questions were excluded. This study encompassed three steps: (1) determining the answering accuracy of GPT-3.5 and GPT-4, (2) categorizing question types and observing differences in model performance across these categories, and (3) comparing model performance on calculation and situational questions. Microsoft Excel and R software were used for statistical analyses.

RESULTS

GPT-4 achieved an accuracy rate of 72.9%, overshadowing GPT-3.5, which achieved 59.1% (P<.001). In the basic subjects category, GPT-4 significantly outperformed GPT-3.5 (73.4% vs 53.2%; P<.001). However, in clinical subjects, only minor differences in accuracy were observed. Specifically, GPT-4 outperformed GPT-3.5 in the calculation and situational questions.

CONCLUSIONS

This study demonstrates that GPT-4 outperforms GPT-3.5 in the Taiwan National Pharmacist Licensing Examination, particularly in basic subjects. While GPT-4 shows potential for use in clinical practice and pharmacy education, its limitations warrant caution. Future research should focus on refining prompts, improving model stability, integrating medical databases, and designing questions that better assess student competence and minimize guessing.

摘要

背景

OpenAI在2022年至2023年期间发布了ChatGPT-3.5和GPT-4版本。GPT-3.5在各种考试中已展现出一定水平,尤其是在美国医师执照考试中。然而,GPT-4具备更先进的能力。

目的

本研究旨在检验GPT-3.5和GPT-4在台湾国家药剂师执照考试中的效果,并确定它们在临床药学和教育中的实用性及潜在应用。

方法

台湾的药剂师考试分为两个阶段:基础科目和临床科目。在本研究中,考试题目被手动输入GPT-3.5和GPT-4模型,并记录它们的回答;基于图形的题目被排除。本研究包括三个步骤:(1)确定GPT-3.5和GPT-4的答题准确率,(2)对题目类型进行分类并观察模型在这些类别中的表现差异,(3)比较模型在计算题和情景题上的表现。使用Microsoft Excel和R软件进行统计分析。

结果

GPT-4的准确率达到72.9%,超过了GPT-3.5的59.1%(P<0.001)。在基础科目类别中,GPT-4明显优于GPT-3.5(73.4%对53.2%;P<0.001)。然而,在临床科目中,仅观察到准确率上的微小差异。具体而言,GPT-4在计算题和情景题上优于GPT-3.5。

结论

本研究表明,在台湾国家药剂师执照考试中,GPT-4优于GPT-3.5,尤其是在基础科目方面。虽然GPT-4在临床实践和药学教育中显示出应用潜力,但其局限性仍需谨慎对待。未来的研究应专注于优化提示、提高模型稳定性、整合医学数据库以及设计能更好评估学生能力并减少猜测因素的题目。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/a9ab079b3dd4/mededu-v11-e56850-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/f50081adf11b/mededu-v11-e56850-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/1fe318eeda62/mededu-v11-e56850-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/c283d6a5db53/mededu-v11-e56850-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/d3a9e41b6b10/mededu-v11-e56850-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/d29b3f03fc5a/mededu-v11-e56850-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/f016a3a84191/mededu-v11-e56850-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/57c9d5565c50/mededu-v11-e56850-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/a9ab079b3dd4/mededu-v11-e56850-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/f50081adf11b/mededu-v11-e56850-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/1fe318eeda62/mededu-v11-e56850-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/c283d6a5db53/mededu-v11-e56850-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/d3a9e41b6b10/mededu-v11-e56850-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/d29b3f03fc5a/mededu-v11-e56850-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/f016a3a84191/mededu-v11-e56850-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/57c9d5565c50/mededu-v11-e56850-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4951/11769692/a9ab079b3dd4/mededu-v11-e56850-g008.jpg

相似文献

1
Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study.ChatGPT-3.5和ChatGPT-4在台湾国家药剂师执照考试中的表现:比较评估研究。
JMIR Med Educ. 2025 Jan 17;11:e56850. doi: 10.2196/56850.
2
Performance of ChatGPT-3.5 and GPT-4 in national licensing examinations for medicine, pharmacy, dentistry, and nursing: a systematic review and meta-analysis.ChatGPT-3.5 和 GPT-4 在医学、药学、牙科和护理国家执照考试中的表现:系统评价和荟萃分析。
BMC Med Educ. 2024 Sep 16;24(1):1013. doi: 10.1186/s12909-024-05944-8.
3
Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现:系统评价和荟萃分析。
J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.
4
Performance of GPT-3.5 and GPT-4 on the Korean Pharmacist Licensing Examination: Comparison Study.GPT-3.5和GPT-4在韩国药剂师执照考试中的表现:比较研究。
JMIR Med Educ. 2024 Dec 4;10:e57451. doi: 10.2196/57451.
5
Large Language Models and the North American Pharmacist Licensure Examination (NAPLEX) Practice Questions.大语言模型与北美药师执照考试(NAPLEX)练习题。
Am J Pharm Educ. 2024 Nov;88(11):101294. doi: 10.1016/j.ajpe.2024.101294. Epub 2024 Sep 20.
6
Performance of ChatGPT on the pharmacist licensing examination in Taiwan.ChatGPT 在台湾药剂师执照考试中的表现。
J Chin Med Assoc. 2023 Jul 1;86(7):653-658. doi: 10.1097/JCMA.0000000000000942. Epub 2023 Jul 5.
7
Performance of ChatGPT on the Peruvian National Licensing Medical Examination: Cross-Sectional Study.ChatGPT在秘鲁国家医学执照考试中的表现:横断面研究
JMIR Med Educ. 2023 Sep 28;9:e48039. doi: 10.2196/48039.
8
ChatGPT-4 Omni Performance in USMLE Disciplines and Clinical Skills: Comparative Analysis.ChatGPT-4 在 USMLE 学科和临床技能中的全能表现:比较分析。
JMIR Med Educ. 2024 Nov 6;10:e63430. doi: 10.2196/63430.
9
Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study.探讨 ChatGPT 版本 3.5、4 和 4 与 Vision 在智利医师执照考试中的表现:观察性研究。
JMIR Med Educ. 2024 Apr 29;10:e55048. doi: 10.2196/55048.
10
Evaluating Bard Gemini Pro and GPT-4 Vision Against Student Performance in Medical Visual Question Answering: Comparative Case Study.在医学视觉问答中评估Bard Gemini Pro和GPT-4 Vision对学生表现的影响:比较案例研究
JMIR Form Res. 2024 Dec 17;8:e57592. doi: 10.2196/57592.

引用本文的文献

1
Benchmarking ChatGPT-3.5 and OpenAI o3 Against Clinical Pharmacists: Preliminary Insights into Clinical Accuracy, Sensitivity, and Specificity in Pharmacy MCQs.将ChatGPT-3.5和OpenAI o3与临床药剂师进行对比:对药学多项选择题中临床准确性、敏感性和特异性的初步见解。
Healthcare (Basel). 2025 Jul 19;13(14):1751. doi: 10.3390/healthcare13141751.

本文引用的文献

1
Evaluating large language models in theory of mind tasks.评估大型语言模型在心理论任务中的表现。
Proc Natl Acad Sci U S A. 2024 Nov 5;121(45):e2405460121. doi: 10.1073/pnas.2405460121. Epub 2024 Oct 29.
2
ChatGPT applications in medical, dental, pharmacy, and public health education: A descriptive study highlighting the advantages and limitations.ChatGPT在医学、牙科、药学和公共卫生教育中的应用:一项突出优势与局限的描述性研究。
Narra J. 2023 Apr;3(1):e103. doi: 10.52225/narra.v3i1.103. Epub 2023 Mar 29.
3
Evaluation of the performance of GPT-3.5 and GPT-4 on the Polish Medical Final Examination.
评估 GPT-3.5 和 GPT-4 在波兰医学期末考试中的表现。
Sci Rep. 2023 Nov 22;13(1):20512. doi: 10.1038/s41598-023-46995-z.
4
Assessment of the capacity of ChatGPT as a self-learning tool in medical pharmacology: a study using MCQs.评估 ChatGPT 作为医学药理学自学工具的能力:一项使用多项选择题的研究。
BMC Med Educ. 2023 Nov 13;23(1):864. doi: 10.1186/s12909-023-04832-x.
5
Prompt Engineering as an Important Emerging Skill for Medical Professionals: Tutorial.医学专业人员的新兴技能:提示工程教程
J Med Internet Res. 2023 Oct 4;25:e50638. doi: 10.2196/50638.
6
Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations.ChatGPT-3.5、ChatGPT-4 和骨科住院医师在骨科评估考试中的表现比较。
J Am Acad Orthop Surg. 2023 Dec 1;31(23):1173-1179. doi: 10.5435/JAAOS-D-23-00396. Epub 2023 Sep 4.
7
Assessing the Performance of GPT-3.5 and GPT-4 on the 2023 Japanese Nursing Examination.评估GPT-3.5和GPT-4在2023年日本护理考试中的表现。
Cureus. 2023 Aug 3;15(8):e42924. doi: 10.7759/cureus.42924. eCollection 2023 Aug.
8
Evaluation of community pharmacists' perceptions and willingness to integrate ChatGPT into their pharmacy practice: A study from Jordan.评估社区药剂师对将 ChatGPT 融入其药房实践的看法和意愿:来自约旦的一项研究。
J Am Pharm Assoc (2003). 2023 Nov-Dec;63(6):1761-1767.e2. doi: 10.1016/j.japh.2023.08.020. Epub 2023 Aug 28.
9
Evaluating the performance of ChatGPT in clinical pharmacy: A comparative study of ChatGPT and clinical pharmacists.评估 ChatGPT 在临床药学中的性能:ChatGPT 与临床药师的对比研究。
Br J Clin Pharmacol. 2024 Jan;90(1):232-238. doi: 10.1111/bcp.15896. Epub 2023 Sep 13.
10
ChatGPT in pharmacometrics? Potential opportunities and limitations.ChatGPT 在药物代谢动力学中的应用?潜在的机会和限制。
Br J Clin Pharmacol. 2024 Jan;90(1):360-365. doi: 10.1111/bcp.15895. Epub 2023 Sep 6.