Department of Pharmacy, Abashiri Kosei General Hospital, Abashiri, Japan.
Graduate School of Health Sciences, Hokkaido University, Sapporo, Japan.
J Educ Eval Health Prof. 2024;21:4. doi: 10.3352/jeehp.2024.21.4. Epub 2024 Feb 28.
The objective of this study was to assess the performance of ChatGPT (GPT-4) on all items, including those with diagrams, in the Japanese National License Examination for Pharmacists (JNLEP) and compare it with the previous GPT-3.5 model’s performance.
The 107th JNLEP, conducted in 2022, with 344 items input into the GPT-4 model, was targeted for this study. Separately, 284 items, excluding those with diagrams, were entered into the GPT-3.5 model. The answers were categorized and analyzed to determine accuracy rates based on categories, subjects, and presence or absence of diagrams. The accuracy rates were compared to the main passing criteria (overall accuracy rate ≥62.9%).
The overall accuracy rate for all items in the 107th JNLEP in GPT-4 was 72.5%, successfully meeting all the passing criteria. For the set of items without diagrams, the accuracy rate was 80.0%, which was significantly higher than that of the GPT-3.5 model (43.5%). The GPT-4 model demonstrated an accuracy rate of 36.1% for items that included diagrams.
Advancements that allow GPT-4 to process images have made it possible for LLMs to answer all items in medical-related license examinations. This study’s findings confirm that ChatGPT (GPT-4) possesses sufficient knowledge to meet the passing criteria.
本研究旨在评估 ChatGPT(GPT-4)在所有项目中的表现,包括有图表的项目,并将其与之前的 GPT-3.5 模型进行比较。
本研究针对 2022 年进行的第 107 次日本药师国家执照考试(JNLEP),将 344 个项目输入到 GPT-4 模型中。另外,将 284 个不包括图表的项目输入到 GPT-3.5 模型中。根据类别、科目以及是否有图表对答案进行分类和分析,以确定基于类别、科目以及是否有图表的准确率。将准确率与主要通过标准(整体准确率≥62.9%)进行比较。
在 GPT-4 中,第 107 次 JNLEP 所有项目的总体准确率为 72.5%,成功达到了所有通过标准。对于没有图表的项目集,准确率为 80.0%,明显高于 GPT-3.5 模型(43.5%)。GPT-4 模型对于包含图表的项目的准确率为 36.1%。
GPT-4 能够处理图像的进步使得 LLM 可以回答医学相关执照考试的所有项目。本研究的结果证实了 ChatGPT(GPT-4)具备满足通过标准的足够知识。