GPT-4V 对日本全国临床工程师执照考试的反应分析。

Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination.

机构信息

Department of Materials and Human Environmental Sciences, Faculty of Engineering, Shonan Institute of Technology, Fujisawa, Japan.

Department of Medical Informatics, School of Allied Health Science, Kitasato University, Sagamihara, Japan.

出版信息

J Med Syst. 2024 Sep 11;48(1):83. doi: 10.1007/s10916-024-02103-w.

DOI:10.1007/s10916-024-02103-w

PMID:39259341

Abstract

Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012 to 2023. The average correct answer rate for all questions was 86.0%. In particular, clinical medicine, basic medicine, medical materials, biological properties, and mechanical engineering achieved a correct response rate of ≥ 90%. Conversely, medical device safety management, electrical and electronic engineering, and extracorporeal circulation obtained low correct answer rates ranging from 64.8% to 76.5%. The correct answer rates for questions that included figures/tables, required numerical calculation, figure/table ∩ calculation, and knowledge of Japanese Industrial Standards were 55.2%, 85.8%, 64.2% and 31.0%, respectively. The reason for the low correct answer rates is that ChatGPT lacked recognition of the images and knowledge of standards and laws. This study concludes that careful attention is required when using ChatGPT because several of its explanations lack the correct description.

摘要

Chat Generative Pretrained Transformer（ChatGPT；OpenAI）是一种先进的大型语言模型，可以根据用户输入模拟类似人类的对话。我们使用 2012 年至 2023 年的 2155 个问题评估了 GPT-4V 在日本国家临床工程师执照考试中的性能。所有问题的平均正确答案率为 86.0%。特别是临床医学、基础医学、医用材料、生物特性和机械工程的正确反应率达到了≥90%。相比之下，医疗器械安全管理、电气和电子工程以及体外循环的正确答案率范围从 64.8%到 76.5%。包含图表/表格、需要数值计算、图表/计算交集和日本工业标准知识的问题的正确答案率分别为 55.2%、85.8%、64.2%和 31.0%。低正确答案率的原因是 ChatGPT 缺乏对图像的识别和对标准和法律的了解。本研究得出结论，在使用 ChatGPT 时需要谨慎，因为它的一些解释缺乏正确的描述。

相似文献

Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination.GPT-4V 对日本全国临床工程师执照考试的反应分析。

J Med Syst. 2024 Sep 11;48(1):83. doi: 10.1007/s10916-024-02103-w.

GPT-4/4V's performance on the Japanese National Medical Licensing Examination.GPT-4/4V在日本国家医师资格考试中的表现。

Med Teach. 2025 Mar;47(3):450-457. doi: 10.1080/0142159X.2024.2342545. Epub 2024 Apr 22.

Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study.GPT-4V（视觉）在日本国家医师资格考试中的能力：评估研究。

JMIR Med Educ. 2024 Mar 12;10:e54393. doi: 10.2196/54393.

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study.ChatGPT在日本国家医师资格考试医学问题上的准确性：评估研究

JMIR Form Res. 2023 Oct 13;7:e48023. doi: 10.2196/48023.

Performance of ChatGPT-3.5 and ChatGPT-4 in the Taiwan National Pharmacist Licensing Examination: Comparative Evaluation Study.ChatGPT-3.5和ChatGPT-4在台湾国家药剂师执照考试中的表现：比较评估研究。

JMIR Med Educ. 2025 Jan 17;11:e56850. doi: 10.2196/56850.

Performance of ChatGPT-3.5 and GPT-4 in national licensing examinations for medicine, pharmacy, dentistry, and nursing: a systematic review and meta-analysis.ChatGPT-3.5 和 GPT-4 在医学、药学、牙科和护理国家执照考试中的表现：系统评价和荟萃分析。

BMC Med Educ. 2024 Sep 16;24(1):1013. doi: 10.1186/s12909-024-05944-8.

Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现：系统评价和荟萃分析。

J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.

Reshaping medical education: Performance of ChatGPT on a PES medical examination.重塑医学教育：ChatGPT 在 PES 医学考试中的表现。

Cardiol J. 2024;31(3):442-450. doi: 10.5603/cj.97517. Epub 2023 Oct 13.

Evaluating Chat Generative Pretrained Transformer (GPT-4o) Problem-Solving Performance in the Japan Certificate Examination for Biomedical Engineering Class 1.评估聊天生成预训练变换器（GPT-4o）在日本生物医学工程1级证书考试中的问题解决表现。

Cureus. 2025 Mar 23;17(3):e81029. doi: 10.7759/cureus.81029. eCollection 2025 Mar.

ChatGPT Performs on the Chinese National Medical Licensing Examination.ChatGPT 通过中国医师资格考试。

J Med Syst. 2023 Aug 15;47(1):86. doi: 10.1007/s10916-023-01961-0.

引用本文的文献

Role of Artificial Intelligence in Surgical Training by Assessing GPT-4 and GPT-4o on the Japan Surgical Board Examination With Text-Only and Image-Accompanied Questions: Performance Evaluation Study.通过在日本外科医师资格考试中使用纯文本和图文并茂的问题评估GPT-4和GPT-4o来研究人工智能在外科培训中的作用：性能评估研究

JMIR Med Educ. 2025 Jul 30;11:e69313. doi: 10.2196/69313.

Cureus. 2025 Mar 23;17(3):e81029. doi: 10.7759/cureus.81029. eCollection 2025 Mar.

Enhancing ophthalmology students' awareness of retinitis pigmentosa: assessing the efficacy of ChatGPT in AI-assisted teaching of rare diseases-a quasi-experimental study.提高眼科学生对色素性视网膜炎的认识：评估ChatGPT在罕见病人工智能辅助教学中的效果——一项准实验研究

Front Med (Lausanne). 2025 Mar 18;12:1534294. doi: 10.3389/fmed.2025.1534294. eCollection 2025.

ChatGPT (GPT-4V) Performance on the Healthcare Information Technologist Examination in Japan.ChatGPT（GPT - 4V）在日本医疗信息技术专家考试中的表现。

Cureus. 2025 Jan 1;17(1):e76775. doi: 10.7759/cureus.76775. eCollection 2025 Jan.

本文引用的文献

Performance of generative pre-trained transformers (GPTs) in Certification Examination of the College of Family Physicians of Canada.生成式预训练转换器（GPTs）在加拿大家庭医生学院认证考试中的表现。

Fam Med Community Health. 2024 May 28;12(Suppl 1):e002626. doi: 10.1136/fmch-2023-002626.

Performance of ChatGPT on the National Korean Occupational Therapy Licensing Examination.ChatGPT在韩国国家职业治疗师执照考试中的表现。

Digit Health. 2024 Feb 29;10:20552076241236635. doi: 10.1177/20552076241236635. eCollection 2024 Jan-Dec.

Performance of ChatGPT on Stage 1 of the Taiwanese medical licensing exam.ChatGPT在台湾医师执照考试第一阶段的表现。

Digit Health. 2024 Feb 16;10:20552076241233144. doi: 10.1177/20552076241233144. eCollection 2024 Jan-Dec.

Performance of ChatGPT on Chinese national medical licensing examinations: a five-year examination evaluation study for physicians, pharmacists and nurses.ChatGPT 在国家医师、药师、护士等医学类考试中的表现：一项针对医、药、护人员的五年考试评估研究。

BMC Med Educ. 2024 Feb 14;24(1):143. doi: 10.1186/s12909-024-05125-7.

Performance of Generative Pretrained Transformer on the National Medical Licensing Examination in Japan.生成式预训练变换器在日本国家医师资格考试中的表现。

PLOS Digit Health. 2024 Jan 23;3(1):e0000433. doi: 10.1371/journal.pdig.0000433. eCollection 2024 Jan.

The Performance of GPT-3.5, GPT-4, and Bard on the Japanese National Dentist Examination: A Comparison Study.GPT-3.5、GPT-4和Bard在日本国家牙科医师考试中的表现：一项比较研究。

Cureus. 2023 Dec 12;15(12):e50369. doi: 10.7759/cureus.50369. eCollection 2023 Dec.

ChatGPT in Iranian medical licensing examination: evaluating the diagnostic accuracy and decision-making capabilities of an AI-based model.ChatGPT 在伊朗医师执照考试中的应用：评估基于人工智能的模型的诊断准确性和决策能力。

BMJ Health Care Inform. 2023 Dec 11;30(1):e100815. doi: 10.1136/bmjhci-2023-100815.

Evaluation of the performance of GPT-3.5 and GPT-4 on the Polish Medical Final Examination.评估 GPT-3.5 和 GPT-4 在波兰医学期末考试中的表现。

Sci Rep. 2023 Nov 22;13(1):20512. doi: 10.1038/s41598-023-46995-z.

Beyond the Pass Mark: Accuracy of ChatGPT and Bing in the National Medical Licensure Examination in Japan.超过及格分数：ChatGPT和必应在日本国家医师资格考试中的准确性

JMA J. 2023 Oct 16;6(4):536-538. doi: 10.31662/jmaj.2023-0043. Epub 2023 Sep 20.

Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study.ChatGPT在日本国家医师资格考试医学问题上的准确性：评估研究

JMIR Form Res. 2023 Oct 13;7:e48023. doi: 10.2196/48023.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

GPT-4V 对日本全国临床工程师执照考试的反应分析。

Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献