Miyazaki Yuki, Hata Masahiro, Omori Hisaki, Hirashima Atsuya, Nakagawa Yuta, Eto Mitsuhiro, Takahashi Shun, Ikeda Manabu
Department of Psychiatry, Osaka University Graduate School of Medicine, Suita, Japan.
Department of Psychiatry, Shichiyama Hospital, Sennan District, Japan.
JMIR Med Educ. 2024 Dec 24;10:e63129. doi: 10.2196/63129.
This study evaluated the performance of ChatGPT with GPT-4 Omni (GPT-4o) on the 118th Japanese Medical Licensing Examination. The study focused on both text-only and image-based questions. The model demonstrated a high level of accuracy overall, with no significant difference in performance between text-only and image-based questions. Common errors included clinical judgment mistakes and prioritization issues, underscoring the need for further improvement in the integration of artificial intelligence into medical education and practice.
本研究评估了ChatGPT与GPT-4 Omni(GPT-4o)在第118次日本医师执照考试中的表现。该研究聚焦于纯文本问题和基于图像的问题。该模型总体表现出较高的准确性,纯文本问题和基于图像的问题在性能上没有显著差异。常见错误包括临床判断失误和优先级问题,这凸显了在将人工智能融入医学教育和实践方面进一步改进的必要性。