Suppr超能文献

GPT-4V 对日本全国临床工程师执照考试的反应分析。

Analysis of Responses of GPT-4 V to the Japanese National Clinical Engineer Licensing Examination.

机构信息

Department of Materials and Human Environmental Sciences, Faculty of Engineering, Shonan Institute of Technology, Fujisawa, Japan.

Department of Medical Informatics, School of Allied Health Science, Kitasato University, Sagamihara, Japan.

出版信息

J Med Syst. 2024 Sep 11;48(1):83. doi: 10.1007/s10916-024-02103-w.

Abstract

Chat Generative Pretrained Transformer (ChatGPT; OpenAI) is a state-of-the-art large language model that can simulate human-like conversations based on user input. We evaluated the performance of GPT-4 V in the Japanese National Clinical Engineer Licensing Examination using 2,155 questions from 2012 to 2023. The average correct answer rate for all questions was 86.0%. In particular, clinical medicine, basic medicine, medical materials, biological properties, and mechanical engineering achieved a correct response rate of ≥ 90%. Conversely, medical device safety management, electrical and electronic engineering, and extracorporeal circulation obtained low correct answer rates ranging from 64.8% to 76.5%. The correct answer rates for questions that included figures/tables, required numerical calculation, figure/table ∩ calculation, and knowledge of Japanese Industrial Standards were 55.2%, 85.8%, 64.2% and 31.0%, respectively. The reason for the low correct answer rates is that ChatGPT lacked recognition of the images and knowledge of standards and laws. This study concludes that careful attention is required when using ChatGPT because several of its explanations lack the correct description.

摘要

Chat Generative Pretrained Transformer(ChatGPT;OpenAI)是一种先进的大型语言模型,可以根据用户输入模拟类似人类的对话。我们使用 2012 年至 2023 年的 2155 个问题评估了 GPT-4V 在日本国家临床工程师执照考试中的性能。所有问题的平均正确答案率为 86.0%。特别是临床医学、基础医学、医用材料、生物特性和机械工程的正确反应率达到了≥90%。相比之下,医疗器械安全管理、电气和电子工程以及体外循环的正确答案率范围从 64.8%到 76.5%。包含图表/表格、需要数值计算、图表/计算交集和日本工业标准知识的问题的正确答案率分别为 55.2%、85.8%、64.2%和 31.0%。低正确答案率的原因是 ChatGPT 缺乏对图像的识别和对标准和法律的了解。本研究得出结论,在使用 ChatGPT 时需要谨慎,因为它的一些解释缺乏正确的描述。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验