OpenAI 的 GPT-4 在-board 风格的皮肤病学问题上表现出色。
OpenAI's GPT-4 performs to a high degree on board-style dermatology questions.
机构信息
Department of Dermatology, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, New Hyde Park, NY, USA.
出版信息
Int J Dermatol. 2024 Jan;63(1):73-78. doi: 10.1111/ijd.16913. Epub 2023 Dec 22.
BACKGROUND
Artificial intelligence tools such as OpenAI's GPT-4 have shown promise in medical education, but their potential in dermatology remains unexplored.
OBJECTIVES
To assess GPT-4's performance on dermatology board-style questions and determine its value as a supplementary educational tool for trainees and educators.
METHODS
This cross-sectional study evaluated GPT-4's performance on 250 random dermatology board-style questions sampled from the American Academy of Dermatology's Board Prep Plus resource. Questions were divided into five subspecialties and various difficulty levels. GPT-4 responses were compared to the correct answers and evaluated by two physicians.
RESULTS
GPT-4 achieved an overall accuracy of 75% on the 250 questions, with no significant variation based on subspecialty or question difficulty. The most common errors were factual and misunderstanding inaccuracies. Responses scored high in clarity, accuracy, and relevance but frequently lacked depth and completeness.
CONCLUSION
GPT-4 performed to a high degree and demonstrated promising performance as an educational adjunct in dermatology. Improvements in response depth and completeness are needed before its use as an unsupervised learning tool is established.
背景
人工智能工具,如 OpenAI 的 GPT-4,在医学教育中显示出了潜力,但它们在皮肤科中的应用潜力尚未得到探索。
目的
评估 GPT-4 在皮肤科 board-style 问题上的表现,并确定其作为学员和教育者补充教育工具的价值。
方法
本横断面研究评估了 GPT-4 在从美国皮肤病学会的 Board Prep Plus 资源中随机抽取的 250 个皮肤科 board-style 问题上的表现。问题分为五个亚专业和各种难度级别。将 GPT-4 的回答与正确答案进行比较,并由两名医生进行评估。
结果
GPT-4 在 250 个问题上的总体准确率为 75%,亚专业或问题难度没有显著差异。最常见的错误是事实和误解错误。回答在清晰度、准确性和相关性方面得分很高,但经常缺乏深度和完整性。
结论
GPT-4 表现出色,在皮肤科中作为教育辅助工具具有很大的应用潜力。在将其用作无监督学习工具之前,需要改进回答的深度和完整性。