Suppr超能文献

将 GPT-4 应用于整形外科住院医师培训考试。

Applying GPT-4 to the Plastic Surgery Inservice Training Examination.

机构信息

Division of Plastic Surgery, Department of Surgery, St. Louis University School of Medicine, St. Louis, MO, USA.

Department of Plastic Surgery, Rutgers New Jersey School of Medicine, Newark, NJ, USA.

出版信息

J Plast Reconstr Aesthet Surg. 2023 Dec;87:78-82. doi: 10.1016/j.bjps.2023.09.027. Epub 2023 Sep 14.

Abstract

BACKGROUND

The recent introduction of Generative Pre-trained Transformer (GPT)-4 has demonstrated the potential to be a superior version of ChatGPT-3.5. According to many, GPT-4 is seen as a more reliable and creative version of GPT-3.5.

OBJECTIVE

In conjugation with our prior manuscript, we wanted to determine if GPT-4 could be exploited as an instrument for plastic surgery graduate medical education by evaluating its performance on the Plastic Surgery Inservice Training Examination (PSITE).

METHODS

Sample assessment questions from the 2022 PSITE were obtained from the American Council of Academic Plastic Surgeons website and manually inputted into GPT-4. Responses by GPT-4 were qualified using the properties of natural coherence. Incorrect answers were stratified into the consequent categories: informational, logical, or explicit fallacy.

RESULTS

From a total of 242 questions, GPT-4 provided correct answers for 187, resulting in a 77.3% accuracy rate. Logical reasoning was utilized in 95.0% of questions, internal information in 98.3%, and external information in 97.5%. Upon separating the questions based on incorrect and correct responses, a statistically significant difference was identified in GPT-4's application of logical reasoning.

CONCLUSION

GPT-4 has shown to be more accurate and reliable for plastic surgery resident education when compared to GPT-3.5. Users should look to utilize the tool to enhance their educational curriculum. Those who adopt the use of such models may be better equipped to deliver high-quality care to their patients.

摘要

背景

最近引入的生成式预训练转换器(GPT)-4 展示了成为 ChatGPT-3.5 的优越版本的潜力。根据许多人的说法,GPT-4 被认为是比 GPT-3.5 更可靠和更有创意的版本。

目的

结合我们之前的论文,我们想确定 GPT-4 是否可以通过评估其在整形外科住院医师医学教育中的表现来用作整形外科住院医师医学教育的工具,即通过评估其在整形外科在职培训考试(PSITE)中的表现来确定。

方法

从美国学术整形外科医师协会网站获得 2022 年 PSITE 的样本评估问题,并手动输入 GPT-4。使用自然连贯性的特性来对 GPT-4 的回答进行定性。将错误答案分为以下几类:信息性、逻辑性或明显错误。

结果

在总共 242 个问题中,GPT-4 正确回答了 187 个问题,准确率为 77.3%。95.0%的问题使用了逻辑推理,98.3%的问题使用了内部信息,97.5%的问题使用了外部信息。根据错误和正确回答对问题进行分类后,发现 GPT-4 在逻辑推理的应用方面存在统计学上的显著差异。

结论

与 GPT-3.5 相比,GPT-4 对整形外科住院医师教育更准确、更可靠。用户应该考虑利用该工具来增强他们的教育课程。那些采用此类模型的人可能更有能力为他们的患者提供高质量的护理。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验