Suppr超能文献

GPT-4在支持肾脏病学多项选择题中医生决策方面的表现。

GPT-4's performance in supporting physician decision-making in nephrology multiple-choice questions.

作者信息

Noda Ryunosuke, Tanabe Kenichiro, Ichikawa Daisuke, Shibagaki Yugo

机构信息

Division of Nephrology and Hypertension, Department of Internal Medicine, St. Marianna University School of Medicine, 2-16-1 Sugao, Miyamae-ku, Kawasaki, Kanagawa, 216-8511, Japan.

Pathophysiology and Bioregulation, St. Marianna University School of Medicine, Kawasaki, Japan.

出版信息

Sci Rep. 2025 May 2;15(1):15439. doi: 10.1038/s41598-025-99774-3.

Abstract

Generative Pre-trained Transformer (GPT)-4, a versatile conversational artificial intelligence, has potential applications in medicine, but its ability to support physicians' decision-making remains unclear. We evaluated GPT-4's performance in assisting physicians with nephrology questions. Forty-five single-answer multiple-choice questions were extracted from the Core Curriculum in Nephrology articles published in the American Journal of Kidney Diseases from October 2021 to June 2023. Eight junior physicians without board certification and ten senior physicians with board certification answered these questions twice: first unaided, then with the opportunity to revise their answers based on GPT-4's outputs. GPT-4 correctly answered 77.8% of the questions. Before using GPT-4, junior physicians had a median (interquartile range) proportion of correct answers of 53.3% (48.3-53.3), senior physicians 65.6% (60.6-66.7). After GPT-4 support, the median proportion of correct answers significantly increased to 72.2% (68.3-76.1) for juniors and 75.6% (73.3-80.0) for seniors (p = 0.008, p = 0.004). The improvement was significantly higher for junior physicians (p = 0.017). However, Senior physicians showed a decreased proportion of correct answers in one of the clinical categories. GPT-4 significantly improved physicians' accuracy in nephrology, especially among less experienced physicians, but may have negative impacts in specific subfields. Careful consideration is required when using GPT-4 to support physicians' decision-making.

摘要

生成式预训练变换器(GPT)-4是一种通用的对话式人工智能,在医学领域具有潜在应用,但它支持医生决策的能力仍不明确。我们评估了GPT-4在协助医生解答肾脏病问题方面的表现。从2021年10月至2023年6月发表在美国《肾脏病杂志》上的肾脏病核心课程文章中提取了45道单项选择题。8名未获得委员会认证的初级医生和10名获得委员会认证的高级医生对这些问题回答了两次:第一次无辅助回答,然后有机会根据GPT-4的输出修改答案。GPT-4正确回答了77.8%的问题。在使用GPT-4之前,初级医生正确答案的中位数(四分位间距)比例为53.3%(48.3 - 53.3),高级医生为65.6%(60.6 - 66.7)。在GPT-4的支持下,初级医生正确答案的中位数比例显著提高到72.2%(68.3 - 76.1),高级医生提高到75.6%(73.3 - 80.0)(p = 0.008,p = 0.004)。初级医生的提高幅度显著更高(p = 0.017)。然而,高级医生在其中一个临床类别中的正确答案比例有所下降。GPT-4显著提高了医生在肾脏病方面的准确性,尤其是在经验较少的医生中,但在特定子领域可能有负面影响。在使用GPT-4支持医生决策时需要仔细考虑。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ff6/12048615/cb86baec4fae/41598_2025_99774_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验