• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大语言模型在手部和周围神经损伤诊断中的应用:ChatGPT与伊莎贝尔鉴别诊断生成器的评估

Large Language Models in the Diagnosis of Hand and Peripheral Nerve Injuries: An Evaluation of ChatGPT and the Isabel Differential Diagnosis Generator.

作者信息

AlShenaiber Abdullah, Datta Shaishav, Mosa Adam J, Binhammer Paul A, Ing Edsel B

机构信息

Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.

Division of Plastic, Reconstructive & Aesthetic Surgery, Department of Surgery, University of Toronto, Toronto, ON, Canada.

出版信息

J Hand Surg Glob Online. 2024 Sep 3;6(6):847-854. doi: 10.1016/j.jhsg.2024.07.011. eCollection 2024 Nov.

DOI:10.1016/j.jhsg.2024.07.011
PMID:39703593
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11652307/
Abstract

PURPOSE

Tools using artificial intelligence may help reduce missed or delayed diagnoses and improve patient care in hand surgery. This study aimed to compare and evaluate the performance of two natural language processing programs, Isabel and ChatGPT-4, in diagnosing hand and peripheral nerve injuries from a set of clinical vignettes.

METHODS

Cases from a virtual library of hand surgery case reports with no history of trauma or previous surgery were included in this study. The clinical details (age, sex, symptoms, signs, and medical history) of 16 hand cases were entered into Isabel and ChatGPT-4 to generate top 10 differential diagnosis lists. Isabel and ChatGPT-4's inclusion and median rank of the correct diagnosis within each list were compared. Two hand surgeons were then provided each list and asked to independently evaluate the performance of the two systems.

RESULTS

Isabel correctly identified 7/16 (44%) cases with a median rank of two (interquartile range = 3). ChatGPT-4 correctly identified 14/16 (88%) of cases with a median rank of one (interquartile range = 1). Physicians one and two, respectively, preferred the lists generated by ChatGPT-4 in 12/16 (75%) and 13/16 (81%) of cases and had no preference in 2/16 (13%) cases.

CONCLUSIONS

ChatGPT-4 had significantly greater diagnostic accuracy within our sample ( < .05) and generated higher quality differential diagnoses than Isabel. Isabel produced several inappropriate and imprecise differential diagnoses.

CLINICAL RELEVANCE

Despite large language models' potential utility in generating medical diagnoses, physicians must continue to exercise high caution and use their clinical judgment when making diagnostic decisions.

摘要

目的

使用人工智能的工具可能有助于减少手部手术中漏诊或误诊的情况,并改善患者护理。本研究旨在比较和评估两种自然语言处理程序Isabel和ChatGPT-4在根据一组临床病例诊断手部和周围神经损伤方面的性能。

方法

本研究纳入了来自手部手术病例报告虚拟库的病例,这些病例无创伤史或既往手术史。将16例手部病例的临床细节(年龄、性别、症状、体征和病史)输入Isabel和ChatGPT-4,以生成前10名的鉴别诊断列表。比较Isabel和ChatGPT-4在每个列表中正确诊断的纳入情况和中位排名。然后向两名手外科医生提供每个列表,并要求他们独立评估这两个系统的性能。

结果

Isabel正确识别了7/16(44%)的病例,中位排名为第二(四分位间距 = 3)。ChatGPT-4正确识别了14/16(88%)的病例,中位排名为第一(四分位间距 = 1)。医生一和医生二分别在12/16(75%)和13/16(81%)的病例中更喜欢ChatGPT-4生成的列表,在2/16(13%)的病例中没有偏好。

结论

在我们的样本中,ChatGPT-4具有显著更高的诊断准确性(P <.05),并且比Isabel生成了更高质量的鉴别诊断。Isabel产生了一些不恰当和不准确的鉴别诊断。

临床相关性

尽管大语言模型在生成医学诊断方面具有潜在效用,但医生在做出诊断决策时必须继续高度谨慎并运用临床判断力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/698812da3753/figs1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/ae3e289c9348/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/2a8025322d57/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/b400ee6d99ae/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/698812da3753/figs1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/ae3e289c9348/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/2a8025322d57/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/b400ee6d99ae/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4877/11652307/698812da3753/figs1.jpg

相似文献

1
Large Language Models in the Diagnosis of Hand and Peripheral Nerve Injuries: An Evaluation of ChatGPT and the Isabel Differential Diagnosis Generator.大语言模型在手部和周围神经损伤诊断中的应用:ChatGPT与伊莎贝尔鉴别诊断生成器的评估
J Hand Surg Glob Online. 2024 Sep 3;6(6):847-854. doi: 10.1016/j.jhsg.2024.07.011. eCollection 2024 Nov.
2
ChatGPT-Generated Differential Diagnosis Lists for Complex Case-Derived Clinical Vignettes: Diagnostic Accuracy Evaluation.基于复杂病例临床案例生成的ChatGPT鉴别诊断列表:诊断准确性评估。
JMIR Med Inform. 2023 Oct 9;11:e48808. doi: 10.2196/48808.
3
Evaluating ChatGPT-4's Diagnostic Accuracy: Impact of Visual Data Integration.评估ChatGPT-4的诊断准确性:视觉数据整合的影响。
JMIR Med Inform. 2024 Apr 9;12:e55627. doi: 10.2196/55627.
4
Evaluating ChatGPT-4's Accuracy in Identifying Final Diagnoses Within Differential Diagnoses Compared With Those of Physicians: Experimental Study for Diagnostic Cases.评估ChatGPT-4在鉴别诊断中识别最终诊断的准确性与医生的准确性比较:诊断病例的实验研究
JMIR Form Res. 2024 Jun 26;8:e59267. doi: 10.2196/59267.
5
Can ChatGPT-4 evaluate whether a differential diagnosis list contains the correct diagnosis as accurately as a physician?ChatGPT-4能否像医生一样准确评估鉴别诊断列表是否包含正确的诊断?
Diagnosis (Berl). 2024 Mar 12;11(3):321-324. doi: 10.1515/dx-2024-0027. eCollection 2024 Aug 1.
6
The Isabel Differential Diagnosis Generator for Orbital Diagnosis.用于眼眶诊断的伊莎贝尔鉴别诊断生成器。
Ophthalmic Plast Reconstr Surg. 2023;39(5):461-464. doi: 10.1097/IOP.0000000000002364. Epub 2023 Mar 16.
7
Diagnostic Accuracy of Differential-Diagnosis Lists Generated by Generative Pretrained Transformer 3 Chatbot for Clinical Vignettes with Common Chief Complaints: A Pilot Study.基于生成式预训练 Transformer 3 聊天机器人为常见主诉临床病例生成鉴别诊断列表的诊断准确性:一项初步研究。
Int J Environ Res Public Health. 2023 Feb 15;20(4):3378. doi: 10.3390/ijerph20043378.
8
Computerized diagnostic decision support systems - a comparative performance study of Isabel Pro vs. ChatGPT4.计算机化诊断决策支持系统——Isabel Pro 与 ChatGPT4 的性能比较研究。
Diagnosis (Berl). 2024 May 7;11(3):250-258. doi: 10.1515/dx-2024-0033. eCollection 2024 Aug 1.
9
Is language an issue? Accuracy of the German computerized diagnostic decision support system ISABEL and cross-validation with the English counterpart.语言是否存在问题?德国计算机化诊断决策支持系统 ISABEL 的准确性以及与英语对应系统的交叉验证。
Diagnosis (Berl). 2023 Jul 24;10(4):398-405. doi: 10.1515/dx-2023-0047. eCollection 2023 Nov 1.
10
Educational Utility of Clinical Vignettes Generated in Japanese by ChatGPT-4: Mixed Methods Study.ChatGPT-4 生成的日语临床病例对教育的效用:混合方法研究。
JMIR Med Educ. 2024 Aug 13;10:e59133. doi: 10.2196/59133.

本文引用的文献

1
The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study.GPT-3 人工智能模型的诊断和分诊准确性:一项观察性研究。
Lancet Digit Health. 2024 Aug;6(8):e555-e561. doi: 10.1016/S2589-7500(24)00097-9.
2
Natural language processing in the era of large language models.大语言模型时代的自然语言处理
Front Artif Intell. 2024 Jan 12;6:1350306. doi: 10.3389/frai.2023.1350306. eCollection 2023.
3
Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.
评估 ChatGPT 在整个临床工作流程中的效用:开发和可用性研究。
J Med Internet Res. 2023 Aug 22;25:e48659. doi: 10.2196/48659.
4
Ethical Considerations of Using ChatGPT in Health Care.使用 ChatGPT 在医疗保健中的伦理考虑。
J Med Internet Res. 2023 Aug 11;25:e48009. doi: 10.2196/48009.
5
Large language models encode clinical knowledge.大语言模型编码临床知识。
Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.
6
Chat GPT-4 significantly surpasses GPT-3.5 in drug information queries.Chat GPT-4在药物信息查询方面显著超越了GPT-3.5。
J Telemed Telecare. 2025 Feb;31(2):306-308. doi: 10.1177/1357633X231181922. Epub 2023 Jun 22.
7
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT:其应用、优势、局限性、未来前景及伦理考量概述
Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.
8
ChatGPT goes to the operating room: evaluating GPT-4 performance and its potential in surgical education and training in the era of large language models.ChatGPT走进手术室:在大语言模型时代评估GPT-4在外科教育与培训中的表现及其潜力。
Ann Surg Treat Res. 2023 May;104(5):269-273. doi: 10.4174/astr.2023.104.5.269. Epub 2023 Apr 28.
9
GPT-4: a new era of artificial intelligence in medicine.GPT-4:医学人工智能的新纪元。
Ir J Med Sci. 2023 Dec;192(6):3197-3200. doi: 10.1007/s11845-023-03377-8. Epub 2023 Apr 19.
10
The Isabel Differential Diagnosis Generator for Orbital Diagnosis.用于眼眶诊断的伊莎贝尔鉴别诊断生成器。
Ophthalmic Plast Reconstr Surg. 2023;39(5):461-464. doi: 10.1097/IOP.0000000000002364. Epub 2023 Mar 16.