• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估聊天生成预训练变换器对有关先天性上肢差异的常见患者问题的回答准确性。

Assessing Accuracy of Chat Generative Pre-Trained Transformer's Responses to Common Patient Questions Regarding Congenital Upper Limb Differences.

作者信息

Zeller Niklaus P, Shah Ayush D, Van Heest Ann E, Bohn Deborah C

机构信息

University of Minnesota Medical School, Minneapolis, MN.

Department of Orthopedic Surgery, University of Minnesota, Minneapolis, MN.

出版信息

J Hand Surg Glob Online. 2025 May 31;7(4):100764. doi: 10.1016/j.jhsg.2025.100764. eCollection 2025 Jul.

DOI:10.1016/j.jhsg.2025.100764
PMID:40520541
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12164003/
Abstract

PURPOSE

The purpose was to assess the ability of Chat Generative Pre-Trained Transformer (ChatGPT) 4.0 to accurately and reliably answer patients' frequently asked questions (FAQs) about congenital upper limb differences (CULDs) and their treatment options.

METHODS

Two pediatric hand surgeons were queried regarding FAQs they receive from parents about CULDs. Sixteen FAQs were input to ChatGPT-4.0 for the following conditions: (1) syndactyly, (2) polydactyly, (3) radial longitudinal deficiency, (4) thumb hypoplasia, and (5) general congenital hand differences. Two additional psychosocial care questions were queried, and all responses were graded by the surgeons using a scale of 1-4, based on the quality of the response. Independent chats were used for each question to reduce memory-retention bias with no pretraining of the software application.

RESULTS

Overall, ChatGPT provided relatively reliable, evidence-based responses to the 16 queried FAQs. In total, 164 grades were assigned to the 82 ChatGPT responses: 83 (51%) did not require any clarification, 37 (23%) required minimal clarification, 32 (20%) required moderate clarification, and 13 (8%) received an unsatisfactory rating. However, there was considerable variability in the depth of many responses. When queried on medical associations with syndactyly and polydactyly, ChatGPT provided a detailed account of associated syndromes, although there was no mention that syndromic involvement is relatively rare. Furthermore, ChatGPT recommended that the patients consult a health care provider for individualized care 81 times in 49 responses. It commonly "referred" patients to genetic counselors (n = 26, 32%), followed by pediatric orthopedic surgeons and orthopedic surgeons (n = 16, 20%), and hand surgeons (n = 9, 11%).

CONCLUSIONS

Chat Generative Pre-Trained Transformer provided evidence-based responses not requiring clarification to a majority of FAQs about CULDs. However, there was considerable variation across the responses, and it rarely "referred" patients to hand surgeons. As new tools for patient education, ChatGPT and similar large language models should be approached cautiously when seeking information about CULDs. Responses do not consistently provide comprehensive, individualized information. 8% of responses were misguiding.

TYPE OF STUDY/LEVEL OF EVIDENCE: Economic/decision analysis IIC.

摘要

目的

评估聊天生成预训练变换器(ChatGPT)4.0准确、可靠地回答患者关于先天性上肢差异(CULD)及其治疗选择的常见问题(FAQ)的能力。

方法

向两位小儿手外科医生询问他们从家长那里收到的关于CULD的常见问题。将16个常见问题输入ChatGPT-4.0,针对以下情况:(1)并指畸形,(2)多指畸形,(3)桡侧纵列发育不全,(4)拇指发育不全,以及(5)一般先天性手部差异。还询问了另外两个心理社会护理问题,所有回答由外科医生根据回答质量按1-4分的等级进行评分。每个问题使用独立聊天以减少记忆保留偏差,且软件应用程序未进行预训练。

结果

总体而言,ChatGPT对16个询问的常见问题提供了相对可靠、基于证据的回答。总共对82个ChatGPT回答给出了164个评分:83个(51%)不需要任何澄清,37个(23%)需要最少的澄清,32个(20%)需要适度的澄清,13个(8%)得到不满意的评分。然而,许多回答的深度存在相当大的差异。当询问与并指畸形和多指畸形相关的医学关联时,ChatGPT详细说明了相关综合征,尽管未提及综合征性受累相对罕见。此外,ChatGPT在49个回答中81次建议患者咨询医疗保健提供者以获得个性化护理。它通常“推荐”患者咨询遗传咨询师(n = 26,32%),其次是小儿骨科医生和骨科医生(n = 16,20%),以及手外科医生(n = 9,11%)。

结论

聊天生成预训练变换器对大多数关于CULD的常见问题提供了无需澄清的基于证据的回答。然而,回答之间存在相当大的差异,并且它很少“推荐”患者咨询手外科医生。作为患者教育的新工具,在寻求关于CULD的信息时,应谨慎对待ChatGPT和类似的大语言模型。回答并不始终提供全面、个性化的信息。8%的回答具有误导性。

研究类型/证据水平:经济/决策分析II C。

相似文献

1
Assessing Accuracy of Chat Generative Pre-Trained Transformer's Responses to Common Patient Questions Regarding Congenital Upper Limb Differences.评估聊天生成预训练变换器对有关先天性上肢差异的常见患者问题的回答准确性。
J Hand Surg Glob Online. 2025 May 31;7(4):100764. doi: 10.1016/j.jhsg.2025.100764. eCollection 2025 Jul.
2
Adequacy of ChatGPT responses to frequently asked questions about shoulder arthroplasty: is it an appropriate adjunct for patient education?ChatGPT对肩关节置换术常见问题的回答是否充分:它是否是患者教育的合适辅助工具?
JSES Int. 2025 Feb 6;9(3):830-836. doi: 10.1016/j.jseint.2025.01.008. eCollection 2025 May.
3
Evaluating Chat Generative Pre-trained Transformer Responses to Common Pediatric In-toeing Questions.评估聊天生成预训练转换器对常见儿科内八字问题的回答。
J Pediatr Orthop. 2024 Aug 1;44(7):e592-e597. doi: 10.1097/BPO.0000000000002695. Epub 2024 Apr 30.
4
Can ChatGPT 4.0 reliably answer patient frequently asked questions about boxer's fractures?ChatGPT 4.0能否可靠地回答患者关于拳击骨折的常见问题?
Hand Surg Rehabil. 2025 Apr;44(2):102082. doi: 10.1016/j.hansur.2025.102082. Epub 2025 Jan 9.
5
Assessing ChatGPT Responses to Frequently Asked Questions Regarding Pediatric Supracondylar Humerus Fractures.评估ChatGPT对小儿肱骨髁上骨折常见问题的回答。
J Pediatr Orthop. 2025 Jul 1;45(6):327-331. doi: 10.1097/BPO.0000000000002923. Epub 2025 Feb 7.
6
ChatGPT Can Often Respond Adequately to Common Patient Questions Regarding Femoroacetabular Impingement.ChatGPT通常能够充分回答患者关于股骨髋臼撞击症的常见问题。
Clin J Sport Med. 2024 Dec 24. doi: 10.1097/JSM.0000000000001327.
7
ChatGPT and Google Provide Mostly Excellent or Satisfactory Responses to the Most Frequently Asked Patient Questions Related to Rotator Cuff Repair.ChatGPT和谷歌对与肩袖修复相关的最常见患者问题大多提供了极佳或令人满意的回答。
Arthrosc Sports Med Rehabil. 2024 Jun 25;6(5):100963. doi: 10.1016/j.asmr.2024.100963. eCollection 2024 Oct.
8
Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.通过对ChatGPT回答有关尺侧副韧带损伤常见患者问题能力的调查,了解其如何成为临床管理工具。
Orthop J Sports Med. 2024 Jul 31;12(7):23259671241257516. doi: 10.1177/23259671241257516. eCollection 2024 Jul.
9
Use and Application of Large Language Models for Patient Questions Following Total Knee Arthroplasty.全膝关节置换术后患者问题的大语言模型应用与实践
J Arthroplasty. 2024 Sep;39(9):2289-2294. doi: 10.1016/j.arth.2024.03.017. Epub 2024 Mar 13.
10
Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement.评估ChatGPT对全膝关节置换常见问题的回答的准确性和相关性。
Knee Surg Relat Res. 2024 Apr 2;36(1):15. doi: 10.1186/s43019-024-00218-5.

本文引用的文献

1
Online patient education in body contouring: A comparison between Google and ChatGPT.网络形体塑造患者教育:谷歌与 ChatGPT 之间的比较。
J Plast Reconstr Aesthet Surg. 2023 Dec;87:390-402. doi: 10.1016/j.bjps.2023.10.091. Epub 2023 Oct 20.
2
Evaluating ChatGPT Performance on the Orthopaedic In-Training Examination.评估ChatGPT在骨科住院医师培训考试中的表现。
JB JS Open Access. 2023 Sep 8;8(3). doi: 10.2106/JBJS.OA.23.00056. eCollection 2023 Jul-Sep.
3
Evaluation of Online Artificial Intelligence-Generated Information on Common Hand Procedures.
常见手部手术的在线人工智能生成信息评估
J Hand Surg Am. 2023 Nov;48(11):1122-1127. doi: 10.1016/j.jhsa.2023.08.003. Epub 2023 Sep 9.
4
Comparison of ChatGPT-3.5, ChatGPT-4, and Orthopaedic Resident Performance on Orthopaedic Assessment Examinations.ChatGPT-3.5、ChatGPT-4 和骨科住院医师在骨科评估考试中的表现比较。
J Am Acad Orthop Surg. 2023 Dec 1;31(23):1173-1179. doi: 10.5435/JAAOS-D-23-00396. Epub 2023 Sep 4.
5
Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations.ChatGPT和GPT-4在神经外科笔试中的表现。
Neurosurgery. 2023 Dec 1;93(6):1353-1365. doi: 10.1227/neu.0000000000002632. Epub 2023 Aug 15.
6
ChatGPT-3.5 and ChatGPT-4 dermatological knowledge level based on the Specialty Certificate Examination in Dermatology.基于皮肤病学专业证书考试的 ChatGPT-3.5 和 ChatGPT-4 皮肤科知识水平。
Clin Exp Dermatol. 2024 Jun 25;49(7):686-691. doi: 10.1093/ced/llad255.
7
Exploring the Role of a Large Language Model on Carpal Tunnel Syndrome Management: An Observation Study of ChatGPT.探索大语言模型在腕管综合征管理中的作用:对ChatGPT的一项观察性研究
J Hand Surg Am. 2023 Oct;48(10):1025-1033. doi: 10.1016/j.jhsa.2023.07.003. Epub 2023 Aug 1.
8
Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty.评估 ChatGPT 对全髋关节置换术常见患者问题的回答。
J Bone Joint Surg Am. 2023 Oct 4;105(19):1519-1526. doi: 10.2106/JBJS.23.00209. Epub 2023 Jul 17.
9
Reliability of Medical Information Provided by ChatGPT: Assessment Against Clinical Guidelines and Patient Information Quality Instrument.ChatGPT 提供的医学信息的可靠性:与临床指南和患者信息质量工具的评估。
J Med Internet Res. 2023 Jun 30;25:e47479. doi: 10.2196/47479.
10
Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine.GPT-4作为医学人工智能聊天机器人的益处、局限性和风险
N Engl J Med. 2023 Mar 30;388(13):1233-1239. doi: 10.1056/NEJMsr2214184.