• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过对ChatGPT回答有关尺侧副韧带损伤常见患者问题能力的调查,了解其如何成为临床管理工具。

Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.

作者信息

Varady Nathan H, Lu Amy Z, Mazzucco Michael, Dines Joshua S, Altchek David W, Williams Riley J, Kunze Kyle N

机构信息

Department of Orthopaedic Surgery, Hospital for Special Surgery, New York, New York, USA.

Weill Cornell Medical College, New York, New York, USA.

出版信息

Orthop J Sports Med. 2024 Jul 31;12(7):23259671241257516. doi: 10.1177/23259671241257516. eCollection 2024 Jul.

DOI:10.1177/23259671241257516
PMID:39139744
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11320692/
Abstract

BACKGROUND

The consumer availability and automated response functions of chat generator pretrained transformer (ChatGPT-4), a large language model, poise this application to be utilized for patient health queries and may have a role in serving as an adjunct to minimize administrative and clinical burden.

PURPOSE

To evaluate the ability of ChatGPT-4 to respond to patient inquiries concerning ulnar collateral ligament (UCL) injuries and compare these results with the performance of Google.

STUDY DESIGN

Cross-sectional study.

METHODS

Google Web Search was used as a benchmark, as it is the most widely used search engine worldwide and the only search engine that generates frequently asked questions (FAQs) when prompted with a query, allowing comparisons through a systematic approach. The query "ulnar collateral ligament reconstruction" was entered into Google, and the top 10 FAQs, answers, and their sources were recorded. ChatGPT-4 was prompted to perform a Google search of FAQs with the same query and to record the sources of answers for comparison. This process was again replicated to obtain 10 new questions requiring numeric instead of open-ended responses. Finally, responses were graded independently for clinical accuracy (grade 0 = inaccurate, grade 1 = somewhat accurate, grade 2 = accurate) by 2 fellowship-trained sports medicine surgeons (D.W.A, J.S.D.) blinded to the search engine and answer source.

RESULTS

ChatGPT-4 used a greater proportion of academic sources than Google to provide answers to the top 10 FAQs, although this was not statistically significant (90% vs 50%; = .14). In terms of question overlap, 40% of the most common questions on Google and ChatGPT-4 were the same. When comparing FAQs with numeric responses, 20% of answers were completely overlapping, 30% demonstrated partial overlap, and the remaining 50% did not demonstrate any overlap. All sources used by ChatGPT-4 to answer these FAQs were academic, while only 20% of sources used by Google were academic ( = .0007). The remaining Google sources included social media (40%), medical practices (20%), single-surgeon websites (10%), and commercial websites (10%). The mean (± standard deviation) accuracy for answers given by ChatGPT-4 was significantly greater compared with Google for the top 10 FAQs (1.9 ± 0.2 vs 1.2 ± 0.6; = .001) and top 10 questions with numeric answers (1.8 ± 0.4 vs 1 ± 0.8; = .013).

CONCLUSION

ChatGPT-4 is capable of providing responses with clinically relevant content concerning UCL injuries and reconstruction. ChatGPT-4 utilized a greater proportion of academic websites to provide responses to FAQs representative of patient inquiries compared with Google Web Search and provided significantly more accurate answers. Moving forward, ChatGPT has the potential to be used as a clinical adjunct when answering queries about UCL injuries and reconstruction, but further validation is warranted before integrated or autonomous use in clinical settings.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/7e39bcd7b125/10.1177_23259671241257516-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/102a5e25a4a8/10.1177_23259671241257516-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/509e64a86071/10.1177_23259671241257516-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/bc4654508c40/10.1177_23259671241257516-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/7e39bcd7b125/10.1177_23259671241257516-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/102a5e25a4a8/10.1177_23259671241257516-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/509e64a86071/10.1177_23259671241257516-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/bc4654508c40/10.1177_23259671241257516-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d40f/11320692/7e39bcd7b125/10.1177_23259671241257516-fig4.jpg
摘要

背景

大型语言模型聊天生成预训练变换器(ChatGPT-4)的用户可用性和自动回复功能,使得该应用可用于患者健康咨询,并且在作为辅助工具以减轻管理和临床负担方面可能发挥作用。

目的

评估ChatGPT-4回答患者关于尺侧副韧带(UCL)损伤问题的能力,并将这些结果与谷歌的表现进行比较。

研究设计

横断面研究。

方法

谷歌网络搜索被用作基准,因为它是全球使用最广泛的搜索引擎,并且是唯一在收到查询时会生成常见问题解答(FAQ)的搜索引擎,这使得能够通过系统的方法进行比较。在谷歌中输入查询“尺侧副韧带重建”,记录前10个常见问题解答、答案及其来源。提示ChatGPT-4使用相同的查询进行谷歌搜索以获取常见问题解答,并记录答案来源以进行比较。此过程再次重复以获得10个需要数字而非开放式回答的新问题。最后,由2名经过专科培训的运动医学外科医生(D.W.A,J.S.D.)在对搜索引擎和答案来源不知情的情况下,独立对回答的临床准确性进行评分(0级=不准确,1级=有点准确,2级=准确)。

结果

在回答前10个常见问题解答时,ChatGPT-4使用学术来源的比例高于谷歌,尽管这在统计学上不显著(90%对50%;P = 0.14)。在问题重叠方面,谷歌和ChatGPT-4上最常见问题的40%是相同的。在比较需要数字回答的常见问题解答时,20%的答案完全重叠,30%部分重叠,其余50%没有任何重叠。ChatGPT-4用于回答这些常见问题解答的所有来源均为学术性的,而谷歌使用的来源中只有20%是学术性的(P = 0.0007)。谷歌的其余来源包括社交媒体(40%)、医疗实践(20%)、单外科医生网站(10%)和商业网站(10%)。对于前10个常见问题解答(1.9±0.2对1.2±0.6;P = 0.001)和前10个需要数字答案的问题(1.8±0.4对1±0.8;P = 0.013),ChatGPT-4给出答案的平均(±标准差)准确性显著高于谷歌。

结论

ChatGPT-4能够提供有关UCL损伤和重建的具有临床相关性的内容的回复。与谷歌网络搜索相比,ChatGPT-4在回答代表患者咨询的常见问题解答时使用了更高比例的学术网站,并提供了明显更准确的答案。展望未来,ChatGPT在回答有关UCL损伤和重建的问题时有可能用作临床辅助工具,但在临床环境中进行整合或自主使用之前,还需要进一步验证。

相似文献

1
Understanding How ChatGPT May Become a Clinical Administrative Tool Through an Investigation on the Ability to Answer Common Patient Questions Concerning Ulnar Collateral Ligament Injuries.通过对ChatGPT回答有关尺侧副韧带损伤常见患者问题能力的调查,了解其如何成为临床管理工具。
Orthop J Sports Med. 2024 Jul 31;12(7):23259671241257516. doi: 10.1177/23259671241257516. eCollection 2024 Jul.
2
ChatGPT-4 Performs Clinical Information Retrieval Tasks Using Consistently More Trustworthy Resources Than Does Google Search for Queries Concerning the Latarjet Procedure.对于有关拉塔热手术的查询,ChatGPT-4在执行临床信息检索任务时,使用的资源始终比谷歌搜索更可靠。
Arthroscopy. 2025 Mar;41(3):588-597. doi: 10.1016/j.arthro.2024.05.025. Epub 2024 Jun 25.
3
Using a Google Web Search Analysis to Assess the Utility of ChatGPT in Total Joint Arthroplasty.利用谷歌网页搜索分析评估 ChatGPT 在全关节置换中的效用。
J Arthroplasty. 2023 Jul;38(7):1195-1202. doi: 10.1016/j.arth.2023.04.007. Epub 2023 Apr 10.
4
Do ChatGPT and Google differ in answers to commonly asked patient questions regarding total shoulder and total elbow arthroplasty?ChatGPT 和谷歌在回答有关全肩和全肘人工关节置换术的常见患者问题方面是否存在差异?
J Shoulder Elbow Surg. 2024 Aug;33(8):e429-e437. doi: 10.1016/j.jse.2023.11.014. Epub 2024 Jan 3.
5
How Does ChatGPT Use Source Information Compared With Google? A Text Network Analysis of Online Health Information.ChatGPT 与谷歌相比如何使用来源信息?在线健康信息的文本网络分析。
Clin Orthop Relat Res. 2024 Apr 1;482(4):578-588. doi: 10.1097/CORR.0000000000002995. Epub 2024 Mar 1.
6
ChatGPT-4 Generates More Accurate and Complete Responses to Common Patient Questions About Anterior Cruciate Ligament Reconstruction Than Google's Search Engine.与谷歌搜索引擎相比,ChatGPT-4对前交叉韧带重建常见患者问题的回答更准确、更完整。
Arthrosc Sports Med Rehabil. 2024 Apr 9;6(3):100939. doi: 10.1016/j.asmr.2024.100939. eCollection 2024 Jun.
7
Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.谷歌医生与ChatGPT医生:通过比较关于白内障及白内障手术的常见患者问题的回答的准确性、安全性和可读性,探索人工智能在眼科领域的应用。
Semin Ophthalmol. 2024 Aug;39(6):472-479. doi: 10.1080/08820538.2024.2326058. Epub 2024 Mar 22.
8
Evaluation of Patient Education Materials From Large-Language Artificial Intelligence Models on Carpal Tunnel Release.基于大语言人工智能模型的腕管松解术患者教育材料评估
Hand (N Y). 2024 Apr 25:15589447241247332. doi: 10.1177/15589447241247332.
9
Using Google web search to analyze and evaluate the application of ChatGPT in femoroacetabular impingement syndrome.利用谷歌网页搜索分析和评估 ChatGPT 在股骨髋臼撞击综合征中的应用。
Front Public Health. 2024 May 31;12:1412063. doi: 10.3389/fpubh.2024.1412063. eCollection 2024.
10
Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性:ChatGPT与谷歌巴德人工智能的比较分析
Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.

引用本文的文献

1
Exploring ChatGPT's Efficacy in Orthopaedic Arthroplasty Questions Compared to Adult Reconstruction Surgeons.与成人重建外科医生相比,探究ChatGPT在骨科关节置换问题方面的效能。
Arthroplast Today. 2025 Jul 14;34:101772. doi: 10.1016/j.artd.2025.101772. eCollection 2025 Aug.

本文引用的文献

1
Leveraging large language models for generating responses to patient messages-a subjective analysis.利用大型语言模型生成对患者信息的回复——主观分析。
J Am Med Inform Assoc. 2024 May 20;31(6):1367-1379. doi: 10.1093/jamia/ocae052.
2
Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.探索 ChatGPT 作为提供骨科信息的补充工具的潜力。
Knee Surg Sports Traumatol Arthrosc. 2023 Nov;31(11):5190-5198. doi: 10.1007/s00167-023-07529-2. Epub 2023 Aug 8.
3
Creation and Adoption of Large Language Models in Medicine.
医学领域中大型语言模型的创建与采用。
JAMA. 2023 Sep 5;330(9):866-869. doi: 10.1001/jama.2023.14217.
4
Enhancing Triage Efficiency and Accuracy in Emergency Rooms for Patients with Metastatic Prostate Cancer: A Retrospective Analysis of Artificial Intelligence-Assisted Triage Using ChatGPT 4.0.提高急诊室中转移性前列腺癌患者的分诊效率和准确性:使用ChatGPT 4.0的人工智能辅助分诊的回顾性分析
Cancers (Basel). 2023 Jul 22;15(14):3717. doi: 10.3390/cancers15143717.
5
Large language models in medicine.医学中的大型语言模型。
Nat Med. 2023 Aug;29(8):1930-1940. doi: 10.1038/s41591-023-02448-8. Epub 2023 Jul 17.
6
Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty.评估 ChatGPT 对全髋关节置换术常见患者问题的回答。
J Bone Joint Surg Am. 2023 Oct 4;105(19):1519-1526. doi: 10.2106/JBJS.23.00209. Epub 2023 Jul 17.
7
Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?人工智能在骨科领域的应用:ChatGPT 能否通过皇家外科学院(创伤与骨科)研究员资格 Section 1 考试?
Postgrad Med J. 2023 Sep 21;99(1176):1110-1114. doi: 10.1093/postmj/qgad053.
8
Algorithmic bias and research integrity; the role of nonhuman authors in shaping scientific knowledge with respect to artificial intelligence: a perspective.算法偏见与研究诚信;非人类作者在人工智能方面塑造科学知识方面的作用:一个视角。
Int J Surg. 2023 Oct 1;109(10):2987-2990. doi: 10.1097/JS9.0000000000000552.
9
What's all the chatter about?这是在议论什么呢?
Bone Joint J. 2023 Jun 1;105-B(6):587-589. doi: 10.1302/0301-620X.105B6.BJJ-2023-0156.
10
Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.人工智能能通过美国骨科医师学会考试吗?骨科住院医师与ChatGPT的对比。
Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630. doi: 10.1097/CORR.0000000000002704. Epub 2023 May 23.