• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GPT-4在美国和中国骨关节炎治疗指南解读及骨科病例咨询方面的性能定量评估。

Quantitative evaluation of GPT-4's performance on US and Chinese osteoarthritis treatment guideline interpretation and orthopaedic case consultation.

作者信息

Li Juntan, Gao Xiang, Dou Tianxu, Gao Yuyang, Li Xu, Zhu Wannan

机构信息

Jinzhou Medical University, Jinzhou, Liaoning, China.

The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China.

出版信息

BMJ Open. 2024 Dec 30;14(12):e082344. doi: 10.1136/bmjopen-2023-082344.

DOI:10.1136/bmjopen-2023-082344
PMID:39806703
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11749315/
Abstract

OBJECTIVES

To evaluate GPT-4's performance in interpreting osteoarthritis (OA) treatment guidelines from the USA and China, and to assess its ability to diagnose and manage orthopaedic cases.

SETTING

The study was conducted using publicly available OA treatment guidelines and simulated orthopaedic case scenarios.

PARTICIPANTS

No human participants were involved. The evaluation focused on GPT-4's responses to clinical guidelines and case questions, assessed by two orthopaedic specialists.

OUTCOMES

Primary outcomes included the accuracy and completeness of GPT-4's responses to guideline-based queries and case scenarios. Metrics included the correct match rate, completeness score and stratification of case responses into predefined tiers of correctness.

RESULTS

In interpreting the American Academy of Orthopaedic Surgeons and Chinese OA guidelines, GPT-4 achieved a correct match rate of 46.4% and complete agreement with all score-2 recommendations. The accuracy score for guideline interpretation was 4.3±1.6 (95% CI 3.9 to 4.7), and the completeness score was 2.8±0.6 (95% CI 2.5 to 3.1). For case-based questions, GPT-4 demonstrated high performance, with over 88% of responses rated as comprehensive.

CONCLUSIONS

GPT-4 demonstrates promising capabilities as an auxiliary tool in orthopaedic clinical practice and patient education, with high levels of accuracy and completeness in guideline interpretation and clinical case analysis. However, further validation is necessary to establish its utility in real-world clinical settings.

摘要

目的

评估GPT-4在解读美国和中国骨关节炎(OA)治疗指南方面的表现,并评估其诊断和处理骨科病例的能力。

设置

本研究使用公开可用的OA治疗指南和模拟骨科病例场景进行。

参与者

未涉及人类参与者。评估重点是GPT-4对临床指南和病例问题的回答,由两名骨科专家进行评估。

结果

主要结果包括GPT-4对基于指南的询问和病例场景回答的准确性和完整性。指标包括正确匹配率、完整性得分以及将病例回答分层到预定义的正确性等级。

结果

在解读美国矫形外科医师学会和中国OA指南时,GPT-4的正确匹配率为46.4%,并与所有2分的推荐完全一致。指南解读的准确性得分为4.3±1.6(95%置信区间3.9至4.7),完整性得分为2.8±0.6(95%置信区间2.5至3.1)。对于基于病例的问题,GPT-4表现出色,超过88%的回答被评为全面。

结论

GPT-4作为骨科临床实践和患者教育的辅助工具显示出有前景的能力,在指南解读和临床病例分析中具有较高的准确性和完整性。然而,需要进一步验证以确定其在实际临床环境中的效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/2f075e4217be/bmjopen-14-12-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/2ee69592628d/bmjopen-14-12-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/3bbc12bf011a/bmjopen-14-12-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/5eb19be83a02/bmjopen-14-12-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/2f075e4217be/bmjopen-14-12-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/2ee69592628d/bmjopen-14-12-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/3bbc12bf011a/bmjopen-14-12-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/5eb19be83a02/bmjopen-14-12-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbea/11749315/2f075e4217be/bmjopen-14-12-g004.jpg

相似文献

1
Quantitative evaluation of GPT-4's performance on US and Chinese osteoarthritis treatment guideline interpretation and orthopaedic case consultation.GPT-4在美国和中国骨关节炎治疗指南解读及骨科病例咨询方面的性能定量评估。
BMJ Open. 2024 Dec 30;14(12):e082344. doi: 10.1136/bmjopen-2023-082344.
2
The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions.人工智能的快速发展:GPT-4 在骨科手术委员会问题上的表现。
Orthopedics. 2024 Mar-Apr;47(2):e85-e89. doi: 10.3928/01477447-20230922-05. Epub 2023 Sep 27.
3
What can GPT-4 do for Diagnosing Rare Eye Diseases? A Pilot Study.GPT-4在罕见眼病诊断中能发挥什么作用?一项初步研究。
Ophthalmol Ther. 2023 Dec;12(6):3395-3402. doi: 10.1007/s40123-023-00789-8. Epub 2023 Sep 1.
4
Comparing Artificial Intelligence-Generated and Clinician-Created Personalized Self-Management Guidance for Patients With Knee Osteoarthritis: Blinded Observational Study.比较人工智能生成与临床医生创建的针对膝骨关节炎患者的个性化自我管理指导:盲法观察研究。
J Med Internet Res. 2025 May 7;27:e67830. doi: 10.2196/67830.
5
Assessing GPT-4's accuracy in answering clinical pharmacological questions on pain therapy.评估GPT-4在回答有关疼痛治疗的临床药理学问题时的准确性。
Br J Clin Pharmacol. 2025 Aug;91(8):2294-2303. doi: 10.1002/bcp.70036. Epub 2025 Mar 11.
6
Comparative analysis of GPT-4-based ChatGPT's diagnostic performance with radiologists using real-world radiology reports of brain tumors.基于GPT-4的ChatGPT与放射科医生在使用脑肿瘤真实世界放射学报告方面的诊断性能比较分析。
Eur Radiol. 2025 Apr;35(4):1938-1947. doi: 10.1007/s00330-024-11032-8. Epub 2024 Aug 28.
7
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.生成式大语言模型与同行用户对解释非专业患者实验室检测结果的答案质量比较:评估研究。
J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.
8
The Potential of GPT-4 as a Support Tool for Pharmacists: Analytical Study Using the Japanese National Examination for Pharmacists.GPT-4作为药剂师辅助工具的潜力:使用日本药剂师国家考试的分析研究
JMIR Med Educ. 2023 Oct 30;9:e48452. doi: 10.2196/48452.
9
Performance Evaluation of the Generative Pre-trained Transformer (GPT-4) on the Family Medicine In-Training Examination.生成式预训练转换器(GPT-4)在家庭医学住院医师考试中的性能评估。
J Am Board Fam Med. 2024 Oct 25;37(4):528-582. doi: 10.3122/jabfm.2023.230433R1.
10
Evaluating ChatGPT-4's Accuracy in Identifying Final Diagnoses Within Differential Diagnoses Compared With Those of Physicians: Experimental Study for Diagnostic Cases.评估ChatGPT-4在鉴别诊断中识别最终诊断的准确性与医生的准确性比较:诊断病例的实验研究
JMIR Form Res. 2024 Jun 26;8:e59267. doi: 10.2196/59267.

引用本文的文献

1
Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.医学诊断中的大语言模型:基于文献计量分析的综述
J Med Internet Res. 2025 Jun 9;27:e72062. doi: 10.2196/72062.

本文引用的文献

1
ChatGPT outperforms crowd workers for text-annotation tasks.在文本注释任务中,ChatGPT的表现优于众包工作者。
Proc Natl Acad Sci U S A. 2023 Jul 25;120(30):e2305016120. doi: 10.1073/pnas.2305016120. Epub 2023 Jul 18.
2
Evaluating GPT4 on Impressions Generation in Radiology Reports.评估GPT4在生成放射学报告印象方面的表现。
Radiology. 2023 Jun;307(5):e231259. doi: 10.1148/radiol.231259.
3
ChatGPT and the clinical informatics board examination: the end of unproctored maintenance of certification?ChatGPT 与临床信息学 board 考试:无需监管的认证维持的终结?
J Am Med Inform Assoc. 2023 Aug 18;30(9):1558-1560. doi: 10.1093/jamia/ocad104.
4
Performance of ChatGPT, GPT-4, and Google Bard on a Neurosurgery Oral Boards Preparation Question Bank.ChatGPT、GPT-4和谷歌巴德在神经外科口试准备题库上的表现。
Neurosurgery. 2023 Nov 1;93(5):1090-1098. doi: 10.1227/neu.0000000000002551. Epub 2023 Jun 12.
5
GPT-4 accuracy and completeness against International Consensus Statement on Allergy and Rhinology: Rhinosinusitis.GPT-4 的准确性和完整性符合过敏和鼻科学国际共识声明:鼻窦炎。
Int Forum Allergy Rhinol. 2023 Dec;13(12):2231-2234. doi: 10.1002/alr.23201. Epub 2023 Jun 16.
6
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.医学领域的ChatGPT:其应用、优势、局限性、未来前景及伦理考量概述
Front Artif Intell. 2023 May 4;6:1169595. doi: 10.3389/frai.2023.1169595. eCollection 2023.
7
Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential.使用ChatGPT和GPT-4通过提示学习将放射学报告翻译成通俗易懂的语言:结果、局限性和潜力。
Vis Comput Ind Biomed Art. 2023 May 18;6(1):9. doi: 10.1186/s42492-023-00136-5.
8
GPT-4 in Radiology: Improvements in Advanced Reasoning.GPT-4 在放射学中的应用:高级推理能力的提升。
Radiology. 2023 Jun;307(5):e230987. doi: 10.1148/radiol.230987. Epub 2023 May 16.
9
Can ChatGPT/GPT-4 assist surgeons in confronting patients with Mpox and handling future epidemics?ChatGPT/GPT-4能否协助外科医生应对感染猴痘的患者并应对未来的疫情?
Int J Surg. 2023 Aug 1;109(8):2544-2548. doi: 10.1097/JS9.0000000000000453.
10
Artificial Intelligence in Sports Medicine: Could GPT-4 Make Human Doctors Obsolete?运动医学中的人工智能:GPT-4会让人类医生过时吗?
Ann Biomed Eng. 2023 Aug;51(8):1658-1662. doi: 10.1007/s10439-023-03213-1. Epub 2023 Apr 25.