GPT-4在美国和中国骨关节炎治疗指南解读及骨科病例咨询方面的性能定量评估。

Quantitative evaluation of GPT-4's performance on US and Chinese osteoarthritis treatment guideline interpretation and orthopaedic case consultation.

作者信息

Li Juntan, Gao Xiang, Dou Tianxu, Gao Yuyang, Li Xu, Zhu Wannan

机构信息

Jinzhou Medical University, Jinzhou, Liaoning, China.

The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China.

出版信息

BMJ Open. 2024 Dec 30;14(12):e082344. doi: 10.1136/bmjopen-2023-082344.

DOI:10.1136/bmjopen-2023-082344

PMID:39806703

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11749315/

Abstract

OBJECTIVES

To evaluate GPT-4's performance in interpreting osteoarthritis (OA) treatment guidelines from the USA and China, and to assess its ability to diagnose and manage orthopaedic cases.

SETTING

The study was conducted using publicly available OA treatment guidelines and simulated orthopaedic case scenarios.

PARTICIPANTS

No human participants were involved. The evaluation focused on GPT-4's responses to clinical guidelines and case questions, assessed by two orthopaedic specialists.

OUTCOMES

Primary outcomes included the accuracy and completeness of GPT-4's responses to guideline-based queries and case scenarios. Metrics included the correct match rate, completeness score and stratification of case responses into predefined tiers of correctness.

RESULTS

In interpreting the American Academy of Orthopaedic Surgeons and Chinese OA guidelines, GPT-4 achieved a correct match rate of 46.4% and complete agreement with all score-2 recommendations. The accuracy score for guideline interpretation was 4.3±1.6 (95% CI 3.9 to 4.7), and the completeness score was 2.8±0.6 (95% CI 2.5 to 3.1). For case-based questions, GPT-4 demonstrated high performance, with over 88% of responses rated as comprehensive.

CONCLUSIONS

GPT-4 demonstrates promising capabilities as an auxiliary tool in orthopaedic clinical practice and patient education, with high levels of accuracy and completeness in guideline interpretation and clinical case analysis. However, further validation is necessary to establish its utility in real-world clinical settings.

摘要

目的

评估GPT-4在解读美国和中国骨关节炎（OA）治疗指南方面的表现，并评估其诊断和处理骨科病例的能力。

设置

本研究使用公开可用的OA治疗指南和模拟骨科病例场景进行。

参与者

未涉及人类参与者。评估重点是GPT-4对临床指南和病例问题的回答，由两名骨科专家进行评估。

结果

主要结果包括GPT-4对基于指南的询问和病例场景回答的准确性和完整性。指标包括正确匹配率、完整性得分以及将病例回答分层到预定义的正确性等级。

结果

在解读美国矫形外科医师学会和中国OA指南时，GPT-4的正确匹配率为46.4%，并与所有2分的推荐完全一致。指南解读的准确性得分为4.3±1.6（95%置信区间3.9至4.7），完整性得分为2.8±0.6（95%置信区间2.5至3.1）。对于基于病例的问题，GPT-4表现出色，超过88%的回答被评为全面。

结论

GPT-4作为骨科临床实践和患者教育的辅助工具显示出有前景的能力，在指南解读和临床病例分析中具有较高的准确性和完整性。然而，需要进一步验证以确定其在实际临床环境中的效用。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

GPT-4在美国和中国骨关节炎治疗指南解读及骨科病例咨询方面的性能定量评估。

Quantitative evaluation of GPT-4's performance on US and Chinese osteoarthritis treatment guideline interpretation and orthopaedic case consultation.

作者信息

机构信息

出版信息

OBJECTIVES

SETTING

PARTICIPANTS

OUTCOMES

RESULTS

CONCLUSIONS

目的

设置

参与者

结果

结果

结论

相似文献

引用本文的文献

本文引用的文献

GPT-4在美国和中国骨关节炎治疗指南解读及骨科病例咨询方面的性能定量评估。

Quantitative evaluation of GPT-4's performance on US and Chinese osteoarthritis treatment guideline interpretation and orthopaedic case consultation.

作者信息

机构信息

出版信息

OBJECTIVES

SETTING

PARTICIPANTS

OUTCOMES

RESULTS

CONCLUSIONS

目的

设置

参与者

结果

结果

结论

相似文献

引用本文的文献

本文引用的文献