从患者角度评估ChatGPT-4o对髋关节镜检查相关问题的回答。

Evaluation of ChatGPT-4o's answers to questions about hip arthroscopy from the patient perspective.

作者信息

Ayık Gökhan, Ercan Niyazi, Demirtaş Yunus, Yıldırım Tuğrul, Çakmak Gökhan

机构信息

Yüksek İhtisas Üniversitesi Ortopedi ve Travmatoloji Anabilim Dalı, 06530 Çankaya, Ankara, Türkiye.

出版信息

Jt Dis Relat Surg. 2025 Jan 2;36(1):193-199. doi: 10.52312/jdrs.2025.1961. Epub 2024 Dec 18.

DOI:10.52312/jdrs.2025.1961

PMID:39719917

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11734852/

Abstract

OBJECTIVES

This study aimed to evaluate the responses provided by ChatGPT-4o to the most frequently asked questions by patients regarding hip arthroscopy.

MATERIALS AND METHODS

In this cross-sectional survey study, a new Google account without a search history was created to determine the 20 most frequently asked questions about hip arthroscopy via Google. These questions were asked to a new ChatGPT-4o account on June 1, 2024, and the responses were recorded. Ten orthopedic surgeons specializing in sports surgery rated the responses using a rating scale to assess relevance, accuracy, clarity, and completeness. The responses were scored on a scale from 1 to 5, with 1 being the worst and 5 being the best. The interrater reliability assessed via the intraclass correlation coefficient (ICC).

RESULTS

The lowest score given by the surgeons for any response was 4/5 in each subcategory. The highest mean scores were in accuracy and clarity, followed by relevance, with completeness receiving the lowest scores. The overall mean score was 4.49±0.16. Interrater reliability showed insufficient overall agreement (ICC=0.004, p=0.383), with the highest agreement in clarity (ICC=0.039, p=0.131) and the lowest in accuracy (ICC=-0.019, p=0.688).

CONCLUSION

The study confirms our hypothesis that ChatGPT-4o provides above-average quality responses to frequently asked questions about hip arthroscopy, as evidenced by the high scores in relevance, accuracy, clarity, and completeness. However, it is still advisable to consult orthopedic specialists on the subject, incorporating ChatGPT's suggestions during the final decision-making process.

摘要

目的

本研究旨在评估ChatGPT-4o对患者关于髋关节镜检查最常见问题的回答。

材料与方法

在这项横断面调查研究中，创建了一个没有搜索历史的新谷歌账户，以确定通过谷歌搜索关于髋关节镜检查的20个最常见问题。2024年6月1日，向一个新的ChatGPT-4o账户提出这些问题，并记录回答。10名专门从事运动外科的骨科医生使用评分量表对回答进行评分，以评估相关性、准确性、清晰度和完整性。回答的评分范围为1至5分，1分为最差，5分为最佳。通过组内相关系数（ICC）评估评分者间的可靠性。

结果

在每个子类别中，外科医生给任何回答的最低分数为4/5。最高平均分在准确性和清晰度方面，其次是相关性，完整性得分最低。总体平均分为4.49±0.16。评分者间可靠性显示总体一致性不足（ICC = 0.004，p = 0.383），清晰度方面一致性最高（ICC = 0.039，p = 0.131），准确性方面一致性最低（ICC = -0.019，p = 0.688）。

结论

该研究证实了我们的假设，即ChatGPT-4o对关于髋关节镜检查的常见问题提供了高于平均水平的高质量回答，相关性、准确性、清晰度和完整性方面的高分证明了这一点。然而，就该主题咨询骨科专家仍然是明智的，在最终决策过程中纳入ChatGPT的建议。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

从患者角度评估ChatGPT-4o对髋关节镜检查相关问题的回答。

Evaluation of ChatGPT-4o's answers to questions about hip arthroscopy from the patient perspective.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

从患者角度评估ChatGPT-4o对髋关节镜检查相关问题的回答。

Evaluation of ChatGPT-4o's answers to questions about hip arthroscopy from the patient perspective.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献