Suppr超能文献

ChatGPT 在回答肌肉骨骼解剖学问题方面的效果:一项评估评分者和时间点之间的质量和一致性的研究。

ChatGPT efficacy for answering musculoskeletal anatomy questions: a study evaluating quality and consistency between raters and timepoints.

机构信息

School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, 54124, Greece.

Department of Anatomy, Clinical Radiologist University of Crete, Crete, Greece.

出版信息

Surg Radiol Anat. 2024 Nov;46(11):1885-1890. doi: 10.1007/s00276-024-03477-9. Epub 2024 Sep 12.

Abstract

PURPOSE

There is increasing interest in the use of digital platforms such as ChatGPT for anatomy education. This study aims to evaluate the efficacy of ChatGPT in providing accurate and consistent responses to questions focusing on musculoskeletal anatomy across various time points (hours and days).

METHODS

A selection of 6 Anatomy-related questions were asked to ChatGPT 3.5 in 4 different timepoints. All answers were rated blindly by 3 expert raters for quality according to a 5 -point Likert Scale. Difference of 0 or 1 points in Likert scale scores between raters was considered as agreement and between different timepoints was considered as consistent indicating good reproducibility.

RESULTS

There was significant variation in the quality of the answers ranging from extremely good to very poor quality. There was also variation of consistency levels between different timepoints. Answers were rated as good quality (≥ 3 in Likert scale) in 50% of cases (3/6) and as consistent in 66.6% (4/6) of cases. In the low-quality answers, significant mistakes, conflicting data or lack of information were encountered.

CONCLUSION

As of the time of this article, the quality and consistency of the ChatGPT v3.5 answers is variable, thus limiting its utility as independent and reliable resource of learning musculoskeletal anatomy. Validating information by reviewing the anatomical literature is highly recommended.

摘要

目的

人们对使用数字平台(如 ChatGPT)进行解剖学教育越来越感兴趣。本研究旨在评估 ChatGPT 在不同时间点(数小时和数天)回答聚焦于肌肉骨骼解剖学问题时提供准确和一致答案的能力。

方法

在 4 个不同的时间点,向 ChatGPT3.5 提出了 6 个与解剖学相关的问题。所有答案均由 3 名专家评估员根据 5 分李克特量表进行质量盲评。评分者之间相差 0 或 1 分的李克特量表分数被认为是一致的,而不同时间点之间的差异则表明具有良好的可重复性。

结果

答案质量差异显著,从极好到极差不等。不同时间点之间的一致性水平也存在差异。在 50%的情况下(3/6),答案被评为高质量(李克特量表≥3),在 66.6%的情况下(4/6)是一致的。在低质量的答案中,遇到了明显的错误、相互矛盾的数据或信息缺失。

结论

截至本文撰写之时,ChatGPT v3.5 回答的质量和一致性是可变的,因此限制了其作为肌肉骨骼解剖学独立可靠学习资源的效用。强烈建议通过查阅解剖学文献来验证信息。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验