比较不同版本的ChatGPT在告知肩袖损伤患者方面的情况。

Comparison of ChatGPT versions in informing patients with rotator cuff injuries.

作者信息

Günay Ali Eray, Özer Alper, Yazıcı Alparslan, Sayer Gökhan

机构信息

Department of Orthopedics and Traumatology, Kayseri City Training and Research Hospital, Kayseri, Turkey.

Department of Orthopedics and Traumatology, Develi State Hospital, Kayseri, Turkey.

出版信息

JSES Int. 2024 May 6;8(5):1016-1018. doi: 10.1016/j.jseint.2024.04.016. eCollection 2024 Sep.

DOI:10.1016/j.jseint.2024.04.016

PMID:39280147

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11401580/

Abstract

BACKGROUND

The aim of this study is to evaluate whether Chat Generative Pretrained Transformer (ChatGPT) can be recommended as a resource for informing patients planning rotator cuff repairs, and to assess the differences between ChatGPT 3.5 and 4.0 versions in terms of information content and readability.

METHODS

In August 2023, 13 commonly asked questions by patients with rotator cuff disease were posed to ChatGPT 3.5 and ChatGPT 4 programs using different internet protocol computers by 3 experienced surgeons in rotator cuff surgery. After converting the answers of both versions into text, the quality and readability of the answers were examined.

RESULTS

The average Journal of the American Medical Association score for both versions was 0, and the average DISCERN score was 61.6. A statistically significant and strong correlation was found between ChatGPT 3.5 and 4.0 DISCERN scores. There was excellent agreement in DISCERN scores for both versions among the 3 evaluators. ChatGPT 3.5 was found to be less readable than ChatGPT 4.0.

CONCLUSION

The information provided by the ChatGPT conversational system was evaluated as of high quality, but there were significant shortcomings in terms of reliability due to the lack of citations. Despite the ChatGPT 4.0 version having higher readability scores, both versions were considered difficult to read.

摘要