ChatGPT 对全髋关节和膝关节置换患者来说是可靠的信息来源吗？

Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?

作者信息

Wright Benjamin M, Bodnar Michael S, Moore Andrew D, Maseda Meghan C, Kucharik Michael P, Diaz Connor C, Schmidt Christian M, Mir Hassan R

机构信息

Morsani College of Medicine, University of South Florida, Tampa, Florida, USA.

Department of Orthopaedic Surgery, University of South Florida, Tampa, Florida, USA.

出版信息

Bone Jt Open. 2024 Feb 15;5(2):139-146. doi: 10.1302/2633-1462.52.BJO-2023-0113.R1.

DOI:10.1302/2633-1462.52.BJO-2023-0113.R1

PMID:38354748

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10867788/

Abstract

AIMS

While internet search engines have been the primary information source for patients' questions, artificial intelligence large language models like ChatGPT are trending towards becoming the new primary source. The purpose of this study was to determine if ChatGPT can answer patient questions about total hip (THA) and knee arthroplasty (TKA) with consistent accuracy, comprehensiveness, and easy readability.

METHODS

We posed the 20 most Google-searched questions about THA and TKA, plus ten additional postoperative questions, to ChatGPT. Each question was asked twice to evaluate for consistency in quality. Following each response, we responded with, "Please explain so it is easier to understand," to evaluate ChatGPT's ability to reduce response reading grade level, measured as Flesch-Kincaid Grade Level (FKGL). Five resident physicians rated the 120 responses on 1 to 5 accuracy and comprehensiveness scales. Additionally, they answered a "yes" or "no" question regarding acceptability. Mean scores were calculated for each question, and responses were deemed acceptable if ≥ four raters answered "yes."

RESULTS

The mean accuracy and comprehensiveness scores were 4.26 (95% confidence interval (CI) 4.19 to 4.33) and 3.79 (95% CI 3.69 to 3.89), respectively. Out of all the responses, 59.2% (71/120; 95% CI 50.0% to 67.7%) were acceptable. ChatGPT was consistent when asked the same question twice, giving no significant difference in accuracy (t = 0.821; p = 0.415), comprehensiveness (t = 1.387; p = 0.171), acceptability (χ = 1.832; p = 0.176), and FKGL (t = 0.264; p = 0.793). There was a significantly lower FKGL (t = 2.204; p = 0.029) for easier responses (11.14; 95% CI 10.57 to 11.71) than original responses (12.15; 95% CI 11.45 to 12.85).

CONCLUSION

ChatGPT answered THA and TKA patient questions with accuracy comparable to previous reports of websites, with adequate comprehensiveness, but with limited acceptability as the sole information source. ChatGPT has potential for answering patient questions about THA and TKA, but needs improvement.

摘要

目的

虽然互联网搜索引擎一直是患者问题的主要信息来源，但像ChatGPT这样的人工智能大语言模型正逐渐成为新的主要信息源。本研究的目的是确定ChatGPT能否以一致的准确性、全面性和易读性回答患者关于全髋关节置换术（THA）和膝关节置换术（TKA）的问题。

方法

我们向ChatGPT提出了20个谷歌搜索量最高的关于THA和TKA的问题，以及另外10个术后问题。每个问题问两次以评估质量的一致性。在每次回答后，我们回复“请解释一下，以便更容易理解”，以评估ChatGPT降低回答阅读难度等级的能力，用弗莱什-金凯德等级水平（FKGL）衡量。五位住院医师在1至5的准确性和全面性量表上对120个回答进行评分。此外，他们回答了一个关于可接受性的“是”或“否”的问题。计算每个问题的平均得分，如果≥四名评分者回答“是”，则该回答被视为可接受。

结果

平均准确性和全面性得分分别为4.26（95%置信区间（CI）4.19至4.33）和3.79（95%CI 3.69至3.89）。在所有回答中，59.2%（71/120；95%CI 50.0%至67.7%）是可接受的。当同一个问题问ChatGPT两次时，其表现具有一致性，准确性（t = 0.821；p = 0.415）、全面性（t = 1.387；p = 0.171））、可接受性（χ = 1.832；p = 0.176）和FKGL（t = 0.264；p = 0.793）均无显著差异。与原始回答（12.15；95%CI 11.45至12.85）相比，更容易理解的回答（11.14；95%CI 10.57至11.71）的FKGL显著更低（t = 2.204；p = 0.029）。

结论

ChatGPT回答THA和TKA患者问题的准确性与之前网站的报告相当，具有足够的全面性，但作为唯一信息源的可接受性有限。ChatGPT有潜力回答患者关于THA和TKA的问题，但需要改进。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

ChatGPT 对全髋关节和膝关节置换患者来说是可靠的信息来源吗？

Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?

作者信息

机构信息

出版信息

AIMS

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

ChatGPT 对全髋关节和膝关节置换患者来说是可靠的信息来源吗？

Is ChatGPT a trusted source of information for total hip and knee arthroplasty patients?

作者信息

机构信息

出版信息

AIMS

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献