Suppr超能文献

评估ChatGPT对全膝关节置换常见问题的回答的准确性和相关性。

Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement.

作者信息

Zhang Siyuan, Liau Zi Qiang Glen, Tan Kian Loong Melvin, Chua Wei Liang

机构信息

Department of Orthopaedic Surgery, National University Health System, Level 11, NUHS Tower Block, 1E Kent Ridge Road, Singapore, 119228, Singapore.

出版信息

Knee Surg Relat Res. 2024 Apr 2;36(1):15. doi: 10.1186/s43019-024-00218-5.

Abstract

BACKGROUND

Chat Generative Pretrained Transformer (ChatGPT), a generative artificial intelligence chatbot, may have broad applications in healthcare delivery and patient education due to its ability to provide human-like responses to a wide range of patient queries. However, there is limited evidence regarding its ability to provide reliable and useful information on orthopaedic procedures. This study seeks to evaluate the accuracy and relevance of responses provided by ChatGPT to frequently asked questions (FAQs) regarding total knee replacement (TKR).

METHODS

A list of 50 clinically-relevant FAQs regarding TKR was collated. Each question was individually entered as a prompt to ChatGPT (version 3.5), and the first response generated was recorded. Responses were then reviewed by two independent orthopaedic surgeons and graded on a Likert scale for their factual accuracy and relevance. These responses were then classified into accurate versus inaccurate and relevant versus irrelevant responses using preset thresholds on the Likert scale.

RESULTS

Most responses were accurate, while all responses were relevant. Of the 50 FAQs, 44/50 (88%) of ChatGPT responses were classified as accurate, achieving a mean Likert grade of 4.6/5 for factual accuracy. On the other hand, 50/50 (100%) of responses were classified as relevant, achieving a mean Likert grade of 4.9/5 for relevance.

CONCLUSION

ChatGPT performed well in providing accurate and relevant responses to FAQs regarding TKR, demonstrating great potential as a tool for patient education. However, it is not infallible and can occasionally provide inaccurate medical information. Patients and clinicians intending to utilize this technology should be mindful of its limitations and ensure adequate supervision and verification of information provided.

摘要

背景

聊天生成预训练变换器(ChatGPT)是一种生成式人工智能聊天机器人,由于其能够对广泛的患者问题提供类似人类的回答,可能在医疗服务和患者教育中具有广泛应用。然而,关于其提供有关骨科手术可靠且有用信息的能力的证据有限。本研究旨在评估ChatGPT对有关全膝关节置换术(TKR)的常见问题(FAQ)所提供回答的准确性和相关性。

方法

整理了一份包含50个与TKR临床相关的常见问题列表。每个问题都作为提示单独输入到ChatGPT(版本3.5)中,并记录生成的第一个回答。然后由两位独立的骨科医生对回答进行审查,并根据李克特量表对其事实准确性和相关性进行评分。然后使用李克特量表上的预设阈值将这些回答分为准确与不准确以及相关与不相关的回答。

结果

大多数回答是准确的,而所有回答都是相关的。在50个常见问题中,ChatGPT的44/50(88%)个回答被归类为准确,事实准确性的平均李克特评分为4.6/5。另一方面,50/50(100%)的回答被归类为相关,相关性的平均李克特评分为4.9/5。

结论

ChatGPT在为有关TKR的常见问题提供准确且相关的回答方面表现良好,显示出作为患者教育工具的巨大潜力。然而,它并非绝对可靠,偶尔可能提供不准确的医疗信息。打算使用这项技术的患者和临床医生应注意其局限性,并确保对所提供信息进行充分监督和核实。

相似文献

引用本文的文献

本文引用的文献

1
Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.探索 ChatGPT 作为提供骨科信息的补充工具的潜力。
Knee Surg Sports Traumatol Arthrosc. 2023 Nov;31(11):5190-5198. doi: 10.1007/s00167-023-07529-2. Epub 2023 Aug 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验