Suppr超能文献

ChatGPT 关于角膜移植和 Fuchs 营养不良信息的质量和与科学共识的一致性。

Quality and Agreement With Scientific Consensus of ChatGPT Information Regarding Corneal Transplantation and Fuchs Dystrophy.

机构信息

Morgan State University, Baltimore, MD.

Harvard Medical School, Boston, MA.

出版信息

Cornea. 2024 Jun 1;43(6):746-750. doi: 10.1097/ICO.0000000000003439. Epub 2023 Nov 28.

Abstract

PURPOSE

ChatGPT is a commonly used source of information by patients and clinicians. However, it can be prone to error and requires validation. We sought to assess the quality and accuracy of information regarding corneal transplantation and Fuchs dystrophy from 2 iterations of ChatGPT, and whether its answers improve over time.

METHODS

A total of 10 corneal specialists collaborated to assess responses of the algorithm to 10 commonly asked questions related to endothelial keratoplasty and Fuchs dystrophy. These questions were asked from both ChatGPT-3.5 and its newer generation, GPT-4. Assessments tested quality, safety, accuracy, and bias of information. Chi-squared, Fisher exact tests, and regression analyses were conducted.

RESULTS

We analyzed 180 valid responses. On a 1 (A+) to 5 (F) scale, the average score given by all specialists across questions was 2.5 for ChatGPT-3.5 and 1.4 for GPT-4, a significant improvement ( P < 0.0001). Most responses by both ChatGPT-3.5 (61%) and GPT-4 (89%) used correct facts, a proportion that significantly improved across iterations ( P < 0.00001). Approximately a third (35%) of responses from ChatGPT-3.5 were considered against the scientific consensus, a notable rate of error that decreased to only 5% of answers from GPT-4 ( P < 0.00001).

CONCLUSIONS

The quality of responses in ChatGPT significantly improved between versions 3.5 and 4, and the odds of providing information against the scientific consensus decreased. However, the technology is still capable of producing inaccurate statements. Corneal specialists are uniquely positioned to assist users to discern the veracity and application of such information.

摘要

目的

ChatGPT 是患者和临床医生常用的信息来源。然而,它可能容易出错,需要验证。我们旨在评估来自 ChatGPT 两个版本的关于角膜移植和 Fuchs 营养不良的信息的质量和准确性,以及其答案是否随着时间的推移而改善。

方法

共有 10 名角膜专家合作评估算法对 10 个与内皮角膜移植和 Fuchs 营养不良相关的常见问题的回答。这些问题是分别向 ChatGPT-3.5 和其较新的一代 GPT-4 提出的。评估测试了信息的质量、安全性、准确性和偏差。进行了卡方检验、Fisher 精确检验和回归分析。

结果

我们分析了 180 个有效回复。在 1(A+)到 5(F)的评分中,所有专家对所有问题的平均评分分别为 ChatGPT-3.5 的 2.5 和 GPT-4 的 1.4,有显著提高(P < 0.0001)。ChatGPT-3.5(61%)和 GPT-4(89%)的大多数回复都使用了正确的事实,这一比例在迭代中显著提高(P < 0.00001)。约三分之一(35%)的 ChatGPT-3.5 回复与科学共识相悖,这是一个显著的错误率,而 GPT-4 的回复中只有 5%(P < 0.00001)与之相悖。

结论

ChatGPT 版本 3.5 和 4 之间的回复质量有显著提高,提供与科学共识相悖的信息的可能性降低。然而,该技术仍有可能产生不准确的陈述。角膜专家在帮助用户辨别此类信息的真实性和适用性方面具有独特的优势。

相似文献

本文引用的文献

1
Performance of ChatGPT in Diagnosis of Corneal Eye Diseases.ChatGPT 在角膜眼病诊断中的表现。
Cornea. 2024 May 1;43(5):664-670. doi: 10.1097/ICO.0000000000003492. Epub 2024 Feb 23.
3
Large language models encode clinical knowledge.大语言模型编码临床知识。
Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.
6
7
Will ChatGPT transform healthcare?ChatGPT会改变医疗保健行业吗?
Nat Med. 2023 Mar;29(3):505-506. doi: 10.1038/s41591-023-02289-5.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验