Suppr超能文献

ChatGPT在角膜疾病诊断中的性能。

Performance of ChatGPT in Diagnosis of Corneal Eye Diseases.

作者信息

Delsoz Mohammad, Madadi Yeganeh, Munir Wuqaas M, Tamm Brendan, Mehravaran Shiva, Soleimani Mohammad, Djalilian Ali, Yousefi Siamak

机构信息

Hamilton Eye Institute, Department of Ophthalmology, University of Tennessee Health Science Center, Memphis, TN, USA.

Department of Ophthalmology and Visual Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.

出版信息

medRxiv. 2023 Aug 28:2023.08.25.23294635. doi: 10.1101/2023.08.25.23294635.

Abstract

INTRODUCTION

Assessing the capabilities of ChatGPT-4.0 and ChatGPT-3.5 for diagnosing corneal eye diseases based on case reports and compare with human experts.

METHODS

We randomly selected 20 cases of corneal diseases including corneal infections, dystrophies, degenerations, and injuries from a publicly accessible online database from the University of Iowa. We then input the text of each case description into ChatGPT-4.0 and ChatGPT3.5 and asked for a provisional diagnosis. We finally evaluated the responses based on the correct diagnoses then compared with the diagnoses of three cornea specialists (Human experts) and evaluated interobserver agreements.

RESULTS

The provisional diagnosis accuracy based on ChatGPT-4.0 was 85% (17 correct out of 20 cases) while the accuracy of ChatGPT-3.5 was 60% (12 correct cases out of 20). The accuracy of three cornea specialists were 100% (20 cases), 90% (18 cases), and 90% (18 cases), respectively. The interobserver agreement between ChatGPT-4.0 and ChatGPT-3.5 was 65% (13 cases) while the interobserver agreement between ChatGPT-4.0 and three cornea specialists were 85% (17 cases), 80% (16 cases), and 75% (15 cases), respectively. However, the interobserver agreement between ChatGPT-3.5 and each of three cornea specialists was 60% (12 cases).

CONCLUSIONS

The accuracy of ChatGPT-4.0 in diagnosing patients with various corneal conditions was markedly improved than ChatGPT-3.5 and promising for potential clinical integration.

摘要

引言

基于病例报告评估ChatGPT-4.0和ChatGPT-3.5诊断角膜疾病的能力,并与人类专家进行比较。

方法

我们从爱荷华大学一个可公开访问的在线数据库中随机选择了20例角膜疾病病例,包括角膜感染、营养不良、变性和损伤。然后我们将每个病例描述的文本输入ChatGPT-4.0和ChatGPT-3.5,并要求给出初步诊断。我们最终根据正确诊断评估回复,然后与三位角膜专家(人类专家)的诊断进行比较,并评估观察者间的一致性。

结果

基于ChatGPT-4.0的初步诊断准确率为85%(20例中有17例正确),而ChatGPT-3.5的准确率为60%(20例中有12例正确)。三位角膜专家的准确率分别为100%(20例)、90%(18例)和90%(18例)。ChatGPT-4.0和ChatGPT-3.5之间的观察者间一致性为65%(13例),而ChatGPT-4.0与三位角膜专家之间的观察者间一致性分别为85%(17例)、80%(16例)和75%(15例)。然而,ChatGPT-3.5与三位角膜专家中每一位之间的观察者间一致性为60%(12例)。

结论

ChatGPT-4.0在诊断各种角膜疾病患者方面的准确率比ChatGPT-3.5有显著提高,并且有望实现潜在的临床整合。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51f3/10500623/8d9c5eaa3896/nihpp-2023.08.25.23294635v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验