Madadi Yeganeh, Delsoz Mohammad, Lao Priscilla A, Fong Joseph W, Hollingsworth T J, Kahook Malik Y, Yousefi Siamak
Department of Ophthalmology (YM, MD, PAL, JWF, TJH, SY), University of Tennessee Health Science Center, Memphis, Tennessee; Department of Ophthalmology (MYK), University of Colorado School of Medicine, Aurora, Colorado; and Department of Genetics, Genomics, and Informatics (SY), University of Tennessee Health Science Center, Memphis, Tennessee.
J Neuroophthalmol. 2024 Oct 10;45(3):301-306. doi: 10.1097/WNO.0000000000002274.
BACKGROUND: To evaluate the accuracy of Chat Generative Pre-Trained Transformer (ChatGPT), a large language model (LLM), to assist in diagnosing neuro-ophthalmic diseases based on case reports. METHODS: We selected 22 different case reports of neuro-ophthalmic disorders from a publicly available online database. These cases included a wide range of chronic and acute diseases commonly seen by neuro-ophthalmologists. We inserted each case as a new prompt into ChatGPTs (GPT-3.5 and GPT-4) and asked for the most likely diagnosis. We then presented the exact information to 2 neuro-ophthalmologists and recorded their diagnoses, followed by comparing responses from both versions of ChatGPT. RESULTS: GPT-3.5 and GPT-4 and the 2 neuro-ophthalmologists were correct in 13 (59%), 18 (82%), 19 (86%), and 19 (86%) out of 22 cases, respectively. The agreements between the various diagnostic sources were as follows: GPT-3.5 and GPT-4, 13 (59%); GPT-3.5 and the first neuro-ophthalmologist, 12 (55%); GPT-3.5 and the second neuro-ophthalmologist, 12 (55%); GPT-4 and the first neuro-ophthalmologist, 17 (77%); GPT-4 and the second neuro-ophthalmologist, 16 (73%); and first and second neuro-ophthalmologists 17 (77%). CONCLUSIONS: The accuracy of GPT-3.5 and GPT-4 in diagnosing patients with neuro-ophthalmic disorders was 59% and 82%, respectively. With further development, GPT-4 may have the potential to be used in clinical care settings to assist clinicians in providing accurate diagnoses. The applicability of using LLMs like ChatGPT in clinical settings that lack access to subspeciality trained neuro-ophthalmologists deserves further research.
背景:为评估大型语言模型Chat生成式预训练变换器(ChatGPT)基于病例报告辅助诊断神经眼科疾病的准确性。 方法:我们从一个公开的在线数据库中选取了22份不同的神经眼科疾病病例报告。这些病例包括神经眼科医生常见的各种慢性和急性疾病。我们将每个病例作为新的提示输入到ChatGPT(GPT - 3.5和GPT - 4)中,并询问最可能的诊断。然后我们将确切信息呈现给2名神经眼科医生并记录他们的诊断结果,随后比较两个版本ChatGPT的回答。 结果:GPT - 3.5、GPT - 4以及2名神经眼科医生在22例病例中分别正确诊断了13例(59%)、18例(82%)、19例(86%)和19例(86%)。不同诊断来源之间的一致性如下:GPT - 3.5和GPT - 4为13例(59%);GPT - 3.5和第一位神经眼科医生为12例(55%);GPT - 3.5和第二位神经眼科医生为12例(55%);GPT - 4和第一位神经眼科医生为17例(77%);GPT - 4和第二位神经眼科医生为16例(73%);第一位和第二位神经眼科医生为17例(77%)。 结论:GPT - 3.5和GPT - 4在诊断神经眼科疾病患者时的准确率分别为59%和82%。随着进一步发展,GPT - 4可能有潜力用于临床护理环境,以协助临床医生提供准确的诊断。在缺乏经过专科培训神经眼科医生的临床环境中使用像ChatGPT这样的大型语言模型的适用性值得进一步研究。
J Neuroophthalmol. 2024-10-10
Front Cell Dev Biol. 2025-7-7
J Med Internet Res. 2024-12-11
Int J Ophthalmol. 2025-9-18
J Neuroophthalmol. 2025-6-13
Front Med (Lausanne). 2025-6-25
Ophthalmol Ther. 2025-6
Am J Ophthalmol. 2024-10
Biomed Signal Process Control. 2024-6
Cornea. 2024-5-1
Curr Opin Ophthalmol. 2024-5-1
Br J Ophthalmol. 2024-9-20
Cell Rep Med. 2023-7-18
JAMA Ophthalmol. 2023-6-1
J Neuroophthalmol. 2022-3-1
Ophthalmology. 2021-9