Hu Xiaoyan, Ran An Ran, Nguyen Truong X, Szeto Simon, Yam Jason C, Chan Carmen K M, Cheung Carol Y
Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China.
Hong Kong Eye Hospital, Hong Kong SAR, China.
Ophthalmol Ther. 2023 Dec;12(6):3395-3402. doi: 10.1007/s40123-023-00789-8. Epub 2023 Sep 1.
Generative pretrained transformer-4 (GPT-4) has gained widespread attention from society, and its potential has been extensively evaluated in many areas. However, investigation of GPT-4's use in medicine, especially in the ophthalmology field, is still limited. This study aims to evaluate GPT-4's capability to identify rare ophthalmic diseases in three simulated scenarios for different end-users, including patients, family physicians, and junior ophthalmologists.
We selected ten treatable rare ophthalmic disease cases from the publicly available EyeRounds service. We gradually increased the amount of information fed into GPT-4 to simulate the scenarios of patient, family physician, and junior ophthalmologist using GPT-4. GPT-4's responses were evaluated from two aspects: suitability (appropriate or inappropriate) and accuracy (right or wrong) by senior ophthalmologists (> 10 years' experiences).
Among the 30 responses, 83.3% were considered "appropriate" by senior ophthalmologists. In the scenarios of simulated patient, family physician, and junior ophthalmologist, seven (70%), ten (100%), and eight (80%) responses were graded as "appropriate" by senior ophthalmologists. However, compared to the ground truth, GPT-4 could only output several possible diseases generally without "right" responses in the simulated patient scenarios. In contrast, in the simulated family physician scenario, 50% of GPT-4's responses were "right," and in the simulated junior ophthalmologist scenario, the model achieved a higher "right" rate of 90%.
To our knowledge, this is the first proof-of-concept study that evaluates GPT-4's capacity to identify rare eye diseases in simulated scenarios involving patients, family physicians, and junior ophthalmologists. The results indicate that GPT-4 has the potential to serve as a consultation assisting tool for patients and family physicians to receive referral suggestions and an assisting tool for junior ophthalmologists to diagnose rare eye diseases. However, it is important to approach GPT-4 with caution and acknowledge the need for verification and careful referrals in clinical settings.
生成式预训练变换器4(GPT-4)已引起社会广泛关注,其潜力在许多领域得到了广泛评估。然而,对GPT-4在医学领域,尤其是眼科领域的应用研究仍然有限。本研究旨在评估GPT-4在三种模拟场景中为不同终端用户(包括患者、家庭医生和初级眼科医生)识别罕见眼科疾病的能力。
我们从公开可用的EyeRounds服务中选取了10例可治疗的罕见眼科疾病病例。我们逐渐增加输入GPT-4的信息量,以模拟患者、家庭医生和初级眼科医生使用GPT-4的场景。由资深眼科医生(超过10年经验)从两个方面评估GPT-4的回答:适用性(合适或不合适)和准确性(正确或错误)。
在30个回答中,资深眼科医生认为83.3%是“合适的”。在模拟患者、家庭医生和初级眼科医生的场景中,资深眼科医生将7个(70%)、10个(100%)和8个(80%)回答评为“合适的”。然而,与实际情况相比,在模拟患者场景中,GPT-4通常只能输出几种可能的疾病,没有“正确”的回答。相比之下,在模拟家庭医生场景中,GPT-4的回答有50%是“正确的”,在模拟初级眼科医生场景中,该模型的“正确”率更高,达到了90%。
据我们所知,这是第一项概念验证研究,评估了GPT-4在涉及患者、家庭医生和初级眼科医生的模拟场景中识别罕见眼病的能力。结果表明,GPT-4有潜力作为一种咨询辅助工具,为患者和家庭医生提供转诊建议,也可作为初级眼科医生诊断罕见眼病的辅助工具。然而,谨慎使用GPT-4并认识到在临床环境中进行验证和谨慎转诊的必要性很重要。