Wang Meng, Lin Tian, Lin Aidi, Yu Kai, Peng Yuanyuan, Wang Lianyu, Chen Cheng, Zou Ke, Liang Huiyu, Chen Man, Yao Xue, Zhang Meiqin, Huang Binwei, Zheng Chaoxin, Zhang Peixin, Chen Wei, Luo Yilong, Chen Yifan, Xia Honghe, Shi Tingkun, Zhang Qi, Guo Jinming, Chen Xiaolin, Wang Jingcheng, Tham Yih Chung, Liu Dianbo, Wong Wendy, Thakur Sahil, Fenner Beau J, Fang Danqi, Liu Siying, Liu Qingyun, Huang Yuqiang, Zeng Hongqiang, Meng Yanda, Zhou Yukun, Jiang Zehua, Qiu Minghui, Zhang Changqing, Chen Xinjian, Wang Sophia Y, Lee Cecilia S, Sobrin Lucia, Cheung Carol Y, Pang Chi Pui, Keane Pearse A, Cheng Ching-Yu, Chen Haoyu, Fu Huazhu
Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
Nat Commun. 2025 Jul 1;16(1):5528. doi: 10.1038/s41467-025-60577-9.
Previous foundation models for fundus images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language model that incorporates knowledge from over 400 fundus diseases. The model is pre-trained on 341,896 fundus images with accompanying text descriptions gathered from diverse sources across multiple ethnicities and countries. RetiZero demonstrates exceptional performance across various downstream tasks including zero-shot disease recognition, image-to-image retrieval, clinical diagnosis assistance, few-shot fine-tuning, and cross-domain disease identification. In zero-shot scenarios, it achieves Top-5 accuracies of 0.843 for 15 diseases and 0.756 for 52 diseases, while for image-to-image retrieval, it scores 0.950 and 0.886 respectively. Notably, RetiZero's Top-3 zero-shot performance exceeds the average diagnostic accuracy of 19 ophthalmologists from Singapore, China, and the United States. The model particularly enhances clinicians' ability to diagnose rare fundus conditions, highlighting its potential value for integration into clinical settings where diverse eye diseases are encountered.
以前用于眼底图像的基础模型是在有限的疾病类别和知识库上进行预训练的。在这里,我们介绍RetiZero,这是一种视觉语言模型,它整合了来自400多种眼底疾病的知识。该模型在341,896张眼底图像上进行预训练,这些图像带有从多个种族和国家的不同来源收集的文本描述。RetiZero在各种下游任务中表现出色,包括零样本疾病识别、图像到图像检索、临床诊断辅助、少样本微调以及跨域疾病识别。在零样本场景中,它在15种疾病上的前5准确率为0.843,在52种疾病上为0.756,而在图像到图像检索方面,它的得分分别为0.950和0.886。值得注意的是,RetiZero的前3零样本性能超过了来自新加坡、中国和美国的19位眼科医生的平均诊断准确率。该模型特别提高了临床医生诊断罕见眼底疾病的能力,突出了其在遇到各种眼部疾病的临床环境中整合的潜在价值。