Discriminative, generative artificial intelligence, and foundation models in retina imaging.

作者信息

Ruamviboonsuk Paisan, Arjkongharn Niracha, Vongsa Nattaporn, Pakaymaskul Pawin, Kaothanthong Natsuda

机构信息

Department of Ophthalmology, College of Medicine, Rangsit University, Bangkok, Thailand.

Sirindhorn International Institute of Technology, Thammasat University, Bangkok, Thailand.

出版信息

Taiwan J Ophthalmol. 2024 Nov 28;14(4):473-485. doi: 10.4103/tjo.TJO-D-24-00064. eCollection 2024 Oct-Dec.

Abstract

Recent advances of artificial intelligence (AI) in retinal imaging found its application in two major categories: discriminative and generative AI. For discriminative tasks, conventional convolutional neural networks (CNNs) are still major AI techniques. Vision transformers (ViT), inspired by the transformer architecture in natural language processing, has emerged as useful techniques for discriminating retinal images. ViT can attain excellent results when pretrained at sufficient scale and transferred to specific tasks with fewer images, compared to conventional CNN. Many studies found better performance of ViT, compared to CNN, for common tasks such as diabetic retinopathy screening on color fundus photographs (CFP) and segmentation of retinal fluid on optical coherence tomography (OCT) images. Generative Adversarial Network (GAN) is the main AI technique in generative AI in retinal imaging. Novel images generated by GAN can be applied for training AI models in imbalanced or inadequate datasets. Foundation models are also recent advances in retinal imaging. They are pretrained with huge datasets, such as millions of CFP and OCT images and fine-tuned for downstream tasks with much smaller datasets. A foundation model, RETFound, which was self-supervised and found to discriminate many eye and systemic diseases better than supervised models. Large language models are foundation models that may be applied for text-related tasks, like reports of retinal angiography. Whereas AI technology moves forward fast, real-world use of AI models moves slowly, making the gap between development and deployment even wider. Strong evidence showing AI models can prevent visual loss may be required to close this gap.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f39/11717344/554d217d50ff/TJO-14-473-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索