视网膜成像中的判别式、生成式人工智能及基础模型。

Discriminative, generative artificial intelligence, and foundation models in retina imaging.

作者信息

Ruamviboonsuk Paisan, Arjkongharn Niracha, Vongsa Nattaporn, Pakaymaskul Pawin, Kaothanthong Natsuda

机构信息

Department of Ophthalmology, College of Medicine, Rangsit University, Bangkok, Thailand.

Sirindhorn International Institute of Technology, Thammasat University, Bangkok, Thailand.

出版信息

Taiwan J Ophthalmol. 2024 Nov 28;14(4):473-485. doi: 10.4103/tjo.TJO-D-24-00064. eCollection 2024 Oct-Dec.

DOI:10.4103/tjo.TJO-D-24-00064

PMID:39803410

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11717344/

Abstract

Recent advances of artificial intelligence (AI) in retinal imaging found its application in two major categories: discriminative and generative AI. For discriminative tasks, conventional convolutional neural networks (CNNs) are still major AI techniques. Vision transformers (ViT), inspired by the transformer architecture in natural language processing, has emerged as useful techniques for discriminating retinal images. ViT can attain excellent results when pretrained at sufficient scale and transferred to specific tasks with fewer images, compared to conventional CNN. Many studies found better performance of ViT, compared to CNN, for common tasks such as diabetic retinopathy screening on color fundus photographs (CFP) and segmentation of retinal fluid on optical coherence tomography (OCT) images. Generative Adversarial Network (GAN) is the main AI technique in generative AI in retinal imaging. Novel images generated by GAN can be applied for training AI models in imbalanced or inadequate datasets. Foundation models are also recent advances in retinal imaging. They are pretrained with huge datasets, such as millions of CFP and OCT images and fine-tuned for downstream tasks with much smaller datasets. A foundation model, RETFound, which was self-supervised and found to discriminate many eye and systemic diseases better than supervised models. Large language models are foundation models that may be applied for text-related tasks, like reports of retinal angiography. Whereas AI technology moves forward fast, real-world use of AI models moves slowly, making the gap between development and deployment even wider. Strong evidence showing AI models can prevent visual loss may be required to close this gap.

摘要

人工智能（AI）在视网膜成像方面的最新进展主要应用于两大类：判别式AI和生成式AI。对于判别式任务，传统的卷积神经网络（CNN）仍然是主要的AI技术。受自然语言处理中Transformer架构启发的视觉Transformer（ViT）已成为用于判别视网膜图像的有用技术。与传统的CNN相比，ViT在足够规模上进行预训练并转移到图像较少的特定任务时，可以获得出色的结果。许多研究发现，与CNN相比，ViT在诸如彩色眼底照片（CFP）上的糖尿病视网膜病变筛查和光学相干断层扫描（OCT）图像上的视网膜液分割等常见任务中表现更好。生成对抗网络（GAN）是视网膜成像中生成式AI的主要AI技术。GAN生成的新图像可用于在不平衡或不充分的数据集中训练AI模型。基础模型也是视网膜成像的最新进展。它们使用大量数据集（例如数百万张CFP和OCT图像）进行预训练，并使用小得多的数据集对下游任务进行微调。一种基础模型RETFound，它是自监督的，并且发现其在判别许多眼部和全身性疾病方面比监督模型更好。大型语言模型是可应用于与文本相关任务（如视网膜血管造影报告）的基础模型。尽管AI技术发展迅速，但AI模型在现实世界中的应用进展缓慢，使得开发与部署之间的差距甚至更大。可能需要有力证据表明AI模型可以预防视力丧失，以弥合这一差距。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f39/11717344/554d217d50ff/TJO-14-473-g001.jpg

相似文献

Discriminative, generative artificial intelligence, and foundation models in retina imaging.

Taiwan J Ophthalmol. 2024 Nov 28;14(4):473-485. doi: 10.4103/tjo.TJO-D-24-00064. eCollection 2024 Oct-Dec.

Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images.

Hum Reprod. 2024 Jun 3;39(6):1197-1207. doi: 10.1093/humrep/deae064.

Revolutionizing Digital Pathology With the Power of Generative Artificial Intelligence and Foundation Models.

Lab Invest. 2023 Nov;103(11):100255. doi: 10.1016/j.labinv.2023.100255. Epub 2023 Sep 26.

Synthetic Medical Images for Robust, Privacy-Preserving Training of Artificial Intelligence: Application to Retinopathy of Prematurity Diagnosis.

Ophthalmol Sci. 2022 Feb 11;2(2):100126. doi: 10.1016/j.xops.2022.100126. eCollection 2022 Jun.

How Foundational Is the Retina Foundation Model? Estimating RETFound's Label Efficiency on Binary Classification of Normal versus Abnormal OCT Images.

Ophthalmol Sci. 2025 Jan 11;5(3):100707. doi: 10.1016/j.xops.2025.100707. eCollection 2025 May-Jun.

Use of artificial intelligence with retinal imaging in screening for diabetes-associated complications: systematic review.

EClinicalMedicine. 2025 Feb 18;81:103089. doi: 10.1016/j.eclinm.2025.103089. eCollection 2025 Mar.

Artificial intelligence-enhanced retinal imaging as a biomarker for systemic diseases.

Theranostics. 2025 Feb 18;15(8):3223-3233. doi: 10.7150/thno.100786. eCollection 2025.

Multi-class Classification of Retinal Eye Diseases from Ophthalmoscopy Images Using Transfer Learning-Based Vision Transformers.

J Imaging Inform Med. 2025 Jan 27. doi: 10.1007/s10278-025-01416-7.

Independent Evaluation of RETFound Foundation Model's Performance on Optic Nerve Analysis Using Fundus Photography.

Ophthalmol Sci. 2025 Jan 28;5(3):100720. doi: 10.1016/j.xops.2025.100720. eCollection 2025 May-Jun.

Self-supervised learning improves robustness of deep learning lung tumor segmentation models to CT imaging differences.

Med Phys. 2025 Mar;52(3):1573-1588. doi: 10.1002/mp.17541. Epub 2024 Dec 5.

本文引用的文献

RETFound-enhanced community-based fundus disease screening: real-world evidence and decision curve analysis.

NPJ Digit Med. 2024 Apr 30;7(1):108. doi: 10.1038/s41746-024-01109-5.

A User-friendly Approach for the Diagnosis of Diabetic Retinopathy Using ChatGPT and Automated Machine Learning.

Ophthalmol Sci. 2024 Feb 21;4(4):100495. doi: 10.1016/j.xops.2024.100495. eCollection 2024 Jul-Aug.

FNeXter: A Multi-Scale Feature Fusion Network Based on ConvNeXt and Transformer for Retinal OCT Fluid Segmentation.

Sensors (Basel). 2024 Apr 10;24(8):2425. doi: 10.3390/s24082425.

Semantic uncertainty Guided Cross-Transformer for enhanced macular edema segmentation in OCT images.

Comput Biol Med. 2024 May;174:108458. doi: 10.1016/j.compbiomed.2024.108458. Epub 2024 Apr 16.

Machine learning and optical coherence tomography-derived radiomics analysis to predict persistent diabetic macular edema in patients undergoing anti-VEGF intravitreal therapy.

J Transl Med. 2024 Apr 16;22(1):358. doi: 10.1186/s12967-024-05141-7.

Domain generalization for retinal vessel segmentation via Hessian-based vector field.

Med Image Anal. 2024 Jul;95:103164. doi: 10.1016/j.media.2024.103164. Epub 2024 Apr 6.

Interpretable detection of epiretinal membrane from optical coherence tomography with deep neural networks.

Sci Rep. 2024 Apr 11;14(1):8484. doi: 10.1038/s41598-024-57798-1.

Deep Learning-Based Automated Detection of Retinal Breaks and Detachments on Fundus Photography.

Transl Vis Sci Technol. 2024 Apr 2;13(4):1. doi: 10.1167/tvst.13.4.1.

Beyond Discrimination: Generative AI Applications and Ethical Challenges in Forensic Psychiatry.

Front Psychiatry. 2024 Mar 8;15:1346059. doi: 10.3389/fpsyt.2024.1346059. eCollection 2024.

Accuracy of generative deep learning model for macular anatomy prediction from optical coherence tomography images in macular hole surgery.

Sci Rep. 2024 Mar 22;14(1):6913. doi: 10.1038/s41598-024-57562-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

视网膜成像中的判别式、生成式人工智能及基础模型。

Discriminative, generative artificial intelligence, and foundation models in retina imaging.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献