SynthEye：研究合成数据对遗传性视网膜疾病人工智能辅助基因诊断的影响。

SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease.

作者信息

Veturi Yoga Advaith, Woof William, Lazebnik Teddy, Moghul Ismail, Woodward-Court Peter, Wagner Siegfried K, Cabral de Guimarães Thales Antonio, Daich Varela Malena, Liefers Bart, Patel Praveen J, Beck Stephan, Webster Andrew R, Mahroo Omar, Keane Pearse A, Michaelides Michel, Balaskas Konstantinos, Pontikos Nikolas

机构信息

University College London Institute of Ophthalmology, University College London, London, UK.

Moorfields Eye Hospital, London, UK.

出版信息

Ophthalmol Sci. 2022 Nov 22;3(2):100258. doi: 10.1016/j.xops.2022.100258. eCollection 2023 Jun.

DOI:10.1016/j.xops.2022.100258

PMID:36685715

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9852957/

Abstract

PURPOSE

Rare disease diagnosis is challenging in medical image-based artificial intelligence due to a natural class imbalance in datasets, leading to biased prediction models. Inherited retinal diseases (IRDs) are a research domain that particularly faces this issue. This study investigates the applicability of synthetic data in improving artificial intelligence-enabled diagnosis of IRDs using generative adversarial networks (GANs).

DESIGN

Diagnostic study of gene-labeled fundus autofluorescence (FAF) IRD images using deep learning.

PARTICIPANTS

Moorfields Eye Hospital (MEH) dataset of 15 692 FAF images obtained from 1800 patients with confirmed genetic diagnosis of 1 of 36 IRD genes.

METHODS

A StyleGAN2 model is trained on the IRD dataset to generate 512 × 512 resolution images. Convolutional neural networks are trained for classification using different synthetically augmented datasets, including real IRD images plus 1800 and 3600 synthetic images, and a fully rebalanced dataset. We also perform an experiment with only synthetic data. All models are compared against a baseline convolutional neural network trained only on real data.

MAIN OUTCOME MEASURES

We evaluated synthetic data quality using a Visual Turing Test conducted with 4 ophthalmologists from MEH. Synthetic and real images were compared using feature space visualization, similarity analysis to detect memorized images, and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score for no-reference-based quality evaluation. Convolutional neural network diagnostic performance was determined on a held-out test set using the area under the receiver operating characteristic curve (AUROC) and Cohen's Kappa (κ).

RESULTS

An average true recognition rate of 63% and fake recognition rate of 47% was obtained from the Visual Turing Test. Thus, a considerable proportion of the synthetic images were classified as real by clinical experts. Similarity analysis showed that the synthetic images were not copies of the real images, indicating that copied real images, meaning the GAN was able to generalize. However, BRISQUE score analysis indicated that synthetic images were of significantly lower quality overall than real images ( < 0.05). Comparing the rebalanced model (RB) with the baseline (R), no significant change in the average AUROC and κ was found (R-AUROC = 0.86[0.85-88], RB-AUROC = 0.88[0.86-0.89], R-k = 0.51[0.49-0.53], and RB-k = 0.52[0.50-0.54]). The synthetic data trained model (S) achieved similar performance as the baseline (S-AUROC = 0.86[0.85-87], S-k = 0.48[0.46-0.50]).

CONCLUSIONS

Synthetic generation of realistic IRD FAF images is feasible. Synthetic data augmentation does not deliver improvements in classification performance. However, synthetic data alone deliver a similar performance as real data, and hence may be useful as a proxy to real data. Proprietary or commercial disclosure may be found after the references.

摘要

目的

在基于医学图像的人工智能中，由于数据集存在自然的类别不平衡，罕见病诊断颇具挑战，这会导致预测模型出现偏差。遗传性视网膜疾病（IRD）是一个特别面临此问题的研究领域。本研究探讨了合成数据在使用生成对抗网络（GAN）改善基于人工智能的IRD诊断中的适用性。

设计

使用深度学习对基因标记的眼底自发荧光（FAF）IRD图像进行诊断研究。

参与者

摩尔菲尔德眼科医院（MEH）数据集，包含从1800例经确诊患有36种IRD基因中某一种基因疾病的患者获得的15692张FAF图像。

方法

在IRD数据集上训练一个StyleGAN2模型，以生成分辨率为512×512的图像。使用不同的合成增强数据集训练卷积神经网络进行分类，这些数据集包括真实IRD图像加上1800张和3600张合成图像，以及一个完全重新平衡的数据集。我们还仅使用合成数据进行了一项实验。将所有模型与仅在真实数据上训练的基线卷积神经网络进行比较。

主要观察指标

我们通过与MEH的4位眼科医生进行的视觉图灵测试来评估合成数据质量。使用特征空间可视化、检测记忆图像的相似性分析以及用于基于无参考的质量评估的盲/无参考图像空间质量评估器（BRISQUE）分数来比较合成图像和真实图像。使用受试者操作特征曲线下面积（AUROC）和科恩卡方（κ）在一个保留测试集上确定卷积神经网络的诊断性能。

结果

视觉图灵测试的平均真识别率为63%，假识别率为47%。因此，相当一部分合成图像被临床专家分类为真实图像。相似性分析表明，合成图像不是真实图像的副本，这表明GAN能够进行泛化，即合成图像不是复制的真实图像。然而，BRISQUE分数分析表明，合成图像的整体质量明显低于真实图像（P<0.05）。将重新平衡模型（RB）与基线模型（R）进行比较，发现平均AUROC和κ没有显著变化（R - AUROC = 0.86[0.85 - 0.88]，RB - AUROC = 0.88[0.86 - 0.89]，R - κ = 0.51[0.49 - 0.53]，RB - κ = 0.52[0.50 - 0.54]）。合成数据训练模型（S）的性能与基线模型相似（S - AUROC = 0.86[0.85 - 0.87]，S - κ = 0.48[0.46 - 0.50]）。

结论

合成生成逼真的IRD FAF图像是可行的。合成数据增强并未提高分类性能。然而，仅合成数据就能提供与真实数据相似的性能，因此可作为真实数据的替代。参考文献之后可能会有专利或商业披露。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51d6/9852957/48fec8786767/gr1.jpg

相似文献

SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease.

Ophthalmol Sci. 2022 Nov 22;3(2):100258. doi: 10.1016/j.xops.2022.100258. eCollection 2023 Jun.

Synthetic Medical Images for Robust, Privacy-Preserving Training of Artificial Intelligence: Application to Retinopathy of Prematurity Diagnosis.

Ophthalmol Sci. 2022 Feb 11;2(2):100126. doi: 10.1016/j.xops.2022.100126. eCollection 2022 Jun.

Improving Artificial Intelligence-based Microbial Keratitis Screening Tools Constrained by Limited Data Using Synthetic Generation of Slit-Lamp Photos.

Ophthalmol Sci. 2024 Dec 20;5(3):100676. doi: 10.1016/j.xops.2024.100676. eCollection 2025 May-Jun.

Assessment of Deep Generative Models for High-Resolution Synthetic Retinal Image Generation of Age-Related Macular Degeneration.

JAMA Ophthalmol. 2019 Mar 1;137(3):258-264. doi: 10.1001/jamaophthalmol.2018.6156.

Deepfakes in Ophthalmology: Applications and Realism of Synthetic Retinal Images from Generative Adversarial Networks.

Ophthalmol Sci. 2021 Nov 16;1(4):100079. doi: 10.1016/j.xops.2021.100079. eCollection 2021 Dec.

Deep Learning-Based Classification of Inherited Retinal Diseases Using Fundus Autofluorescence.

J Clin Med. 2020 Oct 14;9(10):3303. doi: 10.3390/jcm9103303.

Assessment of Generative Adversarial Networks Model for Synthetic Optical Coherence Tomography Images of Retinal Disorders.

Transl Vis Sci Technol. 2020 May 27;9(2):29. doi: 10.1167/tvst.9.2.29. eCollection 2020 May.

Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images.

Hum Reprod. 2024 Jun 3;39(6):1197-1207. doi: 10.1093/humrep/deae064.

Synthetic artificial intelligence using generative adversarial network for retinal imaging in detection of age-related macular degeneration.

Front Med (Lausanne). 2023 Jun 22;10:1184892. doi: 10.3389/fmed.2023.1184892. eCollection 2023.

Quantification of Fundus Autofluorescence Features in a Molecularly Characterized Cohort of >3500 Patients with Inherited Retinal Disease from the United Kingdom.

Ophthalmol Sci. 2024 Nov 12;5(2):100652. doi: 10.1016/j.xops.2024.100652. eCollection 2025 Mar-Apr.

引用本文的文献

Gut microbiota regulate atherosclerosis via the gut-vascular axis: a scoping review of mechanisms and therapeutic interventions.

Front Microbiol. 2025 Aug 8;16:1606309. doi: 10.3389/fmicb.2025.1606309. eCollection 2025.

Next-generation phenotyping of inherited retinal diseases from multimodal imaging with Eye2Gene.

Nat Mach Intell. 2025;7(6):967-978. doi: 10.1038/s42256-025-01040-8. Epub 2025 Jun 18.

GenECG: a synthetic image-based ECG dataset to augment artificial intelligence-enhanced algorithm development.

BMJ Health Care Inform. 2025 May 31;32(1):e101335. doi: 10.1136/bmjhci-2024-101335.

Improving Artificial Intelligence-based Microbial Keratitis Screening Tools Constrained by Limited Data Using Synthetic Generation of Slit-Lamp Photos.

Ophthalmol Sci. 2024 Dec 20;5(3):100676. doi: 10.1016/j.xops.2024.100676. eCollection 2025 May-Jun.

Predicting postoperative nausea and vomiting using machine learning: a model development and validation study.

BMC Anesthesiol. 2025 Mar 20;25(1):135. doi: 10.1186/s12871-025-02987-2.

A Future Picture: A Review of Current Generative Adversarial Neural Networks in Vitreoretinal Pathologies and Their Future Potentials.

Biomedicines. 2025 Jan 24;13(2):284. doi: 10.3390/biomedicines13020284.

Phenotypic Distinctions Between EYS- and USH2A-Associated Retinitis Pigmentosa in an Asian Population.

Transl Vis Sci Technol. 2025 Feb 3;14(2):16. doi: 10.1167/tvst.14.2.16.

Generative artificial intelligence in ophthalmology: current innovations, future applications and challenges.

Br J Ophthalmol. 2024 Sep 20;108(10):1335-1340. doi: 10.1136/bjo-2024-325458.

Entering the Exciting Era of Artificial Intelligence and Big Data in Ophthalmology.

Ophthalmol Sci. 2024 Jan 19;4(2):100469. doi: 10.1016/j.xops.2024.100469. eCollection 2024 Mar-Apr.

Mathematical modeling of BCG-based bladder cancer treatment using socio-demographics.

Sci Rep. 2023 Oct 31;13(1):18754. doi: 10.1038/s41598-023-45581-7.

本文引用的文献

Investigating Determinants and Evaluating Deep Learning Training Approaches for Visual Acuity in Foveal Hypoplasia.

Ophthalmol Sci. 2022 Sep 24;3(1):100225. doi: 10.1016/j.xops.2022.100225. eCollection 2023 Mar.

Synthetic Medical Images for Robust, Privacy-Preserving Training of Artificial Intelligence: Application to Retinopathy of Prematurity Diagnosis.

Ophthalmol Sci. 2022 Feb 11;2(2):100126. doi: 10.1016/j.xops.2022.100126. eCollection 2022 Jun.

Detecting Anomalies in Retinal Diseases Using Generative, Discriminative, and Self-supervised Deep Learning.

JAMA Ophthalmol. 2022 Feb 1;140(2):185-189. doi: 10.1001/jamaophthalmol.2021.5557.

Solving the problem of imbalanced dataset with synthetic image generation for cell classification using deep learning.

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:2981-2984. doi: 10.1109/EMBC46164.2021.9631065.

100,000 Genomes Pilot on Rare-Disease Diagnosis in Health Care - Preliminary Report.

N Engl J Med. 2021 Nov 11;385(20):1868-1880. doi: 10.1056/NEJMoa2035790.

Are rare diseases overlooked by medical education? Awareness of rare diseases among physicians in Poland: an explanatory study.

Orphanet J Rare Dis. 2021 Sep 28;16(1):400. doi: 10.1186/s13023-021-02023-9.

Prediction of causative genes in inherited retinal disorder from fundus photography and autofluorescence imaging using deep learning techniques.

Br J Ophthalmol. 2021 Sep;105(9):1272-1279. doi: 10.1136/bjophthalmol-2020-318544. Epub 2021 Apr 20.

Rare disease awareness and perspectives of physicians in China: a questionnaire-based study.

Orphanet J Rare Dis. 2021 Apr 13;16(1):171. doi: 10.1186/s13023-021-01788-3.

Panel-based genetic testing for inherited retinal disease screening 176 genes.

Mol Genet Genomic Med. 2021 Dec;9(12):e1663. doi: 10.1002/mgg3.1663. Epub 2021 Mar 22.

The need for widely available genomic testing in rare eye diseases: an ERN-EYE position statement.

Orphanet J Rare Dis. 2021 Mar 20;16(1):142. doi: 10.1186/s13023-021-01756-x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

SynthEye：研究合成数据对遗传性视网膜疾病人工智能辅助基因诊断的影响。

SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease.

作者信息

机构信息

出版信息

PURPOSE

DESIGN

PARTICIPANTS

METHODS

MAIN OUTCOME MEASURES

RESULTS

CONCLUSIONS

目的

设计

参与者

方法

主要观察指标

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献