Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.
SERI-NTU Advanced Ocular Engineering (STANCE), Singapore, Singapore.
JAMA Ophthalmol. 2022 Oct 1;140(10):974-981. doi: 10.1001/jamaophthalmol.2022.3375.
Deep learning (DL) networks require large data sets for training, which can be challenging to collect clinically. Generative models could be used to generate large numbers of synthetic optical coherence tomography (OCT) images to train such DL networks for glaucoma detection.
To assess whether generative models can synthesize circumpapillary optic nerve head OCT images of normal and glaucomatous eyes and determine the usability of synthetic images for training DL models for glaucoma detection.
DESIGN, SETTING, AND PARTICIPANTS: Progressively growing generative adversarial network models were trained to generate circumpapillary OCT scans. Image gradeability and authenticity were evaluated on a clinical set of 100 real and 100 synthetic images by 2 clinical experts. DL networks for glaucoma detection were trained with real or synthetic images and evaluated on independent internal and external test data sets of 140 and 300 real images, respectively.
Evaluations of the clinical set between the experts were compared. Glaucoma detection performance of the DL networks was assessed using area under the curve (AUC) analysis. Class activation maps provided visualizations of the regions contributing to the respective classifications.
A total of 990 normal and 862 glaucomatous eyes were analyzed. Evaluations of the clinical set were similar for gradeability (expert 1: 92.0%; expert 2: 93.0%) and authenticity (expert 1: 51.8%; expert 2: 51.3%). The best-performing DL network trained on synthetic images had AUC scores of 0.97 (95% CI, 0.95-0.99) on the internal test data set and 0.90 (95% CI, 0.87-0.93) on the external test data set, compared with AUCs of 0.96 (95% CI, 0.94-0.99) on the internal test data set and 0.84 (95% CI, 0.80-0.87) on the external test data set for the network trained with real images. An increase in the AUC for the synthetic DL network was observed with the use of larger synthetic data set sizes. Class activation maps showed that the regions of the synthetic images contributing to glaucoma detection were generally similar to that of real images.
DL networks trained with synthetic OCT images for glaucoma detection were comparable with networks trained with real images. These results suggest potential use of generative models in the training of DL networks and as a means of data sharing across institutions without patient information confidentiality issues.
深度学习(DL)网络需要大量的训练数据集,这在临床上可能很难收集。生成模型可用于生成大量的合成光学相干断层扫描(OCT)图像,以训练此类用于青光眼检测的 DL 网络。
评估生成模型是否可以合成正常和青光眼眼的周边视神经头 OCT 图像,并确定用于训练用于青光眼检测的 DL 模型的合成图像的可用性。
设计、设置和参与者:逐步发展的生成对抗网络模型被训练以生成周边 OCT 扫描。由 2 名临床专家对 100 张真实图像和 100 张合成图像的临床数据集进行图像分级和真实性评估。使用真实或合成图像训练用于青光眼检测的 DL 网络,并分别在独立的内部和外部测试数据集(140 张和 300 张真实图像)上进行评估。
比较了专家之间对临床数据集的评估。使用曲线下面积(AUC)分析评估 DL 网络的青光眼检测性能。类激活图提供了对各自分类有贡献的区域的可视化。
共分析了 990 只正常眼和 862 只青光眼眼。对临床数据集的评估在分级能力(专家 1:92.0%;专家 2:93.0%)和真实性(专家 1:51.8%;专家 2:51.3%)方面相似。在内部测试数据集上,使用合成图像训练的表现最佳的 DL 网络的 AUC 评分为 0.97(95%CI,0.95-0.99),在外部测试数据集上为 0.90(95%CI,0.87-0.93),而使用真实图像训练的网络在内部测试数据集上的 AUC 评分为 0.96(95%CI,0.94-0.99),在外部测试数据集上为 0.84(95%CI,0.80-0.87)。随着使用更大的合成数据集大小,用于合成 DL 网络的 AUC 增加。类激活图显示,对青光眼检测有贡献的合成图像区域通常与真实图像相似。
使用合成 OCT 图像训练用于青光眼检测的 DL 网络与使用真实图像训练的网络相当。这些结果表明,生成模型在 DL 网络的训练中具有潜在的应用价值,并且可以作为在不涉及患者信息保密问题的情况下在机构之间共享数据的一种手段。