Habra Oussama, Gallardo Mathias, Meyer Zu Westram Till, De Zanet Sandro, Jaggi Damian, Zinkernagel Martin, Wolf Sebastian, Sznitman Raphael
Department for Ophthalmology, Inselspital, University Hospital, University of Bern, Bern, Switzerland.
AIMI, ARTORG Center, University of Bern, Bern, Switzerland.
Ophthalmologica. 2022;245(6):516-527. doi: 10.1159/000527345. Epub 2022 Oct 10.
In this retrospective cohort study, we wanted to evaluate the performance and analyze the insights of an artificial intelligence (AI) algorithm in detecting retinal fluid in spectral-domain OCT volume scans from a large cohort of patients with neovascular age-related macular degeneration (AMD) and diabetic macular edema (DME).
A total of 3,981 OCT volumes from 374 patients with AMD and 11,501 OCT volumes from 811 patients with DME were acquired with Heidelberg-Spectralis OCT device (Heidelberg Engineering Inc., Heidelberg, Germany) between 2013 and 2021. Each OCT volume was annotated for the presence or absence of intraretinal fluid (IRF) and subretinal fluid (SRF) by masked reading center graders (ground truth). The performance of an already published AI algorithm to detect IRF and SRF separately, and a combined fluid detector (IRF and/or SRF) of the same OCT volumes was evaluated. An analysis of the sources of disagreement between annotation and prediction and their relationship to central retinal thickness was performed. We computed the mean areas under the curves (AUC) and under the precision-recall curves (AP), accuracy, sensitivity, specificity, and precision.
The AUC for IRF was 0.92 and 0.98, for SRF 0.98 and 0.99, in the AMD and DME cohort, respectively. The AP for IRF was 0.89 and 1.00, for SRF 0.97 and 0.93, in the AMD and DME cohort, respectively. The accuracy, specificity, and sensitivity for IRF were 0.87, 0.88, 0.84, and 0.93, 0.95, 0.93, and for SRF 0.93, 0.93, 0.93, and 0.95, 0.95, 0.95 in the AMD and DME cohort, respectively. For detecting any fluid, the AUC was 0.95 and 0.98, and the accuracy, specificity, and sensitivity were 0.89, 0.93, and 0.90 and 0.95, 0.88, and 0.93, in the AMD and DME cohort, respectively. False positives were present when retinal shadow artifacts and strong retinal deformation were present. False negatives were due to small hyporeflective areas in combination with poor image quality. The combined detector correctly predicted more OCT volumes than the single detectors for IRF and SRF, 89.0% versus 81.6% in the AMD and 93.1% versus 88.6% in the DME cohort.
DISCUSSION/CONCLUSION: The AI-based fluid detector achieves high performance for retinal fluid detection in a very large dataset dedicated to AMD and DME. Combining single detectors provides better fluid detection accuracy than considering the single detectors separately. The observed independence of the single detectors ensures that the detectors learned features particular to IRF and SRF.
在这项回顾性队列研究中,我们旨在评估一种人工智能(AI)算法在检测大量新生血管性年龄相关性黄斑变性(AMD)和糖尿病性黄斑水肿(DME)患者的光谱域光学相干断层扫描(OCT)容积扫描中视网膜积液方面的性能,并分析其见解。
在2013年至2021年期间,使用海德堡光谱OCT设备(德国海德堡海德堡工程公司)采集了374例AMD患者的3981份OCT容积数据以及811例DME患者的11501份OCT容积数据。由遮蔽阅读中心的分级人员(真值)对每份OCT容积数据标注有无视网膜内积液(IRF)和视网膜下积液(SRF)。评估一种已发表的AI算法分别检测IRF和SRF以及同一OCT容积数据的联合积液检测器(IRF和/或SRF)的性能。对标注与预测之间的分歧来源及其与中心视网膜厚度的关系进行分析。我们计算了曲线下面积(AUC)、精确召回曲线下面积(AP)、准确性、敏感性、特异性和精确率。
在AMD队列中,IRF的AUC分别为0.92和0.98,SRF的AUC分别为0.98和0.99;在DME队列中,IRF的AP分别为0.89和1.00,SRF的AP分别为0.97和0.93。在AMD队列中,IRF的准确性、特异性和敏感性分别为0.87、0.88、0.84,在DME队列中分别为0.93、0.95、0.93;SRF在AMD队列中的准确性、特异性和敏感性分别为0.93、0.93、0.93,在DME队列中分别为0.95、0.95、0.95。对于检测任何积液,在AMD队列中,AUC为0.95,准确性、特异性和敏感性分别为0.89、0.93、0.90;在DME队列中,AUC为0.98,准确性、特异性和敏感性分别为0.95、0.88、0.93。当存在视网膜阴影伪影和强烈的视网膜变形时会出现假阳性。假阴性是由于小的低反射区域与图像质量差共同导致的。联合检测器正确预测的OCT容积数据比单独检测IRF和SRF的单个检测器更多,在AMD队列中分别为89.0%和81.6%,在DME队列中分别为93.1%和88.6%。
讨论/结论:基于AI的积液检测器在一个专门针对AMD和DME的非常大的数据集中,在视网膜积液检测方面实现了高性能。联合单个检测器比单独考虑单个检测器提供了更好的积液检测准确性。观察到的单个检测器的独立性确保了检测器学习到了特定于IRF和SRF的特征。