Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200, Copenhagen N, Denmark.
Australian E-Health Research Centre, CSIRO, Perth, Australia.
Sci Rep. 2021 May 6;11(1):9704. doi: 10.1038/s41598-021-89225-0.
Diabetic retinopathy (DR) is a leading cause of blindness and affects millions of people throughout the world. Early detection and timely checkups are key to reduce the risk of blindness. Automated grading of DR is a cost-effective way to ensure early detection and timely checkups. Deep learning or more specifically convolutional neural network (CNN)-based methods produce state-of-the-art performance in DR detection. Whilst CNN based methods have been proposed, no comparisons have been done between the extracted image features and their clinical relevance. Here we first adopt a CNN visualization strategy to discover the inherent image features involved in the CNN's decision-making process. Then, we critically analyze those features with respect to commonly known pathologies namely microaneurysms, hemorrhages and exudates, and other ocular components. We also critically analyze different CNNs by considering what image features they pick up during learning to predict and justify their clinical relevance. The experiments are executed on publicly available fundus datasets (EyePACS and DIARETDB1) achieving an accuracy of 89 ~ 95% with AUC, sensitivity and specificity of respectively 95 ~ 98%, 74 ~ 86%, and 93 ~ 97%, for disease level grading of DR. Whilst different CNNs produce consistent classification results, the rate of picked-up image features disagreement between models could be as high as 70%.
糖尿病视网膜病变 (DR) 是导致失明的主要原因,影响着全世界数以百万计的人。早期发现和及时检查是降低失明风险的关键。DR 的自动分级是确保早期发现和及时检查的一种具有成本效益的方法。深度学习,或者更具体地说基于卷积神经网络 (CNN) 的方法,在 DR 检测方面取得了最先进的性能。虽然已经提出了基于 CNN 的方法,但尚未对提取的图像特征及其与临床的相关性进行比较。在这里,我们首先采用 CNN 可视化策略来发现 CNN 决策过程中涉及的固有图像特征。然后,我们根据常见的病变(即微动脉瘤、出血和渗出物)以及其他眼部成分,对这些特征进行批判性分析。我们还通过考虑它们在学习过程中预测和证明其临床相关性时选择哪些图像特征来批判性地分析不同的 CNN。在公开的眼底数据集 (EyePACS 和 DIARETDB1) 上进行实验,在疾病级别分级方面,DR 的准确率为 89%95%,AUC、灵敏度和特异性分别为 95%98%、74%86%和 93%97%。虽然不同的 CNN 产生一致的分类结果,但模型之间提取的图像特征的一致性可能高达 70%。