Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou, China.
The big data center, Shantou University Medical College, Shantou, China.
BMJ Open. 2022 Jul 28;12(7):e060155. doi: 10.1136/bmjopen-2021-060155.
To develop and validate a real-world screening, guideline-based deep learning (DL) system for referable diabetic retinopathy (DR) detection.
This is a multicentre platform development study based on retrospective, cross-sectional data sets. Images were labelled by two-level certificated graders as the ground truth. According to the UK DR screening guideline, a DL model based on colour retinal images with five-dimensional classifiers, namely image quality, retinopathy, maculopathy gradability, maculopathy and photocoagulation, was developed. Referable decisions were generated by integrating the output of all classifiers and reported at the image, eye and patient level. The performance of the DL was compared with DR experts.
DR screening programmes from three hospitals and the Lifeline Express Diabetic Retinopathy Screening Program in China.
83 465 images of 39 836 eyes from 21 716 patients were annotated, of which 53 211 images were used as the development set and 30 254 images were used as the external validation set, split based on centre and period.
Accuracy, F1 score, sensitivity, specificity, area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), Cohen's unweighted κ and Gwet's AC1 were calculated to evaluate the performance of the DL algorithm.
In the external validation set, the five classifiers achieved an accuracy of 0.915-0.980, F1 score of 0.682-0.966, sensitivity of 0.917-0.978, specificity of 0.907-0.981, AUROC of 0.9639-0.9944 and AUPRC of 0.7504-0.9949. Referable DR at three levels was detected with an accuracy of 0.918-0.967, F1 score of 0.822-0.918, sensitivity of 0.970-0.971, specificity of 0.905-0.967, AUROC of 0.9848-0.9931 and AUPRC of 0.9527-0.9760. With reference to the ground truth, the DL system showed comparable performance (Cohen's κ: 0.86-0.93; Gwet's AC1: 0.89-0.94) with three DR experts (Cohen's κ: 0.89-0.96; Gwet's AC1: 0.91-0.97) in detecting referable lesions.
The automatic DL system for detection of referable DR based on the UK guideline could achieve high accuracy in multidimensional classifications. It is suitable for large-scale, real-world DR screening.
开发并验证一种基于真实世界数据、基于指南的深度学习(DL)系统,用于检测可转诊的糖尿病视网膜病变(DR)。
这是一项基于回顾性、横断面数据集的多中心平台开发研究。图像由两级认证分级员标记为真实情况。根据英国 DR 筛查指南,开发了一种基于彩色视网膜图像和五维分类器的 DL 模型,即图像质量、视网膜病变、黄斑病变可分级性、黄斑病变和光凝。参考决策是通过整合所有分类器的输出并在图像、眼睛和患者层面报告而生成的。将 DL 的性能与 DR 专家进行了比较。
来自中国三家医院和 Lifeline Express 糖尿病视网膜病变筛查项目的 DR 筛查项目。
对 21716 名患者的 39836 只眼中的 83465 张图像进行了注释,其中 53211 张图像用于开发集,30254 张图像用于外部验证集,基于中心和时期进行了拆分。
计算准确性、F1 评分、灵敏度、特异性、接收器工作特征曲线下面积(AUROC)、精度-召回曲线下面积(AUPRC)、Cohen 未加权 κ 和 Gwet 的 AC1 以评估 DL 算法的性能。
在外部验证集中,五个分类器的准确性为 0.915-0.980,F1 评分为 0.682-0.966,灵敏度为 0.917-0.978,特异性为 0.907-0.981,AUROC 为 0.9639-0.9944,AUPRC 为 0.7504-0.9949。在三个层面上检测到可转诊的 DR,准确性为 0.918-0.967,F1 评分为 0.822-0.918,灵敏度为 0.970-0.971,特异性为 0.905-0.967,AUROC 为 0.9848-0.9931,AUPRC 为 0.9527-0.9760。与真实情况相比,DL 系统在检测可转诊病变方面的性能与三位 DR 专家相当(Cohen 的 κ:0.86-0.93;Gwet 的 AC1:0.89-0.94)。
基于英国指南的用于检测可转诊 DR 的自动 DL 系统可以在多维分类中实现高精度。它适用于大规模的真实世界 DR 筛查。