深度学习算法在视网膜眼底照片糖尿病视网膜病变检测中的开发与验证。

Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

机构信息

Google Inc, Mountain View, California.

Google Inc, Mountain View, California2Department of Computer Science, University of Texas, Austin.

出版信息

JAMA. 2016 Dec 13;316(22):2402-2410. doi: 10.1001/jama.2016.17216.

DOI:10.1001/jama.2016.17216

PMID:27898976

Abstract

IMPORTANCE: Deep learning is a family of computational methods that allow an algorithm to program itself by learning from a large set of examples that demonstrate the desired behavior, removing the need to specify rules explicitly. Application of these methods to medical imaging requires further assessment and validation. OBJECTIVE: To apply deep learning to create an algorithm for automated detection of diabetic retinopathy and diabetic macular edema in retinal fundus photographs. DESIGN AND SETTING: A specific type of neural network optimized for image classification called a deep convolutional neural network was trained using a retrospective development data set of 128 175 retinal images, which were graded 3 to 7 times for diabetic retinopathy, diabetic macular edema, and image gradability by a panel of 54 US licensed ophthalmologists and ophthalmology senior residents between May and December 2015. The resultant algorithm was validated in January and February 2016 using 2 separate data sets, both graded by at least 7 US board-certified ophthalmologists with high intragrader consistency. EXPOSURE: Deep learning-trained algorithm. MAIN OUTCOMES AND MEASURES: The sensitivity and specificity of the algorithm for detecting referable diabetic retinopathy (RDR), defined as moderate and worse diabetic retinopathy, referable diabetic macular edema, or both, were generated based on the reference standard of the majority decision of the ophthalmologist panel. The algorithm was evaluated at 2 operating points selected from the development set, one selected for high specificity and another for high sensitivity. RESULTS: The EyePACS-1 data set consisted of 9963 images from 4997 patients (mean age, 54.4 years; 62.2% women; prevalence of RDR, 683/8878 fully gradable images [7.8%]); the Messidor-2 data set had 1748 images from 874 patients (mean age, 57.6 years; 42.6% women; prevalence of RDR, 254/1745 fully gradable images [14.6%]). For detecting RDR, the algorithm had an area under the receiver operating curve of 0.991 (95% CI, 0.988-0.993) for EyePACS-1 and 0.990 (95% CI, 0.986-0.995) for Messidor-2. Using the first operating cut point with high specificity, for EyePACS-1, the sensitivity was 90.3% (95% CI, 87.5%-92.7%) and the specificity was 98.1% (95% CI, 97.8%-98.5%). For Messidor-2, the sensitivity was 87.0% (95% CI, 81.1%-91.0%) and the specificity was 98.5% (95% CI, 97.7%-99.1%). Using a second operating point with high sensitivity in the development set, for EyePACS-1 the sensitivity was 97.5% and specificity was 93.4% and for Messidor-2 the sensitivity was 96.1% and specificity was 93.9%. CONCLUSIONS AND RELEVANCE: In this evaluation of retinal fundus photographs from adults with diabetes, an algorithm based on deep machine learning had high sensitivity and specificity for detecting referable diabetic retinopathy. Further research is necessary to determine the feasibility of applying this algorithm in the clinical setting and to determine whether use of the algorithm could lead to improved care and outcomes compared with current ophthalmologic assessment.

摘要

重要性：深度学习是一种计算方法，它允许算法通过从大量展示所需行为的示例中学习来自我编程，从而无需明确指定规则。将这些方法应用于医学成像需要进一步评估和验证。

目的：应用深度学习创建一种用于自动检测糖尿病视网膜病变和糖尿病性黄斑水肿的眼底照片的算法。

设计和设置：一种专门针对图像分类的神经网络——深度卷积神经网络，使用了一个回顾性的开发数据集进行训练，该数据集包含 128175 张视网膜图像，这些图像由 54 名美国持照眼科医生和眼科高级住院医师组成的小组在 2015 年 5 月至 12 月期间对糖尿病视网膜病变、糖尿病性黄斑水肿和图像分级能力进行了 3 到 7 次分级。在 2016 年 1 月和 2 月，使用两个独立的数据集对生成的算法进行了验证，这两个数据集都由至少 7 名美国董事会认证的眼科医生进行了分级，且内部一致性较高。

暴露情况：经过深度学习训练的算法。

主要结果和措施：根据眼科医生小组多数决定的参考标准，生成了该算法检测可治疗的糖尿病视网膜病变（RDR）的敏感性和特异性，RDR 定义为中度及以上的糖尿病视网膜病变、可治疗的糖尿病性黄斑水肿或两者兼有。在开发数据集的两个选择的操作点评估了该算法，一个选择具有高特异性，另一个选择具有高敏感性。

结果：EyePACS-1 数据集包含 4997 名患者的 9963 张图像（平均年龄为 54.4 岁；女性占 62.2%；8878 张可完全分级的图像中 RDR 的患病率为 683/8878[7.8%]）；Messidor-2 数据集包含 874 名患者的 1748 张图像（平均年龄为 57.6 岁；女性占 42.6%；1745 张可完全分级的图像中 RDR 的患病率为 254/1745[14.6%]）。对于检测 RDR，该算法在 EyePACS-1 中的受试者工作特征曲线下面积为 0.991（95%CI，0.988-0.993），在 Messidor-2 中的面积为 0.990（95%CI，0.986-0.995）。使用第一个具有高特异性的操作切点，对于 EyePACS-1，敏感性为 90.3%（95%CI，87.5%-92.7%），特异性为 98.1%（95%CI，97.8%-98.5%）。对于 Messidor-2，敏感性为 87.0%（95%CI，81.1%-91.0%），特异性为 98.5%（95%CI，97.7%-99.1%）。在开发数据集的第二个具有高敏感性的操作点，对于 EyePACS-1，敏感性为 97.5%，特异性为 93.4%，对于 Messidor-2，敏感性为 96.1%，特异性为 93.9%。

结论和相关性：在对患有糖尿病的成年人的眼底照片的这项评估中，基于深度机器学习的算法具有检测可治疗性糖尿病视网膜病变的高敏感性和特异性。需要进一步研究来确定该算法在临床环境中的可行性，以及该算法是否可以与当前的眼科评估相比带来更好的护理和结果。