From the Singapore Eye Research Institute (C.V., R.P.N., Z.T., J.L.L., S.S., S.T., D.S.W.T., T.Y.W., D.M.); Duke-NUS Medical School (R.P.N., J.L.L., S.S., S.T., T.Y.W., D.M.); Institute of High Performance Computing (X.X., Y.L.), Agency for Science, Technology and Research (A*STAR); Singapore National Eye Centre (J.L.L., S.S., S.T., D.S.W.T., T.Y.W., D.M.); University of Berkeley (L.M.), CA; Departments of Ophthalmology and Neurology (N.J.N., V.B.), Emory University School of Medicine, Atlanta, GA; and Copenhagen University Hospital (D.M.), Denmark.
Neurology. 2021 Jul 27;97(4):e369-e377. doi: 10.1212/WNL.0000000000012226. Epub 2021 May 19.
To evaluate the performance of a deep learning system (DLS) in classifying the severity of papilledema associated with increased intracranial pressure on standard retinal fundus photographs.
A DLS was trained to automatically classify papilledema severity in 965 patients (2,103 mydriatic fundus photographs), representing a multiethnic cohort of patients with confirmed elevated intracranial pressure. Training was performed on 1,052 photographs with mild/moderate papilledema (MP) and 1,051 photographs with severe papilledema (SP) classified by a panel of experts. The performance of the DLS and that of 3 independent neuro-ophthalmologists were tested in 111 patients (214 photographs, 92 with MP and 122 with SP) by calculating the area under the receiver operating characteristics curve (AUC), accuracy, sensitivity, and specificity. Kappa agreement scores between the DLS and each of the 3 graders and among the 3 graders were calculated.
The DLS successfully discriminated between photographs of MP and SP, with an AUC of 0.93 (95% confidence interval [CI] 0.89-0.96) and an accuracy, sensitivity, and specificity of 87.9%, 91.8%, and 86.2%, respectively. This performance was comparable with that of the 3 neuro-ophthalmologists (84.1%, 91.8%, and 73.9%, = 0.19, = 1, = 0.09, respectively). Misclassification by the DLS was mainly observed for moderate papilledema (Frisén grade 3). Agreement scores between the DLS and the neuro-ophthalmologists' evaluation was 0.62 (95% CI 0.57-0.68), whereas the intergrader agreement among the 3 neuro-ophthalmologists was 0.54 (95% CI 0.47-0.62).
Our DLS accurately classified the severity of papilledema on an independent set of mydriatic fundus photographs, achieving a comparable performance with that of independent neuro-ophthalmologists.
This study provides Class II evidence that a DLS using mydriatic retinal fundus photographs accurately classified the severity of papilledema associated in patients with a diagnosis of increased intracranial pressure.
评估深度学习系统(DLS)在分类伴有颅内压升高的视盘水肿严重程度方面的表现,该系统通过对标准眼底视网膜照片进行自动分析。
DLS 经训练,可对 965 名患者(2103 张散瞳眼底照片)的视盘水肿严重程度进行自动分类,这些患者代表了一组经证实颅内压升高的多种族患者队列。训练过程中,使用专家小组分类的 1052 张轻度/中度视盘水肿(MP)和 1051 张重度视盘水肿(SP)照片。通过计算受试者工作特征曲线下面积(AUC)、准确性、敏感度和特异性,测试 DLS 与 3 位独立神经眼科医生的表现。在 111 名患者(214 张照片,92 张 MP,122 张 SP)中进行测试,由 DLS 与 3 位分级员中的每一位以及分级员之间的kappa 一致性评分来评估。
DLS 成功区分了 MP 和 SP 的照片,AUC 为 0.93(95%置信区间 [CI] 0.89-0.96),准确性、敏感度和特异性分别为 87.9%、91.8%和 86.2%。这一表现与 3 位神经眼科医生(84.1%、91.8%和 73.9%, = 0.19, = 1, = 0.09)相当。DLS 的错误分类主要发生在中度视盘水肿(Frisén 分级 3)。DLS 与神经眼科医生评估之间的一致性评分为 0.62(95%CI 0.57-0.68),而 3 位神经眼科医生之间的分级员间一致性为 0.54(95%CI 0.47-0.62)。
我们的 DLS 可以准确地对散瞳眼底照片进行视盘水肿严重程度的分类,其表现与独立的神经眼科医生相当。
本研究提供了 II 级证据,表明使用散瞳视网膜眼底照片的 DLS 可准确分类伴有颅内压升高的视盘水肿严重程度。