Suppr超能文献

使用预分割深度学习分类模型对糖尿病视网膜病变进行分级:一种自动化算法的验证

Grading of diabetic retinopathy using a pre-segmenting deep learning classification model: Validation of an automated algorithm.

作者信息

Similié Dyllan Edson, Andersen Jakob K H, Dinesen Sebastian, Savarimuthu Thiusius R, Grauslund Jakob

机构信息

Department of Ophthalmology, Odense University Hospital, Odense, Denmark.

The Maersk Mc-Kinney Moeller Institute, SDU Robotics, University of Southern Denmark, Odense, Denmark.

出版信息

Acta Ophthalmol. 2025 Mar;103(2):215-221. doi: 10.1111/aos.16781. Epub 2024 Oct 19.

Abstract

PURPOSE

To validate the performance of autonomous diabetic retinopathy (DR) grading by comparing a human grader and a self-developed deep-learning (DL) algorithm with gold-standard evaluation.

METHODS

We included 500, 6-field retinal images graded by an expert ophthalmologist (gold standard) according to the International Clinical Diabetic Retinopathy Disease Severity Scale as represented with DR levels 0-4 (97, 100, 100, 103, 100, respectively). Weighted kappa was calculated to measure the DR classification agreement for (1) a certified human grader without, and (2) with assistance from a DL algorithm and (3) the DL operating autonomously. Using any DR (level 0 vs. 1-4) as a cutoff, we calculated sensitivity, specificity, as well as positive and negative predictive values (PPV and NPV). Finally, we assessed lesion discrepancies between Model 3 and the gold standard.

RESULTS

As compared to the gold standard, weighted kappa for Models 1-3 was 0.88, 0.89 and 0.72, sensitivities were 95%, 94% and 78% and specificities were 82%, 84% and 81%. Extrapolating to a real-world DR prevalence of 23.8%, the PPV were 63%, 64% and 57% and the NPV were 98%, 98% and 92%. Discrepancies between the gold standard and Model 3 were mainly incorrect detection of artefacts (n = 49), missed microaneurysms (n = 26) and inconsistencies between the segmentation and classification (n = 51).

CONCLUSION

While the autonomous DL algorithm for DR classification only performed on par with a human grader for some measures in a high-risk population, extrapolations to a real-world population demonstrated an excellent 92% NPV, which could make it clinically feasible to use autonomously to identify non-DR patients.

摘要

目的

通过将人工分级者和自主研发的深度学习(DL)算法与金标准评估进行比较,验证自主糖尿病视网膜病变(DR)分级的性能。

方法

我们纳入了由专家眼科医生根据国际临床糖尿病视网膜病变疾病严重程度量表分级的500张6视野视网膜图像,该量表用DR水平0 - 4表示(分别为97、100、100、103、100)。计算加权kappa值以衡量(1)未经辅助的认证人工分级者、(2)在DL算法辅助下以及(3)自主运行的DL的DR分类一致性。以任何DR(水平0与1 - 4)作为截断值,我们计算了敏感性、特异性以及阳性和阴性预测值(PPV和NPV)。最后,我们评估了模型3与金标准之间的病变差异。

结果

与金标准相比,模型1 - 3的加权kappa值分别为0.88、0.89和0.72,敏感性分别为95%、94%和78%,特异性分别为82%、84%和81%。外推至现实世界中23.8%的DR患病率,PPV分别为63%、64%和57%,NPV分别为98%、98%和92%。金标准与模型3之间的差异主要在于伪像的错误检测(n = 49)、微动脉瘤的漏检(n = 26)以及分割与分类之间的不一致(n = 51)。

结论

虽然用于DR分类的自主DL算法在高危人群中的某些指标上仅与人工分级者表现相当,但外推至现实世界人群显示出高达92%的优异NPV,这使得自主使用它来识别非DR患者在临床上具有可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d729/11810534/fae29e3c00a6/AOS-103-215-g002.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验