基于可解释变压器的框架，用于通过多骨干分割和基于视杯盘比的分类从眼底图像中检测青光眼。

Explainable Transformer-Based Framework for Glaucoma Detection from Fundus Images Using Multi-Backbone Segmentation and vCDR-Based Classification.

作者信息

Alasmari Hind, Amoudi Ghada, Alghamdi Hanan

机构信息

Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia.

出版信息

Diagnostics (Basel). 2025 Sep 10;15(18):2301. doi: 10.3390/diagnostics15182301.

DOI:10.3390/diagnostics15182301

PMID:41008673

Abstract

Glaucoma is an eye disease caused by increased intraocular pressure (IOP) that affects the optic nerve head (ONH), leading to vision problems and irreversible blindness. : Glaucoma is the second leading cause of blindness worldwide, and the number of people affected is increasing each year, with the number expected to reach 111.8 million by 2040. This escalating trend is alarming due to the lack of ophthalmology specialists relative to the population. This study proposes an explainable end-to-end pipeline for automated glaucoma diagnosis from fundus images. It also evaluates the performance of Vision Transformers (ViTs) relative to traditional CNN-based models. : The proposed system uses three datasets: REFUGE, ORIGA, and G1020. It begins with YOLOv11 for object detection of the optic disc. Then, the optic disc (OD) and optic cup (OC) are segmented using U-Net with ResNet50, VGG16, and MobileNetV2 backbones, as well as MaskFormer with a Swin-Base backbone. Glaucoma is classified based on the vertical cup-to-disc ratio (vCDR). : MaskFormer outperforms all models in segmentation in all aspects, including IoU OD, IoU OC, DSC OD, and DSC OC, with scores of 88.29%, 91.09%, 93.83%, and 93.71%. For classification, it achieved accuracy and F1-scores of 84.03% and 84.56%. : By relying on the interpretable features of the vCDR, the proposed framework enhances transparency and aligns well with the principles of explainable AI, thus offering a trustworthy solution for glaucoma screening. Our findings show that Vision Transformers offer a promising approach for achieving high segmentation performance with explainable, biomarker-driven diagnosis.

摘要

青光眼是一种由眼内压升高引起的眼部疾病，会影响视神经乳头，导致视力问题和不可逆的失明。青光眼是全球第二大致盲原因，且受影响人数每年都在增加，预计到2040年将达到1.118亿。由于眼科专家数量相对于人口数量不足，这种不断升级的趋势令人担忧。本研究提出了一种用于从眼底图像进行青光眼自动诊断的可解释端到端流程。它还评估了视觉Transformer（ViT）相对于传统基于卷积神经网络（CNN）模型的性能。所提出的系统使用了三个数据集：REFUGE、ORIGA和G1020。它首先使用YOLOv11进行视盘的目标检测。然后，使用具有ResNet50、VGG16和MobileNetV2主干的U-Net以及具有Swin-Base主干的MaskFormer对视盘（OD）和视杯（OC）进行分割。基于垂直杯盘比（vCDR）对青光眼进行分类。MaskFormer在分割的各个方面都优于所有模型，包括OD的交并比（IoU）、OC的IoU、OD的Dice相似系数（DSC）和OC的DSC，得分分别为88.29%、91.09%、93.83%和93.71%。对于分类，它的准确率和F1分数分别为84.03%和84.56%。通过依赖vCDR的可解释特征，所提出的框架提高了透明度，并与可解释人工智能的原则高度契合，从而为青光眼筛查提供了一个可靠的解决方案。我们的研究结果表明，视觉Transformer为通过可解释的、生物标志物驱动的诊断实现高分割性能提供了一种有前景的方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于可解释变压器的框架，用于通过多骨干分割和基于视杯盘比的分类从眼底图像中检测青光眼。

Explainable Transformer-Based Framework for Glaucoma Detection from Fundus Images Using Multi-Backbone Segmentation and vCDR-Based Classification.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

基于可解释变压器的框架，用于通过多骨干分割和基于视杯盘比的分类从眼底图像中检测青光眼。

Explainable Transformer-Based Framework for Glaucoma Detection from Fundus Images Using Multi-Backbone Segmentation and vCDR-Based Classification.

作者信息

机构信息

出版信息

相似文献

本文引用的文献