• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 Vision Transformer 的糖尿病视网膜病变分级识别。

Vision Transformer-based recognition of diabetic retinopathy grade.

机构信息

School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China.

School of Traditional Chinese Medicine, Jinan University, Guangzhou, China.

出版信息

Med Phys. 2021 Dec;48(12):7850-7863. doi: 10.1002/mp.15312. Epub 2021 Nov 16.

DOI:10.1002/mp.15312
PMID:34693536
Abstract

BACKGROUND

In the domain of natural language processing, Transformers are recognized as state-of-the-art models, which opposing to typical convolutional neural networks (CNNs) do not rely on convolution layers. Instead, Transformers employ multi-head attention mechanisms as the main building block to capture long-range contextual relations between image pixels. Recently, CNNs dominated the deep learning solutions for diabetic retinopathy grade recognition. However, spurred by the advantages of Transformers, we propose a Transformer-based method that is appropriate for recognizing the grade of diabetic retinopathy.

PURPOSE

The purposes of this work are to demonstrate that (i) the pure attention mechanism is suitable for diabetic retinopathy grade recognition and (ii) Transformers can replace traditional CNNs for diabetic retinopathy grade recognition.

METHODS

This paper proposes a Vision Transformer-based method to recognize the grade of diabetic retinopathy. Fundus images are subdivided into non-overlapping patches, which are then converted into sequences by flattening, and undergo a linear and positional embedding process to preserve positional information. Then, the generated sequence is input into several multi-head attention layers to generate the final representation. The first token sequence is input to a softmax classification layer to produce the recognition output in the classification stage.

RESULTS

The dataset for training and testing employs fundus images of different resolutions, subdivided into patches. We challenge our method against current CNNs and extreme learning machines and achieve an appealing performance. Specifically, the suggested deep learning architecture attains an accuracy of 91.4%, specificity = 0.977 (95% confidence interval (CI) (0.951-1)), precision = 0.928 (95% CI (0.852-1)), sensitivity = 0.926 (95% CI (0.863-0.989)), quadratic weighted kappa score = 0.935, and area under curve (AUC) = 0.986.

CONCLUSION

Our comparative experiments against current methods conclude that our model is competitive and highlight that an attention mechanism based on a Vision Transformer model is promising for the diabetic retinopathy grade recognition task.

摘要

背景

在自然语言处理领域,Transformer 被认为是最先进的模型,与典型的卷积神经网络(CNN)不同,它不依赖于卷积层。相反,Transformer 采用多头注意力机制作为主要构建块,以捕捉图像像素之间的长程上下文关系。最近,CNN 主导了糖尿病视网膜病变分级识别的深度学习解决方案。然而,受 Transformer 优势的启发,我们提出了一种基于 Transformer 的方法,适用于识别糖尿病视网膜病变的分级。

目的

本研究旨在证明(i)纯注意力机制适用于糖尿病视网膜病变分级识别,(ii)Transformer 可以替代传统 CNN 用于糖尿病视网膜病变分级识别。

方法

本文提出了一种基于 Vision Transformer 的方法来识别糖尿病视网膜病变的分级。眼底图像被细分为不重叠的斑块,然后通过展平将其转换为序列,并通过线性和位置嵌入过程来保留位置信息。然后,生成的序列被输入到几个多头注意力层中,以生成最终的表示。在分类阶段,将第一个令牌序列输入到 softmax 分类层中,以生成识别输出。

结果

用于训练和测试的数据集使用不同分辨率的眼底图像,细分为斑块。我们将我们的方法与当前的 CNN 和极限学习机进行了对比,并取得了令人满意的性能。具体来说,所提出的深度学习架构的准确率为 91.4%,特异性=0.977(95%置信区间(CI)(0.951-1)),精度=0.928(95% CI(0.852-1)),敏感性=0.926(95% CI(0.863-0.989)),二次加权kappa 得分=0.935,曲线下面积(AUC)=0.986。

结论

我们与当前方法的对比实验得出结论,我们的模型具有竞争力,并强调基于 Vision Transformer 模型的注意力机制在糖尿病视网膜病变分级识别任务中具有很大的潜力。

相似文献

1
Vision Transformer-based recognition of diabetic retinopathy grade.基于 Vision Transformer 的糖尿病视网膜病变分级识别。
Med Phys. 2021 Dec;48(12):7850-7863. doi: 10.1002/mp.15312. Epub 2021 Nov 16.
2
Gait-ViT: Gait Recognition with Vision Transformer.步态-ViT:基于视觉Transformer 的步态识别。
Sensors (Basel). 2022 Sep 28;22(19):7362. doi: 10.3390/s22197362.
3
Non-uniform Label Smoothing for Diabetic Retinopathy Grading from Retinal Fundus Images with Deep Neural Networks.基于深度神经网络的视网膜眼底图像糖尿病性视网膜病变分级中的非均匀标签平滑。
Transl Vis Sci Technol. 2020 Jun 30;9(2):34. doi: 10.1167/tvst.9.2.34. eCollection 2020 Jun.
4
CoT-XNet: contextual transformer with Xception network for diabetic retinopathy grading.CoT-XNet:用于糖尿病视网膜病变分级的基于Xception网络的上下文变换器
Phys Med Biol. 2022 Dec 6;67(24). doi: 10.1088/1361-6560/ac9fa0.
5
Vision transformer with masked autoencoders for referable diabetic retinopathy classification based on large-size retina image.基于大尺寸视网膜图像的可引用糖尿病视网膜病变分类的掩蔽自动编码器视觉转换器。
PLoS One. 2024 Mar 6;19(3):e0299265. doi: 10.1371/journal.pone.0299265. eCollection 2024.
6
Diabetic retinopathy detection through convolutional neural networks with synaptic metaplasticity.通过具有突触超可塑性的卷积神经网络检测糖尿病性视网膜病变。
Comput Methods Programs Biomed. 2021 Jul;206:106094. doi: 10.1016/j.cmpb.2021.106094. Epub 2021 Apr 22.
7
Bimodal learning via trilogy of skip-connection deep networks for diabetic retinopathy risk progression identification.通过 skip-connection 深度网络三部曲进行双模态学习,以识别糖尿病视网膜病变风险进展。
Int J Med Inform. 2019 Dec;132:103926. doi: 10.1016/j.ijmedinf.2019.07.005. Epub 2019 Aug 5.
8
Automatic severity grade classification of diabetic retinopathy using deformable ladder Bi attention U-net and deep adaptive CNN.使用可变形 Ladder Bi 注意力 U-Net 和深度自适应 CNN 对糖尿病视网膜病变进行自动严重程度分级。
Med Biol Eng Comput. 2023 Aug;61(8):2091-2113. doi: 10.1007/s11517-023-02860-9. Epub 2023 Jun 20.
9
A new ultra-wide-field fundus dataset to diabetic retinopathy grading using hybrid preprocessing methods.一种使用混合预处理方法进行糖尿病视网膜病变分级的新型超广角眼底数据集。
Comput Biol Med. 2023 May;157:106750. doi: 10.1016/j.compbiomed.2023.106750. Epub 2023 Mar 8.
10
Comparative Analysis of Vision Transformers and Conventional Convolutional Neural Networks in Detecting Referable Diabetic Retinopathy.视觉Transformer与传统卷积神经网络在检测可转诊糖尿病视网膜病变中的对比分析
Ophthalmol Sci. 2024 May 17;4(6):100552. doi: 10.1016/j.xops.2024.100552. eCollection 2024 Nov-Dec.

引用本文的文献

1
Revolutionizing gastroenterology and hepatology with artificial intelligence: From precision diagnosis to equitable healthcare through interdisciplinary practice.人工智能为胃肠病学和肝病学带来变革:通过跨学科实践实现精准诊断和公平医疗。
World J Gastroenterol. 2025 Jun 28;31(24):108021. doi: 10.3748/wjg.v31.i24.108021.
2
Artificial intelligence in traditional Chinese medicine: advances in multi-metabolite multi-target interaction modeling.人工智能在中医领域的应用:多代谢物多靶点相互作用建模的进展
Front Pharmacol. 2025 Apr 15;16:1541509. doi: 10.3389/fphar.2025.1541509. eCollection 2025.
3
Discriminative, generative artificial intelligence, and foundation models in retina imaging.
视网膜成像中的判别式、生成式人工智能及基础模型。
Taiwan J Ophthalmol. 2024 Nov 28;14(4):473-485. doi: 10.4103/tjo.TJO-D-24-00064. eCollection 2024 Oct-Dec.
4
In-depth analysis of research hotspots and emerging trends in AI for retinal diseases over the past decade.对过去十年中用于视网膜疾病的人工智能研究热点和新兴趋势的深入分析。
Front Med (Lausanne). 2024 Nov 20;11:1489139. doi: 10.3389/fmed.2024.1489139. eCollection 2024.
5
Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review.医学图像分析中视觉转换器与卷积神经网络的比较:系统评价。
J Med Syst. 2024 Sep 12;48(1):84. doi: 10.1007/s10916-024-02105-8.
6
Comparative Analysis of Vision Transformers and Conventional Convolutional Neural Networks in Detecting Referable Diabetic Retinopathy.视觉Transformer与传统卷积神经网络在检测可转诊糖尿病视网膜病变中的对比分析
Ophthalmol Sci. 2024 May 17;4(6):100552. doi: 10.1016/j.xops.2024.100552. eCollection 2024 Nov-Dec.
7
Deep Learning Model Using Stool Pictures for Predicting Endoscopic Mucosal Inflammation in Patients With Ulcerative Colitis.利用粪便图片的深度学习模型预测溃疡性结肠炎患者的内镜下黏膜炎症
Am J Gastroenterol. 2025 Jan 1;120(1):213-224. doi: 10.14309/ajg.0000000000002978. Epub 2024 Jul 25.
8
3D residual attention hierarchical fusion for real-time detection of the prostate capsule.三维残差注意层次融合实时检测前列腺包膜。
BMC Med Imaging. 2024 Jun 24;24(1):157. doi: 10.1186/s12880-024-01336-y.
9
Recognition of eye diseases based on deep neural networks for transfer learning and improved D-S evidence theory.基于深度神经网络的迁移学习和改进的 D-S 证据理论的眼病识别。
BMC Med Imaging. 2024 Jan 18;24(1):19. doi: 10.1186/s12880-023-01176-2.
10
Retinal Disease Diagnosis Using Deep Learning on Ultra-Wide-Field Fundus Images.基于超广角眼底图像的深度学习视网膜疾病诊断
Diagnostics (Basel). 2024 Jan 3;14(1):105. doi: 10.3390/diagnostics14010105.