基于模态特定注意力网络的多模态视网膜图像分类。

Multi-Modal Retinal Image Classification With Modality-Specific Attention Network.

出版信息

IEEE Trans Med Imaging. 2021 Jun;40(6):1591-1602. doi: 10.1109/TMI.2021.3059956. Epub 2021 Jun 1.

DOI:10.1109/TMI.2021.3059956

Abstract

Recently, automatic diagnostic approaches have been widely used to classify ocular diseases. Most of these approaches are based on a single imaging modality (e.g., fundus photography or optical coherence tomography (OCT)), which usually only reflect the oculopathy to a certain extent, and neglect the modality-specific information among different imaging modalities. This paper proposes a novel modality-specific attention network (MSAN) for multi-modal retinal image classification, which can effectively utilize the modality-specific diagnostic features from fundus and OCT images. The MSAN comprises two attention modules to extract the modality-specific features from fundus and OCT images, respectively. Specifically, for the fundus image, ophthalmologists need to observe local and global pathologies at multiple scales (e.g., from microaneurysms at the micrometer level, optic disc at millimeter level to blood vessels through the whole eye). Therefore, we propose a multi-scale attention module to extract both the local and global features from fundus images. Moreover, large background regions exist in the OCT image, which is meaningless for diagnosis. Thus, a region-guided attention module is proposed to encode the retinal layer-related features and ignore the background in OCT images. Finally, we fuse the modality-specific features to form a multi-modal feature and train the multi-modal retinal image classification network. The fusion of modality-specific features allows the model to combine the advantages of fundus and OCT modality for a more accurate diagnosis. Experimental results on a clinically acquired multi-modal retinal image (fundus and OCT) dataset demonstrate that our MSAN outperforms other well-known single-modal and multi-modal retinal image classification methods.

摘要

最近，自动诊断方法已被广泛用于对眼部疾病进行分类。这些方法大多基于单一的成像模式（如眼底照相或光学相干断层扫描（OCT）），这些方法通常只能在一定程度上反映眼疾，而忽略了不同成像模式之间的模式特异性信息。本文提出了一种新颖的模态特定注意网络（MSAN），用于多模态视网膜图像分类，该网络可以有效地利用眼底和 OCT 图像中的模态特异性诊断特征。MSAN 由两个注意模块组成，分别从眼底和 OCT 图像中提取模态特异性特征。具体来说，对于眼底图像，眼科医生需要在多个尺度上观察局部和全局病变（例如，从微米级别的微动脉瘤、毫米级别的视盘到整个眼睛的血管）。因此，我们提出了一种多尺度注意模块，用于从眼底图像中提取局部和全局特征。此外，OCT 图像中存在大量的背景区域，这些区域对诊断没有意义。因此，我们提出了一种区域引导注意模块，用于编码与视网膜层相关的特征，并忽略 OCT 图像中的背景。最后，我们融合模态特异性特征，形成多模态特征，并训练多模态视网膜图像分类网络。模态特异性特征的融合使模型能够结合眼底和 OCT 模式的优势，从而进行更准确的诊断。在一个临床采集的多模态视网膜图像（眼底和 OCT）数据集上的实验结果表明，我们的 MSAN 优于其他著名的单模态和多模态视网膜图像分类方法。

相似文献

Multi-Modal Retinal Image Classification With Modality-Specific Attention Network.基于模态特定注意力网络的多模态视网膜图像分类。

IEEE Trans Med Imaging. 2021 Jun;40(6):1591-1602. doi: 10.1109/TMI.2021.3059956. Epub 2021 Jun 1.

The retinal disease screening study: retrospective comparison of nonmydriatic fundus photography and three-dimensional optical coherence tomography for detection of retinal irregularities.视网膜疾病筛查研究：非散瞳眼底照相术与三维光相干断层扫描检测视网膜不规则性的回顾性比较。

Invest Ophthalmol Vis Sci. 2013 Aug 21;54(8):5694-700. doi: 10.1167/iovs.13-12043.

Deep Ensemble Learning Based Objective Grading of Macular Edema by Extracting Clinically Significant Findings from Fused Retinal Imaging Modalities.基于深度集成学习的融合视网膜成像模态中临床显著发现提取的黄斑水肿客观分级。

Sensors (Basel). 2019 Jul 5;19(13):2970. doi: 10.3390/s19132970.

GAMMA challenge: Glaucoma grAding from Multi-Modality imAges.伽马挑战赛：多模态图像的青光眼分级。

Med Image Anal. 2023 Dec;90:102938. doi: 10.1016/j.media.2023.102938. Epub 2023 Sep 18.

Using a dual-stream attention neural network to characterize mild cognitive impairment based on retinal images.基于视网膜图像的双通道注意神经网络特征识别轻度认知障碍

Comput Biol Med. 2023 Nov;166:107411. doi: 10.1016/j.compbiomed.2023.107411. Epub 2023 Sep 9.

Attention to Lesion: Lesion-Aware Convolutional Neural Network for Retinal Optical Coherence Tomography Image Classification.关注病灶：用于视网膜光学相干断层扫描图像分类的病灶感知卷积神经网络。

IEEE Trans Med Imaging. 2019 Aug;38(8):1959-1970. doi: 10.1109/TMI.2019.2898414. Epub 2019 Feb 8.

Cross-modal attention network for retinal disease classification based on multi-modal images.基于多模态图像的视网膜疾病分类跨模态注意力网络

Biomed Opt Express. 2024 May 14;15(6):3699-3714. doi: 10.1364/BOE.516764. eCollection 2024 Jun 1.

Ocular fundus reference images from optical coherence tomography.光学相干断层扫描的眼底参考图像。

Comput Med Imaging Graph. 2014 Jul;38(5):381-9. doi: 10.1016/j.compmedimag.2014.02.003. Epub 2014 Feb 22.

Simultaneous fundus imaging and optical coherence tomography of the mouse retina.小鼠视网膜的同步眼底成像和光学相干断层扫描

Invest Ophthalmol Vis Sci. 2007 Mar;48(3):1283-9. doi: 10.1167/iovs.06-0732.

Assessment of patient specific information in the wild on fundus photography and optical coherence tomography.眼底照相和光相干断层扫描的野生患者特定信息评估。

Sci Rep. 2021 Apr 21;11(1):8621. doi: 10.1038/s41598-021-86577-5.

引用本文的文献

Multimodal Integration in Health Care: Development With Applications in Disease Management.医疗保健中的多模态整合：疾病管理应用中的发展

J Med Internet Res. 2025 Aug 21;27:e76557. doi: 10.2196/76557.

A Novel Foundation Model-Based Framework for Multimodal Retinal Age Prediction.一种基于新型基础模型的多模态视网膜年龄预测框架。

IEEE J Transl Eng Health Med. 2025 Jun 4;13:299-309. doi: 10.1109/JTEHM.2025.3576596. eCollection 2025.

Optimizing deep learning models for glaucoma screening with vision transformers for resource efficiency and the pie augmentation method.使用视觉变换器优化用于青光眼筛查的深度学习模型，以提高资源效率并采用派增强方法。

PLoS One. 2025 Mar 21;20(3):e0314111. doi: 10.1371/journal.pone.0314111. eCollection 2025.

Deep learning for retinal vessel segmentation: a systematic review of techniques and applications.用于视网膜血管分割的深度学习：技术与应用的系统综述

Med Biol Eng Comput. 2025 Feb 18. doi: 10.1007/s11517-025-03324-y.

In-depth analysis of research hotspots and emerging trends in AI for retinal diseases over the past decade.对过去十年中用于视网膜疾病的人工智能研究热点和新兴趋势的深入分析。

Front Med (Lausanne). 2024 Nov 20;11:1489139. doi: 10.3389/fmed.2024.1489139. eCollection 2024.

Advances and prospects of multi-modal ophthalmic artificial intelligence based on deep learning: a review.基于深度学习的多模态眼科人工智能研究进展与展望：综述

Eye Vis (Lond). 2024 Oct 1;11(1):38. doi: 10.1186/s40662-024-00405-1.

Cross-modal attention network for retinal disease classification based on multi-modal images.基于多模态图像的视网膜疾病分类跨模态注意力网络

Biomed Opt Express. 2024 May 14;15(6):3699-3714. doi: 10.1364/BOE.516764. eCollection 2024 Jun 1.

Glaucoma detection model by exploiting multi-region and multi-scan-pattern OCT images with dynamical region score.基于动态区域评分利用多区域和多扫描模式光学相干断层扫描（OCT）图像的青光眼检测模型

Biomed Opt Express. 2024 Feb 2;15(3):1370-1392. doi: 10.1364/BOE.512138. eCollection 2024 Mar 1.

Multi-Scale-Denoising Residual Convolutional Network for Retinal Disease Classification Using OCT.基于 OCT 的视网膜病变分类的多尺度去噪残差卷积网络

Sensors (Basel). 2023 Dec 27;24(1):150. doi: 10.3390/s24010150.

A deep-learning based system using multi-modal data for diagnosing gastric neoplasms in real-time (with video).一种基于深度学习的系统，使用多模态数据实时诊断胃肿瘤（带视频）。

Gastric Cancer. 2023 Mar;26(2):275-285. doi: 10.1007/s10120-022-01358-x. Epub 2022 Dec 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于模态特定注意力网络的多模态视网膜图像分类。

Multi-Modal Retinal Image Classification With Modality-Specific Attention Network.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献