Liu Zirong, Hu Yan, Qiu Zhongxi, Niu Yanyan, Zhou Dan, Li Xiaoling, Shen Junyong, Jiang Hongyang, Li Heng, Liu Jiang
School of Ophthalmology and Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China.
Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China.
Biomed Opt Express. 2024 May 14;15(6):3699-3714. doi: 10.1364/BOE.516764. eCollection 2024 Jun 1.
Multi-modal eye disease screening improves diagnostic accuracy by providing lesion information from different sources. However, existing multi-modal automatic diagnosis methods tend to focus on the specificity of modalities and ignore the spatial correlation of images. This paper proposes a novel cross-modal retinal disease diagnosis network (CRD-Net) that digs out the relevant features from modal images aided for multiple retinal disease diagnosis. Specifically, our model introduces a cross-modal attention (CMA) module to query and adaptively pay attention to the relevant features of the lesion in the different modal images. In addition, we also propose multiple loss functions to fuse features with modality correlation and train a multi-modal retinal image classification network to achieve a more accurate diagnosis. Experimental evaluation on three publicly available datasets shows that our CRD-Net outperforms existing single-modal and multi-modal methods, demonstrating its superior performance.
多模态眼病筛查通过提供来自不同来源的病变信息提高诊断准确性。然而,现有的多模态自动诊断方法往往侧重于模态的特异性,而忽略了图像的空间相关性。本文提出了一种新颖的跨模态视网膜疾病诊断网络(CRD-Net),该网络从模态图像中挖掘相关特征,辅助多种视网膜疾病诊断。具体而言,我们的模型引入了跨模态注意力(CMA)模块,以查询并自适应地关注不同模态图像中病变的相关特征。此外,我们还提出了多个损失函数,以融合具有模态相关性的特征,并训练一个多模态视网膜图像分类网络,以实现更准确的诊断。在三个公开可用数据集上的实验评估表明,我们的CRD-Net优于现有的单模态和多模态方法,证明了其卓越的性能。