Tang Chaosheng, Wei Mingyang, Sun Junding, Wang Shuihua, Zhang Yudong
School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, PR China.
School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK.
J King Saud Univ Comput Inf Sci. 2023 Jul;35(7):101618. doi: 10.1016/j.jksuci.2023.101618. Epub 2023 Jun 24.
Alzheimer's disease (AD) is a terrible and degenerative disease commonly occurring in the elderly. Early detection can prevent patients from further damage, which is crucial in treating AD. Over the past few decades, it has been demonstrated that neuroimaging can be a critical diagnostic tool for AD, and the feature fusion of different neuroimaging modalities can enhance diagnostic performance. Most previous studies in multimodal feature fusion have only concatenated the high-level features extracted by neural networks from various neuroimaging images simply. However, a major problem of these studies is over-looking the low-level feature interactions between modalities in the feature extraction stage, resulting in suboptimal performance in AD diagnosis. In this paper, we develop a dual-branch vision transformer with cross-attention and graph pooling, namely CsAGP, which enables multi-level feature interactions between the inputs to learn a shared feature representation. Specifically, we first construct a brand-new cross-attention fusion module (CAFM), which processes MRI and PET images by two independent branches of differing computational complexity. These features are fused merely by the cross-attention mechanism to enhance each other. After that, a concise graph pooling algorithm-based Reshape-Pooling-Reshape (RPR) framework is developed for token selection to reduce token redundancy in the proposed model. Extensive experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database demonstrated that the suggested method obtains 99.04%, 97.43%, 98.57%, and 98.72% accuracy for the classification of AD vs. CN, AD vs. MCI, CN vs. MCI, and AD vs. CN vs. MCI, respectively.
阿尔茨海默病(AD)是一种常见于老年人的严重退行性疾病。早期检测可防止患者受到进一步损害,这对AD的治疗至关重要。在过去几十年中,已证明神经影像学可成为AD的关键诊断工具,并且不同神经影像学模态的特征融合可提高诊断性能。以前大多数多模态特征融合研究只是简单地将神经网络从各种神经影像学图像中提取的高级特征连接起来。然而,这些研究的一个主要问题是在特征提取阶段忽略了模态之间的低级特征交互,导致AD诊断性能次优。在本文中,我们开发了一种具有交叉注意力和图池化的双分支视觉Transformer,即CsAGP,它能够在输入之间进行多级特征交互以学习共享特征表示。具体而言,我们首先构建一个全新的交叉注意力融合模块(CAFM),它通过两个计算复杂度不同的独立分支来处理MRI和PET图像。这些特征仅通过交叉注意力机制进行融合以相互增强。之后,开发了一种基于简洁图池化算法的重塑-池化-重塑(RPR)框架用于令牌选择,以减少所提出模型中的令牌冗余。在阿尔茨海默病神经影像学倡议(ADNI)数据库上进行的大量实验表明,所提出的方法在AD与CN、AD与MCI、CN与MCI以及AD与CN与MCI分类中的准确率分别为99.04%、97.43%、98.57%和98.72%。