Teng Xiangze, Li Xiang, Wei Benzheng
Center for Medical Artificial Intelligence, Shandong University of Traditional Chinese Medicine, Qingdao, China.
Qingdao Academy of Chinese Medical Sciences, Shandong University of Traditional Chinese Medicine, Qingdao, China.
Front Comput Neurosci. 2025 Jul 16;19:1604399. doi: 10.3389/fncom.2025.1604399. eCollection 2025.
Parkinson's disease (PD) is a complex neurodegenerative disorder characterized by a high rate of misdiagnosis, underscoring the critical importance of early and accurate diagnosis. Although existing computer-aided diagnostic systems integrate clinical assessment scales with neuroimaging data, they typically rely on superficial feature concatenation, which fails to capture the deep inter-modal dependencies essential for effective multimodal fusion. To address this limitation, we propose ModFus-PD, Contrastive learning effectively aligns heterogeneous modalities such as imaging and clinical text, while the cross-modal attention mechanism further exploits semantic interactions between them to enhance feature fusion. The framework comprises three key components: (1) a contrastive learning-based feature alignment module that projects MRI data and clinical text prompts into a unified embedding space via pretrained image and text encoders; (2) a bidirectional cross-modal attention module in which textual semantics guide MRI feature refinement for improved sensitivity to PD-related brain regions, while MRI features simultaneously enhance the contextual understanding of clinical text; (3) a hierarchical classification module that integrates the fused representations through two fully connected layers to produce final PD classification probabilities. Experiments on the PPMI dataset demonstrate the superior performance of ModFus-PD, achieving an accuracy of 0.903, AUC of 0.892, and F1 score of 0.840, surpassing several state-of-the-art baselines. These results validate the effectiveness of our cross-modal fusion strategy, which enables interpretable and reliable diagnostic support, holding promise for future clinical translation.
帕金森病(PD)是一种复杂的神经退行性疾病,其误诊率很高,这凸显了早期准确诊断的至关重要性。尽管现有的计算机辅助诊断系统将临床评估量表与神经影像数据相结合,但它们通常依赖于表面特征拼接,无法捕捉有效多模态融合所必需的深度跨模态依赖性。为了解决这一局限性,我们提出了ModFus-PD,对比学习有效地对齐了成像和临床文本等异构模态,而跨模态注意力机制进一步利用它们之间的语义交互来增强特征融合。该框架包括三个关键组件:(1)基于对比学习的特征对齐模块,通过预训练的图像和文本编码器将MRI数据和临床文本提示投影到统一的嵌入空间;(2)双向跨模态注意力模块,其中文本语义指导MRI特征细化,以提高对PD相关脑区的敏感性,而MRI特征同时增强对临床文本的上下文理解;(3)分层分类模块,通过两个全连接层整合融合表示,以产生最终的PD分类概率。在PPMI数据集上的实验证明了ModFus-PD的卓越性能,准确率达到0.903,AUC为0.892,F1分数为0.840,超过了几个现有最先进的基线。这些结果验证了我们的跨模态融合策略的有效性,该策略能够提供可解释和可靠的诊断支持,为未来的临床转化带来希望。