Zhang Kaixuan, Dong Shuqi, Shi Peifeng, Hu Dingcan, Gao Geng, Yang Jinlin, Gan Tao, Rao Nini
Brain-Computer Interface & Brain-Inspired Intelligence Key Laboratory of Sichuan Province, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, Sichuan, China.
Digestive Endoscopic Center of West China Hospital, Sichuan University, Chengdu, 610017, Sichuan, China.
Comput Med Imaging Graph. 2026 Feb;128:102699. doi: 10.1016/j.compmedimag.2026.102699. Epub 2026 Jan 5.
Survival prediction using whole slide images (WSIs) and bulk genes is a key task in computational pathology, essential for automated risk assessment and personalized treatment planning. While integrating WSIs with genomic features presents challenges due to inconsistent modality granularity, semantic disparity, and the lack of personalized fusion. We propose GenoPath-MCA, a novel multimodal framework that models dense cross-modal interactions between histopathology and gene expression data. A masked co-attention mechanism aligns features across modalities, and the Multimodal Masked Cross-Attention Module (M2CAM) jointly captures high-order image-gene and gene-gene relationships for enhanced semantic fusion. To address patient-level heterogeneity, we develop a Dynamic Modality Weight Adjustment Strategy (DMWAS) that adaptively modulates fusion weights based on the discriminative relevance of each modality. Additionally, an importance-guided patch selection strategy effectively filters redundant visual inputs, reducing computational cost while preserving critical context. Experiments on public multimodal cancer survival datasets demonstrate that GenoPath-MCA significantly outperforms existing methods in terms of concordance index and robustness. Visualizations of multimodal attention maps validate the biological interpretability and clinical potential of our approach.
使用全切片图像(WSIs)和大量基因进行生存预测是计算病理学中的一项关键任务,对于自动风险评估和个性化治疗规划至关重要。虽然将WSIs与基因组特征整合存在挑战,因为模态粒度不一致、语义差异以及缺乏个性化融合。我们提出了GenoPath-MCA,这是一种新颖的多模态框架,用于对组织病理学和基因表达数据之间的密集跨模态交互进行建模。一种掩码协同注意力机制跨模态对齐特征,并且多模态掩码交叉注意力模块(M2CAM)联合捕获高阶图像-基因和基因-基因关系以增强语义融合。为了解决患者水平的异质性,我们开发了一种动态模态权重调整策略(DMWAS),该策略基于每种模态的判别相关性自适应地调节融合权重。此外,一种重要性引导的补丁选择策略有效地过滤冗余视觉输入,在保留关键上下文的同时降低计算成本。在公共多模态癌症生存数据集上的实验表明,GenoPath-MCA在一致性指数和稳健性方面显著优于现有方法。多模态注意力图的可视化验证了我们方法的生物学可解释性和临床潜力。