School of Computer Science and Technology, East China Normal University, Shanghai 200062, China.
School of Data Science, The Chinese University of Hong Kong, Shenzhen Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen 518172, China.
Neural Netw. 2024 Dec;180:106670. doi: 10.1016/j.neunet.2024.106670. Epub 2024 Sep 6.
Radiologists must utilize medical images of multiple modalities for tumor segmentation and diagnosis due to the limitations of medical imaging technology and the diversity of tumor signals. This has led to the development of multimodal learning in medical image segmentation. However, the redundancy among modalities creates challenges for existing subtraction-based joint learning methods, such as misjudging the importance of modalities, ignoring specific modal information, and increasing cognitive load. These thorny issues ultimately decrease segmentation accuracy and increase the risk of overfitting. This paper presents the complementary information mutual learning (CIML) framework, which can mathematically model and address the negative impact of inter-modal redundant information. CIML adopts the idea of addition and removes inter-modal redundant information through inductive bias-driven task decomposition and message passing-based redundancy filtering. CIML first decomposes the multimodal segmentation task into multiple subtasks based on expert prior knowledge, minimizing the information dependence between modalities. Furthermore, CIML introduces a scheme in which each modality can extract information from other modalities additively through message passing. To achieve non-redundancy of extracted information, the redundant filtering is transformed into complementary information learning inspired by the variational information bottleneck. The complementary information learning procedure can be efficiently solved by variational inference and cross-modal spatial attention. Numerical results from the verification task and standard benchmarks indicate that CIML efficiently removes redundant information between modalities, outperforming SOTA methods regarding validation accuracy and segmentation effect. To emphasize, message-passing-based redundancy filtering allows neural network visualization techniques to visualize the knowledge relationship among different modalities, which reflects interpretability.
放射科医生必须利用多种模态的医学图像进行肿瘤分割和诊断,这是因为医学成像技术的局限性和肿瘤信号的多样性。这导致了医学图像分割中多模态学习的发展。然而,模态之间的冗余给现有的基于减法的联合学习方法带来了挑战,例如错误判断模态的重要性、忽略特定模态信息以及增加认知负担。这些棘手的问题最终会降低分割准确性并增加过拟合的风险。本文提出了互补信息互学习(CIML)框架,它可以从数学上对模态间冗余信息的负面影响进行建模和处理。CIML 采用了加法的思想,通过归纳偏差驱动的任务分解和基于消息传递的冗余过滤来消除模态间的冗余信息。CIML 首先根据专家先验知识将多模态分割任务分解为多个子任务,从而最小化模态之间的信息依赖性。此外,CIML 引入了一种方案,通过消息传递,每个模态可以从其他模态中以加法的方式提取信息。为了实现提取信息的非冗余性,将冗余过滤转换为基于变分信息瓶颈的互补信息学习。互补信息学习过程可以通过变分推理和跨模态空间注意力来有效地解决。验证任务和标准基准的数值结果表明,CIML 可以有效地去除模态之间的冗余信息,在验证精度和分割效果方面优于 SOTA 方法。需要强调的是,基于消息传递的冗余过滤允许神经网络可视化技术可视化不同模态之间的知识关系,从而反映了可解释性。