Suppr
超能文献

MATR：基于多尺度自适应变换的多模态医学图像融合。

MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer.

出版信息

IEEE Trans Image Process. 2022;31:5134-5149. doi: 10.1109/TIP.2022.3193288. Epub 2022 Aug 2.

DOI:10.1109/TIP.2022.3193288

Abstract

Owing to the limitations of imaging sensors, it is challenging to obtain a medical image that simultaneously contains functional metabolic information and structural tissue details. Multimodal medical image fusion, an effective way to merge the complementary information in different modalities, has become a significant technique to facilitate clinical diagnosis and surgical navigation. With powerful feature representation ability, deep learning (DL)-based methods have improved such fusion results but still have not achieved satisfactory performance. Specifically, existing DL-based methods generally depend on convolutional operations, which can well extract local patterns but have limited capability in preserving global context information. To compensate for this defect and achieve accurate fusion, we propose a novel unsupervised method to fuse multimodal medical images via a multiscale adaptive Transformer termed MATR. In the proposed method, instead of directly employing vanilla convolution, we introduce an adaptive convolution for adaptively modulating the convolutional kernel based on the global complementary context. To further model long-range dependencies, an adaptive Transformer is employed to enhance the global semantic extraction capability. Our network architecture is designed in a multiscale fashion so that useful multimodal information can be adequately acquired from the perspective of different scales. Moreover, an objective function composed of a structural loss and a region mutual information loss is devised to construct constraints for information preservation at both the structural-level and the feature-level. Extensive experiments on a mainstream database demonstrate that the proposed method outperforms other representative and state-of-the-art methods in terms of both visual quality and quantitative evaluation. We also extend the proposed method to address other biomedical image fusion issues, and the pleasing fusion results illustrate that MATR has good generalization capability. The code of the proposed method is available at https://github.com/tthinking/MATR.

摘要

由于成像传感器的限制，很难获得同时包含功能代谢信息和结构组织细节的医学图像。多模态医学图像融合是一种融合不同模态互补信息的有效方法，已成为促进临床诊断和手术导航的重要技术。基于深度学习（DL）的方法具有强大的特征表示能力，提高了这种融合效果，但仍未达到令人满意的性能。具体来说，现有的基于 DL 的方法通常依赖于卷积操作，虽然卷积操作可以很好地提取局部模式，但在保留全局上下文信息方面能力有限。为了弥补这一缺陷并实现准确的融合，我们提出了一种新颖的基于多尺度自适应 Transformer 的无监督方法，称为 MATR，用于融合多模态医学图像。在提出的方法中，我们不是直接使用常规卷积，而是引入了自适应卷积，根据全局互补上下文自适应地调整卷积核。为了进一步建模远程依赖关系，引入自适应 Transformer 来增强全局语义提取能力。我们的网络架构采用多尺度设计，以便从不同尺度充分获取有用的多模态信息。此外，我们设计了一个由结构损失和区域互信息损失组成的目标函数，以在结构级和特征级构建信息保留的约束。在主流数据库上的广泛实验表明，与其他代表性和最先进的方法相比，所提出的方法在视觉质量和定量评估方面都表现出色。我们还将所提出的方法扩展到解决其他生物医学图像融合问题，令人满意的融合结果表明 MATR 具有良好的泛化能力。该方法的代码可在 https://github.com/tthinking/MATR 上获得。

相似文献

MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer.

IEEE Trans Image Process. 2022;31:5134-5149. doi: 10.1109/TIP.2022.3193288. Epub 2022 Aug 2.

EMOST: A dual-branch hybrid network for medical image fusion via efficient model module and sparse transformer.

Comput Biol Med. 2024 Sep;179:108771. doi: 10.1016/j.compbiomed.2024.108771. Epub 2024 Jul 5.

A multibranch and multiscale neural network based on semantic perception for multimodal medical image fusion.

Sci Rep. 2024 Jul 30;14(1):17609. doi: 10.1038/s41598-024-68183-3.

Spatial adaptive and transformer fusion network (STFNet) for low-count PET blind denoising with MRI.

Med Phys. 2022 Jan;49(1):343-356. doi: 10.1002/mp.15368. Epub 2021 Dec 10.

MFNet: Multimodal medical image fusion network via multi-receptive-field and multi-scale feature integration.

Comput Biol Med. 2023 Jun;159:106923. doi: 10.1016/j.compbiomed.2023.106923. Epub 2023 Apr 14.

MACTFusion: Lightweight Cross Transformer for Adaptive Multimodal Medical Image Fusion.

IEEE J Biomed Health Inform. 2025 May;29(5):3317-3328. doi: 10.1109/JBHI.2024.3391620. Epub 2025 May 6.

VANet: a medical image fusion model based on attention mechanism to assist disease diagnosis.

BMC Bioinformatics. 2022 Dec 19;23(1):548. doi: 10.1186/s12859-022-05072-4.

MDC-RHT: Multi-Modal Medical Image Fusion via Multi-Dimensional Dynamic Convolution and Residual Hybrid Transformer.

Sensors (Basel). 2024 Jun 21;24(13):4056. doi: 10.3390/s24134056.

Multimodal medical image fusion based on interval gradients and convolutional neural networks.

BMC Med Imaging. 2024 Sep 5;24(1):232. doi: 10.1186/s12880-024-01418-x.

3D Multimodal Fusion Network With Disease-Induced Joint Learning for Early Alzheimer's Disease Diagnosis.

IEEE Trans Med Imaging. 2024 Sep;43(9):3161-3175. doi: 10.1109/TMI.2024.3386937. Epub 2024 Sep 3.

引用本文的文献

Language-Driven Cross-Attention for Visible-Infrared Image Fusion Using CLIP.

Sensors (Basel). 2025 Aug 15;25(16):5083. doi: 10.3390/s25165083.

Unsupervised cross-modal biomedical image fusion framework with dual-path detail enhancement and global context awareness.

Biomed Opt Express. 2025 Jul 25;16(8):3378-3394. doi: 10.1364/BOE.562137. eCollection 2025 Aug 1.

Deep learning assisted non-invasive lymph node burden evaluation and CDK4/6i administration in luminal breast cancer.

iScience. 2025 Jun 7;28(7):112849. doi: 10.1016/j.isci.2025.112849. eCollection 2025 Jul 18.

SAFFusion: a saliency-aware frequency fusion network for multimodal medical image fusion.

Biomed Opt Express. 2025 May 27;16(6):2459-2481. doi: 10.1364/BOE.555458. eCollection 2025 Jun 1.

Saliency-enhanced infrared and visible image fusion via sub-window variance filter and weighted least squares optimization.

PLoS One. 2025 Jul 7;20(7):e0323285. doi: 10.1371/journal.pone.0323285. eCollection 2025.

A novel multimodal computer-aided diagnostic model for pulmonary embolism based on hybrid transformer-CNN and tabular transformer.

Phys Eng Sci Med. 2025 May 24. doi: 10.1007/s13246-025-01568-4.

TongueNet: a multi-modal fusion and multi-label classification model for traditional Chinese Medicine tongue diagnosis.

Front Physiol. 2025 Apr 25;16:1527751. doi: 10.3389/fphys.2025.1527751. eCollection 2025.

A multi-scale pyramid residual weight network for medical image fusion.

Quant Imaging Med Surg. 2025 Mar 3;15(3):1793-1821. doi: 10.21037/qims-24-851. Epub 2025 Feb 26.

A stochastic structural similarity guided approach for multi-modal medical image fusion.

Sci Rep. 2025 Mar 14;15(1):8792. doi: 10.1038/s41598-025-93662-6.

Diffusion-driven multi-modality medical image fusion.

Med Biol Eng Comput. 2025 Feb 11. doi: 10.1007/s11517-025-03300-6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

MATR：基于多尺度自适应变换的多模态医学图像融合。

MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译