基于Swin变压器的生成对抗网络用于多模态医学图像翻译。

Swin transformer-based GAN for multi-modal medical image translation.

作者信息

Yan Shouang, Wang Chengyan, Chen Weibo, Lyu Jun

机构信息

School of Computer and Control Engineering, Yantai University, Yantai, China.

Human Phenome Institute, Fudan University, Shanghai, China.

出版信息

Front Oncol. 2022 Aug 8;12:942511. doi: 10.3389/fonc.2022.942511. eCollection 2022.

DOI:10.3389/fonc.2022.942511

PMID:36003791

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9395186/

Abstract

Medical image-to-image translation is considered a new direction with many potential applications in the medical field. The medical image-to-image translation is dominated by two models, including supervised Pix2Pix and unsupervised cyclic-consistency generative adversarial network (GAN). However, existing methods still have two shortcomings: 1) the Pix2Pix requires paired and pixel-aligned images, which are difficult to acquire. Nevertheless, the optimum output of the cycle-consistency model may not be unique. 2) They are still deficient in capturing the global features and modeling long-distance interactions, which are critical for regions with complex anatomical structures. We propose a Swin Transformer-based GAN for Multi-Modal Medical Image Translation, named MMTrans. Specifically, MMTrans consists of a generator, a registration network, and a discriminator. The Swin Transformer-based generator enables to generate images with the same content as source modality images and similar style information of target modality images. The encoder part of the registration network, based on Swin Transformer, is utilized to predict deformable vector fields. The convolution-based discriminator determines whether the target modality images are similar to the generator or from the real images. Extensive experiments conducted using the public dataset and clinical datasets showed that our network outperformed other advanced medical image translation methods in both aligned and unpaired datasets and has great potential to be applied in clinical applications.

摘要

医学图像到图像的转换被认为是一个新的方向，在医学领域有许多潜在应用。医学图像到图像的转换主要由两种模型主导，包括有监督的Pix2Pix和无监督的循环一致生成对抗网络（GAN）。然而，现有方法仍存在两个缺点：1）Pix2Pix需要配对且像素对齐的图像，而这些图像很难获取。此外，循环一致模型的最优输出可能不唯一。2）它们在捕捉全局特征和对长距离交互进行建模方面仍然存在不足，而这对于具有复杂解剖结构的区域至关重要。我们提出了一种基于Swin Transformer的用于多模态医学图像转换的GAN，名为MMTrans。具体而言，MMTrans由一个生成器、一个配准网络和一个判别器组成。基于Swin Transformer的生成器能够生成与源模态图像具有相同内容且与目标模态图像具有相似风格信息的图像。基于Swin Transformer的配准网络的编码器部分用于预测可变形向量场。基于卷积的判别器确定目标模态图像是与生成器生成的图像相似还是来自真实图像。使用公共数据集和临床数据集进行的大量实验表明，我们的网络在对齐和未配对数据集中均优于其他先进的医学图像转换方法，并且在临床应用中具有很大的应用潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd9a/9395186/7bbaf0c82b03/fonc-12-942511-g001.jpg

相似文献

Swin transformer-based GAN for multi-modal medical image translation.

Front Oncol. 2022 Aug 8;12:942511. doi: 10.3389/fonc.2022.942511. eCollection 2022.

A modality-collaborative convolution and transformer hybrid network for unpaired multi-modal medical image segmentation with limited annotations.

Med Phys. 2023 Sep;50(9):5460-5478. doi: 10.1002/mp.16338. Epub 2023 Mar 15.

SwinGAN: A dual-domain Swin Transformer-based generative adversarial network for MRI reconstruction.

Comput Biol Med. 2023 Feb;153:106513. doi: 10.1016/j.compbiomed.2022.106513. Epub 2022 Dec 31.

SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images.

Med Phys. 2024 Mar;51(3):2096-2107. doi: 10.1002/mp.16703. Epub 2023 Sep 30.

ST-GAN: A Swin Transformer-Based Generative Adversarial Network for Unsupervised Domain Adaptation of Cross-Modality Cardiac Segmentation.

IEEE J Biomed Health Inform. 2024 Feb;28(2):893-904. doi: 10.1109/JBHI.2023.3336965. Epub 2024 Feb 5.

Reliable multi-modal medical image-to-image translation independent of pixel-wise aligned data.

Med Phys. 2024 Nov;51(11):8283-8301. doi: 10.1002/mp.17362. Epub 2024 Aug 17.

Multi-Modal Brain Tumor Data Completion Based on Reconstruction Consistency Loss.

J Digit Imaging. 2023 Aug;36(4):1794-1807. doi: 10.1007/s10278-022-00697-6. Epub 2023 Mar 1.

Swin-Net: A Swin-Transformer-Based Network Combing with Multi-Scale Features for Segmentation of Breast Tumor Ultrasound Images.

Diagnostics (Basel). 2024 Jan 26;14(3):269. doi: 10.3390/diagnostics14030269.

BTMF-GAN: A multi-modal MRI fusion generative adversarial network for brain tumors.

Comput Biol Med. 2023 May;157:106769. doi: 10.1016/j.compbiomed.2023.106769. Epub 2023 Mar 9.

Swin-MFA: A Multi-Modal Fusion Attention Network Based on Swin-Transformer for Low-Light Image Human Segmentation.

Sensors (Basel). 2022 Aug 19;22(16):6229. doi: 10.3390/s22166229.

引用本文的文献

Robust deep MRI contrast synthesis using a prior-based and task-oriented 3D network.

Imaging Neurosci (Camb). 2025 Aug 26;3. doi: 10.1162/IMAG.a.116. eCollection 2025.

Deep Learning in Digital Breast Tomosynthesis: Current Status, Challenges, and Future Trends.

MedComm (2020). 2025 Jun 9;6(6):e70247. doi: 10.1002/mco2.70247. eCollection 2025 Jun.

Based on TransRes-Pix2Pix network to generate the OBL image during SMILE surgery.

Front Cell Dev Biol. 2025 May 21;13:1598475. doi: 10.3389/fcell.2025.1598475. eCollection 2025.

The Role of Artificial Intelligence in Epiretinal Membrane Care: A Scoping Review.

Ophthalmol Sci. 2024 Dec 20;5(4):100689. doi: 10.1016/j.xops.2024.100689. eCollection 2025 Jul-Aug.

A multi-modal deep learning solution for precise pneumonia diagnosis: the PneumoFusion-Net model.

Front Physiol. 2025 Mar 12;16:1512835. doi: 10.3389/fphys.2025.1512835. eCollection 2025.

Transformers and large language models in healthcare: A review.

Artif Intell Med. 2024 Aug;154:102900. doi: 10.1016/j.artmed.2024.102900. Epub 2024 Jun 5.

L2NLF: a novel linear-to-nonlinear framework for multi-modal medical image registration.

Biomed Eng Lett. 2024 Jan 10;14(3):497-509. doi: 10.1007/s13534-023-00344-1. eCollection 2024 May.

Exceeding the limit for microscopic image translation with a deep learning-based unified framework.

PNAS Nexus. 2024 Mar 29;3(4):pgae133. doi: 10.1093/pnasnexus/pgae133. eCollection 2024 Apr.

A Deep Diagnostic Framework Using Explainable Artificial Intelligence and Clustering.

Diagnostics (Basel). 2023 Nov 9;13(22):3413. doi: 10.3390/diagnostics13223413.

SW-UNet: a U-Net fusing sliding window transformer block with CNN for segmentation of lung nodules.

Front Med (Lausanne). 2023 Sep 28;10:1273441. doi: 10.3389/fmed.2023.1273441. eCollection 2023.

本文引用的文献

ReconFormer: Accelerated MRI Reconstruction Using Recurrent Transformer.

IEEE Trans Med Imaging. 2024 Jan;43(1):582-593. doi: 10.1109/TMI.2023.3314747. Epub 2024 Jan 2.

Cross-Modality LGE-CMR Segmentation Using Image-to-Image Translation Based Data Augmentation.

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jul-Aug;20(4):2367-2375. doi: 10.1109/TCBB.2022.3140306. Epub 2023 Aug 9.

Multi-Modal MRI Image Synthesis via GAN With Multi-Scale Gate Mergence.

IEEE J Biomed Health Inform. 2022 Jan;26(1):17-26. doi: 10.1109/JBHI.2021.3088866. Epub 2022 Jan 17.

Deep learning-based multi-modal computing with feature disentanglement for MRI image synthesis.

Med Phys. 2021 Jul;48(7):3778-3789. doi: 10.1002/mp.14929. Epub 2021 Jun 7.

Adversarial Uni- and Multi-modal Stream Networks for Multimodal Image Registration.

Med Image Comput Comput Assist Interv. 2020 Oct;12263:222-232. doi: 10.1007/978-3-030-59716-0_22. Epub 2020 Sep 29.

Hi-Net: Hybrid-Fusion Network for Multi-Modal MR Image Synthesis.

IEEE Trans Med Imaging. 2020 Sep;39(9):2772-2781. doi: 10.1109/TMI.2020.2975344. Epub 2020 Feb 20.

Image Synthesis in Multi-Contrast MRI With Conditional Generative Adversarial Networks.

IEEE Trans Med Imaging. 2019 Oct;38(10):2375-2388. doi: 10.1109/TMI.2019.2901750. Epub 2019 Feb 26.

3D Auto-Context-Based Locality Adaptive Multi-Modality GANs for PET Synthesis.

IEEE Trans Med Imaging. 2019 Jun;38(6):1328-1339. doi: 10.1109/TMI.2018.2884053. Epub 2018 Nov 29.

The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).

IEEE Trans Med Imaging. 2015 Oct;34(10):1993-2024. doi: 10.1109/TMI.2014.2377694. Epub 2014 Dec 4.

Multi-contrast, isotropic, single-slab 3D MR imaging in multiple sclerosis.

Eur Radiol. 2008 Oct;18(10):2311-20. doi: 10.1007/s00330-008-1009-7. Epub 2008 May 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于Swin变压器的生成对抗网络用于多模态医学图像翻译。

Swin transformer-based GAN for multi-modal medical image translation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献