Yan Shouang, Wang Chengyan, Chen Weibo, Lyu Jun
School of Computer and Control Engineering, Yantai University, Yantai, China.
Human Phenome Institute, Fudan University, Shanghai, China.
Front Oncol. 2022 Aug 8;12:942511. doi: 10.3389/fonc.2022.942511. eCollection 2022.
Medical image-to-image translation is considered a new direction with many potential applications in the medical field. The medical image-to-image translation is dominated by two models, including supervised Pix2Pix and unsupervised cyclic-consistency generative adversarial network (GAN). However, existing methods still have two shortcomings: 1) the Pix2Pix requires paired and pixel-aligned images, which are difficult to acquire. Nevertheless, the optimum output of the cycle-consistency model may not be unique. 2) They are still deficient in capturing the global features and modeling long-distance interactions, which are critical for regions with complex anatomical structures. We propose a Swin Transformer-based GAN for Multi-Modal Medical Image Translation, named MMTrans. Specifically, MMTrans consists of a generator, a registration network, and a discriminator. The Swin Transformer-based generator enables to generate images with the same content as source modality images and similar style information of target modality images. The encoder part of the registration network, based on Swin Transformer, is utilized to predict deformable vector fields. The convolution-based discriminator determines whether the target modality images are similar to the generator or from the real images. Extensive experiments conducted using the public dataset and clinical datasets showed that our network outperformed other advanced medical image translation methods in both aligned and unpaired datasets and has great potential to be applied in clinical applications.
医学图像到图像的转换被认为是一个新的方向,在医学领域有许多潜在应用。医学图像到图像的转换主要由两种模型主导,包括有监督的Pix2Pix和无监督的循环一致生成对抗网络(GAN)。然而,现有方法仍存在两个缺点:1)Pix2Pix需要配对且像素对齐的图像,而这些图像很难获取。此外,循环一致模型的最优输出可能不唯一。2)它们在捕捉全局特征和对长距离交互进行建模方面仍然存在不足,而这对于具有复杂解剖结构的区域至关重要。我们提出了一种基于Swin Transformer的用于多模态医学图像转换的GAN,名为MMTrans。具体而言,MMTrans由一个生成器、一个配准网络和一个判别器组成。基于Swin Transformer的生成器能够生成与源模态图像具有相同内容且与目标模态图像具有相似风格信息的图像。基于Swin Transformer的配准网络的编码器部分用于预测可变形向量场。基于卷积的判别器确定目标模态图像是与生成器生成的图像相似还是来自真实图像。使用公共数据集和临床数据集进行的大量实验表明,我们的网络在对齐和未配对数据集中均优于其他先进的医学图像转换方法,并且在临床应用中具有很大的应用潜力。