IEEE Trans Med Imaging. 2023 Nov;42(11):3395-3407. doi: 10.1109/TMI.2023.3288001. Epub 2023 Oct 27.
Cross-modality magnetic resonance (MR) image synthesis can be used to generate missing modalities from given ones. Existing (supervised learning) methods often require a large number of paired multi-modal data to train an effective synthesis model. However, it is often challenging to obtain sufficient paired data for supervised training. In reality, we often have a small number of paired data while a large number of unpaired data. To take advantage of both paired and unpaired data, in this paper, we propose a Multi-scale Transformer Network (MT-Net) with edge-aware pre-training for cross-modality MR image synthesis. Specifically, an Edge-preserving Masked AutoEncoder (Edge-MAE) is first pre-trained in a self-supervised manner to simultaneously perform 1) image imputation for randomly masked patches in each image and 2) whole edge map estimation, which effectively learns both contextual and structural information. Besides, a novel patch-wise loss is proposed to enhance the performance of Edge-MAE by treating different masked patches differently according to the difficulties of their respective imputations. Based on this proposed pre-training, in the subsequent fine-tuning stage, a Dual-scale Selective Fusion (DSF) module is designed (in our MT-Net) to synthesize missing-modality images by integrating multi-scale features extracted from the encoder of the pre-trained Edge-MAE. Furthermore, this pre-trained encoder is also employed to extract high-level features from the synthesized image and corresponding ground-truth image, which are required to be similar (consistent) in the training. Experimental results show that our MT-Net achieves comparable performance to the competing methods even using 70% of all available paired data. Our code will be released at https://github.com/lyhkevin/MT-Net.
跨模态磁共振(MR)图像合成可用于从给定模态生成缺失模态。现有的(监督学习)方法通常需要大量的多模态配对数据来训练有效的合成模型。然而,通常很难获得足够的配对数据进行监督训练。在现实中,我们通常只有少量的配对数据,而有大量的未配对数据。为了利用配对数据和未配对数据,在本文中,我们提出了一种具有边缘感知预训练的多尺度变换网络(MT-Net),用于跨模态 MR 图像合成。具体来说,首先以自监督的方式预训练一个具有边缘保持功能的掩蔽自动编码器(Edge-MAE),同时执行 1)对每个图像中随机掩蔽补丁的图像插补,以及 2)整个边缘图估计,从而有效地学习上下文和结构信息。此外,我们提出了一种新的补丁损失函数,根据各自插补的难度,对不同的掩蔽补丁进行不同的处理,从而增强 Edge-MAE 的性能。基于这种预训练,在随后的微调阶段,我们设计了一个双尺度选择性融合(DSF)模块(在我们的 MT-Net 中),通过整合从预训练的 Edge-MAE 编码器中提取的多尺度特征来合成缺失模态的图像。此外,这个预训练的编码器还被用来从合成图像和相应的真实图像中提取高层特征,这些特征在训练中需要是相似的(一致的)。实验结果表明,即使使用所有可用配对数据的 70%,我们的 MT-Net 也能达到与竞争方法相当的性能。我们的代码将在 https://github.com/lyhkevin/MT-Net 上发布。