Department of Radiology, Mayo Clinic, Rochester, MN, USA.
J Imaging Inform Med. 2024 Jun;37(3):1228-1238. doi: 10.1007/s10278-024-00976-4. Epub 2024 Feb 16.
We evaluated the impact of training set size on generative adversarial networks (GANs) to synthesize brain MRI sequences. We compared three sets of GANs trained to generate pre-contrast T1 (gT1) from post-contrast T1 and FLAIR (gFLAIR) from T2. The baseline models were trained on 135 cases; for this study, we used the same model architecture but a larger cohort of 1251 cases and two stopping rules, an early checkpoint (early models) and one after 50 epochs (late models). We tested all models on an independent dataset of 485 newly diagnosed gliomas. We compared the generated MRIs with the original ones using the structural similarity index (SSI) and mean squared error (MSE). We simulated scenarios where either the original T1, FLAIR, or both were missing and used their synthesized version as inputs for a segmentation model with the original post-contrast T1 and T2. We compared the segmentations using the dice similarity coefficient (DSC) for the contrast-enhancing area, non-enhancing area, and the whole lesion. For the baseline, early, and late models on the test set, for the gT1, median SSI was .957, .918, and .947; median MSE was .006, .014, and .008. For the gFLAIR, median SSI was .924, .908, and .915; median MSE was .016, .016, and .019. The range DSC was .625-.955, .420-.952, and .610-.954. Overall, GANs trained on a relatively small cohort performed similarly to those trained on a cohort ten times larger, making them a viable option for rare diseases or institutions with limited resources.
我们评估了训练集大小对生成对抗网络(GAN)生成脑 MRI 序列的影响。我们比较了三种 GAN,它们被训练用于从对比度增强 T1(gT1)生成对比度增强 T1 和从 T2 生成液体衰减反转恢复(gFLAIR)。基线模型在 135 例病例中进行了训练;在这项研究中,我们使用了相同的模型架构,但将队列扩大到 1251 例,并使用了两种停止规则,即早期检查点(早期模型)和 50 个周期后(晚期模型)。我们在一个独立的 485 例新诊断的脑胶质瘤数据集上测试了所有模型。我们使用结构相似性指数(SSI)和均方误差(MSE)比较生成的 MRI 与原始 MRI。我们模拟了原始 T1、FLAIR 或两者缺失的情况,并将其合成版本用作具有原始对比度增强 T1 和 T2 的分割模型的输入。我们使用对比度增强区域、非增强区域和整个病变的 Dice 相似系数(DSC)比较分割结果。对于测试集中的基线、早期和晚期模型,对于 gT1,中位数 SSI 分别为.957、.918 和.947;中位数 MSE 分别为.006、.014 和.008。对于 gFLAIR,中位数 SSI 分别为.924、.908 和.915;中位数 MSE 分别为.016、.016 和.019。DSC 的范围为.625-.955、.420-.952 和.610-.954。总体而言,在相对较小的队列上训练的 GAN 与在大十倍的队列上训练的 GAN 性能相当,这使得它们成为罕见疾病或资源有限的机构的可行选择。