Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
Department of Computational and Applied Mathematics, Rice University, Houston, Texas, USA.
Med Phys. 2024 Jul;51(7):4898-4906. doi: 10.1002/mp.17059. Epub 2024 Apr 19.
Magnetic resonance imaging (MRI) scans are known to suffer from a variety of acquisition artifacts as well as equipment-based variations that impact image appearance and segmentation performance. It is still unclear whether a direct relationship exists between magnetic resonance (MR) image quality metrics (IQMs) (e.g., signal-to-noise, contrast-to-noise) and segmentation accuracy.
Deep learning (DL) approaches have shown significant promise for automated segmentation of brain tumors on MRI but depend on the quality of input training images. We sought to evaluate the relationship between IQMs of input training images and DL-based brain tumor segmentation accuracy toward developing more generalizable models for multi-institutional data.
We trained a 3D DenseNet model on the BraTS 2020 cohorts for segmentation of tumor subregions enhancing tumor (ET), peritumoral edematous, and necrotic and non-ET on MRI; with performance quantified via a 5-fold cross-validated Dice coefficient. MRI scans were evaluated through the open-source quality control tool MRQy, to yield 13 IQMs per scan. The Pearson correlation coefficient was computed between whole tumor (WT) dice values and IQM measures in the training cohorts to identify quality measures most correlated with segmentation performance. Each selected IQM was used to group MRI scans as "better" quality (BQ) or "worse" quality (WQ), via relative thresholding. Segmentation performance was re-evaluated for the DenseNet model when (i) training on BQ MRI images with validation on WQ images, as well as (ii) training on WQ images, and validation on BQ images. Trends were further validated on independent test sets derived from the BraTS 2021 training cohorts.
For this study, multimodal MRI scans from the BraTS 2020 training cohorts were used to train the segmentation model and validated on independent test sets derived from the BraTS 2021 cohort. Among the selected IQMs, models trained on BQ images based on inhomogeneity measurements (coefficient of variance, coefficient of joint variation, coefficient of variation of the foreground patch) and the models trained on WQ images based on noise measurement peak signal-to-noise ratio (SNR) yielded significantly improved tumor segmentation accuracy compared to their inverse models.
Our results suggest that a significant correlation may exist between specific MR IQMs and DenseNet-based brain tumor segmentation performance. The selection of MRI scans for model training based on IQMs may yield more accurate and generalizable models in unseen validation.
磁共振成像(MRI)扫描会受到多种采集伪影以及设备变化的影响,这些因素会影响图像的外观和分割性能。目前尚不清楚磁共振(MR)图像质量指标(IQM)(例如,信噪比、对比噪声比)与分割准确性之间是否存在直接关系。
深度学习(DL)方法已显示出在 MRI 上自动分割脑肿瘤的巨大潜力,但依赖于输入训练图像的质量。我们旨在评估输入训练图像的 IQM 与基于 DL 的脑肿瘤分割准确性之间的关系,以便为多机构数据开发更具通用性的模型。
我们在 BraTS 2020 队列上使用 3D DenseNet 模型对肿瘤亚区进行分割,增强肿瘤(ET)、瘤周水肿和坏死以及非 ET;通过 5 折交叉验证 Dice 系数进行性能量化。通过开源质量控制工具 MRQy 评估 MRI 扫描,每个扫描产生 13 个 IQM。计算整个肿瘤(WT)骰子值与训练队列中 IQM 测量值之间的 Pearson 相关系数,以确定与分割性能最相关的质量测量值。通过相对阈值将每个选定的 IQM 用于将 MRI 扫描分组为“更好”质量(BQ)或“更差”质量(WQ)。当(i)在 BQ MRI 图像上进行训练并在 WQ 图像上进行验证,以及(ii)在 WQ 图像上进行训练并在 BQ 图像上进行验证时,重新评估 DenseNet 模型的分割性能。进一步在来自 BraTS 2021 训练队列的独立测试集上验证趋势。
在这项研究中,使用来自 BraTS 2020 训练队列的多模态 MRI 扫描来训练分割模型,并在来自 BraTS 2021 队列的独立测试集上进行验证。在所选择的 IQM 中,基于各向异性测量值(方差系数、联合变化系数、前景斑块的变异系数)在 BQ 图像上训练的模型和基于噪声测量值峰值信噪比(SNR)在 WQ 图像上训练的模型与它们的逆模型相比,肿瘤分割准确性显著提高。
我们的结果表明,特定的 MR IQM 与基于 DenseNet 的脑肿瘤分割性能之间可能存在显著相关性。基于 IQM 为模型训练选择 MRI 扫描可能会在未见的验证中产生更准确和更具通用性的模型。