Marchetto Elisa, Eichhorn Hannah, Gallichan Daniel, Schnabel Julia A, Ganz Melanie
Bernard and Irene Schwartz Center for Biomedical Imaging, Dept. of Radiology, NYU School of Medicine, NY, USA.
Center for Advanced Imaging Innovation and Research (CAIR), Dept. of Radiology, NYU School of Medicine, NY, USA.
ArXiv. 2024 Dec 24:arXiv:2412.18389v1.
Reliable image quality assessment is crucial for evaluating new motion correction methods for magnetic resonance imaging. In this work, we compare the performance of commonly used reference-based and reference-free image quality metrics on a unique dataset with real motion artifacts. We further analyze the image quality metrics' robustness to typical pre-processing techniques.
We compared five reference-based and five reference-free image quality metrics on data acquired with and without intentional motion (2D and 3D sequences). The metrics were recalculated seven times with varying pre-processing steps. The anonymized images were rated by radiologists and radiographers on a 1-5 Likert scale. Spearman correlation coefficients were computed to assess the relationship between image quality metrics and observer scores.
All reference-based image quality metrics showed strong correlation with observer assessments, with minor performance variations across sequences. Among reference-free metrics, Average Edge Strength offers the most promising results, as it consistently displayed stronger correlations across all sequences compared to the other reference-free metrics. Overall, the strongest correlation was achieved with percentile normalization and restricting the metric values to the skull-stripped brain region. In contrast, correlations were weaker when not applying any brain mask and using min-max or no normalization.
Reference-based metrics reliably correlate with radiological evaluation across different sequences and datasets. Pre-processing steps, particularly normalization and brain masking, significantly influence the correlation values. Future research should focus on refining pre-processing techniques and exploring machine learning approaches for automated image quality evaluation.
可靠的图像质量评估对于评估磁共振成像的新运动校正方法至关重要。在这项工作中,我们在一个具有真实运动伪影的独特数据集上比较了常用的基于参考和无参考的图像质量指标的性能。我们进一步分析了图像质量指标对典型预处理技术的鲁棒性。
我们在有和没有故意运动的情况下采集的数据(2D和3D序列)上比较了五个基于参考和五个无参考的图像质量指标。通过不同的预处理步骤对这些指标进行了七次重新计算。放射科医生和放射技师对匿名图像进行了1-5李克特量表评分。计算斯皮尔曼相关系数以评估图像质量指标与观察者评分之间的关系。
所有基于参考的图像质量指标均与观察者评估显示出强相关性,各序列之间的性能差异较小。在无参考指标中,平均边缘强度提供了最有希望的结果,因为与其他无参考指标相比,它在所有序列中始终显示出更强的相关性。总体而言,百分位数归一化并将指标值限制在去除颅骨的脑区域时相关性最强。相比之下,不应用任何脑掩码并使用最小-最大归一化或不进行归一化时相关性较弱。
基于参考的指标在不同序列和数据集上与放射学评估可靠相关。预处理步骤,特别是归一化和脑掩码,对相关值有显著影响。未来的研究应侧重于改进预处理技术并探索用于自动图像质量评估的机器学习方法。