Acibadem Mehmet Ali Aydinlar University, School of Medicine, Department of Radiology, Istanbul, 34457, Turkey.
Cumhuriyet University, School of Medicine, Sivas, 581407, Turkey.
Eur J Radiol. 2023 Aug;165:110924. doi: 10.1016/j.ejrad.2023.110924. Epub 2023 Jun 11.
Although systems such as Prostate Imaging Quality (PI-QUAL) have been proposed for quality assessment, visual evaluations by human readers remain somewhat inconsistent, particularly among less-experienced readers.
To assess the feasibility of deep learning (DL) for the automated assessment of image quality in bi-parametric MRI scans and compare its performance to that of less-experienced readers.
We used bi-parametric prostate MRI scans from the PI-CAI dataset in this study. A 3-point Likert scale, consisting of poor, moderate, and excellent, was utilized for assessing image quality. Three expert readers established the ground-truth labels for the development (500) and testing sets (100). We trained a 3D DL model on the development set using probabilistic prostate masks and an ordinal loss function. Four less-experienced readers scored the testing set for performance comparison.
The kappa scores between the DL model and the expert consensus for T2W images and ADC maps were 0.42 and 0.61, representing moderate and good levels of agreement. The kappa scores between the less-experienced readers and the expert consensus for T2W images and ADC maps ranged from 0.39 to 0.56 (fair to moderate) and from 0.39 to 0.62 (fair to good).
Deep learning (DL) can offer performance comparable to that of less-experienced readers when assessing image quality in bi-parametric prostate MRI, making it a viable option for an automated quality assessment tool. We suggest that DL models trained on more representative datasets, annotated by a larger group of experts, could yield reliable image quality assessment and potentially substitute or assist visual evaluations by human readers.
尽管已经提出了前列腺成像质量(PI-QUAL)等系统用于质量评估,但人类读者的视觉评估仍然存在一定的不一致性,尤其是在经验较少的读者中。
评估深度学习(DL)在双参数 MRI 扫描图像质量自动评估中的可行性,并比较其性能与经验较少的读者的性能。
我们在这项研究中使用了 PI-CAI 数据集的双参数前列腺 MRI 扫描。使用 3 分李克特量表(包括差、中、优)评估图像质量。三位专家读者为开发集(500 个)和测试集(100 个)建立了真实标签。我们使用概率性前列腺掩模和有序损失函数在开发集上训练了一个 3D DL 模型。四名经验较少的读者对测试集进行评分,以进行性能比较。
DL 模型与专家共识之间的 T2W 图像和 ADC 图的 Kappa 评分分别为 0.42 和 0.61,表明具有中等和良好的一致性水平。经验较少的读者与专家共识之间的 T2W 图像和 ADC 图的 Kappa 评分范围为 0.39 至 0.56(公平至中等)和 0.39 至 0.62(公平至良好)。
在评估双参数前列腺 MRI 中的图像质量时,深度学习(DL)可以提供与经验较少的读者相当的性能,因此它是一种可行的自动质量评估工具选择。我们建议,使用更多代表性数据集和更大专家组进行注释来训练的 DL 模型,可以实现可靠的图像质量评估,并可能替代或辅助人类读者的视觉评估。