Department of Radiology, Mayo Clinic, Rochester, Minnesota, USA.
Med Phys. 2022 Jan;49(1):70-83. doi: 10.1002/mp.15362. Epub 2021 Dec 1.
Conventional model observers (MO) in CT are often limited to a uniform background or varying background that is random and can be modeled in an analytical form. It is unclear if these conventional MOs can be readily generalized to predict human observer performance in clinical CT tasks that involve realistic anatomical background. Deep-learning-based model observers (DL-MO) have recently been developed, but have not been validated for challenging low contrast diagnostic tasks in abdominal CT. We consequently sought to validate a DL-MO for a low-contrast hepatic metastases localization task.
We adapted our recently developed DL-MO framework for the liver metastases localization task. Our previously-validated projection-domain lesion-/noise-insertion techniques were used to synthesize realistic positive and low-dose abdominal CT exams, using the archived patient projection data. Ten experimental conditions were generated, which involved different lesion sizes/contrasts, radiation dose levels, and image reconstruction types. Each condition included 100 trials generated from a patient cohort of 7 cases. Each trial was presented as liver image patches (160×160×5 voxels). The DL-MO performance was calculated for each condition and was compared with human observer performance, which was obtained by three sub-specialized radiologists in an observer study. The performance of DL-MO and radiologists was gauged by the area under localization receiver-operating-characteristic curves. The generalization performance of the DL-MO was estimated with the repeated twofold cross-validation method over the same set of trials used in the human observer study. A multi-slice Channelized Hoteling Observers (CHO) was compared with the DL-MO across the same experimental conditions.
The performance of DL-MO was highly correlated to that of radiologists (Pearson's correlation coefficient: 0.987; 95% CI: [0.942, 0.997]). The performance level of DL-MO was comparable to that of the grouped radiologists, that is, the mean performance difference was -3.3%. The CHO performance was poorer than the grouped radiologist performance, before internal noise could be added. The correlation between CHO and radiologists was weaker (Pearson's correlation coefficient: 0.812, and 95% CI: [0.378, 0.955]), and the corresponding performance bias (-29.5%) was statistically significant.
The presented study demonstrated the potential of using the DL-MO for image quality assessment in patient abdominal CT tasks.
CT 中的传统模型观察者(MO)通常仅限于均匀背景或变化的背景,这些背景是随机的,可以用分析形式建模。目前尚不清楚这些传统的 MO 是否可以很容易地推广到预测涉及真实解剖背景的临床 CT 任务中的人类观察者的性能。基于深度学习的模型观察者(DL-MO)最近已经开发出来,但尚未针对腹部 CT 中具有挑战性的低对比度诊断任务进行验证。因此,我们试图为低对比度肝转移定位任务验证一种 DL-MO。
我们针对肝转移定位任务对我们最近开发的 DL-MO 框架进行了调整。我们之前经过验证的投影域病变/噪声插入技术用于使用存档的患者投影数据合成真实的阳性和低剂量腹部 CT 检查。生成了 10 种实验条件,其中包括不同的病变大小/对比度、辐射剂量水平和图像重建类型。每种条件包括从 7 例患者队列生成的 100 次试验。每次试验都以肝脏图像块(160×160×5 体素)的形式呈现。计算了每种条件下的 DL-MO 性能,并将其与通过三位亚专业放射科医生进行的观察者研究获得的人类观察者性能进行了比较。通过定位接收者操作特性曲线下的面积来衡量 DL-MO 和放射科医生的性能。使用在人类观察者研究中使用的相同试验集通过重复的两倍交叉验证方法估计了 DL-MO 的泛化性能。在相同的实验条件下,将多切片通道化霍尔顿观察者(CHO)与 DL-MO 进行了比较。
DL-MO 的性能与放射科医生的性能高度相关(Pearson 相关系数:0.987;95%CI:[0.942,0.997])。DL-MO 的性能水平与分组放射科医生的性能相当,即平均性能差异为-3.3%。在可以添加内部噪声之前,CHO 的性能比分组放射科医生的性能差。CHO 与放射科医生之间的相关性较弱(Pearson 相关系数:0.812,95%CI:[0.378,0.955]),相应的性能偏差(-29.5%)具有统计学意义。
本研究表明,使用 DL-MO 进行患者腹部 CT 任务的图像质量评估具有潜力。