Department of Radiology, Mayo Clinic, Rochester, MN, 55920, USA.
Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55920, USA.
Med Phys. 2017 Oct;44(10):e339-e352. doi: 10.1002/mp.12345.
Using common datasets, to estimate and compare the diagnostic performance of image-based denoising techniques or iterative reconstruction algorithms for the task of detecting hepatic metastases.
Datasets from contrast-enhanced CT scans of the liver were provided to participants in an NIH-, AAPM- and Mayo Clinic-sponsored Low Dose CT Grand Challenge. Training data included full-dose and quarter-dose scans of the ACR CT accreditation phantom and 10 patient examinations; both images and projections were provided in the training data. Projection data were supplied in a vendor-neutral standardized format (DICOM-CT-PD). Twenty quarter-dose patient datasets were provided to each participant for testing the performance of their technique. Images were provided to sites intending to perform denoising in the image domain. Fully preprocessed projection data and statistical noise maps were provided to sites intending to perform iterative reconstruction. Upon return of the denoised or iteratively reconstructed quarter-dose images, randomized, blinded evaluation of the cases was performed using a Latin Square study design by 11 senior radiology residents or fellows, who marked the locations of identified hepatic metastases. Markings were scored against reference locations of clinically or pathologically demonstrated metastases to determine a per-lesion normalized score and a per-case normalized score (a faculty abdominal radiologist established the reference location using clinical and pathological information). Scores increased for correct detections; scores decreased for missed or incorrect detections. The winner for the competition was the entry that produced the highest total score (mean of the per-lesion and per-case normalized score). Reader confidence was used to compute a Jackknife alternative free-response receiver operating characteristic (JAFROC) figure of merit, which was used for breaking ties.
103 participants from 90 sites and 26 countries registered to participate. Training data were shared with 77 sites that completed the data sharing agreements. Subsequently, 41 sites downloaded the 20 test cases, which included only the 25% dose data (CTDIvol = 3.0 ± 1.8 mGy, SSDE = 3.5 ± 1.3 mGy). 22 sites submitted results for evaluation. One site provided binary images and one site provided images with severe artifacts; cases from these sites were excluded from review and the participants removed from the challenge. The mean (range) per-lesion and per-case normalized scores were -24.2% (-75.8%, 3%) and 47% (10%, 70%), respectively. Compared to reader results for commercially reconstructed quarter-dose images with no noise reduction, 11 of the 20 sites showed a numeric improvement in the mean JAFROC figure of merit. Notably two sites performed comparably to the reader results for full-dose commercial images. The study was not designed for these comparisons, so wide confidence intervals surrounded these figures of merit and the results should be used only to motivate future testing.
Infrastructure and methodology were developed to rapidly estimate observer performance for liver metastasis detection in low-dose CT examinations of the liver after either image-based denoising or iterative reconstruction. The results demonstrated large differences in detection and classification performance between noise reduction methods, although the majority of methods provided some improvement in performance relative to the commercial quarter-dose images with no noise reduction applied.
使用常见数据集,估算和比较基于图像的去噪技术或迭代重建算法在检测肝转移方面的诊断性能。
美国国立卫生研究院(NIH)、美国医学物理师协会(AAPM)和梅奥诊所(Mayo Clinic)联合赞助的低剂量 CT 大挑战赛向参与者提供了肝脏对比增强 CT 扫描数据集。训练数据包括 ACR CT 认证体模的全剂量和四分之一剂量扫描以及 10 例患者检查;训练数据中同时提供了图像和投影。投影数据以供应商中立的标准化格式(DICOM-CT-PD)提供。每个参与者都收到了 20 个四分之一剂量的患者数据集,用于测试其技术的性能。向有意在图像域中进行去噪的站点提供图像。向有意进行迭代重建的站点提供完全预处理的投影数据和统计噪声图。在返回去噪或迭代重建的四分之一剂量图像后,由 11 名资深放射科住院医师或研究员通过拉丁方研究设计对病例进行随机、盲法评估,他们标记了识别出的肝转移的位置。使用临床或病理证实的转移的参考位置对标记进行评分,以确定每个病变的归一化评分和每个病例的归一化评分(一名腹部放射科教员使用临床和病理信息确定参考位置)。正确检测的评分增加;漏检或错误检测的评分降低。竞赛的获胜者是产生总分(病变和病例归一化评分的平均值)最高的参赛者。读者置信度用于计算 Jackknife 替代自由响应者操作特征(JAFROC)的优值,该值用于打破平局。
来自 90 个地点和 26 个国家的 103 名参与者注册参加了比赛。77 个站点完成了数据共享协议并共享了训练数据。随后,有 41 个站点下载了 20 个测试病例,其中仅包含 25%的剂量数据(CTDIvol=3.0±1.8mGy,SSDE=3.5±1.3mGy)。有 22 个站点提交了评估结果。一个站点提供了二进制图像,一个站点提供了严重伪影的图像;这些站点的病例被排除在审查之外,参与者也被从挑战赛中移除。病变的平均(范围)和病例的归一化评分分别为-24.2%(-75.8%,3%)和 47%(10%,70%)。与商业重建的四分之一剂量图像无噪声降低的读者结果相比,20 个站点中的 11 个在平均 JAFROC 优值方面显示出数值上的改善。值得注意的是,有两个站点的表现与全剂量商业图像的读者结果相当。该研究并非为此类比较而设计,因此这些优值的置信区间较宽,结果仅用于激发未来的测试。
为了快速评估肝脏转移在经过基于图像的去噪或迭代重建后的低剂量 CT 肝脏检查中的检测性能,已经开发了基础设施和方法。结果表明,在检测和分类性能方面,降噪方法之间存在很大差异,尽管大多数方法相对于未应用噪声降低的商业四分之一剂量图像都提供了一些性能改进。