Department of Radiology, Division of Abdominal Imaging, Massachusetts General Hospital, 55 Fruit St, Boston, MA 02114.
AJR Am J Roentgenol. 2021 Jul;217(1):141-151. doi: 10.2214/AJR.20.24199. Epub 2020 Sep 9.
PI-RADS version 2.1 (v2.1) modifications primarily address transition zone (TZ) interpretation. The revisions also impact peripheral zone (PZ) interpretation, which has received less attention. The purpose of this study was to compare interobserver agreement of PI-RADS version 2 (v2) and v2.1 in the prostate PZ and TZ and perform a pilot comparison of their diagnostic performance in the two zones. Six radiologists with varying experience retrospectively assessed 80 prostate lesions (40 PZ, 40 TZ) on MRI in separate sessions for PI-RADS v2 and v2.1. Interobserver agreement was assessed using Conger kappa (κ). For 50 lesions with pathology data, average AUC for detecting clinically significant cancer was compared between versions using multireader multicase statistical methods. Error variance and covariance results informed post hoc power analysis. Interobserver agreement for PI-RADS category 4 or greater was higher for version 2.1 (κ = 0.64) than version 2 (κ = 0.51) in the PZ, but similar for version 2 (κ = 0.64) and version 2.1 (κ = 0.60) in the TZ. The PI-RADS v2.1 DWI descriptor "linear/wedge-shaped" had higher agreement than its predecessor version 2 descriptor "indistinct hypointense" (κ = 0.52 vs κ = 0.18) and yielded 14 more true-negative versus five more false-negative interpretations. The ADC signal descriptor "markedly hypointense," for which only version 2.1 provides a specific definition, had lower agreement in version 2.1 (κ = 0.26) than version 2 (κ = 0.52). Modified TZ T2-weighted category 2 descriptors in version 2.1 had fair agreement (κ = 0.21), and agreement for PI-RADS category 2 in the TZ was lower in version 2.1 (κ = 0.31) than version 2 (κ = 0.57). DWI upgraded a TZ lesion category from 2 to 3 in four patients, detecting two additional cancers. Average AUC was not different between versions 2 and 2.1 for the PZ (AUC, 0.81 vs 0.85; = .24) or the TZ (AUC, 0.69 vs 0.69; = .94), though among experienced readers AUC was higher for version 2.1 than version 2 for the PZ (0.91 vs 0.82; = .001). Overall performance comparison had sufficient power (0.8) to detect a 0.085 difference in AUC. Interobserver agreement improved using PI-RADS v2.1 in the PZ but not the TZ. Diagnostic performance improved using version 2.1 only in the PZ for experienced readers. Specific version 2.1 modifications yielded mixed results. The impact of PI-RADS v2.1 in the PZ is notable given the emphasis on version 2.1 TZ modifications. The findings suggest areas in which additional modification could further improve interobserver agreement and performance.
PI-RADS 版本 2.1(v2.1)的修改主要针对过渡区(TZ)的解读。修订还影响外周区(PZ)的解读,这方面的关注度较低。本研究的目的是比较 PI-RADS 版本 2(v2)和 v2.1 在前列腺 PZ 和 TZ 的观察者间一致性,并对这两个区域的诊断性能进行初步比较。6 名经验不同的放射科医生在单独的会议上对 80 个前列腺病变(40 个 PZ,40 个 TZ)进行 MRI 评估,分别进行 PI-RADS v2 和 v2.1 评估。使用 Conger κ(κ)评估观察者间一致性。对于 50 个具有病理学数据的病变,使用多读者多病例统计方法比较两个版本中检测临床显著癌症的平均 AUC。误差方差和协方差结果为事后功效分析提供信息。在 PZ 中,版本 2.1 的 PI-RADS 类别 4 或更高的观察者间一致性(κ=0.64)高于版本 2(κ=0.51),而在 TZ 中,版本 2 的观察者间一致性(κ=0.64)与版本 2.1 相似(κ=0.60)。PI-RADS v2.1 的 DWI 描述符“线性/楔形”比其前身版本 2 的描述符“不明显低信号”(κ=0.52 比 κ=0.18)具有更高的一致性,并产生了 14 个更多的真阴性与 5 个更多的假阴性解释。ADC 信号描述符“明显低信号”,只有版本 2.1 提供了具体的定义,在版本 2.1 中的一致性较低(κ=0.26)比版本 2(κ=0.52)。版本 2.1 中修改后的 TZ T2 加权类别 2 描述符具有适度的一致性(κ=0.21),而版本 2.1 中 TZ 的 PI-RADS 类别 2 的一致性较低(κ=0.31)比版本 2(κ=0.57)。DWI 将四名患者的 TZ 病变类别从 2 级升级为 3 级,检测到另外两个癌症。在 PZ(AUC,0.81 比 0.85; =.24)或 TZ(AUC,0.69 比 0.69; =.94)中,版本 2 和版本 2.1 的平均 AUC 没有差异,尽管在经验丰富的读者中,版本 2.1 的 AUC 高于版本 2 的 PZ(0.91 比 0.82; =.001)。总体性能比较具有足够的功效(0.8),可以检测 AUC 差异的 0.085。在 PZ 中使用 PI-RADS v2.1 提高了观察者间的一致性,但在 TZ 中没有提高。仅在 PZ 中,经验丰富的读者使用版本 2.1 提高了诊断性能。PI-RADS v2.1 的特定修改产生了混合结果。鉴于对版本 2.1 TZ 修改的强调,PI-RADS v2.1 在 PZ 中的影响值得注意。研究结果表明,在哪些方面进一步修改可以进一步提高观察者间的一致性和性能。