Departments of Radiology, Epidemiology and Biostatistics, and Medicine, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10021.
Radiology. 2013 Nov;269(2):451-9. doi: 10.1148/radiology.13122665. Epub 2013 Jul 3.
To assess variability of computed tomographic (CT) measurements of lesions of various sizes and margin sharpness in several organs taken by readers with different levels of experience, as would be found in routine clinical practice.
In this institutional review board-approved, HIPAA-compliant retrospective study, 17 radiologists with varying levels of experience independently obtained bidimensional orthogonal axial measurements of 80 lymph nodes, 120 pulmonary lesions, and 120 hepatic lesions, categorized by size and margin sharpness. Repeat measurements were performed 2 or more weeks later. Intraclass correlation coefficients and Bland-Altman plots were used to assess intra- and interobserver variability.
For long- and short-axis measurements, respectively, overall intraobserver agreement rates were 0.957 (95% confidence interval [CI]: 0.947, 0.966) and 0.945 (95% CI: 0.933, 0.955); interobserver agreement rates were 0.954 (95% CI: 0.943, 0.963) and 0.941 (95% CI: 0.929, 0.951). Both intra- and interobserver agreement differed by lesion size, margin sharpness, location, and reader experience. Agreement ranged from 0.847 to 0.886 for lesions 20 mm or larger versus 0.745-0.785 for lesions smaller than 10 mm, 0.961 to 0.975 for smooth margins versus 0.924-0.942 for irregular margins, 0.955 to 0.97 for lung lesions versus 0.884-0.94 for lymph nodes, and 0.95 to 0.97 for attending radiologists versus 0.928-0.945 for fellows. Measurement variability decreased with increasing lesion size; 95% limits of agreement for short-axis measurements were -11.6% to 6.7% for lesions smaller than 10 mm versus -6.2% to 4.7% for lesions 20 mm or larger.
Overall intra- and interobserver variability rates were similar; in clinical practice, serial CT measurements can be safely performed by different radiologists. Smooth margins, larger lesion size, and greater reader experience resulted in a higher consistency of measurements. Depending on lesion size, increases of 4%-6% or greater in long axis and 5%-7% or greater in short axis and decreases of -6% to -10% or greater in long axis and -6% to -12% or greater in short axis at CT can be considered true changes rather than measurement variation, with 95% confidence.
评估在常规临床实践中,不同经验水平的读者对不同大小和边缘锐利度的病变进行计算机断层扫描(CT)测量的变异性。
本研究为机构审查委员会批准、符合 HIPAA 规定的回顾性研究,17 名放射科医生具有不同水平的经验,他们分别对 80 个淋巴结、120 个肺病变和 120 个肝病变进行二维正交轴向测量,这些病变按大小和边缘锐利度进行分类。2 周或更长时间后进行重复测量。使用组内相关系数和 Bland-Altman 图评估观察者内和观察者间的变异性。
对于长轴和短轴测量,观察者内的总体一致性率分别为 0.957(95%置信区间[CI]:0.947,0.966)和 0.945(95% CI:0.933,0.955);观察者间的一致性率分别为 0.954(95% CI:0.943,0.963)和 0.941(95% CI:0.929,0.951)。病变大小、边缘锐利度、位置和读者经验均影响观察者内和观察者间的一致性。20mm 或更大的病变的一致性范围为 0.847-0.886,而小于 10mm 的病变为 0.745-0.785,边缘光滑的病变为 0.961-0.975,边缘不规则的病变为 0.924-0.942,肺病变的一致性范围为 0.955-0.97,而淋巴结为 0.884-0.94,主治放射科医生的一致性为 0.95-0.97,而研究员为 0.928-0.945。随着病变大小的增加,测量的变异性降低;短轴测量的 95%置信区间的界限为小于 10mm 的病变为-11.6%至 6.7%,而 20mm 或更大的病变为-6.2%至 4.7%。
总体观察者内和观察者间的变异性率相似;在临床实践中,不同的放射科医生可以安全地进行连续 CT 测量。边缘光滑、病变较大和读者经验丰富可提高测量的一致性。根据病变大小,在 CT 上长轴增加 4%-6%或更多,短轴增加 5%-7%或更多,长轴减少-6%至-10%或更多,短轴减少-6%至-12%或更多,可以认为是真实变化,而不是测量变化,置信区间为 95%。