Department of Radiology, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands.
Eur J Radiol. 2012 Oct;81(10):2543-9. doi: 10.1016/j.ejrad.2011.12.026. Epub 2012 Jan 20.
This study evaluates intra- and interobserver variability of automatic diameter and volume measurements of colorectal liver metastases (CRLM) before and after chemotherapy and its influence on response classification.
Pre-and post-chemotherapy CT-scans of 33 patients with 138 CRLM were evaluated. Two observers measured all metastases three times on pre-and post-chemotherapy CT-scans, using three different techniques: manual diameter (MD), automatic diameter (AD) and automatic volume (AV). RECIST 1.0 criteria were used to define response classification. For each technique, we assessed intra- and interobserver reliability by determining the intraclass correlation coefficient (α-level 0.05). Intra-observer agreement was estimated by the variance coefficient (%). For inter-observer agreement the relative measurement error (%) was calculated using Bland-Altman analysis. In addition, we compared agreement in response classification by calculating kappa-scores (κ) and estimating proportions of discordance between methods (%).
Intra-observer variability was 6.05%, 4.28% and 12.72% for MD, AD and AV, respectively. Inter-observer variability was 4.23%, 2.02% and 14.86% for MD, AD and AV, respectively. Chemotherapy marginally affected these estimates. Agreement in response classification did not improve using AD or AV (MD κ=0.653, AD κ=0.548, AV κ=0.548) and substantial discordance between observers was observed with all three methods (MD 17.8%, AD 22.2%, AV 22.2%).
Semi-automatic software allows repeatable and reproducible measurement of both diameter and volume measurements of CRLM, but does not reduce variability in response classification.
本研究评估了化疗前后结直肠肝转移瘤(CRLM)的自动直径和体积测量的观察者内和观察者间变异性及其对反应分类的影响。
对 33 例 138 个 CRLM 的化疗前后 CT 扫描进行评估。两名观察者使用三种不同的技术(手动直径[MD]、自动直径[AD]和自动体积[AV])在化疗前后的 CT 扫描上对所有转移灶进行了三次测量。RECIST 1.0 标准用于定义反应分类。对于每种技术,我们通过确定组内相关系数(α水平为 0.05)来评估观察者内和观察者间的可靠性。通过方差系数(%)来评估观察者内一致性。对于观察者间一致性,使用 Bland-Altman 分析计算相对测量误差(%)。此外,我们通过计算 Kappa 评分(κ)并估计方法之间的不一致比例(%)来比较反应分类的一致性。
MD、AD 和 AV 的观察者内变异性分别为 6.05%、4.28%和 12.72%。MD、AD 和 AV 的观察者间变异性分别为 4.23%、2.02%和 14.86%。化疗对这些估计值的影响不大。使用 AD 或 AV 并不能提高反应分类的一致性(MD κ=0.653,AD κ=0.548,AV κ=0.548),并且三种方法均观察到观察者之间存在大量不一致(MD 17.8%,AD 22.2%,AV 22.2%)。
半自动软件可实现 CRLM 的直径和体积测量的可重复性和再现性,但不能减少反应分类的变异性。