Bashir Usman, Azad Gurdip, Siddique Muhammad Musib, Dhillon Saana, Patel Nikheel, Bassett Paul, Landau David, Goh Vicky, Cook Gary
Cancer Imaging Department, Division of Imaging Sciences and Biomedical Engineering, King's College London, London, SE1 7EH, UK.
Stats Consultancy Ltd, 40 Longwood Lane, Amersham, Bucks, HP7 9EN, UK.
EJNMMI Res. 2017 Dec;7(1):60. doi: 10.1186/s13550-017-0310-3. Epub 2017 Jul 26.
Measures of tumour heterogeneity derived from 18-fluoro-2-deoxyglucose positron emission tomography/computed tomography (F-FDG PET/CT) scans are increasingly reported as potential biomarkers of non-small cell lung cancer (NSCLC) for classification and prognostication. Several segmentation algorithms have been used to delineate tumours, but their effects on the reproducibility and predictive and prognostic capability of derived parameters have not been evaluated. The purpose of our study was to retrospectively compare various segmentation algorithms in terms of inter-observer reproducibility and prognostic capability of texture parameters derived from non-small cell lung cancer (NSCLC) F-FDG PET/CT images. Fifty three NSCLC patients (mean age 65.8 years; 31 males) underwent pre-chemoradiotherapy F-FDG PET/CT scans. Three readers segmented tumours using freehand (FH), 40% of maximum intensity threshold (40P), and fuzzy locally adaptive Bayesian (FLAB) algorithms. Intraclass correlation coefficient (ICC) was used to measure the inter-observer variability of the texture features derived by the three segmentation algorithms. Univariate cox regression was used on 12 commonly reported texture features to predict overall survival (OS) for each segmentation algorithm. Model quality was compared across segmentation algorithms using Akaike information criterion (AIC).
40P was the most reproducible algorithm (median ICC 0.9; interquartile range [IQR] 0.85-0.92) compared with FLAB (median ICC 0.83; IQR 0.77-0.86) and FH (median ICC 0.77; IQR 0.7-0.85). On univariate cox regression analysis, 40P found 2 out of 12 variables, i.e. first-order entropy and grey-level co-occurence matrix (GLCM) entropy, to be significantly associated with OS; FH and FLAB found 1, i.e., first-order entropy. For each tested variable, survival models for all three segmentation algorithms were of similar quality, exhibiting comparable AIC values with overlapping 95% CIs.
Compared with both FLAB and FH, segmentation with 40P yields superior inter-observer reproducibility of texture features. Survival models generated by all three segmentation algorithms are of at least equivalent utility. Our findings suggest that a segmentation algorithm using a 40% of maximum threshold is acceptable for texture analysis of F-FDG PET in NSCLC.
源自18-氟-2-脱氧葡萄糖正电子发射断层扫描/计算机断层扫描(F-FDG PET/CT)扫描的肿瘤异质性测量方法越来越多地被报道为非小细胞肺癌(NSCLC)分类和预后的潜在生物标志物。几种分割算法已被用于勾勒肿瘤,但它们对衍生参数的可重复性以及预测和预后能力的影响尚未得到评估。我们研究的目的是回顾性比较各种分割算法在非小细胞肺癌(NSCLC)F-FDG PET/CT图像纹理参数的观察者间可重复性和预后能力方面的差异。53例NSCLC患者(平均年龄65.8岁;31例男性)接受了放化疗前的F-FDG PET/CT扫描。三名阅片者使用徒手(FH)、最大强度阈值的40%(40P)和模糊局部自适应贝叶斯(FLAB)算法分割肿瘤。组内相关系数(ICC)用于测量三种分割算法得出的纹理特征的观察者间变异性。对12个常见报道的纹理特征进行单变量Cox回归,以预测每种分割算法的总生存期(OS)。使用赤池信息准则(AIC)比较各分割算法的模型质量。
与FLAB(中位数ICC 0.83;四分位间距[IQR] 0.77 - 0.86)和FH(中位数ICC 0.77;IQR 0.7 - 0.85)相比,40P是最具可重复性的算法(中位数ICC 0.9;IQR 0.85 - 0.92)。在单变量Cox回归分析中,40P发现12个变量中有2个,即一阶熵和灰度共生矩阵(GLCM)熵,与OS显著相关;FH和FLAB各发现1个,即一阶熵。对于每个测试变量,所有三种分割算法的生存模型质量相似,表现出具有重叠95%置信区间的可比AIC值。
与FLAB和FH相比,40P分割在纹理特征的观察者间可重复性方面表现更优。所有三种分割算法生成的生存模型至少具有同等效用。我们的研究结果表明,对于NSCLC中F-FDG PET的纹理分析,使用最大阈值40%的分割算法是可以接受的。