Chalkidou Anastasia, O'Doherty Michael J, Marsden Paul K
Division of Imaging Sciences and Biomedical Engineering, Kings College London 4th Floor, Lambeth Wing, St. Thomas Hospital, SE1 7EH, London, United Kingdom.
PLoS One. 2015 May 4;10(5):e0124165. doi: 10.1371/journal.pone.0124165. eCollection 2015.
A number of recent publications have proposed that a family of image-derived indices, called texture features, can predict clinical outcome in patients with cancer. However, the investigation of multiple indices on a single data set can lead to significant inflation of type-I errors. We report a systematic review of the type-I error inflation in such studies and review the evidence regarding associations between patient outcome and texture features derived from positron emission tomography (PET) or computed tomography (CT) images.
For study identification PubMed and Scopus were searched (1/2000-9/2013) using combinations of the keywords texture, prognostic, predictive and cancer. Studies were divided into three categories according to the sources of the type-I error inflation and the use or not of an independent validation dataset. For each study, the true type-I error probability and the adjusted level of significance were estimated using the optimum cut-off approach correction, and the Benjamini-Hochberg method. To demonstrate explicitly the variable selection bias in these studies, we re-analyzed data from one of the published studies, but using 100 random variables substituted for the original image-derived indices. The significance of the random variables as potential predictors of outcome was examined using the analysis methods used in the identified studies.
Fifteen studies were identified. After applying appropriate statistical corrections, an average type-I error probability of 76% (range: 34-99%) was estimated with the majority of published results not reaching statistical significance. Only 3/15 studies used a validation dataset. For the 100 random variables examined, 10% proved to be significant predictors of survival when subjected to ROC and multiple hypothesis testing analysis.
We found insufficient evidence to support a relationship between PET or CT texture features and patient survival. Further fit for purpose validation of these image-derived biomarkers should be supported by appropriate biological and statistical evidence before their association with patient outcome is investigated in prospective studies.
近期有多项研究表明,一类源自图像的指标(称为纹理特征)可预测癌症患者的临床结局。然而,在单个数据集上对多个指标进行研究可能会导致I型错误显著增加。我们报告了对此类研究中I型错误增加情况的系统评价,并综述了有关患者结局与源自正电子发射断层扫描(PET)或计算机断层扫描(CT)图像的纹理特征之间关联的证据。
为了检索研究,我们在PubMed和Scopus数据库中进行了搜索(2000年1月至2013年9月),使用了纹理、预后、预测和癌症等关键词的组合。根据I型错误增加的来源以及是否使用独立验证数据集,将研究分为三类。对于每项研究,使用最佳截断方法校正和Benjamini-Hochberg方法估计真实的I型错误概率和调整后的显著性水平。为了明确展示这些研究中的变量选择偏差,我们重新分析了一项已发表研究中的数据,但使用100个随机变量替代了原始的源自图像的指标。使用已识别研究中使用的分析方法检查随机变量作为结局潜在预测指标的显著性。
共识别出15项研究。应用适当的统计校正后,估计平均I型错误概率为76%(范围:34%-99%),大多数已发表结果未达到统计学显著性。只有3/15的研究使用了验证数据集。在对100个随机变量进行检查时,经ROC和多重假设检验分析,10%被证明是生存的显著预测指标。
我们发现没有足够的证据支持PET或CT纹理特征与患者生存之间的关系。在这些源自图像的生物标志物与患者结局的关联在前瞻性研究中得到调查之前,应通过适当的生物学和统计证据支持对这些生物标志物进行进一步的针对性验证。