Li Xiao, Guindani Michele, Ng Chaan S, Hobbs Brian P
Department of Biostatistics, The University of Texas Health Science Center at Houston, Houston, USA.
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, USA.
J Appl Stat. 2018;46(2):230-246. doi: 10.1080/02664763.2018.1473348. Epub 2018 May 15.
The emerging field of cancer radiomics endeavors to characterize intrinsic patterns of tumor phenotypes and surrogate markers of response by transforming medical images into objects that yield quantifiable summary statistics to which regression and machine learning algorithms may be applied for statistical interrogation. Recent literature has identified clinicopathological association based on textural features deriving from gray-level co-occurrence matrices (GLCM) which facilitate evaluations of gray-level spatial dependence within a delineated region of interest. GLCM-derived features, however, tend to contribute highly redundant information. Moreover, when reporting selected feature sets, investigators often fail to adjust for multiplicities and commonly fail to convey the predictive power of their findings. This article presents a Bayesian probabilistic modeling framework for the GLCM as a multivariate object as well as describes its application within a cancer detection context based on computed tomography. The methodology, which circumvents processing steps and avoids evaluations of reductive and highly correlated feature sets, uses latent Gaussian Markov random field structure to characterize spatial dependencies among GLCM cells and facilitates classification via predictive probability. Correctly predicting the underlying pathology of 81% of the adrenal lesions in our case study, the proposed method outperformed current practices which achieved a maximum accuracy of only 59%. Simulations and theory are presented to further elucidate this comparison as well as ascertain the utility of applying multivariate Gaussian spatial processes to GLCM objects.
新兴的癌症放射组学领域致力于通过将医学图像转化为能够产生可量化汇总统计数据的对象,来表征肿瘤表型的内在模式和反应替代标志物,进而可以应用回归和机器学习算法进行统计分析。最近的文献已经确定了基于灰度共生矩阵(GLCM)纹理特征的临床病理关联,这有助于评估在划定的感兴趣区域内的灰度空间依赖性。然而,GLCM衍生的特征往往会提供高度冗余的信息。此外,在报告选定的特征集时,研究人员常常未能对多重性进行调整,并且通常无法传达其研究结果的预测能力。本文提出了一种将GLCM作为多变量对象的贝叶斯概率建模框架,并描述了其在基于计算机断层扫描的癌症检测背景中的应用。该方法规避了处理步骤,避免了对简化且高度相关的特征集进行评估,它使用潜在高斯马尔可夫随机场结构来表征GLCM细胞之间的空间依赖性,并通过预测概率促进分类。在我们的案例研究中,所提出的方法正确预测了81%肾上腺病变的潜在病理,优于目前的做法,目前做法的最高准确率仅为59%。本文还给出了模拟和理论,以进一步阐明这种比较,并确定将多变量高斯空间过程应用于GLCM对象的效用。