Liu Xueli, Minin Vladimir, Huang Yunda, Seligson David B, Horvath Steve
Department of Biostatistics, School of Public Health, UCLA, Los Angeles, California, USA.
J Biopharm Stat. 2004 Aug;14(3):671-85. doi: 10.1081/BIP-200025657.
Tissue microarrays (TMAs) are a new high-throughput tool for the study of protein expression patterns in tissues and are increasingly used to evaluate the diagnostic and prognostic importance of biomarkers. TMA data are rather challenging to analyze. Covariates are highly skewed, non-normal, and may be highly correlated. We present statistical methods for relating TMA data to censored time-to-event data. We review methods for evaluating the predictive power of Cox regression models and show how to test whether biomarker data contain predictive information above and beyond standard pathology covariates. We use nonparametric bootstrap methods to validate model fitting indices such as the concordance index. We also present data mining methods for characterizing high risk patients with simple biomarker rules. Since researchers in the TMA community routinely dichotomize biomarker expression values, survival trees are a natural choice. We also use bump hunting (patient rule induction method), which we adapt to the use with survival data. The proposed methods are applied to a kidney cancer tissue microarray data set.
组织微阵列(TMAs)是一种用于研究组织中蛋白质表达模式的新型高通量工具,并且越来越多地用于评估生物标志物在诊断和预后方面的重要性。TMA数据的分析颇具挑战性。协变量高度偏态、非正态,且可能高度相关。我们提出了将TMA数据与删失的事件发生时间数据相关联的统计方法。我们回顾了评估Cox回归模型预测能力的方法,并展示了如何检验生物标志物数据是否包含超出标准病理协变量的预测信息。我们使用非参数自助法来验证诸如一致性指数等模型拟合指标。我们还提出了用于通过简单生物标志物规则来表征高危患者的数据挖掘方法。由于TMA领域的研究人员通常将生物标志物表达值二分,生存树是一个自然的选择。我们还使用了凸点搜索法(患者规则归纳法),并将其适用于生存数据。所提出的方法应用于一个肾癌组织微阵列数据集。