Department of Statistics, Stanford University, Stanford, CA 94305, USA.
Stat Methods Med Res. 2010 Feb;19(1):29-51. doi: 10.1177/0962280209105024. Epub 2009 Aug 4.
In recent years, breakthroughs in biomedical technology have led to a wealth of data in which the number of features (for instance, genes on which expression measurements are available) exceeds the number of observations (e.g. patients). Sometimes survival outcomes are also available for those same observations. In this case, one might be interested in (a) identifying features that are associated with survival (in a univariate sense), and (b) developing a multivariate model for the relationship between the features and survival that can be used to predict survival in a new observation. Due to the high dimensionality of this data, most classical statistical methods for survival analysis cannot be applied directly. Here, we review a number of methods from the literature that address these two problems.
近年来,生物医学技术的突破带来了大量的数据,其中特征的数量(例如,可提供表达测量的基因)超过了观察的数量(例如,患者)。有时,对于相同的观察结果也可以获得生存结果。在这种情况下,人们可能有兴趣:(a) 确定与生存相关的特征(在单变量意义上),以及 (b) 为特征与生存之间的关系开发一个多元模型,以便用于预测新观察中的生存情况。由于该数据的高维度,大多数用于生存分析的经典统计方法都不能直接应用。在这里,我们回顾了文献中一些解决这两个问题的方法。