Salerno Stephen, Li Yi
Department of Biostatistics, University of Michigan, Ann Arbor, United States, 48109.
Annu Rev Stat Appl. 2023 Mar;10(1):25-49. doi: 10.1146/annurev-statistics-032921-022127. Epub 2022 Oct 6.
In the era of precision medicine, time-to-event outcomes such as time to death or progression are routinely collected, along with high-throughput covariates. These high-dimensional data defy classical survival regression models, which are either infeasible to fit or likely to incur low predictability due to over-fitting. To overcome this, recent emphasis has been placed on developing novel approaches for feature selection and survival prognostication. We will review various cutting-edge methods that handle survival outcome data with high-dimensional predictors, highlighting recent innovations in machine learning approaches for survival prediction. We will cover the statistical intuitions and principles behind these methods and conclude with extensions to more complex settings, where competing events are observed. We exemplify these methods with applications to the Boston Lung Cancer Survival Cohort study, one of the largest cancer epidemiology cohorts investigating the complex mechanisms of lung cancer.
在精准医学时代,诸如死亡时间或进展时间等事件发生时间结局会与高通量协变量一起被常规收集。这些高维数据使经典生存回归模型失效,因为要么拟合这些模型不可行,要么由于过拟合而可能导致预测能力较低。为了克服这一问题,近期人们将重点放在开发用于特征选择和生存预后的新方法上。我们将回顾各种处理具有高维预测变量的生存结局数据的前沿方法,突出机器学习方法在生存预测方面的最新创新。我们将阐述这些方法背后的统计直觉和原理,并以扩展到观察到竞争事件的更复杂情形作为结尾。我们将通过应用于波士顿肺癌生存队列研究来举例说明这些方法,该研究是调查肺癌复杂机制的最大癌症流行病学队列之一。