Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA.
Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, 80045, Colorado, USA.
BMC Med Res Methodol. 2021 Oct 17;21(1):216. doi: 10.1186/s12874-021-01375-x.
Risk prediction models for time-to-event outcomes play a vital role in personalized decision-making. A patient's biomarker values, such as medical lab results, are often measured over time but traditional prediction models ignore their longitudinal nature, using only baseline information. Dynamic prediction incorporates longitudinal information to produce updated survival predictions during follow-up. Existing methods for dynamic prediction include joint modeling, which often suffers from computational complexity and poor performance under misspecification, and landmarking, which has a straightforward implementation but typically relies on a proportional hazards model. Random survival forests (RSF), a machine learning algorithm for time-to-event outcomes, can capture complex relationships between the predictors and survival without requiring prior specification and has been shown to have superior predictive performance.
We propose an alternative approach for dynamic prediction using random survival forests in a landmarking framework. With a simulation study, we compared the predictive performance of our proposed method with Cox landmarking and joint modeling in situations where the proportional hazards assumption does not hold and the longitudinal marker(s) have a complex relationship with the survival outcome. We illustrated the use of the RSF landmark approach in two clinical applications to assess the performance of various RSF model building decisions and to demonstrate its use in obtaining dynamic predictions.
In simulation studies, RSF landmarking outperformed joint modeling and Cox landmarking when a complex relationship between the survival and longitudinal marker processes was present. It was also useful in application when there were several predictors for which the clinical relevance was unknown and multiple longitudinal biomarkers were present. Individualized dynamic predictions can be obtained from this method and the variable importance metric is useful for examining the changing predictive power of variables over time. In addition, RSF landmarking is easily implementable in standard software and using suggested specifications requires less computation time than joint modeling.
RSF landmarking is a nonparametric, machine learning alternative to current methods for obtaining dynamic predictions when there are complex or unknown relationships present. It requires little upfront decision-making and has comparable predictive performance and has preferable computational speed.
针对事件时间结局的风险预测模型在个性化决策中起着至关重要的作用。患者的生物标志物值(例如医学实验室结果)通常是随时间测量的,但传统的预测模型忽略了它们的纵向性质,仅使用基线信息。动态预测则包含了纵向信息,可以在随访期间生成更新的生存预测。现有的动态预测方法包括联合建模,该方法通常存在计算复杂度高和在模型指定有误时表现不佳的问题,以及标志点法,该方法实现简单,但通常依赖于比例风险模型。随机生存森林(RSF)是一种针对事件时间结局的机器学习算法,无需预先指定,即可捕捉预测因子与生存之间的复杂关系,并且已被证明具有优越的预测性能。
我们提出了一种替代方法,即在标志点框架中使用随机生存森林进行动态预测。通过模拟研究,我们比较了所提出的方法与 Cox 标志点法和联合建模在比例风险假设不成立且纵向标记与生存结局之间存在复杂关系时的预测性能。我们在两个临床应用中说明了 RSF 标志方法的使用,以评估各种 RSF 模型构建决策的性能,并展示其在获得动态预测中的应用。
在模拟研究中,当生存和纵向标记过程之间存在复杂关系时,RSF 标志点法优于联合建模和 Cox 标志点法。当存在多个预测因子且其临床相关性未知且存在多个纵向生物标志物时,该方法也很有用。可以从该方法获得个体化的动态预测,并且变量重要性指标可用于检查变量随时间变化的预测能力。此外,RSF 标志点法易于在标准软件中实现,并且使用建议的规范所需的计算时间比联合建模少。
当存在复杂或未知关系时,RSF 标志点法是一种非参数、机器学习的替代方法,可以用于获取动态预测。它需要很少的前期决策,具有可比较的预测性能和更优的计算速度。