Linden Ariel, Yarnold Paul R
Linden Consulting Group, LLC, Ann Arbor, MI, USA.
Division of General Medicine, Medical School, University of Michigan, Ann Arbor, MI, USA.
J Eval Clin Pract. 2017 Dec;23(6):1299-1308. doi: 10.1111/jep.12779. Epub 2017 Jul 3.
RATIONALE, AIMS, AND OBJECTIVES: Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability.
Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves.
The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time.
Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework.
原理、目的和目标:在健康研究中,常常会对事件发生的时间进行研究。生存分析与其他设计不同,因为在分析中会考虑到那些在研究结束时未经历该事件的个体(称为删失个体)的随访时间。Cox回归是分析删失数据的标准方法,但这些模型所需的假设很容易被违反。在本文中,我们引入分类树分析(CTA)作为一种灵活的替代方法来对删失数据进行建模。分类树分析是一种类似“决策树”的分类模型,它提供简洁、透明(即易于直观展示和解释)的决策规则,以最大化预测准确性,通过置换检验得出精确的P值,并评估模型的交叉可推广性。
我们使用实证数据确定所有统计上有效、可重复、纵向一致且具有交叉可推广性的CTA生存模型,然后将它们的预测准确性与通过Cox回归和未经调整的朴素模型得出的估计值进行比较。使用综合Brier评分和估计生存曲线之间的比较来评估模型性能。
Cox回归模型最能预测随时间推移的结局平均发生率,而CTA生存模型最能预测随时间推移结局的相对高发生率或低发生率。
分类树分析生存模型相对于Cox回归具有许多优势,例如明确最大化预测准确性、简洁性、统计稳健性和透明度。因此,对准确预后和清晰决策规则感兴趣的研究人员应考虑使用CTA - 生存框架开发模型。