Department of Statistical Science, University College London, London, UK.
Stat Med. 2012 May 20;31(11-12):1150-61. doi: 10.1002/sim.4371. Epub 2011 Oct 14.
Prognostic models for survival outcomes are often developed by fitting standard survival regression models, such as the Cox proportional hazards model, to representative datasets. However, these models can be unreliable if the datasets contain few events, which may be the case if either the disease or the event of interest is rare. Specific problems include predictions that are too extreme, and poor discrimination between low-risk and high-risk patients. The objective of this paper is to evaluate three existing penalised methods that have been proposed to improve predictive accuracy. In particular, ridge, lasso and the garotte, which use penalised maximum likelihood to shrink coefficient estimates and in some cases omit predictors entirely, are assessed using simulated data derived from two clinical datasets. The predictions obtained using these methods are compared with those from Cox models fitted using standard maximum likelihood. The simulation results suggest that Cox models fitted using maximum likelihood can perform poorly when there are few events, and that significant improvements are possible by taking a penalised modelling approach. The ridge method generally performed the best, although lasso is recommended if variable selection is required.
生存结果的预后模型通常通过将标准生存回归模型(如 Cox 比例风险模型)拟合到代表性数据集来开发。然而,如果数据集包含的事件很少,这些模型可能会不可靠,例如,如果疾病或感兴趣的事件很少见。具体问题包括预测过于极端,以及低风险和高风险患者之间的区分度差。本文的目的是评估三种已提出的改进预测准确性的现有惩罚方法。特别是,使用惩罚最大似然法收缩系数估计值并在某些情况下完全省略预测因子的岭、lasso 和绞索方法,使用来自两个临床数据集的模拟数据进行评估。使用这些方法获得的预测结果与使用标准最大似然法拟合的 Cox 模型的预测结果进行了比较。模拟结果表明,当事件很少时,使用最大似然法拟合的 Cox 模型可能表现不佳,通过采用惩罚建模方法可以显著提高预测性能。岭方法通常表现最好,尽管如果需要变量选择,则推荐使用 lasso。