Thackham Mark, Ma Jun
Department of Statistics, Macquarie University, Sydney, Australia.
J Appl Stat. 2019 Oct 31;47(9):1511-1528. doi: 10.1080/02664763.2019.1681946. eCollection 2020.
Including time-varying covariates is a popular extension to the Cox model and a suitable approach for dealing with non-proportional hazards. However, partial likelihood (PL) estimation of this model has three shortcomings: (i) estimated regression coefficients can be less accurate in small samples with heavy censoring; (ii) the baseline hazard is not directly estimated and (iii) a covariance matrix for both the regression coefficients and the baseline hazard is not easily produced. We address these by developing a maximum likelihood (ML) approach to jointly estimate regression coefficients and baseline hazard using a constrained optimisation ensuring the latter's non-negativity. We demonstrate asymptotic properties of these estimates and show via simulation their increased accuracy compared to PL estimates in small samples and show our method produces smoother baseline hazard estimates than the Breslow estimator. Finally, we apply our method to two examples, including an important real-world financial example to estimate time to default for retail home loans. We demonstrate using our ML estimate for the baseline hazard can give much clearer corroboratory evidence of the 'humped hazard', whereby the risk of loan default rises to a peak and then later falls.
纳入时变协变量是Cox模型的一种流行扩展,也是处理非比例风险的合适方法。然而,该模型的偏似然(PL)估计有三个缺点:(i)在删失严重的小样本中,估计的回归系数可能不太准确;(ii)基线风险未直接估计;(iii)回归系数和基线风险的协方差矩阵不易生成。我们通过开发一种最大似然(ML)方法来解决这些问题,该方法使用约束优化来联合估计回归系数和基线风险,以确保后者的非负性。我们证明了这些估计的渐近性质,并通过模拟表明,与小样本中的PL估计相比,它们的准确性有所提高,并且我们的方法比Breslow估计器产生更平滑的基线风险估计。最后,我们将我们的方法应用于两个例子,包括一个重要的实际金融例子,以估计零售住房贷款的违约时间。我们证明,使用我们对基线风险的ML估计可以给出更清晰的“驼峰状风险”的确证证据,即贷款违约风险上升到峰值,然后随后下降。