Department of Health Sciences, University of Leicester, Adrian Building, University Road, Leicester, LE1 7RH, UK.
Stat Med. 2012 Dec 30;31(30):4456-71. doi: 10.1002/sim.5644. Epub 2012 Oct 4.
The joint modelling of longitudinal and survival data is a highly active area of biostatistical research. The submodel for the longitudinal biomarker usually takes the form of a linear mixed effects model. We describe a flexible parametric approach for the survival submodel that models the log baseline cumulative hazard using restricted cubic splines. This approach overcomes limitations of standard parametric choices for the survival submodel, which can lack the flexibility to effectively capture the shape of the underlying hazard function. Numerical integration techniques, such as Gauss-Hermite quadrature, are usually required to evaluate both the cumulative hazard and the overall joint likelihood; however, by using a flexible parametric model, the cumulative hazard has an analytically tractable form, providing considerable computational benefits. We conduct an extensive simulation study to assess the proposed model, comparing it with a B-spline formulation, illustrating insensitivity of parameter estimates to the baseline cumulative hazard function specification. Furthermore, we compare non-adaptive and fully adaptive quadrature, showing the superiority of adaptive quadrature in evaluating the joint likelihood. We also describe a useful technique to simulate survival times from complex baseline hazard functions and illustrate the methods using an example data set investigating the association between longitudinal prothrombin index and survival of patients with liver cirrhosis, showing greater flexibility and improved stability with fewer parameters under the proposed model compared with the B-spline approach. We provide user-friendly Stata software.
纵向和生存数据的联合建模是生物统计学研究的一个非常活跃的领域。纵向生物标志物的子模型通常采用线性混合效应模型的形式。我们描述了一种用于生存子模型的灵活参数方法,该方法使用限制立方样条对对数基线累积风险进行建模。这种方法克服了生存子模型的标准参数选择的局限性,标准参数选择可能缺乏有效捕捉潜在风险函数形状的灵活性。通常需要使用数值积分技术(如高斯-赫尔墨特求积)来评估累积风险和总体联合似然;然而,通过使用灵活的参数模型,累积风险具有可分析的形式,提供了相当大的计算优势。我们进行了广泛的模拟研究来评估所提出的模型,将其与 B 样条公式进行比较,说明了参数估计对基线累积风险函数规范的不敏感性。此外,我们比较了非自适应和完全自适应求积,表明自适应求积在评估联合似然方面的优越性。我们还描述了一种从复杂基线风险函数中模拟生存时间的有用技术,并使用一个研究纵向凝血酶原指数与肝硬化患者生存之间关系的示例数据集说明了这些方法,与 B 样条方法相比,该模型具有更少的参数,提供了更大的灵活性和更好的稳定性。我们提供了用户友好的 Stata 软件。