NIH/NIAID/BRB, 5601 Fishers Lane, Rockville, MD 20852, USA.
Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA.
Comput Methods Programs Biomed. 2021 Jul;206:106115. doi: 10.1016/j.cmpb.2021.106115. Epub 2021 Apr 28.
With the recent surge in availability of large biomedical databases mostly derived from electronic health records, the need for the development of scalable marginal survival models with faster implementation cannot be more timely. The presence of clustering renders computational complexity, especially when the number of clusters is high. Marginalizing conditional survival models can violate the proportional hazards assumption for some frailty distributions, disrupting the connection to a conditional model. While theoretical connections between proportional hazard and accelerated failure time models exist, a computational framework to produce both for either marginal or conditional perspectives is lacking. Our objective is to provide fast, scalable bridged-survival models contained in a unified framework from which the effects and standard errors for the conditional hazard ratio, the marginal hazard ratio, the conditional acceleration factor, and the marginal acceleration factor can be estimated, and related to one another in a transparent fashion. Methods We formulate a Weibull parametric frailty likelihood for clustered survival times that can directly estimate the four estimands. Under a nonlinear mixed model specification with positive stable frailties powered by Gaussian quadrature, we put forth a novel closed form of the integrated likelihood that lowered the computational threshold for fitting these models. The method is illustrated on a real dataset generated from electronic health records examining tooth-loss.
Our novel closed form of the integrated likelihood significantly lowered the computational threshold for fitting these models by a factor of 12 (36 compared to 3 min) for the R package parfm, and a factor of 2400 for Gaussian Quadrature (4.6 days compared to 3 min) in SAS. Moreover, each of these estimands are connected by simple relationships of the parameters and the proportional hazards assumption is preserved for the marginal model. Our framework provides a flow of analysis enabling the fit of any/all of the 4 perspective-parameterization combinations. Conclusions We see the potential usefulness of our framework of bridged parametric survival models fitted with the Static-Stirling closed form likelihood. Bridged-survival models provide insights on subject-specific and population-level survival effects when their relation is transparent. SAS and R codes, along with implementation details on a pseudo data are provided.
随着大量生物医学数据库的出现,这些数据库主要来源于电子健康记录,因此开发具有更快实现速度的可扩展边缘生存模型的需求变得尤为迫切。聚类的存在会增加计算复杂度,尤其是当聚类数量较高时。条件生存模型的边缘化可能会违反某些脆弱性分布的比例风险假设,从而破坏与条件模型的联系。虽然比例风险和加速失效时间模型之间存在理论联系,但缺乏从边缘或条件角度生成这两种模型的计算框架。我们的目标是提供一个快速、可扩展的桥接生存模型框架,该框架可以从条件危险比、边缘危险比、条件加速因子和边缘加速因子的估计值以及它们之间的关系来估计这四个估计量,并且以透明的方式将它们联系起来。方法我们为聚类生存时间制定了一个威布尔参数脆弱性似然,该似然可以直接估计这四个估计量。在带有正稳定脆弱性的非线性混合模型规范下,我们提出了一种新的综合似然闭式形式,降低了拟合这些模型的计算阈值。该方法在一个由电子健康记录生成的真实数据集上进行了说明,该数据集用于研究牙齿缺失。
我们提出的综合似然闭式形式显著降低了拟合这些模型的计算阈值,对于 R 包 parfm,计算阈值降低了 12 倍(36 次对比 3 分钟),对于高斯积分,计算阈值降低了 2400 倍(4.6 天对比 3 分钟)。此外,这些估计量中的每一个都通过参数的简单关系联系在一起,并且边缘模型保留了比例风险假设。我们的框架提供了一种分析流程,能够拟合 4 种视角参数化组合中的任意一种或全部组合。结论我们看到了使用静态-斯特林闭式似然拟合桥接参数生存模型的框架的潜在用途。当桥接生存模型的关系是透明的时候,它可以提供关于个体和群体水平生存效果的见解。我们提供了 SAS 和 R 代码,以及伪数据的实现细节。