Florida State University, FL, USA.
University of Texas at Austin, TX, USA.
Biometrics. 2022 Sep;78(3):880-893. doi: 10.1111/biom.13478. Epub 2021 Apr 29.
Popular parametric and semiparametric hazards regression models for clustered survival data are inappropriate and inadequate when the unknown effects of different covariates and clustering are complex. This calls for a flexible modeling framework to yield efficient survival prediction. Moreover, for some survival studies involving time to occurrence of some asymptomatic events, survival times are typically interval censored between consecutive clinical inspections. In this article, we propose a robust semiparametric model for clustered interval-censored survival data under a paradigm of Bayesian ensemble learning, called soft Bayesian additive regression trees or SBART (Linero and Yang, 2018), which combines multiple sparse (soft) decision trees to attain excellent predictive accuracy. We develop a novel semiparametric hazards regression model by modeling the hazard function as a product of a parametric baseline hazard function and a nonparametric component that uses SBART to incorporate clustering, unknown functional forms of the main effects, and interaction effects of various covariates. In addition to being applicable for left-censored, right-censored, and interval-censored survival data, our methodology is implemented using a data augmentation scheme which allows for existing Bayesian backfitting algorithms to be used. We illustrate the practical implementation and advantages of our method via simulation studies and an analysis of a prostate cancer surgery study where dependence on the experience and skill level of the physicians leads to clustering of survival times. We conclude by discussing our method's applicability in studies involving high-dimensional data with complex underlying associations.
当不同协变量和聚类的未知效应复杂时,流行的参数和半参数风险回归模型不适合也不充分用于聚类生存数据。这需要一个灵活的建模框架来产生有效的生存预测。此外,对于一些涉及某些无症状事件发生时间的生存研究,生存时间通常在连续的临床检查之间是区间删失的。在本文中,我们提出了一种基于贝叶斯集成学习范式的聚类区间删失生存数据的稳健半参数模型,称为软贝叶斯加法回归树或 SBART(Linero 和 Yang,2018 年),它结合了多个稀疏(软)决策树,以达到出色的预测准确性。我们通过将风险函数建模为参数基准风险函数和非参数组件的乘积来开发一种新的半参数风险回归模型,该组件使用 SBART 来纳入聚类、主效应的未知函数形式以及各种协变量的交互效应。除了适用于左删失、右删失和区间删失生存数据外,我们的方法还使用数据增强方案实现,允许使用现有的贝叶斯后拟合算法。我们通过模拟研究和前列腺癌手术研究的分析来说明我们方法的实际实施和优势,其中生存时间的聚类取决于医生的经验和技能水平。最后,我们讨论了我们的方法在涉及具有复杂潜在关联的高维数据的研究中的适用性。