Porndumnernsawat Patcharaporn, Frank Till D, Ingsrisawang Lily
Department of Mathematics and Computer Science, Faculty of Science and Technology, Rajamangala University of Technology Krungthep, Bangkok, Thailand.
Department of Psychological Sciences, University of Connecticut, Storrs, USA.
Sci Rep. 2025 Apr 8;15(1):12017. doi: 10.1038/s41598-025-96198-x.
This study compared the performance of the Bayesian multivariate survival tree approach constructed from extended Cox proportional hazard with gamma frailty term, and two shared gamma frailty models with exponential and Weibull baseline hazard function, respectively. A simulation study was applied to evaluate the impact of the baseline hazard function, number of clusters (200, 500, 1000), cluster size (5, 10, 20), and right censoring rate (10%, 50%, 80%) on the performance of classification. We generated 90 clustered survival datasets having correlated failure times and 50 covariates at cluster level and at individual level. Each dataset was resampling 1000 times by selecting clusters at random 70% as training datasets and the rest 30% as the test datasets. The performance of a Bayesian multivariate survival tree approach based on shared gamma frailty models with Weibull distribution provided the highest accuracy. All three models, the accuracy tended to increase with an increase in the cluster size and the number of clusters. The accuracy decreased monotonically with increasing the percentage of censoring rate. In conclusion, the use of the Bayesian multivariate survival tree approach constructed from the shared gamma frailty with baseline hazard function as Weibull distribution was recommended.
本研究比较了由扩展的Cox比例风险模型与伽马脆弱项构建的贝叶斯多变量生存树方法,以及分别具有指数和威布尔基线风险函数的两个共享伽马脆弱模型的性能。应用模拟研究来评估基线风险函数、聚类数量(200、500、1000)、聚类大小(5、10、20)和右删失率(10%、50%、80%)对分类性能的影响。我们生成了90个具有聚类生存数据集,在聚类水平和个体水平上具有相关的失效时间和50个协变量。每个数据集通过随机选择70%的聚类作为训练数据集,其余30%作为测试数据集进行1000次重采样。基于具有威布尔分布的共享伽马脆弱模型的贝叶斯多变量生存树方法的性能提供了最高的准确性。对于所有三个模型,准确性往往随着聚类大小和聚类数量的增加而提高。准确性随着删失率百分比的增加而单调下降。总之,建议使用由具有威布尔分布的基线风险函数的共享伽马脆弱构建的贝叶斯多变量生存树方法。