Suppr超能文献

基于模拟研究以及对两个事件发生时间数据应用的情况,对条件推断生存森林模型与随机生存森林进行比较。

A comparison of the conditional inference survival forest model to random survival forests based on a simulation study as well as on two applications with time-to-event data.

作者信息

Nasejje Justine B, Mwambi Henry, Dheda Keertan, Lesosky Maia

机构信息

School of Statistics, Mathematics and Computer Science, University of Kwazulu-Natal, Pietermaritzburg, South Africa.

Division of Pulmonology and UCT Lung Institute, Department of Medicine, University of Cape Town, Cape Town, South Africa.

出版信息

BMC Med Res Methodol. 2017 Jul 28;17(1):115. doi: 10.1186/s12874-017-0383-8.

Abstract

BACKGROUND

Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditional inference forests for time-to-event data have been suggested. Conditional inference forests (CIF) are known to correct the bias in RSF models by separating the procedure for the best covariate to split on from that of the best split point search for the selected covariate.

METHODS

In this study, we compare the random survival forest model to the conditional inference model (CIF) using twenty-two simulated time-to-event datasets. We also analysed two real time-to-event datasets. The first dataset is based on the survival of children under-five years of age in Uganda and it consists of categorical covariates with most of them having more than two levels (many split-points). The second dataset is based on the survival of patients with extremely drug resistant tuberculosis (XDR TB) which consists of mainly categorical covariates with two levels (few split-points).

RESULTS

The study findings indicate that the conditional inference forest model is superior to random survival forest models in analysing time-to-event data that consists of covariates with many split-points based on the values of the bootstrap cross-validated estimates for integrated Brier scores. However, conditional inference forests perform comparably similar to random survival forests models in analysing time-to-event data consisting of covariates with fewer split-points.

CONCLUSION

Although survival forests are promising methods in analysing time-to-event data, it is important to identify the best forest model for analysis based on the nature of covariates of the dataset in question.

摘要

背景

随机生存森林(RSF)模型已被确定为在分析事件发生时间数据时替代Cox比例风险模型的方法。然而,这些方法因偏向具有多个分割点的协变量而导致偏差,因此有人提出了用于事件发生时间数据的条件推断森林。已知条件推断森林(CIF)通过将用于选择最佳协变量进行分割的过程与为选定协变量搜索最佳分割点的过程分开,来纠正RSF模型中的偏差。

方法

在本研究中,我们使用22个模拟的事件发生时间数据集,将随机生存森林模型与条件推断模型(CIF)进行比较。我们还分析了两个真实的事件发生时间数据集。第一个数据集基于乌干达五岁以下儿童的生存情况,它由分类协变量组成,其中大多数有两个以上的水平(多个分割点)。第二个数据集基于广泛耐药结核病(XDR TB)患者的生存情况,它主要由具有两个水平的分类协变量组成(分割点较少)。

结果

研究结果表明,基于综合Brier评分的自助法交叉验证估计值,在分析由具有多个分割点的协变量组成的事件发生时间数据时,条件推断森林模型优于随机生存森林模型。然而,在分析由分割点较少的协变量组成的事件发生时间数据时,条件推断森林的表现与随机生存森林模型相当相似。

结论

尽管生存森林是分析事件发生时间数据的有前景的方法,但根据所讨论数据集协变量的性质确定最佳的森林模型进行分析很重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7817/5534080/de049dc89431/12874_2017_383_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验