Suppr超能文献

威布尔回归与机器学习生存模型:方法、比较及在心脏外科生物医学数据中的应用

Weibull Regression and Machine Learning Survival Models: Methodology, Comparison, and Application to Biomedical Data Related to Cardiac Surgery.

作者信息

Cavalcante Thalytta, Ospina Raydonal, Leiva Víctor, Cabezas Xavier, Martin-Barreiro Carlos

机构信息

Department of Statistics, CASTLab, Universidade Federal de Pernambuco, Recife 50670-901, Brazil.

Department of Statistics, IME, Universidade Federal da Bahia, Salvador 40170-110, Brazil.

出版信息

Biology (Basel). 2023 Mar 13;12(3):442. doi: 10.3390/biology12030442.

Abstract

In this article, we propose a comparative study between two models that can be used by researchers for the analysis of survival data: (i) the Weibull regression model and (ii) the random survival forest (RSF) model. The models are compared considering the error rate, the performance of the model through the Harrell C-index, and the identification of the relevant variables for survival prediction. A statistical analysis of a data set from the Heart Institute of the University of São Paulo, Brazil, has been carried out. In the study, the length of stay of patients undergoing cardiac surgery, within the operating room, was used as the response variable. The obtained results show that the RSF model has less error rate for the training and testing data sets, at 23.55% and 20.31%, respectively, than the Weibull model, which has an error rate of 23.82%. Regarding the Harrell C-index, we obtain the values 0.76, 0.79, and 0.76, for the RSF and Weibull models, respectively. After the selection procedure, the Weibull model contains variables associated with the type of protocol and type of patient being statistically significant at 5%. The RSF model chooses age, type of patient, and type of protocol as relevant variables for prediction. We employ the randomForestSRC package of the R software to perform our data analysis and computational experiments. The proposal that we present has many applications in biology and medicine, which are discussed in the conclusions of this work.

摘要

在本文中,我们提出了一项比较研究,涉及研究人员可用于生存数据分析的两种模型:(i)威布尔回归模型和(ii)随机生存森林(RSF)模型。通过错误率、基于哈雷尔C指数的模型性能以及用于生存预测的相关变量识别来比较这两种模型。我们对巴西圣保罗大学心脏研究所的一个数据集进行了统计分析。在该研究中,心脏手术患者在手术室的停留时间被用作响应变量。所得结果表明,RSF模型在训练数据集和测试数据集上的错误率分别为23.55%和20.31%,低于威布尔模型的23.82%。关于哈雷尔C指数,RSF模型和威布尔模型分别得到0.76、0.79和0.76的值。经过选择过程后,威布尔模型包含与方案类型和患者类型相关的变量,在5%的水平上具有统计学显著性。RSF模型选择年龄、患者类型和方案类型作为预测的相关变量。我们使用R软件的randomForestSRC包来进行数据分析和计算实验。我们提出的方案在生物学和医学中有许多应用,将在本工作的结论中进行讨论。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验