Mansoor Hend, Elgendy Islam Y, Segal Richard, Bavry Anthony A, Bian Jiang
Department of Health Services Research, University of Florida, College of Public Health, Gainesville, FL, USA.
Division of Cardiovascular Medicine, Department of Medicine, Gainesville, FL, USA.
Heart Lung. 2017 Nov-Dec;46(6):405-411. doi: 10.1016/j.hrtlng.2017.09.003. Epub 2017 Oct 6.
Studies had shown that mortality due to ST-elevation myocardial infarction (STEMI) is higher in women compared with men. The purpose of this study is to develop and validate prediction models for all-cause in-hospital mortality in women admitted with STEMI using logistic regression and random forest, and to compare the performance and validity of the different models.
Data from the National Inpatient Sample (NIS) data years 2011-2013 were used to identify women admitted with STEMI. The main outcome was all-cause in-hospital mortality. Patients were divided into development and validation cohorts, and trained models were internally validated using 20% of the 2012 data, and externally validated using 2011 and 2013 NIS data.
Three main models were developed and compared; multivariate logistic regression, full and reduced random forest models. In the multivariate logistic regression, 11 variables were included in the final model based on backward elimination. The full random forest model contained 32 variables, and the reduced model contained 17 variables selected based on individual variable importance. In the internal validation cohort, the C-index was 0.84, 0.81, and 0.80 for the multivariate logistic regression, full, and reduced random forest models, respectively. The models showed good stability in the external validation cohorts with a C-index for the logistic regression, full, and reduced random forest models of 0.84, 0.85, and 0.81 for year 2011, and 0.82, 0.81, and 0.81 for year 2013, respectively.
Random forest was comparable to logistic regression in predicting in-hospital mortality in women with STEMI, and can be a useful and accurate tool in clinical practice.
研究表明,与男性相比,ST段抬高型心肌梗死(STEMI)导致的女性死亡率更高。本研究的目的是使用逻辑回归和随机森林开发并验证STEMI入院女性全因院内死亡率的预测模型,并比较不同模型的性能和有效性。
使用2011 - 2013年国家住院样本(NIS)数据来识别STEMI入院女性。主要结局是全因院内死亡率。患者被分为开发队列和验证队列,训练后的模型使用2012年数据的20%进行内部验证,并使用2011年和2013年NIS数据进行外部验证。
开发并比较了三个主要模型;多变量逻辑回归、完整和简化随机森林模型。在多变量逻辑回归中,基于向后排除法,最终模型纳入了11个变量。完整随机森林模型包含32个变量,简化模型包含基于单个变量重要性选择的17个变量。在内部验证队列中,多变量逻辑回归、完整和简化随机森林模型的C指数分别为0.84、0.81和0.80。这些模型在外部验证队列中显示出良好的稳定性,2011年逻辑回归、完整和简化随机森林模型的C指数分别为0.84、0.85和0.81,2013年分别为0.82、0.81和0.81。
随机森林在预测STEMI女性院内死亡率方面与逻辑回归相当,并且在临床实践中可以是一种有用且准确的工具。