Suppr超能文献

增强心血管风险预测:基于医院水平随机效应的先进Xgboost模型的开发。

Enhancing Cardiovascular Risk Prediction: Development of an Advanced Xgboost Model with Hospital-Level Random Effects.

作者信息

Dong Tim, Oronti Iyabosola Busola, Sinha Shubhra, Freitas Alberto, Zhai Bing, Chan Jeremy, Fudulu Daniel P, Caputo Massimo, Angelini Gianni D

机构信息

Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol BS2 8HW, UK.

Statistics and Risk Unit (AS&RU), Department of Statistics, School of Engineering, University of Warwick, Coventry CV4 7AL, UK.

出版信息

Bioengineering (Basel). 2024 Oct 18;11(10):1039. doi: 10.3390/bioengineering11101039.

Abstract

BACKGROUND

Ensemble tree-based models such as Xgboost are highly prognostic in cardiovascular medicine, as measured by the Clinical Effectiveness Metric (CEM). However, their ability to handle correlated data, such as hospital-level effects, is limited.

OBJECTIVES

The aim of this work is to develop a binary-outcome mixed-effects Xgboost (BME) model that integrates random effects at the hospital level. To ascertain how well the model handles correlated data in cardiovascular outcomes, we aim to assess its performance and compare it to fixed-effects Xgboost and traditional logistic regression models.

METHODS

A total of 227,087 patients over 17 years of age, undergoing cardiac surgery from 42 UK hospitals between 1 January 2012 and 31 March 2019, were included. The dataset was split into two cohorts: training/validation (n = 157,196; 2012-2016) and holdout (n = 69,891; 2017-2019). The outcome variable was 30-day mortality with hospitals considered as the clustering variable. The logistic regression, mixed-effects logistic regression, Xgboost and binary-outcome mixed-effects Xgboost (BME) were fitted to both standardized and unstandardized datasets across a range of sample sizes and the estimated prediction power metrics were compared to identify the best approach.

RESULTS

The exploratory study found high variability in hospital-related mortality across datasets, which supported the adoption of the mixed-effects models. Unstandardized Xgboost BME demonstrated marked improvements in prediction power over the Xgboost model at small sample size ranges, but performance differences decreased as dataset sizes increased. Generalized linear models (glms) and generalized linear mixed-effects models (glmers) followed similar results, with the Xgboost models also excelling at greater sample sizes.

CONCLUSIONS

These findings suggest that integrating mixed effects into machine learning models can enhance their performance on datasets where the sample size is small.

摘要

背景

通过临床有效性指标(CEM)衡量,诸如Xgboost等基于集成树的模型在心血管医学中具有很高的预后价值。然而,它们处理相关数据(如医院层面效应)的能力有限。

目的

本研究的目的是开发一种二元结局混合效应Xgboost(BME)模型,该模型整合了医院层面的随机效应。为了确定该模型在心血管结局中处理相关数据的能力,我们旨在评估其性能,并将其与固定效应Xgboost和传统逻辑回归模型进行比较。

方法

纳入了2012年1月1日至2019年3月31日期间在英国42家医院接受心脏手术的17岁以上的227,087例患者。数据集被分为两个队列:训练/验证队列(n = 157,196;2012 - 2016年)和保留队列(n = 69,891;2017 - 2019年)。结局变量为30天死亡率,医院被视为聚类变量。对一系列样本量的标准化和非标准化数据集拟合逻辑回归、混合效应逻辑回归、Xgboost和二元结局混合效应Xgboost(BME),并比较估计的预测能力指标以确定最佳方法。

结果

探索性研究发现各数据集中与医院相关的死亡率存在高度变异性,这支持采用混合效应模型。非标准化的Xgboost BME在小样本量范围内的预测能力比Xgboost模型有显著提高,但随着数据集规模的增加,性能差异减小。广义线性模型(glms)和广义线性混合效应模型(glmers)呈现类似结果,Xgboost模型在更大样本量时也表现出色。

结论

这些发现表明,将混合效应整合到机器学习模型中可以提高其在小样本量数据集上的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c9c/11505330/821b4cf11440/bioengineering-11-01039-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验