预测加拿大多伦多的 COVID-19 死亡率：基于树的和基于回归的机器学习方法的比较。

Predicting COVID-19 mortality risk in Toronto, Canada: a comparison of tree-based and regression-based machine learning methods.

机构信息

Department of Community Health and Epidemiology, Faculty of Medicine, Dalhousie University, 5790 University Avenue, Halifax, B3H 1V7, NS, Canada.

Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, 80045 Aurora, Colorado, 80045, USA.

出版信息

BMC Med Res Methodol. 2021 Nov 27;21(1):267. doi: 10.1186/s12874-021-01441-4.

DOI:10.1186/s12874-021-01441-4

PMID:34837951

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8627169/

Abstract

BACKGROUND

Coronavirus disease (COVID-19) presents an unprecedented threat to global health worldwide. Accurately predicting the mortality risk among the infected individuals is crucial for prioritizing medical care and mitigating the healthcare system's burden. The present study aimed to assess the predictive accuracy of machine learning methods to predict the COVID-19 mortality risk.

METHODS

We compared the performance of classification tree, random forest (RF), extreme gradient boosting (XGBoost), logistic regression, generalized additive model (GAM) and linear discriminant analysis (LDA) to predict the mortality risk among 49,216 COVID-19 positive cases in Toronto, Canada, reported from March 1 to December 10, 2020. We used repeated split-sample validation and k-steps-ahead forecasting validation. Predictive models were estimated using training samples, and predictive accuracy of the methods for the testing samples was assessed using the area under the receiver operating characteristic curve, Brier's score, calibration intercept and calibration slope.

RESULTS

We found XGBoost is highly discriminative, with an AUC of 0.9669 and has superior performance over conventional tree-based methods, i.e., classification tree or RF methods for predicting COVID-19 mortality risk. Regression-based methods (logistic, GAM and LASSO) had comparable performance to the XGBoost with slightly lower AUCs and higher Brier's scores.

CONCLUSIONS

XGBoost offers superior performance over conventional tree-based methods and minor improvement over regression-based methods for predicting COVID-19 mortality risk in the study population.

摘要

背景

冠状病毒病（COVID-19）在全球范围内对全球健康构成了前所未有的威胁。准确预测感染者的死亡风险对于优先提供医疗护理和减轻医疗系统负担至关重要。本研究旨在评估机器学习方法预测 COVID-19 死亡风险的预测准确性。

方法

我们比较了分类树、随机森林（RF）、极端梯度提升（XGBoost）、逻辑回归、广义加性模型（GAM）和线性判别分析（LDA）在预测 2020 年 3 月 1 日至 12 月 10 日期间在加拿大多伦多报告的 49,216 例 COVID-19 阳性病例的死亡风险中的性能。我们使用重复拆分样本验证和 k 步前瞻性验证。使用训练样本估计预测模型，并使用受试者工作特征曲线下的面积、Brier 得分、校准截距和校准斜率评估方法对测试样本的预测准确性。

结果

我们发现 XGBoost 具有高度的辨别力，AUC 为 0.9669，并且在预测 COVID-19 死亡风险方面优于传统的基于树的方法，例如分类树或 RF 方法。基于回归的方法（逻辑、GAM 和 LASSO）与 XGBoost 的性能相当，AUC 略低，Brier 得分略高。

结论

XGBoost 在预测研究人群中的 COVID-19 死亡风险方面优于传统的基于树的方法，并且略微优于基于回归的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6245/8627629/451a4ad37f8a/12874_2021_1441_Fig1_HTML.jpg

相似文献

Predicting COVID-19 mortality risk in Toronto, Canada: a comparison of tree-based and regression-based machine learning methods.预测加拿大多伦多的 COVID-19 死亡率：基于树的和基于回归的机器学习方法的比较。

BMC Med Res Methodol. 2021 Nov 27;21(1):267. doi: 10.1186/s12874-021-01441-4.

Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。

BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.

Machine learning-based risk prediction of malignant arrhythmia in hospitalized patients with heart failure.基于机器学习的心力衰竭住院患者恶性心律失常风险预测。

ESC Heart Fail. 2021 Dec;8(6):5363-5371. doi: 10.1002/ehf2.13627. Epub 2021 Sep 28.

[Construction of a predictive model for in-hospital mortality of sepsis patients in intensive care unit based on machine learning].基于机器学习构建重症监护病房脓毒症患者院内死亡率预测模型

Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023 Jul;35(7):696-701. doi: 10.3760/cma.j.cn121430-20221219-01104.

Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation.机器学习预测纽约市新冠肺炎患者队列中的死亡率和危急事件：模型开发与验证

J Med Internet Res. 2020 Nov 6;22(11):e24018. doi: 10.2196/24018.

Predicting Sepsis Mortality in a Population-Based National Database: Machine Learning Approach.基于人群的国家数据库中预测脓毒症死亡率：机器学习方法。

J Med Internet Res. 2022 Apr 13;24(4):e29982. doi: 10.2196/29982.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Development of a machine learning model and nomogram to predict seizures in children with COVID-19: a two-center study.开发一种机器学习模型和诺莫图来预测 COVID-19 患儿的癫痫发作：一项两中心研究。

J Trop Pediatr. 2024 Apr 5;70(3). doi: 10.1093/tropej/fmae011.

Evaluating methods for risk prediction of Covid-19 mortality in nursing home residents before and after vaccine availability: a retrospective cohort study.评估疫苗供应前后养老院居民 COVID-19 死亡率风险预测方法的回顾性队列研究。

BMC Med Res Methodol. 2024 Mar 27;24(1):77. doi: 10.1186/s12874-024-02189-3.

Predicting and identifying factors associated with undernutrition among children under five years in Ghana using machine learning algorithms.利用机器学习算法预测和识别加纳五岁以下儿童营养不良的相关因素。

PLoS One. 2024 Feb 13;19(2):e0296625. doi: 10.1371/journal.pone.0296625. eCollection 2024.

引用本文的文献

The effects of deprivation, age, and regional differences in COVID-19 mortality from 2020 to 2022: a retrospective analysis of public provincial data.2020年至2022年新冠病毒病死亡中贫困、年龄及地区差异的影响：省级公共数据的回顾性分析

BMC Public Health. 2025 Jan 14;25(1):148. doi: 10.1186/s12889-024-21031-5.

Investigating the effectiveness of whole-virus, protein-based, and vector-based SARS-CoV-2 vaccines on the rates of COVID-19 infection, hospitalization, and mortality: a historical cohort study in Iran.研究全病毒、基于蛋白质和基于载体的新冠病毒疫苗对新冠病毒感染率、住院率和死亡率的有效性：伊朗的一项历史性队列研究。

BMC Infect Dis. 2025 Jan 9;25(1):44. doi: 10.1186/s12879-025-10449-w.

Prediction of COVID-19 Hospitalization and Mortality Using Artificial Intelligence.利用人工智能预测新冠病毒疾病（COVID-19）的住院率和死亡率

Healthcare (Basel). 2024 Aug 26;12(17):1694. doi: 10.3390/healthcare12171694.

BMC Med Res Methodol. 2024 Mar 27;24(1):77. doi: 10.1186/s12874-024-02189-3.

Artificial intelligence-driven prediction of COVID-19-related hospitalization and death: a systematic review.人工智能驱动的 COVID-19 相关住院和死亡预测：系统评价。

Front Public Health. 2023 Jun 20;11:1183725. doi: 10.3389/fpubh.2023.1183725. eCollection 2023.

A Machine Learning Framework Based on Extreme Gradient Boosting to Predict the Occurrence and Development of Infectious Diseases in Laying Hen Farms, Taking H9N2 as an Example.以H9N2为例，基于极端梯度提升的机器学习框架预测蛋鸡养殖场传染病的发生与发展

Animals (Basel). 2023 Apr 27;13(9):1494. doi: 10.3390/ani13091494.

Demographic characteristics, clinical symptoms, biochemical markers and probability of occurrence of severe dengue: A multicenter hospital-based study in Bangladesh.人口统计学特征、临床症状、生化标志物与重症登革热发生概率：孟加拉国多中心医院为基础的研究。

PLoS Negl Trop Dis. 2023 Mar 15;17(3):e0011161. doi: 10.1371/journal.pntd.0011161. eCollection 2023 Mar.

Artificial intelligence and discrete-event simulation for capacity management of intensive care units during the Covid-19 pandemic: A case study.人工智能与离散事件模拟在新冠疫情期间重症监护病房容量管理中的应用：一项案例研究

J Bus Res. 2023 May;160:113806. doi: 10.1016/j.jbusres.2023.113806. Epub 2023 Mar 3.

Development and validation of self-monitoring auto-updating prognostic models of survival for hospitalized COVID-19 patients.开发和验证住院 COVID-19 患者生存的自我监测自动更新预后模型。

Nat Commun. 2022 Nov 10;13(1):6812. doi: 10.1038/s41467-022-34646-2.

A Comparison of XGBoost, Random Forest, and Nomograph for the Prediction of Disease Severity in Patients With COVID-19 Pneumonia: Implications of Cytokine and Immune Cell Profile.XGBoost、随机森林和列线图在预测 COVID-19 肺炎患者疾病严重程度方面的比较：细胞因子和免疫细胞特征的意义。

Front Cell Infect Microbiol. 2022 Apr 12;12:819267. doi: 10.3389/fcimb.2022.819267. eCollection 2022.

本文引用的文献

Early risk assessment for COVID-19 patients from emergency department data using machine learning.基于机器学习的急诊科新冠患者早期风险评估。

Sci Rep. 2021 Feb 18;11(1):4200. doi: 10.1038/s41598-021-83784-y.

Impact of temperature and relative humidity on the transmission of COVID-19: a modelling study in China and the United States.温度和相对湿度对 COVID-19 传播的影响：中国和美国的建模研究。

BMJ Open. 2021 Feb 17;11(2):e043863. doi: 10.1136/bmjopen-2020-043863.

Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model.使用 XGBoost 模型预测 ICU 中急性肾损伤患者的死亡率。

PLoS One. 2021 Feb 4;16(2):e0246306. doi: 10.1371/journal.pone.0246306. eCollection 2021.

Predicting CoVID-19 community mortality risk using machine learning and development of an online prognostic tool.使用机器学习预测新冠病毒疾病（COVID-19）社区死亡风险并开发在线预后工具。

PeerJ. 2020 Sep 28;8:e10083. doi: 10.7717/peerj.10083. eCollection 2020.

A Machine Learning-Based Prediction of Hospital Mortality in Patients With Postoperative Sepsis.基于机器学习的术后脓毒症患者医院死亡率预测

Front Med (Lausanne). 2020 Aug 11;7:445. doi: 10.3389/fmed.2020.00445. eCollection 2020.

Temperature, Humidity, and Latitude Analysis to Estimate Potential Spread and Seasonality of Coronavirus Disease 2019 (COVID-19).温度、湿度和纬度分析估计 2019 年冠状病毒病（COVID-19）的潜在传播和季节性。

JAMA Netw Open. 2020 Jun 1;3(6):e2011834. doi: 10.1001/jamanetworkopen.2020.11834.

Comorbidity and its impact on 1590 patients with COVID-19 in China: a nationwide analysis.中国 COVID-19 患者 1590 例的合并症及其影响：一项全国性分析。

Eur Respir J. 2020 May 14;55(5). doi: 10.1183/13993003.00547-2020. Print 2020 May.

Using a machine learning approach to predict mortality in critically ill influenza patients: a cross-sectional retrospective multicentre study in Taiwan.运用机器学习方法预测危重症流感患者的死亡率：台湾一项跨中心回顾性研究

BMJ Open. 2020 Feb 25;10(2):e033898. doi: 10.1136/bmjopen-2019-033898.

Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径

J Stat Softw. 2010;33(1):1-22.

Use of Brier score to assess binary predictions.使用布里尔分数评估二元预测。

J Clin Epidemiol. 2010 Aug;63(8):938-9; author reply 939. doi: 10.1016/j.jclinepi.2009.11.009. Epub 2010 Mar 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

预测加拿大多伦多的 COVID-19 死亡率：基于树的和基于回归的机器学习方法的比较。

Predicting COVID-19 mortality risk in Toronto, Canada: a comparison of tree-based and regression-based machine learning methods.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献