• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用英国生物库数据揭示临床风险因素并预测严重 COVID-19 病例:机器学习方法。

Uncovering Clinical Risk Factors and Predicting Severe COVID-19 Cases Using UK Biobank Data: Machine Learning Approach.

机构信息

School of Biomedical Sciences, The Chinese University of Hong Kong, Hong Kong, China.

KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, Kunming, China.

出版信息

JMIR Public Health Surveill. 2021 Sep 30;7(9):e29544. doi: 10.2196/29544.

DOI:10.2196/29544
PMID:34591027
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8485986/
Abstract

BACKGROUND

COVID-19 is a major public health concern. Given the extent of the pandemic, it is urgent to identify risk factors associated with disease severity. More accurate prediction of those at risk of developing severe infections is of high clinical importance.

OBJECTIVE

Based on the UK Biobank (UKBB), we aimed to build machine learning models to predict the risk of developing severe or fatal infections, and uncover major risk factors involved.

METHODS

We first restricted the analysis to infected individuals (n=7846), then performed analysis at a population level, considering those with no known infection as controls (ncontrols=465,728). Hospitalization was used as a proxy for severity. A total of 97 clinical variables (collected prior to the COVID-19 outbreak) covering demographic variables, comorbidities, blood measurements (eg, hematological/liver/renal function/metabolic parameters), anthropometric measures, and other risk factors (eg, smoking/drinking) were included as predictors. We also constructed a simplified (lite) prediction model using 27 covariates that can be more easily obtained (demographic and comorbidity data). XGboost (gradient-boosted trees) was used for prediction and predictive performance was assessed by cross-validation. Variable importance was quantified by Shapley values (ShapVal), permutation importance (PermImp), and accuracy gain. Shapley dependency and interaction plots were used to evaluate the pattern of relationships between risk factors and outcomes.

RESULTS

A total of 2386 severe and 477 fatal cases were identified. For analyses within infected individuals (n=7846), our prediction model achieved area under the receiving-operating characteristic curve (AUC-ROC) of 0.723 (95% CI 0.711-0.736) and 0.814 (95% CI 0.791-0.838) for severe and fatal infections, respectively. The top 5 contributing factors (sorted by ShapVal) for severity were age, number of drugs taken (cnt_tx), cystatin C (reflecting renal function), waist-to-hip ratio (WHR), and Townsend deprivation index (TDI). For mortality, the top features were age, testosterone, cnt_tx, waist circumference (WC), and red cell distribution width. For analyses involving the whole UKBB population, AUCs for severity and fatality were 0.696 (95% CI 0.684-0.708) and 0.825 (95% CI 0.802-0.848), respectively. The same top 5 risk factors were identified for both outcomes, namely, age, cnt_tx, WC, WHR, and TDI. Apart from the above, age, cystatin C, TDI, and cnt_tx were among the top 10 across all 4 analyses. Other diseases top ranked by ShapVal or PermImp were type 2 diabetes mellitus (T2DM), coronary artery disease, atrial fibrillation, and dementia, among others. For the "lite" models, predictive performances were broadly similar, with estimated AUCs of 0.716, 0.818, 0.696, and 0.830, respectively. The top ranked variables were similar to above, including age, cnt_tx, WC, sex (male), and T2DM.

CONCLUSIONS

We identified numerous baseline clinical risk factors for severe/fatal infection by XGboost. For example, age, central obesity, impaired renal function, multiple comorbidities, and cardiometabolic abnormalities may predispose to poorer outcomes. The prediction models may be useful at a population level to identify those susceptible to developing severe/fatal infections, facilitating targeted prevention strategies. A risk-prediction tool is also available online. Further replications in independent cohorts are required to verify our findings.

摘要

背景

COVID-19 是一个主要的公共卫生问题。鉴于大流行的程度,迫切需要确定与疾病严重程度相关的风险因素。更准确地预测那些有发展为严重感染风险的人具有重要的临床意义。

目的

基于英国生物库(UKBB),我们旨在建立机器学习模型来预测发生严重或致命感染的风险,并揭示涉及的主要风险因素。

方法

我们首先将分析仅限于感染个体(n=7846),然后在人群水平上进行分析,将没有已知感染的个体视为对照(ncontrols=465728)。住院被用作严重程度的替代指标。总共纳入了 97 个临床变量(在 COVID-19 爆发前收集),包括人口统计学变量、合并症、血液测量(例如,血液学/肝脏/肾脏功能/代谢参数)、人体测量学指标和其他风险因素(例如,吸烟/饮酒)作为预测因子。我们还构建了一个简化(lite)预测模型,使用 27 个更容易获得的协变量(人口统计学和合并症数据)。使用 XGBoost(梯度增强树)进行预测,并通过交叉验证评估预测性能。通过 Shapley 值(ShapVal)、排列重要性(PermImp)和准确性增益来量化变量的重要性。使用 Shapley 依赖和交互图来评估风险因素和结果之间的关系模式。

结果

总共确定了 2386 例严重感染和 477 例致命感染病例。对于感染个体内的分析(n=7846),我们的预测模型在严重感染和致命感染方面的接收者操作特征曲线下面积(AUC-ROC)分别为 0.723(95% CI 0.711-0.736)和 0.814(95% CI 0.791-0.838)。严重程度的前 5 个主要贡献因素(按 ShapVal 排序)是年龄、服用的药物数量(cnt_tx)、半胱氨酸蛋白酶抑制剂 C(反映肾功能)、腰臀比(WHR)和汤森剥夺指数(TDI)。对于死亡率,前 5 个主要特征是年龄、睾丸激素、cnt_tx、腰围(WC)和红细胞分布宽度。对于整个 UKBB 人群的分析,严重程度和死亡率的 AUC 分别为 0.696(95% CI 0.684-0.708)和 0.825(95% CI 0.802-0.848)。严重和致命感染的相同前 5 个风险因素被确定,即年龄、cnt_tx、WC、WHR 和 TDI。除了上述因素外,年龄、半胱氨酸蛋白酶抑制剂 C、TDI 和 cnt_tx 也是所有 4 项分析中排名前 10 的因素。ShapVal 或 PermImp 排名较高的其他疾病包括 2 型糖尿病(T2DM)、冠状动脉疾病、心房颤动和痴呆症等。对于“lite”模型,预测性能大致相似,估计的 AUC 分别为 0.716、0.818、0.696 和 0.830。排名最高的变量与上述变量相似,包括年龄、cnt_tx、WC、性别(男性)和 T2DM。

结论

我们通过 XGBoost 确定了许多严重/致命感染的基线临床风险因素。例如,年龄、中心性肥胖、肾功能受损、多种合并症和心脏代谢异常可能导致不良结局。预测模型可能在人群水平上有用,以识别那些容易发生严重/致命感染的人,从而促进有针对性的预防策略。也提供了在线风险预测工具。需要在独立队列中进行进一步验证,以验证我们的发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/48e4d9a2961d/publichealth_v7i9e29544_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/ab345e3b5416/publichealth_v7i9e29544_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/6724637f0c57/publichealth_v7i9e29544_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/b5130b722afe/publichealth_v7i9e29544_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/48e4d9a2961d/publichealth_v7i9e29544_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/ab345e3b5416/publichealth_v7i9e29544_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/6724637f0c57/publichealth_v7i9e29544_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/b5130b722afe/publichealth_v7i9e29544_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac26/8485986/48e4d9a2961d/publichealth_v7i9e29544_fig4.jpg

相似文献

1
Uncovering Clinical Risk Factors and Predicting Severe COVID-19 Cases Using UK Biobank Data: Machine Learning Approach.利用英国生物库数据揭示临床风险因素并预测严重 COVID-19 病例:机器学习方法。
JMIR Public Health Surveill. 2021 Sep 30;7(9):e29544. doi: 10.2196/29544.
2
Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants.使用自动化机器学习进行心血管疾病风险预测:对 423604 名英国生物库参与者的前瞻性研究。
PLoS One. 2019 May 15;14(5):e0213653. doi: 10.1371/journal.pone.0213653. eCollection 2019.
3
Fracture risk prediction in postmenopausal women with traditional and machine learning models in a nationwide, prospective cohort study in Switzerland with validation in the UK Biobank.在瑞士进行的一项全国性前瞻性队列研究中,使用传统和机器学习模型对绝经后妇女进行骨折风险预测,并在英国生物库中进行验证。
J Bone Miner Res. 2024 Aug 21;39(8):1103-1112. doi: 10.1093/jbmr/zjae089.
4
Prediction of Suicidal Behaviors in the Middle-aged Population: Machine Learning Analyses of UK Biobank.预测中年人群的自杀行为:英国生物库的机器学习分析。
JMIR Public Health Surveill. 2023 Feb 20;9:e43419. doi: 10.2196/43419.
5
Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study.基于临床和炎症特征的机器学习模型对住院 COVID-19 患者死亡风险的预测:一项回顾性队列研究的结果。
Ann Med. 2021 Dec;53(1):257-266. doi: 10.1080/07853890.2020.1868564.
6
COVID-19 mortality in the UK Biobank cohort: revisiting and evaluating risk factors.英国生物库队列中 COVID-19 的死亡率:重新审视和评估风险因素。
Eur J Epidemiol. 2021 Mar;36(3):299-309. doi: 10.1007/s10654-021-00722-y. Epub 2021 Feb 15.
7
Learning From Past Respiratory Infections to Predict COVID-19 Outcomes: Retrospective Study.从既往呼吸道感染预测 COVID-19 结局:回顾性研究。
J Med Internet Res. 2021 Feb 22;23(2):e23026. doi: 10.2196/23026.
8
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
9
Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。
BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.
10
Clinical Characterization and Prediction of Clinical Severity of SARS-CoV-2 Infection Among US Adults Using Data From the US National COVID Cohort Collaborative.利用美国国家 COVID 队列协作的数据,对美国成年人中 SARS-CoV-2 感染的临床特征和临床严重程度进行临床描述和预测。
JAMA Netw Open. 2021 Jul 1;4(7):e2116901. doi: 10.1001/jamanetworkopen.2021.16901.

引用本文的文献

1
Comorbidity patterns associated with severe COVID-19 outcomes: A cohort study based on the UK Biobank.与严重新冠病毒病结局相关的共病模式:一项基于英国生物银行的队列研究。
PLoS One. 2025 Aug 22;20(8):e0329701. doi: 10.1371/journal.pone.0329701. eCollection 2025.
2
Predictors of developing severe COVID-19 among hospitalized patients: a retrospective study.住院患者中发生重症 COVID-19 的预测因素:一项回顾性研究。
Front Med (Lausanne). 2025 Jan 14;11:1494302. doi: 10.3389/fmed.2024.1494302. eCollection 2024.
3
Variables associated with cognitive function: an exposome-wide and mendelian randomization analysis.

本文引用的文献

1
Artificial intelligence in clinical care amidst COVID-19 pandemic: A systematic review.COVID-19大流行期间临床护理中的人工智能:一项系统综述。
Comput Struct Biotechnol J. 2021;19:2833-2850. doi: 10.1016/j.csbj.2021.05.010. Epub 2021 May 7.
2
Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19.比较用于预测新冠肺炎重症监护病房收治率和死亡率的机器学习算法
NPJ Digit Med. 2021 May 21;4(1):87. doi: 10.1038/s41746-021-00456-x.
3
Risk factors for severity of COVID-19: a rapid review to inform vaccine prioritisation in Canada.
与认知功能相关的变量:全暴露组和孟德尔随机化分析
Alzheimers Res Ther. 2025 Jan 7;17(1):13. doi: 10.1186/s13195-025-01670-5.
4
What is Occluding Our Understanding of Retinal Vein Occlusion?是什么阻碍了我们对视网膜静脉阻塞的理解?
Ophthalmol Ther. 2024 Dec;13(12):3025-3034. doi: 10.1007/s40123-024-01042-6. Epub 2024 Oct 10.
5
Conventional and unconventional T-cell responses contribute to the prediction of clinical outcome and causative bacterial pathogen in sepsis patients.常规和非常规 T 细胞反应有助于预测脓毒症患者的临床结果和致病细菌病原体。
Clin Exp Immunol. 2024 May 16;216(3):293-306. doi: 10.1093/cei/uxae019.
6
Predicting post-liver transplant outcomes in patients with acute-on-chronic liver failure using Expert-Augmented Machine Learning.使用专家增强机器学习预测慢加急性肝衰竭患者肝移植术后结局。
Am J Transplant. 2023 Dec;23(12):1908-1921. doi: 10.1016/j.ajt.2023.08.022. Epub 2023 Aug 30.
7
Development and validation of a prediction model based on comorbidities to estimate the risk of in-hospital death in patients with COVID-19.基于合并症的预测模型的建立和验证,以估计 COVID-19 患者住院死亡的风险。
Front Public Health. 2023 May 26;11:1194349. doi: 10.3389/fpubh.2023.1194349. eCollection 2023.
8
A Survey of COVID-19 Diagnosis Using Routine Blood Tests with the Aid of Artificial Intelligence Techniques.借助人工智能技术利用常规血液检测进行新冠病毒诊断的调查
Diagnostics (Basel). 2023 May 16;13(10):1749. doi: 10.3390/diagnostics13101749.
9
An Integrative Explainable Artificial Intelligence Approach to Analyze Fine-Scale Land-Cover and Land-Use Factors Associated with Spatial Distributions of Place of Residence of Reported Dengue Cases.一种用于分析与登革热报告病例居住地空间分布相关的精细尺度土地覆盖和土地利用因素的综合可解释人工智能方法。
Trop Med Infect Dis. 2023 Apr 20;8(4):238. doi: 10.3390/tropicalmed8040238.
10
The effect of ACE2 receptor, IFN-γ, and TNF-α polymorphisms on the severity and prognosis of the disease in SARS-CoV-2 infection.ACE2 受体、IFN-γ 和 TNF-α 多态性对严重急性呼吸综合征冠状病毒 2 感染疾病严重程度和预后的影响。
J Investig Med. 2023 Jun;71(5):526-535. doi: 10.1177/10815589231158379. Epub 2023 Mar 6.
COVID-19 严重程度的风险因素:一项快速综述,为加拿大疫苗优先排序提供信息。
BMJ Open. 2021 May 13;11(5):e044684. doi: 10.1136/bmjopen-2020-044684.
4
Effectiveness of the Pfizer-BioNTech and Oxford-AstraZeneca vaccines on covid-19 related symptoms, hospital admissions, and mortality in older adults in England: test negative case-control study.辉瑞-生物科技疫苗和牛津-阿斯利康疫苗对英格兰老年人新冠病毒相关症状、住院及死亡率的有效性:检测阴性病例对照研究
BMJ. 2021 May 13;373:n1088. doi: 10.1136/bmj.n1088.
5
Impact and effectiveness of mRNA BNT162b2 vaccine against SARS-CoV-2 infections and COVID-19 cases, hospitalisations, and deaths following a nationwide vaccination campaign in Israel: an observational study using national surveillance data.以色列全国疫苗接种运动后,mRNA BNT162b2疫苗对SARS-CoV-2感染及COVID-19病例、住院和死亡的影响与效果:一项利用国家监测数据的观察性研究
Lancet. 2021 May 15;397(10287):1819-1829. doi: 10.1016/S0140-6736(21)00947-8. Epub 2021 May 5.
6
Correction: Multimorbidity, polypharmacy, and COVID-19 infection within the UK Biobank cohort.更正:英国生物银行队列中的多重疾病、多种药物治疗与新冠病毒感染
PLoS One. 2021 May 6;16(5):e0251613. doi: 10.1371/journal.pone.0251613. eCollection 2021.
7
Effectiveness of Pfizer-BioNTech and Moderna Vaccines Against COVID-19 Among Hospitalized Adults Aged ≥65 Years - United States, January-March 2021.辉瑞-生物科技和莫德纳疫苗对≥65 岁住院成年人 COVID-19 的有效性-美国,2021 年 1 月至 3 月。
MMWR Morb Mortal Wkly Rep. 2021 May 7;70(18):674-679. doi: 10.15585/mmwr.mm7018e1.
8
Machine learning approaches in COVID-19 diagnosis, mortality, and severity risk prediction: A review.COVID-19诊断、死亡率和严重程度风险预测中的机器学习方法:综述
Inform Med Unlocked. 2021;24:100564. doi: 10.1016/j.imu.2021.100564. Epub 2021 Apr 3.
9
Predictability of COVID-19 Hospitalizations, Intensive Care Unit Admissions, and Respiratory Assistance in Portugal: Longitudinal Cohort Study.葡萄牙 COVID-19 住院、重症监护病房入院和呼吸辅助治疗的可预测性:纵向队列研究。
J Med Internet Res. 2021 Apr 28;23(4):e26075. doi: 10.2196/26075.
10
Prediction and Feature Importance Analysis for Severity of COVID-19 in South Korea Using Artificial Intelligence: Model Development and Validation.利用人工智能预测和分析韩国 COVID-19 严重程度及特征重要性:模型建立与验证。
J Med Internet Res. 2021 Apr 19;23(4):e27060. doi: 10.2196/27060.