• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

提取与 COVID-19 严重程度预后相关的预测变量:特征选择技术的详尽比较。

Extracting relevant predictive variables for COVID-19 severity prognosis: An exhaustive comparison of feature selection techniques.

机构信息

Basque Center for Applied Mathematics (BCAM), Bilbao, Basque Country, Spain.

Department of Electronic Technology, University of the Basque Country (UPV/EHU), Leioa, Basque Country, Spain.

出版信息

PLoS One. 2023 Apr 13;18(4):e0284150. doi: 10.1371/journal.pone.0284150. eCollection 2023.

DOI:10.1371/journal.pone.0284150
PMID:37053151
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10101453/
Abstract

With the COVID-19 pandemic having caused unprecedented numbers of infections and deaths, large research efforts have been undertaken to increase our understanding of the disease and the factors which determine diverse clinical evolutions. Here we focused on a fully data-driven exploration regarding which factors (clinical or otherwise) were most informative for SARS-CoV-2 pneumonia severity prediction via machine learning (ML). In particular, feature selection techniques (FS), designed to reduce the dimensionality of data, allowed us to characterize which of our variables were the most useful for ML prognosis. We conducted a multi-centre clinical study, enrolling n = 1548 patients hospitalized due to SARS-CoV-2 pneumonia: where 792, 238, and 598 patients experienced low, medium and high-severity evolutions, respectively. Up to 106 patient-specific clinical variables were collected at admission, although 14 of them had to be discarded for containing ⩾60% missing values. Alongside 7 socioeconomic attributes and 32 exposures to air pollution (chronic and acute), these became d = 148 features after variable encoding. We addressed this ordinal classification problem both as a ML classification and regression task. Two imputation techniques for missing data were explored, along with a total of 166 unique FS algorithm configurations: 46 filters, 100 wrappers and 20 embeddeds. Of these, 21 setups achieved satisfactory bootstrap stability (⩾0.70) with reasonable computation times: 16 filters, 2 wrappers, and 3 embeddeds. The subsets of features selected by each technique showed modest Jaccard similarities across them. However, they consistently pointed out the importance of certain explanatory variables. Namely: patient's C-reactive protein (CRP), pneumonia severity index (PSI), respiratory rate (RR) and oxygen levels -saturation Sp O2, quotients Sp O2/RR and arterial Sat O2/Fi O2-, the neutrophil-to-lymphocyte ratio (NLR) -to certain extent, also neutrophil and lymphocyte counts separately-, lactate dehydrogenase (LDH), and procalcitonin (PCT) levels in blood. A remarkable agreement has been found a posteriori between our strategy and independent clinical research works investigating risk factors for COVID-19 severity. Hence, these findings stress the suitability of this type of fully data-driven approaches for knowledge extraction, as a complementary to clinical perspectives.

摘要

由于 COVID-19 大流行导致了前所未有的感染和死亡人数,因此进行了大量的研究工作,以提高我们对该疾病的认识以及决定不同临床演变的因素。在这里,我们专注于通过机器学习 (ML) 对哪些因素(临床或其他因素)对 SARS-CoV-2 肺炎严重程度预测最具信息性进行完全数据驱动的探索。特别是,特征选择技术 (FS) 旨在降低数据的维数,使我们能够描述我们的变量中哪些对 ML 预后最有用。我们进行了一项多中心临床研究,共纳入 1548 名因 SARS-CoV-2 肺炎住院的患者:其中 792、238 和 598 例患者的病情分别为低、中、重度。在入院时收集了多达 106 个患者特定的临床变量,但其中 14 个变量由于包含 ⩾60%的缺失值而不得不丢弃。除了 7 个社会经济属性和 32 个空气污染暴露(慢性和急性)外,这些变量经过变量编码后成为 d = 148 个特征。我们将这个有序分类问题作为机器学习分类和回归任务来解决。我们探索了两种缺失数据的插补技术,以及总共 166 种独特的 FS 算法配置:46 个过滤器、100 个包装器和 20 个嵌入式。其中,21 种设置通过合理的计算时间实现了令人满意的自举稳定性(⩾0.70):16 个过滤器、2 个包装器和 3 个嵌入式。每个技术选择的特征子集之间的杰卡德相似度适中。然而,它们始终指出了某些解释变量的重要性。具体来说:患者的 C 反应蛋白 (CRP)、肺炎严重指数 (PSI)、呼吸频率 (RR) 和氧饱和度-SpO2、SpO2/RR 和动脉 SatO2/FiO2 的比值、中性粒细胞与淋巴细胞的比值 (NLR)-在某种程度上,也分别是中性粒细胞和淋巴细胞计数-乳酸脱氢酶 (LDH) 和血液中的降钙素原 (PCT) 水平。我们的策略与独立的 COVID-19 严重程度危险因素研究工作之间存在着显著的后验一致性。因此,这些发现强调了这种完全数据驱动方法的适用性,可作为临床观点的补充,用于知识提取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/8c0ee1191408/pone.0284150.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/cf0ae782b79f/pone.0284150.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/c0090a7f65b5/pone.0284150.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/5d4a2265fbf7/pone.0284150.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/64b92eabb7ec/pone.0284150.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/8c0ee1191408/pone.0284150.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/cf0ae782b79f/pone.0284150.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/c0090a7f65b5/pone.0284150.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/5d4a2265fbf7/pone.0284150.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/64b92eabb7ec/pone.0284150.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aaa1/10101453/8c0ee1191408/pone.0284150.g005.jpg

相似文献

1
Extracting relevant predictive variables for COVID-19 severity prognosis: An exhaustive comparison of feature selection techniques.提取与 COVID-19 严重程度预后相关的预测变量:特征选择技术的详尽比较。
PLoS One. 2023 Apr 13;18(4):e0284150. doi: 10.1371/journal.pone.0284150. eCollection 2023.
2
Safety and Efficacy of Imatinib for Hospitalized Adults with COVID-19: A structured summary of a study protocol for a randomised controlled trial.COVID-19 住院成人患者使用伊马替尼的安全性和疗效:一项随机对照试验研究方案的结构化总结。
Trials. 2020 Oct 28;21(1):897. doi: 10.1186/s13063-020-04819-9.
3
Factors associated with death outcome in patients with severe coronavirus disease-19 (COVID-19): a case-control study.与严重新型冠状病毒病(COVID-19)患者死亡结局相关的因素:病例对照研究。
Int J Med Sci. 2020 May 18;17(9):1281-1292. doi: 10.7150/ijms.46614. eCollection 2020.
4
Predictive role of clinical features in patients with coronavirus disease 2019 for severe disease.2019冠状病毒病患者临床特征对重症疾病的预测作用
Zhong Nan Da Xue Xue Bao Yi Xue Ban. 2020 May 28;45(5):536-541. doi: 10.11817/j.issn.1672-7347.2020.200384.
5
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
6
Covid-19: contribution of clinical characteristics and laboratory features for early detection of patients with high risk of severe evolution.Covid-19:临床特征和实验室特征对早期发现高危重症患者的贡献。
Acta Clin Belg. 2022 Apr;77(2):261-267. doi: 10.1080/17843286.2020.1822078. Epub 2020 Sep 16.
7
Persistent lymphocyte reduction and interleukin-6 levels are independently associated with death in patients with COVID-19.持续性淋巴细胞减少和白细胞介素 6 水平与 COVID-19 患者的死亡独立相关。
Clin Exp Med. 2023 Nov;23(7):3719-3728. doi: 10.1007/s10238-023-01114-0. Epub 2023 Jun 13.
8
The predictive and prognostic role of hematologic and biochemical parameters in the emergency department among coronavirus disease 2019 patients.血液学和生化学参数在 2019 年冠状病毒病患者急诊科的预测和预后作用。
Chin J Physiol. 2021 Nov-Dec;64(6):306-311. doi: 10.4103/cjp.cjp_77_21.
9
Cost-sensitive ordinal classification methods to predict SARS-CoV-2 pneumonia severity.用于预测新型冠状病毒肺炎严重程度的成本敏感序数分类方法。
IEEE J Biomed Health Inform. 2024 Feb 8;PP. doi: 10.1109/JBHI.2024.3363765.
10
Risk factors for illness severity in patients with COVID-19 pneumonia: a prospective cohort study.COVID-19 肺炎患者疾病严重程度的危险因素:一项前瞻性队列研究。
Int J Med Sci. 2021 Jan 1;18(4):921-928. doi: 10.7150/ijms.51205. eCollection 2021.

引用本文的文献

1
Comparative analysis of feature selection techniques for COVID-19 dataset.COVID-19 数据集特征选择技术的比较分析。
Sci Rep. 2024 Aug 11;14(1):18627. doi: 10.1038/s41598-024-69209-6.
2
Obtaining patient phenotypes in SARS-CoV-2 pneumonia, and their association with clinical severity and mortality.获取新型冠状病毒肺炎患者的表型及其与临床严重程度和死亡率的关联。
Pneumonia (Nathan). 2024 Jun 25;16(1):12. doi: 10.1186/s41479-024-00132-0.

本文引用的文献

1
Cost-sensitive ordinal classification methods to predict SARS-CoV-2 pneumonia severity.用于预测新型冠状病毒肺炎严重程度的成本敏感序数分类方法。
IEEE J Biomed Health Inform. 2024 Feb 8;PP. doi: 10.1109/JBHI.2024.3363765.
2
Artificial intelligence applications used in the clinical response to COVID-19: A scoping review.用于COVID-19临床应对的人工智能应用:一项范围综述。
PLOS Digit Health. 2022 Oct 17;1(10):e0000132. doi: 10.1371/journal.pdig.0000132. eCollection 2022 Oct.
3
Socioeconomic Inequalities in COVID-19 Vaccination and Infection in Adults, Catalonia, Spain.
西班牙加泰罗尼亚成年人中 COVID-19 疫苗接种和感染的社会经济不平等
Emerg Infect Dis. 2022 Nov;28(11):2243-2252. doi: 10.3201/eid2811.220614. Epub 2022 Oct 11.
4
Machine-learning-derived predictive score for early estimation of COVID-19 mortality risk in hospitalized patients.基于机器学习的预测评分模型,用于早期评估住院患者 COVID-19 死亡风险。
PLoS One. 2022 Sep 22;17(9):e0274171. doi: 10.1371/journal.pone.0274171. eCollection 2022.
5
The impact of air pollution on COVID-19 incidence, severity, and mortality: A systematic review of studies in Europe and North America.空气污染对 COVID-19 发病率、严重程度和死亡率的影响:欧洲和北美的研究系统评价。
Environ Res. 2022 Dec;215(Pt 1):114155. doi: 10.1016/j.envres.2022.114155. Epub 2022 Aug 27.
6
Assessing the impact of long-term exposure to nine outdoor air pollutants on COVID-19 spatial spread and related mortality in 107 Italian provinces.评估 9 种户外空气污染物长期暴露对 107 个意大利省份 COVID-19 空间传播及相关死亡率的影响。
Sci Rep. 2022 Aug 3;12(1):13317. doi: 10.1038/s41598-022-17215-x.
7
A review on the biological, epidemiological, and statistical relevance of COVID-19 paired with air pollution.关于新型冠状病毒肺炎与空气污染的生物学、流行病学及统计学相关性的综述。
Environ Adv. 2022 Jul;8:100250. doi: 10.1016/j.envadv.2022.100250. Epub 2022 Jun 4.
8
Neutrophil-to-Lymphocyte Ratio (NLR) Is a Promising Predictor of Mortality and Admission to Intensive Care Unit of COVID-19 Patients.中性粒细胞与淋巴细胞比值(NLR)是COVID-19患者死亡率和入住重症监护病房的一个有前景的预测指标。
J Clin Med. 2022 Apr 16;11(8):2235. doi: 10.3390/jcm11082235.
9
Neutrophil to Lymphocyte Ratio: An Emerging Marker of the Relationships between the Immune System and Diseases.中性粒细胞与淋巴细胞比值:免疫系统与疾病关系的新兴标志物。
Int J Mol Sci. 2022 Mar 26;23(7):3636. doi: 10.3390/ijms23073636.
10
Socioeconomic inequalities and COVID-19 - A review of the current international literature.社会经济不平等与新冠疫情——当前国际文献综述
J Health Monit. 2020 Oct 9;5(Suppl 7):3-17. doi: 10.25646/7059.