• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

COVID-19 数据集特征选择技术的比较分析。

Comparative analysis of feature selection techniques for COVID-19 dataset.

机构信息

Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran.

Hearing Sciences, Mental Health and Clinical Neurosciences, School of Medicine, National Institute for Health and Care Research (NIHR) Nottingham Biomedical Research Center, University of Nottingham, Nottingham, UK.

出版信息

Sci Rep. 2024 Aug 11;14(1):18627. doi: 10.1038/s41598-024-69209-6.

DOI:10.1038/s41598-024-69209-6
PMID:39128991
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11317481/
Abstract

In the context of early disease detection, machine learning (ML) has emerged as a vital tool. Feature selection (FS) algorithms play a crucial role in ensuring the accuracy of predictive models by identifying the most influential variables. This study, focusing on a retrospective cohort of 4778 COVID-19 patients from Iran, explores the performance of various FS methods, including filter, embedded, and hybrid approaches, in predicting mortality outcomes. The researchers leveraged 115 routine clinical, laboratory, and demographic features and employed 13 ML models to assess the effectiveness of these FS methods based on classification accuracy, predictive accuracy, and statistical tests. The results indicate that a Hybrid Boruta-VI model combined with the Random Forest algorithm demonstrated superior performance, achieving an accuracy of 0.89, an F1 score of 0.76, and an AUC value of 0.95 on test data. Key variables identified as important predictors of adverse outcomes include age, oxygen saturation levels, albumin levels, neutrophil counts, platelet levels, and markers of kidney function. These findings highlight the potential of advanced FS techniques and ML models in enhancing early disease detection and informing clinical decision-making.

摘要

在早期疾病检测方面,机器学习 (ML) 已成为一种重要工具。特征选择 (FS) 算法通过识别最具影响力的变量,在确保预测模型的准确性方面发挥着关键作用。本研究聚焦于来自伊朗的 4778 例 COVID-19 患者的回顾性队列,探讨了各种 FS 方法(包括过滤、嵌入式和混合方法)在预测死亡率结果方面的性能。研究人员利用了 115 项常规临床、实验室和人口统计学特征,并采用 13 种 ML 模型,根据分类准确性、预测准确性和统计检验来评估这些 FS 方法的有效性。结果表明,结合随机森林算法的 Hybrid Boruta-VI 模型表现最佳,在测试数据上的准确率为 0.89,F1 得分为 0.76,AUC 值为 0.95。被确定为不良预后重要预测因子的关键变量包括年龄、血氧饱和度水平、白蛋白水平、中性粒细胞计数、血小板计数以及肾功能标志物。这些发现强调了先进的 FS 技术和 ML 模型在增强早期疾病检测和为临床决策提供信息方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/fa191fd1bc64/41598_2024_69209_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/eab64e2b9eaf/41598_2024_69209_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/ce0ec29c4c44/41598_2024_69209_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/b9bd4a7d8d88/41598_2024_69209_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/a464c50d728f/41598_2024_69209_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/e9fd5186a78f/41598_2024_69209_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/e933a40a326d/41598_2024_69209_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/abb3b520991b/41598_2024_69209_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/fa191fd1bc64/41598_2024_69209_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/eab64e2b9eaf/41598_2024_69209_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/ce0ec29c4c44/41598_2024_69209_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/b9bd4a7d8d88/41598_2024_69209_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/a464c50d728f/41598_2024_69209_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/e9fd5186a78f/41598_2024_69209_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/e933a40a326d/41598_2024_69209_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/abb3b520991b/41598_2024_69209_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13d5/11317481/fa191fd1bc64/41598_2024_69209_Fig8_HTML.jpg

相似文献

1
Comparative analysis of feature selection techniques for COVID-19 dataset.COVID-19 数据集特征选择技术的比较分析。
Sci Rep. 2024 Aug 11;14(1):18627. doi: 10.1038/s41598-024-69209-6.
2
Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。
BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.
3
A new hybrid ensemble machine-learning model for severity risk assessment and post-COVID prediction system.一种新的混合集成机器学习模型,用于严重程度风险评估和 COVID 后预测系统。
Math Biosci Eng. 2022 Apr 13;19(6):6102-6123. doi: 10.3934/mbe.2022285.
4
Comparing machine learning algorithms for predicting COVID-19 mortality.比较用于预测 COVID-19 死亡率的机器学习算法。
BMC Med Inform Decis Mak. 2022 Jan 4;22(1):2. doi: 10.1186/s12911-021-01742-0.
5
Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study.利用自动化机器学习预测 COVID-19 患者的死亡率:预测模型开发研究。
J Med Internet Res. 2021 Feb 26;23(2):e23458. doi: 10.2196/23458.
6
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
7
Learning From Past Respiratory Infections to Predict COVID-19 Outcomes: Retrospective Study.从既往呼吸道感染预测 COVID-19 结局:回顾性研究。
J Med Internet Res. 2021 Feb 22;23(2):e23026. doi: 10.2196/23026.
8
Application of machine learning models based on decision trees in classifying the factors affecting mortality of COVID-19 patients in Hamadan, Iran.基于决策树的机器学习模型在伊朗哈马丹 COVID-19 患者死亡率影响因素分类中的应用。
BMC Med Inform Decis Mak. 2022 Jul 24;22(1):192. doi: 10.1186/s12911-022-01939-x.
9
The Development and Validation of Simplified Machine Learning Algorithms to Predict Prognosis of Hospitalized Patients With COVID-19: Multicenter, Retrospective Study.中文译文:简化机器学习算法预测 COVID-19 住院患者预后的开发和验证:多中心回顾性研究。
J Med Internet Res. 2022 Jan 21;24(1):e31549. doi: 10.2196/31549.
10
Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data.比较机器学习算法,使用包含胸部计算机断层扫描严重程度评分数据的数据集来预测 COVID-19 死亡率。
Sci Rep. 2023 Jul 13;13(1):11343. doi: 10.1038/s41598-023-38133-6.

本文引用的文献

1
Extracting relevant predictive variables for COVID-19 severity prognosis: An exhaustive comparison of feature selection techniques.提取与 COVID-19 严重程度预后相关的预测变量:特征选择技术的详尽比较。
PLoS One. 2023 Apr 13;18(4):e0284150. doi: 10.1371/journal.pone.0284150. eCollection 2023.
2
A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction.基于机器学习的疾病风险预测的特征选择方法综述
Front Bioinform. 2022 Jun 27;2:927312. doi: 10.3389/fbinf.2022.927312. eCollection 2022.
3
Handling class imbalance in COVID-19 chest X-ray images classification: Using SMOTE and weighted loss.
处理COVID-19胸部X光图像分类中的类别不平衡问题:使用SMOTE和加权损失
Appl Soft Comput. 2022 Nov;129:109588. doi: 10.1016/j.asoc.2022.109588. Epub 2022 Aug 29.
4
Epidemiology of COVID-19 in Tehran, Iran: A Cohort Study of Clinical Profile, Risk Factors, and Outcomes.伊朗德黑兰 COVID-19 流行病学:临床特征、危险因素和结局的队列研究。
Biomed Res Int. 2022 May 10;2022:2350063. doi: 10.1155/2022/2350063. eCollection 2022.
5
The role of blood urea nitrogen to serum albumin ratio in the prediction of severity and 30-day mortality in patients with COVID-19.血尿素氮与血清白蛋白比值在预测新型冠状病毒肺炎患者病情严重程度及30天死亡率中的作用
Health Sci Rep. 2022 May 6;5(3):e606. doi: 10.1002/hsr2.606. eCollection 2022 May.
6
A new COVID-19 intubation prediction strategy using an intelligent feature selection and K-NN method.一种使用智能特征选择和K近邻方法的新型新冠病毒插管预测策略。
Inform Med Unlocked. 2022;28:100825. doi: 10.1016/j.imu.2021.100825. Epub 2021 Dec 28.
7
COVID-19 early detection for imbalanced or low number of data using a regularized cost-sensitive CapsNet.使用正则化成本敏感胶囊网络对不平衡或少量数据进行COVID-19早期检测。
Sci Rep. 2021 Sep 16;11(1):18478. doi: 10.1038/s41598-021-97901-4.
8
Risk factors analysis of COVID-19 patients with ARDS and prediction based on machine learning.基于机器学习的 COVID-19 合并 ARDS 患者风险因素分析及预测
Sci Rep. 2021 Feb 3;11(1):2933. doi: 10.1038/s41598-021-82492-x.
9
Predicting mortality risk in patients with COVID-19 using machine learning to help medical decision-making.利用机器学习预测2019冠状病毒病患者的死亡风险以辅助医疗决策。
Smart Health (Amst). 2021 Apr;20:100178. doi: 10.1016/j.smhl.2020.100178. Epub 2021 Jan 16.
10
The chronic kidney disease and acute kidney injury involvement in COVID-19 pandemic: A systematic review and meta-analysis.慢性肾脏病和急性肾损伤在 COVID-19 大流行中的作用:系统评价和荟萃分析。
PLoS One. 2021 Jan 5;16(1):e0244779. doi: 10.1371/journal.pone.0244779. eCollection 2021.