Suppr超能文献

COVID-19 从症状到预测:一种统计和机器学习方法。

COVID-19 from symptoms to prediction: A statistical and machine learning approach.

机构信息

Department of Information System, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia.

School of Built Environment, Engineering, and Computing, Leeds Beckett University, Leeds, LS6 3QR, UK.

出版信息

Comput Biol Med. 2024 Nov;182:109211. doi: 10.1016/j.compbiomed.2024.109211. Epub 2024 Sep 28.

Abstract

During the COVID-19 pandemic, the analysis of patient data has become a cornerstone for developing effective public health strategies. This study leverages a dataset comprising over 10,000 anonymized patient records from various leading medical institutions to predict COVID-19 patient age groups using a suite of statistical and machine learning techniques. Initially, extensive statistical tests including ANOVA and t-tests were utilized to assess relationships among demographic and symptomatic variables. The study then employed machine learning models such as Decision Tree, Naïve Bayes, KNN, Gradient Boosted Trees, Support Vector Machine, and Random Forest, with rigorous data preprocessing to enhance model accuracy. Further improvements were sought through ensemble methods; bagging, boosting, and stacking. Our findings indicate strong associations between key symptoms and patient age groups, with ensemble methods significantly enhancing model accuracy. Specifically, stacking applied with random forest as a meta leaner exhibited the highest accuracy (0.7054). In addition, the implementation of stacking techniques notably improved the performance of K-Nearest Neighbors (from 0.529 to 0.63) and Naïve Bayes (from 0.554 to 0.622) and demonstrated the most successful prediction method. The study aimed to understand the number of symptoms identified in COVID-19 patients and their association with different age groups. The results can assist doctors and higher authorities in improving treatment strategies. Additionally, several decision-making techniques can be applied during pandemic, tailored to specific age groups, such as resource allocation, medicine availability, vaccine development, and treatment strategies. The integration of these predictive models into clinical settings could support real-time public health responses and targeted intervention strategies.

摘要

在 COVID-19 大流行期间,对患者数据的分析已成为制定有效公共卫生策略的基石。本研究利用来自多家领先医疗机构的超过 10000 份匿名患者记录数据集,使用一系列统计和机器学习技术预测 COVID-19 患者的年龄组。最初,我们进行了广泛的统计测试,包括 ANOVA 和 t 检验,以评估人口统计学和症状变量之间的关系。然后,我们使用了机器学习模型,如决策树、朴素贝叶斯、KNN、梯度提升树、支持向量机和随机森林,并进行了严格的数据预处理以提高模型的准确性。通过集成方法,如袋装法、提升法和堆叠法进一步提高模型的性能。我们的研究结果表明,关键症状与患者年龄组之间存在很强的关联,而集成方法显著提高了模型的准确性。具体来说,随机森林作为元学习者的堆叠方法表现出最高的准确性(0.7054)。此外,堆叠技术的实施显著提高了 K-近邻(从 0.529 到 0.63)和朴素贝叶斯(从 0.554 到 0.622)的性能,并展示了最成功的预测方法。本研究旨在了解 COVID-19 患者中识别出的症状数量及其与不同年龄组的关联。研究结果可以帮助医生和上级部门改进治疗策略。此外,在大流行期间可以应用几种决策技术,针对特定年龄组进行定制,如资源分配、药物供应、疫苗开发和治疗策略。将这些预测模型集成到临床环境中,可以支持实时公共卫生响应和有针对性的干预策略。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验