比较机器学习算法,使用包含胸部计算机断层扫描严重程度评分数据的数据集来预测 COVID-19 死亡率。

Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data.

机构信息

Department of Medical Physics, Ilam University of Medical Sciences, Ilam, Iran.

Department of Midwifery, Ilam University of Medical Sciences, Ilam, Iran.

出版信息

Sci Rep. 2023 Jul 13;13(1):11343. doi: 10.1038/s41598-023-38133-6.

Abstract

Since the beginning of the COVID-19 pandemic, new and non-invasive digital technologies such as artificial intelligence (AI) had been introduced for mortality prediction of COVID-19 patients. The prognostic performances of the machine learning (ML)-based models for predicting clinical outcomes of COVID-19 patients had been mainly evaluated using demographics, risk factors, clinical manifestations, and laboratory results. There is a lack of information about the prognostic role of imaging manifestations in combination with demographics, clinical manifestations, and laboratory predictors. The purpose of the present study is to develop an efficient ML prognostic model based on a more comprehensive dataset including chest CT severity score (CT-SS). Fifty-five primary features in six main classes were retrospectively reviewed for 6854 suspected cases. The independence test of Chi-square was used to determine the most important features in the mortality prediction of COVID-19 patients. The most relevant predictors were used to train and test ML algorithms. The predictive models were developed using eight ML algorithms including the J48 decision tree (J48), support vector machine (SVM), multi-layer perceptron (MLP), k-nearest neighbourhood (k-NN), Naïve Bayes (NB), logistic regression (LR), random forest (RF), and eXtreme gradient boosting (XGBoost). The performances of the predictive models were evaluated using accuracy, precision, sensitivity, specificity, and area under the ROC curve (AUC) metrics. After applying the exclusion criteria, a total of 815 positive RT-PCR patients were the final sample size, where 54.85% of the patients were male and the mean age of the study population was 57.22 ± 16.76 years. The RF algorithm with an accuracy of 97.2%, the sensitivity of 100%, a precision of 94.8%, specificity of 94.5%, F1-score of 97.3%, and AUC of 99.9% had the best performance. Other ML algorithms with AUC ranging from 81.2 to 93.9% had also good prediction performances in predicting COVID-19 mortality. Results showed that timely and accurate risk stratification of COVID-19 patients could be performed using ML-based predictive models fed by routine data. The proposed algorithm with the more comprehensive dataset including CT-SS could efficiently predict the mortality of COVID-19 patients. This could lead to promptly targeting high-risk patients on admission, the optimal use of hospital resources, and an increased probability of survival of patients.

摘要

自 COVID-19 大流行开始以来,已经引入了新的非侵入性数字技术,如人工智能 (AI),用于预测 COVID-19 患者的死亡率。基于机器学习 (ML) 的模型预测 COVID-19 患者临床结局的预后性能主要使用人口统计学、风险因素、临床表现和实验室结果进行评估。关于影像学表现与人口统计学、临床表现和实验室预测因素相结合的预后作用的信息不足。本研究的目的是开发一种基于更全面数据集的高效 ML 预后模型,该数据集包括胸部 CT 严重程度评分 (CT-SS)。回顾性分析了 6854 例疑似病例的 55 个主要特征,共分 6 个主要类别。卡方独立性检验用于确定 COVID-19 患者死亡率预测中最重要的特征。使用 ML 算法对最相关的预测因子进行训练和测试。使用包括 J48 决策树 (J48)、支持向量机 (SVM)、多层感知机 (MLP)、k-最近邻 (k-NN)、朴素贝叶斯 (NB)、逻辑回归 (LR)、随机森林 (RF) 和极端梯度提升 (XGBoost) 在内的八种 ML 算法开发预测模型。使用准确性、精确性、敏感性、特异性和 ROC 曲线下面积 (AUC) 等指标评估预测模型的性能。应用排除标准后,总共 815 例 RT-PCR 阳性患者为最终样本量,其中 54.85%的患者为男性,研究人群的平均年龄为 57.22±16.76 岁。RF 算法的准确性为 97.2%,敏感性为 100%,精确性为 94.8%,特异性为 94.5%,F1 得分为 97.3%,AUC 为 99.9%,性能最佳。AUC 范围为 81.2%至 93.9%的其他 ML 算法在预测 COVID-19 死亡率方面也具有良好的预测性能。结果表明,使用基于 ML 的预测模型可以对 COVID-19 患者进行及时、准确的风险分层。使用包括 CT-SS 在内的更全面数据集的建议算法可以有效地预测 COVID-19 患者的死亡率。这可能导致及时针对入院时的高危患者,优化利用医院资源,并增加患者的生存率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0387/10345104/18c1953316e6/41598_2023_38133_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索