关于在大型 COVID-19 患者数据集上进行住院预测的可解释性。

On the explainability of hospitalization prediction on a large COVID-19 patient dataset.

机构信息

IBM Research Europe.

IBM GBS Germany.

出版信息

AMIA Annu Symp Proc. 2022 Feb 21;2021:526-535. eCollection 2021.

PMID:35308959

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8861733/

Abstract

We develop various AI models to predict hospitalization on a large (over 110k) cohort of COVID-19 positive-tested US patients, sourced from March 2020 to February 2021. Models range from Random Forest to Neural Network (NN) and Time Convolutional NN, where combination of the data modalities (tabular and time dependent) are performed at different stages (early vs. model fusion). Despite high data unbalance, the models reach average precision 0.96-0.98 (0.75-0.85), recall 0.96-0.98 (0.74-0.85), and F-score 0.97-0.98 (0.79-0.83) on the non-hospitalized (or hospitalized) class. Performances do not significantly drop even when selected lists of features are removed to study model adaptability to different scenarios. However, a systematic study of the SHAP feature importance values for the developed models in the different scenarios shows a large variability across models and use cases. This calls for even more complete studies on several explainability methods before their adoption in high-stakes scenarios.

摘要

我们开发了各种 AI 模型，以预测在一个来自 2020 年 3 月至 2021 年 2 月的超过 11 万例 COVID-19 阳性检测美国患者的大型队列上的住院情况。模型范围从随机森林到神经网络（NN）和时间卷积神经网络，其中数据模态（表格和时间相关）的组合在不同阶段（早期与模型融合）进行。尽管数据严重不平衡，模型在非住院（或住院）类别上的平均精度达到 0.96-0.98（0.75-0.85），召回率达到 0.96-0.98（0.74-0.85），F 分数达到 0.97-0.98（0.79-0.83）。即使在去除选择的特征列表以研究模型对不同场景的适应性时，性能也没有明显下降。然而，对不同场景下开发的模型的 SHAP 特征重要性值进行系统研究表明，模型之间和用例之间存在很大的可变性。在采用高风险场景之前，需要对几种可解释性方法进行更全面的研究。

相似文献

On the explainability of hospitalization prediction on a large COVID-19 patient dataset.关于在大型 COVID-19 患者数据集上进行住院预测的可解释性。

AMIA Annu Symp Proc. 2022 Feb 21;2021:526-535. eCollection 2021.

Development and Validation of a Robust and Interpretable Early Triaging Support System for Patients Hospitalized With COVID-19: Predictive Algorithm Modeling and Interpretation Study.开发和验证用于 COVID-19 住院患者的强大且可解释的早期分诊支持系统：预测算法建模和解释研究。

J Med Internet Res. 2024 Jan 11;26:e52134. doi: 10.2196/52134.

Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。

BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.

COVID-Net Biochem: an explainability-driven framework to building machine learning models for predicting survival and kidney injury of COVID-19 patients from clinical and biochemistry data.COVID-Net 生化：一个基于可解释性的框架，用于构建基于临床和生化数据预测 COVID-19 患者生存和肾脏损伤的机器学习模型。

Sci Rep. 2023 Oct 9;13(1):17001. doi: 10.1038/s41598-023-42203-0.

Machine learning models to predict the maximum severity of COVID-19 based on initial hospitalization record.基于初始住院记录的机器学习模型预测 COVID-19 最大严重程度。

Front Public Health. 2022 Nov 28;10:1007205. doi: 10.3389/fpubh.2022.1007205. eCollection 2022.

Artificial intelligence-driven prediction of COVID-19-related hospitalization and death: a systematic review.人工智能驱动的 COVID-19 相关住院和死亡预测：系统评价。

Front Public Health. 2023 Jun 20;11:1183725. doi: 10.3389/fpubh.2023.1183725. eCollection 2023.

Explainable machine learning models based on multimodal time-series data for the early detection of Parkinson's disease.基于多模态时间序列数据的可解释机器学习模型用于帕金森病的早期检测。

Comput Methods Programs Biomed. 2023 Jun;234:107495. doi: 10.1016/j.cmpb.2023.107495. Epub 2023 Mar 23.

Prediction of High-Risk Donors for Kidney Discard and Nonrecovery Using Structured Donor Characteristics and Unstructured Donor Narratives.使用结构化的供者特征和非结构化的供者描述来预测肾脏废弃和无法恢复的高危供者。

JAMA Surg. 2024 Jan 1;159(1):60-68. doi: 10.1001/jamasurg.2023.4679.

Chest X-ray image phase features for improved diagnosis of COVID-19 using convolutional neural network.基于卷积神经网络的胸部 X 射线图像相位特征提高 COVID-19 诊断性能

Int J Comput Assist Radiol Surg. 2021 Feb;16(2):197-206. doi: 10.1007/s11548-020-02305-w. Epub 2021 Jan 9.

Verifying explainability of a deep learning tissue classifier trained on RNA-seq data.验证基于 RNA-seq 数据训练的深度学习组织分类器的可解释性。

Sci Rep. 2021 Jan 29;11(1):2641. doi: 10.1038/s41598-021-81773-9.

引用本文的文献

Performance and explainability of feature selection-boosted tree-based classifiers for COVID-19 detection.用于COVID-19检测的特征选择增强型基于树的分类器的性能与可解释性

Heliyon. 2023 Dec 7;10(1):e23219. doi: 10.1016/j.heliyon.2023.e23219. eCollection 2024 Jan 15.

本文引用的文献

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.停止为高风险决策解释黑箱机器学习模型，转而使用可解释模型。

Nat Mach Intell. 2019 May;1(5):206-215. doi: 10.1038/s42256-019-0048-x. Epub 2019 May 13.

Artificial intelligence for COVID-19: saviour or saboteur?用于应对新冠疫情的人工智能：救星还是破坏者？

Lancet Digit Health. 2021 Jan;3(1):e1. doi: 10.1016/S2589-7500(20)30295-8.

Developing and validating COVID-19 adverse outcome risk prediction models from a bi-national European cohort of 5594 patients.从一个包含 5594 名患者的欧洲多中心队列中开发和验证 COVID-19 不良结局风险预测模型。

Sci Rep. 2021 Feb 5;11(1):3246. doi: 10.1038/s41598-021-81844-x.

Early prediction of level-of-care requirements in patients with COVID-19.对 COVID-19 患者的医疗照护需求进行早期预测。

Elife. 2020 Oct 12;9:e60519. doi: 10.7554/eLife.60519.

The myth of generalisability in clinical research and machine learning in health care.临床研究和医疗保健中机器学习的泛化性神话。

Lancet Digit Health. 2020 Sep;2(9):e489-e492. doi: 10.1016/S2589-7500(20)30186-2. Epub 2020 Aug 24.

Development and validation of a model for individualized prediction of hospitalization risk in 4,536 patients with COVID-19.开发和验证一种模型，用于对 4536 例 COVID-19 患者的住院风险进行个体化预测。

PLoS One. 2020 Aug 11;15(8):e0237419. doi: 10.1371/journal.pone.0237419. eCollection 2020.

Coronavirus Disease 2019 Case Surveillance - United States, January 22-May 30, 2020.2019 年冠状病毒病病例监测-美国，2020 年 1 月 22 日-5 月 30 日。

MMWR Morb Mortal Wkly Rep. 2020 Jun 19;69(24):759-765. doi: 10.15585/mmwr.mm6924e2.

Presenting Characteristics, Comorbidities, and Outcomes Among 5700 Patients Hospitalized With COVID-19 in the New York City Area.在纽约市地区，5700 名因 COVID-19 住院的患者的特征、合并症和结局。

JAMA. 2020 May 26;323(20):2052-2059. doi: 10.1001/jama.2020.6775.

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal.COVID-19 诊断和预后预测模型：系统评价和批判性评估。

BMJ. 2020 Apr 7;369:m1328. doi: 10.1136/bmj.m1328.

Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study.中国武汉成人 COVID-19 住院患者的临床病程和死亡危险因素：一项回顾性队列研究。

Lancet. 2020 Mar 28;395(10229):1054-1062. doi: 10.1016/S0140-6736(20)30566-3. Epub 2020 Mar 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验