Lawrence Livermore National Laboratory, 7000 East Ave, Livermore, CA, 94550, USA.
ProMedica Health System, Inc, 3103 Executive Pkwy, Toledo, OH, 43606, USA.
Sci Rep. 2021 Oct 1;11(1):19543. doi: 10.1038/s41598-021-98071-z.
The combination of machine learning (ML) and electronic health records (EHR) data may be able to improve outcomes of hospitalized COVID-19 patients through improved risk stratification and patient outcome prediction. However, in resource constrained environments the clinical utility of such data-driven predictive tools may be limited by the cost or unavailability of certain laboratory tests. We leveraged EHR data to develop an ML-based tool for predicting adverse outcomes that optimizes clinical utility under a given cost structure. We further gained insights into the decision-making process of the ML models through an explainable AI tool. This cohort study was performed using deidentified EHR data from COVID-19 patients from ProMedica Health System in northwest Ohio and southeastern Michigan. We tested the performance of various ML approaches for predicting either increasing ventilatory support or mortality. We performed post hoc analysis to obtain optimal feature sets under various budget constraints. We demonstrate that it is possible to achieve a significant reduction in cost at the expense of a small reduction in predictive performance. For example, when predicting ventilation, it is possible to achieve a 43% reduction in cost with only a 3% reduction in performance. Similarly, when predicting mortality, it is possible to achieve a 50% reduction in cost with only a 1% reduction in performance. This study presents a quick, accurate, and cost-effective method to evaluate risk of deterioration for patients with SARS-CoV-2 infection at the time of clinical evaluation.
机器学习 (ML) 和电子健康记录 (EHR) 数据的结合,可能通过改善风险分层和患者预后预测,从而改善住院 COVID-19 患者的结局。然而,在资源有限的环境中,此类数据驱动的预测工具的临床实用性可能受到特定实验室检测的成本或可用性的限制。我们利用 EHR 数据开发了一种基于机器学习的工具,用于预测不良结局,在给定的成本结构下优化临床实用性。我们还通过可解释的人工智能工具深入了解机器学习模型的决策过程。本队列研究使用了来自俄亥俄州西北部和密歇根州东南部的 ProMedica 健康系统的 COVID-19 患者的匿名 EHR 数据。我们测试了各种用于预测通气支持增加或死亡率的机器学习方法的性能。我们进行了事后分析,以在各种预算限制下获得最佳特征集。我们证明,以牺牲较小的预测性能为代价,降低成本是有可能的。例如,在预测通气时,以降低 3%的性能为代价,降低成本 43%是有可能的。同样,在预测死亡率时,以降低 1%的性能为代价,降低成本 50%是有可能的。本研究提出了一种快速、准确和具有成本效益的方法,用于在临床评估时评估 SARS-CoV-2 感染患者恶化的风险。