Parchure Prathamesh, Joshi Himanshu, Dharmarajan Kavita, Freeman Robert, Reich David L, Mazumdar Madhu, Timsina Prem, Kia Arash
Institute for Healthcare Delivery Science, Icahn School of Medicine at Mount Sinai, New York, New York, USA.
Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, New York, United States.
BMJ Support Palliat Care. 2020 Sep 22. doi: 10.1136/bmjspcare-2020-002602.
To develop and validate a model for prediction of near-term in-hospital mortality among patients with COVID-19 by application of a machine learning (ML) algorithm on time-series inpatient data from electronic health records.
A cohort comprised of 567 patients with COVID-19 at a large acute care healthcare system between 10 February 2020 and 7 April 2020 observed until either death or discharge. Random forest (RF) model was developed on randomly drawn 70% of the cohort (training set) and its performance was evaluated on the rest of 30% (the test set). The outcome variable was in-hospital mortality within 20-84 hours from the time of prediction. Input features included patients' vital signs, laboratory data and ECG results.
Patients had a median age of 60.2 years (IQR 26.2 years); 54.1% were men. In-hospital mortality rate was 17.0% and overall median time to death was 6.5 days (range 1.3-23.0 days). In the test set, the RF classifier yielded a sensitivity of 87.8% (95% CI: 78.2% to 94.3%), specificity of 60.6% (95% CI: 55.2% to 65.8%), accuracy of 65.5% (95% CI: 60.7% to 70.0%), area under the receiver operating characteristic curve of 85.5% (95% CI: 80.8% to 90.2%) and area under the precision recall curve of 64.4% (95% CI: 53.5% to 75.3%).
Our ML-based approach can be used to analyse electronic health record data and reliably predict near-term mortality prediction. Using such a model in hospitals could help improve care, thereby better aligning clinical decisions with prognosis in critically ill patients with COVID-19.
通过对电子健康记录中的时间序列住院患者数据应用机器学习(ML)算法,开发并验证一种预测COVID-19患者近期院内死亡率的模型。
在一个大型急性护理医疗系统中,选取2020年2月10日至2020年4月7日期间的567例COVID-19患者组成队列,观察至死亡或出院。随机森林(RF)模型基于随机抽取的队列的70%(训练集)开发,并在其余30%(测试集)上评估其性能。结果变量为预测时间起20 - 84小时内的院内死亡率。输入特征包括患者的生命体征、实验室数据和心电图结果。
患者的中位年龄为60.2岁(四分位间距26.2岁);54.1%为男性。院内死亡率为17.0%,总体死亡中位时间为6.5天(范围1.3 - 23.0天)。在测试集中,RF分类器的灵敏度为87.8%(95%置信区间:78.2%至94.3%),特异度为60.6%(95%置信区间:55.2%至65.8%),准确度为65.5%(95%置信区间:60.7%至70.0%),受试者工作特征曲线下面积为85.5%(95%置信区间:80.8%至90.2%)以及精确召回率曲线下面积为64.4%(95%置信区间:53.5%至75.3%)。
我们基于机器学习的方法可用于分析电子健康记录数据,并可靠地预测近期死亡率。在医院中使用这样的模型有助于改善护理,从而使COVID-19重症患者的临床决策与预后更好地匹配。