PERSIMUNE Center of Excellence, Rigshospitalet, Copenhagen, Denmark.
Department of Hematology, Rigshospitalet, Copenhagen, Denmark.
Sci Rep. 2022 Aug 16;12(1):13879. doi: 10.1038/s41598-022-17953-y.
Interpretable risk assessment of SARS-CoV-2 positive patients can aid clinicians to implement precision medicine. Here we trained a machine learning model to predict mortality within 12 weeks of a first positive SARS-CoV-2 test. By leveraging data on 33,938 confirmed SARS-CoV-2 cases in eastern Denmark, we considered 2723 variables extracted from electronic health records (EHR) including demographics, diagnoses, medications, laboratory test results and vital parameters. A discrete-time framework for survival modelling enabled us to predict personalized survival curves and explain individual risk factors. Performance on the test set was measured with a weighted concordance index of 0.95 and an area under the curve for precision-recall of 0.71. Age, sex, number of medications, previous hospitalizations and lymphocyte counts were identified as top mortality risk factors. Our explainable survival model developed on EHR data also revealed temporal dynamics of the 22 selected risk factors. Upon further validation, this model may allow direct reporting of personalized survival probabilities in routine care.
对 SARS-CoV-2 阳性患者进行可解释的风险评估可以帮助临床医生实施精准医疗。在这里,我们训练了一个机器学习模型来预测首次 SARS-CoV-2 检测后 12 周内的死亡率。通过利用丹麦东部 33938 例确诊的 SARS-CoV-2 病例的数据,我们考虑了从电子健康记录(EHR)中提取的 2723 个变量,包括人口统计学、诊断、药物、实验室检测结果和生命体征。生存模型的离散时间框架使我们能够预测个性化的生存曲线并解释个体风险因素。在测试集上的性能通过加权一致性指数为 0.95 和精度召回曲线下面积为 0.71 来衡量。年龄、性别、药物数量、先前住院和淋巴细胞计数被确定为最高死亡风险因素。我们基于 EHR 数据开发的可解释生存模型还揭示了 22 个选定风险因素的时间动态。经过进一步验证,该模型可允许在常规护理中直接报告个性化的生存概率。