Lentzen Manuel, Linden Thomas, Veeranki Sai, Madan Sumit, Kramer Diether, Leodolter Werner, Frohlich Holger
IEEE J Biomed Health Inform. 2023 Sep;27(9):4548-4558. doi: 10.1109/JBHI.2023.3288768. Epub 2023 Sep 6.
In situations like the COVID-19 pandemic, healthcare systems are under enormous pressure as they can rapidly collapse under the burden of the crisis. Machine learning (ML) based risk models could lift the burden by identifying patients with a high risk of severe disease progression. Electronic Health Records (EHRs) provide crucial sources of information to develop these models because they rely on routinely collected healthcare data. However, EHR data is challenging for training ML models because it contains irregularly timestamped diagnosis, prescription, and procedure codes. For such data, transformer-based models are promising. We extended the previously published Med-BERT model by including age, sex, medications, quantitative clinical measures, and state information. After pre-training on approximately 988 million EHRs from 3.5 million patients, we developed models to predict Acute Respiratory Manifestations (ARM) risk using the medical history of 80,211 COVID-19 patients. Compared to Random Forests, XGBoost, and RETAIN, our transformer-based models more accurately forecast the risk of developing ARM after COVID-19 infection. We used Integrated Gradients and Bayesian networks to understand the link between the essential features of our model. Finally, we evaluated adapting our model to Austrian in-patient data. Our study highlights the promise of predictive transformer-based models for precision medicine.
在新冠疫情这样的情况下,医疗系统承受着巨大压力,因为它们可能在危机的重压下迅速崩溃。基于机器学习(ML)的风险模型可以通过识别具有严重疾病进展高风险的患者来减轻负担。电子健康记录(EHR)为开发这些模型提供了关键的信息来源,因为它们依赖于常规收集的医疗数据。然而,EHR数据对训练ML模型具有挑战性,因为它包含时间戳不规则的诊断、处方和程序代码。对于此类数据,基于Transformer的模型很有前景。我们通过纳入年龄、性别、药物、定量临床指标和状态信息,扩展了先前发表的Med-BERT模型。在对来自350万患者的约9.88亿份EHR进行预训练后,我们使用80211名新冠患者的病史开发了预测急性呼吸道表现(ARM)风险的模型。与随机森林、XGBoost和RETAIN相比,我们基于Transformer的模型能更准确地预测新冠感染后发生ARM的风险。我们使用集成梯度和贝叶斯网络来理解模型基本特征之间的联系。最后,我们评估了将我们的模型应用于奥地利住院患者数据的情况。我们的研究突出了基于Transformer的预测模型在精准医学方面的前景。