Herrero-Zazo Maria, Fitzgerald Tomas, Taylor Vince, Street Helen, Chaudhry Afzal N, Bradley John R, Birney Ewan, Keevil Victoria L
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK.
Department of Medicine for the Elderly, Addenbrooke's Hospital, Cambridge University Hospitals NHS Foundation Trust, Hills Road, Cambridge CB2 0QQ, UK.
iScience. 2022 Dec 24;26(1):105876. doi: 10.1016/j.isci.2022.105876. eCollection 2023 Jan 20.
Electronic Health Records (EHR) data can provide novel insights into inpatient trajectories. Blood tests and vital signs from de-identified patients' hospital admission episodes (AE) were represented as multivariate time-series (MVTS) to train unsupervised Hidden Markov Models (HMM) and represent each AE day as one of 17 states. All HMM states were clinically interpreted based on their patterns of MVTS variables and relationships with clinical information. Visualization differentiated patients progressing toward stable '' states versus those remaining at risk of inpatient mortality (IM). Chi-square tests confirmed these relationships (two states associated with IM; 12 states with ≥1 diagnosis). Logistic Regression and Random Forest (RF) models trained with MVTS data rather than states had higher prediction performances of IM, but results were comparable (best RF model AUC-ROC: MVTS data = 0.85; HMM states = 0.79). ML models extracted clinically interpretable signals from hospital data. The potential of ML to develop decision-support tools for EHR systems warrants investigation.
电子健康记录(EHR)数据能够为住院病程提供全新的见解。来自去识别化患者住院期间(AE)的血液检测和生命体征被表示为多变量时间序列(MVTS),用于训练无监督隐马尔可夫模型(HMM),并将每个AE日表示为17种状态之一。所有HMM状态均根据其MVTS变量模式以及与临床信息的关系进行临床解读。可视化区分了朝着稳定“状态”进展的患者与仍有住院死亡(IM)风险的患者。卡方检验证实了这些关系(两种状态与IM相关;12种状态有≥1种诊断)。使用MVTS数据而非状态训练的逻辑回归和随机森林(RF)模型对IM具有更高的预测性能,但结果相当(最佳RF模型AUC-ROC:MVTS数据 = 0.85;HMM状态 = 0.79)。机器学习模型从医院数据中提取了具有临床可解释性的信号。机器学习为EHR系统开发决策支持工具的潜力值得研究。