Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, United States of America.
Department of Pharmaceutical Outcomes and Policy, University of Florida College of Pharmacy, Gainesville, FL, United States of America.
PLoS One. 2023 Oct 20;18(10):e0292888. doi: 10.1371/journal.pone.0292888. eCollection 2023.
OBJECTIVE: This study aimed to develop and validate predictive models using electronic health records (EHR) data to determine whether hospitalized COVID-19-positive patients would be admitted to alternative medical care or discharged home. METHODS: We conducted a retrospective cohort study using deidentified data from the University of Florida Health Integrated Data Repository. The study included 1,578 adult patients (≥18 years) who tested positive for COVID-19 while hospitalized, comprising 960 (60.8%) female patients with a mean (SD) age of 51.86 (18.49) years and 618 (39.2%) male patients with a mean (SD) age of 54.35 (18.48) years. Machine learning (ML) model training involved cross-validation to assess their performance in predicting patient disposition. RESULTS: We developed and validated six supervised ML-based prediction models (logistic regression, Gaussian Naïve Bayes, k-nearest neighbors, decision trees, random forest, and support vector machine classifier) to predict patient discharge status. The models were evaluated based on the area under the receiver operating characteristic curve (ROC-AUC), precision, accuracy, F1 score, and Brier score. The random forest classifier exhibited the highest performance, achieving an accuracy of 0.84 and an AUC of 0.72. Logistic regression (accuracy: 0.85, AUC: 0.71), k-nearest neighbor (accuracy: 0.84, AUC: 0.63), decision tree (accuracy: 0.84, AUC: 0.61), Gaussian Naïve Bayes (accuracy: 0.84, AUC: 0.66), and support vector machine classifier (accuracy: 0.84, AUC: 0.67) also demonstrated valuable predictive capabilities. SIGNIFICANCE: This study's findings are crucial for efficiently allocating healthcare resources during pandemics like COVID-19. By harnessing ML techniques and EHR data, we can create predictive tools to identify patients at greater risk of severe symptoms based on their medical histories. The models developed here serve as a foundation for expanding the toolkit available to healthcare professionals and organizations. Additionally, explainable ML methods, such as Shapley Additive Explanations, aid in uncovering underlying data features that inform healthcare decision-making processes.
目的:本研究旨在利用电子健康记录(EHR)数据开发和验证预测模型,以确定住院的 COVID-19 阳性患者是否会转至其他医疗护理或出院回家。
方法:我们进行了一项回顾性队列研究,使用了佛罗里达大学健康综合数据存储库的匿名数据。该研究包括 1578 名成年 COVID-19 住院阳性患者,其中 960 名(60.8%)为女性,平均(SD)年龄为 51.86(18.49)岁,618 名(39.2%)为男性,平均(SD)年龄为 54.35(18.48)岁。机器学习(ML)模型训练涉及交叉验证,以评估其在预测患者处置方面的性能。
结果:我们开发并验证了六个基于监督学习的预测模型(逻辑回归、高斯朴素贝叶斯、k-最近邻、决策树、随机森林和支持向量机分类器),以预测患者出院状态。基于受试者工作特征曲线下的面积(ROC-AUC)、精度、准确性、F1 评分和 Brier 评分对模型进行评估。随机森林分类器表现出最高的性能,准确率为 0.84,AUC 为 0.72。逻辑回归(准确率:0.85,AUC:0.71)、k-最近邻(准确率:0.84,AUC:0.63)、决策树(准确率:0.84,AUC:0.61)、高斯朴素贝叶斯(准确率:0.84,AUC:0.66)和支持向量机分类器(准确率:0.84,AUC:0.67)也表现出了有价值的预测能力。
意义:本研究的结果对于在 COVID-19 等大流行期间有效分配医疗资源至关重要。通过利用机器学习技术和 EHR 数据,我们可以创建预测工具,根据患者的病史识别出症状更严重的患者。这里开发的模型为扩大医疗保健专业人员和组织可用的工具包提供了基础。此外,可解释的机器学习方法(如 Shapley Additive Explanations)有助于揭示用于指导医疗保健决策过程的数据特征。
BMC Public Health. 2024-6-28
BMC Cardiovasc Disord. 2024-1-18
Clin Orthop Relat Res. 2020-7
BMC Med Inform Decis Mak. 2022-10-25
PLOS Digit Health. 2024-8-22
Emerg Microbes Infect. 2024-12
Open Respir Med J. 2021-9-17
Med Clin (Barc). 2021-6-11
JMIR Med Inform. 2021-1-11
Heart. 2020-6-10