• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用机器学习预测急诊入院风险:基于电子健康记录的开发和验证。

Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records.

机构信息

Deep Medicine, Oxford Martin School, Oxford, United Kingdom.

The George Institute for Global Health, University of Oxford, Oxford, United Kingdom.

出版信息

PLoS Med. 2018 Nov 20;15(11):e1002695. doi: 10.1371/journal.pmed.1002695. eCollection 2018 Nov.

DOI:10.1371/journal.pmed.1002695
PMID:30458006
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6245681/
Abstract

BACKGROUND

Emergency admissions are a major source of healthcare spending. We aimed to derive, validate, and compare conventional and machine learning models for prediction of the first emergency admission. Machine learning methods are capable of capturing complex interactions that are likely to be present when predicting less specific outcomes, such as this one.

METHODS AND FINDINGS

We used longitudinal data from linked electronic health records of 4.6 million patients aged 18-100 years from 389 practices across England between 1985 to 2015. The population was divided into a derivation cohort (80%, 3.75 million patients from 300 general practices) and a validation cohort (20%, 0.88 million patients from 89 general practices) from geographically distinct regions with different risk levels. We first replicated a previously reported Cox proportional hazards (CPH) model for prediction of the risk of the first emergency admission up to 24 months after baseline. This reference model was then compared with 2 machine learning models, random forest (RF) and gradient boosting classifier (GBC). The initial set of predictors for all models included 43 variables, including patient demographics, lifestyle factors, laboratory tests, currently prescribed medications, selected morbidities, and previous emergency admissions. We then added 13 more variables (marital status, prior general practice visits, and 11 additional morbidities), and also enriched all variables by incorporating temporal information whenever possible (e.g., time since first diagnosis). We also varied the prediction windows to 12, 36, 48, and 60 months after baseline and compared model performances. For internal validation, we used 5-fold cross-validation. When the initial set of variables was used, GBC outperformed RF and CPH, with an area under the receiver operating characteristic curve (AUC) of 0.779 (95% CI 0.777, 0.781), compared to 0.752 (95% CI 0.751, 0.753) and 0.740 (95% CI 0.739, 0.741), respectively. In external validation, we observed an AUC of 0.796, 0.736, and 0.736 for GBC, RF, and CPH, respectively. The addition of temporal information improved AUC across all models. In internal validation, the AUC rose to 0.848 (95% CI 0.847, 0.849), 0.825 (95% CI 0.824, 0.826), and 0.805 (95% CI 0.804, 0.806) for GBC, RF, and CPH, respectively, while the AUC in external validation rose to 0.826, 0.810, and 0.788, respectively. This enhancement also resulted in robust predictions for longer time horizons, with AUC values remaining at similar levels across all models. Overall, compared to the baseline reference CPH model, the final GBC model showed a 10.8% higher AUC (0.848 compared to 0.740) for prediction of risk of emergency admission within 24 months. GBC also showed the best calibration throughout the risk spectrum. Despite the wide range of variables included in models, our study was still limited by the number of variables included; inclusion of more variables could have further improved model performances.

CONCLUSIONS

The use of machine learning and addition of temporal information led to substantially improved discrimination and calibration for predicting the risk of emergency admission. Model performance remained stable across a range of prediction time windows and when externally validated. These findings support the potential of incorporating machine learning models into electronic health records to inform care and service planning.

摘要

背景

急诊入院是医疗保健支出的主要来源。我们旨在开发、验证和比较传统和机器学习模型,以预测首次急诊入院。机器学习方法能够捕捉到在预测不太具体的结果(如本次预测)时可能存在的复杂交互作用。

方法和发现

我们使用来自英格兰 389 家实践中的 460 万名 18-100 岁患者的纵向电子健康记录数据,时间范围为 1985 年至 2015 年。该人群分为来自 300 家普通实践的 80%(375 万名患者)的推导队列和来自地理上不同、风险水平不同的 89 家普通实践的 20%(88 万名患者)的验证队列。我们首先复制了之前报告的用于预测基线后 24 个月内首次急诊入院风险的 Cox 比例风险(CPH)模型。然后将这个参考模型与两种机器学习模型(随机森林(RF)和梯度提升分类器(GBC))进行比较。所有模型的初始预测因子集包括 43 个变量,包括患者人口统计学、生活方式因素、实验室检查、目前开的药物、选择的合并症和之前的急诊入院情况。然后我们添加了 13 个变量(婚姻状况、之前的普通实践就诊情况和 11 个额外的合并症),并尽可能地通过纳入时间信息(例如,首次诊断后的时间)来丰富所有变量。我们还将预测窗口更改为基线后 12、36、48 和 60 个月,并比较了模型性能。对于内部验证,我们使用了 5 折交叉验证。当使用初始变量集时,GBC 的表现优于 RF 和 CPH,其接收者操作特征曲线(ROC)下面积(AUC)为 0.779(95%CI 0.777,0.781),分别为 0.752(95%CI 0.751,0.753)和 0.740(95%CI 0.739,0.741)。在外部验证中,我们观察到 GBC、RF 和 CPH 的 AUC 分别为 0.796、0.736 和 0.736。添加时间信息提高了所有模型的 AUC。在内部验证中,AUC 上升至 0.848(95%CI 0.847,0.849)、0.825(95%CI 0.824,0.826)和 0.805(95%CI 0.804,0.806),分别用于 GBC、RF 和 CPH,而外部验证的 AUC 上升至 0.826、0.810 和 0.788,分别用于 GBC、RF 和 CPH。这种增强还导致更长时间的预测具有稳健性,所有模型的 AUC 值保持在相似水平。总体而言,与基线参考 CPH 模型相比,最终的 GBC 模型在预测 24 个月内急诊入院风险方面的 AUC 提高了 10.8%(0.848 与 0.740 相比)。GBC 在整个风险范围内的校准效果也最好。尽管模型中包含了广泛的变量,但我们的研究仍然受到包含变量数量的限制;纳入更多的变量可以进一步提高模型性能。

结论

机器学习的使用和时间信息的添加导致预测急诊入院风险的区分度和校准度有了显著提高。在一系列预测时间窗口和外部验证时,模型性能保持稳定。这些发现支持将机器学习模型纳入电子健康记录以提供护理和服务计划的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/16a74860eb4a/pmed.1002695.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/abe191d6338a/pmed.1002695.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/867c4d7192b6/pmed.1002695.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/16a74860eb4a/pmed.1002695.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/abe191d6338a/pmed.1002695.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/867c4d7192b6/pmed.1002695.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7251/6245681/16a74860eb4a/pmed.1002695.g003.jpg

相似文献

1
Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records.使用机器学习预测急诊入院风险:基于电子健康记录的开发和验证。
PLoS Med. 2018 Nov 20;15(11):e1002695. doi: 10.1371/journal.pmed.1002695. eCollection 2018 Nov.
2
Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.基于电子病历中的屈光数据预测中国学龄儿童近视进展:一项回顾性、多中心机器学习研究。
PLoS Med. 2018 Nov 6;15(11):e1002674. doi: 10.1371/journal.pmed.1002674. eCollection 2018 Nov.
3
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation.机器学习预测纽约市新冠肺炎患者队列中的死亡率和危急事件:模型开发与验证
J Med Internet Res. 2020 Nov 6;22(11):e24018. doi: 10.2196/24018.
4
Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): A retrospective, single-site study.使用自动整理的电子健康记录数据(Pythia)开发和验证机器学习模型以识别高风险手术患者:一项回顾性、单站点研究。
PLoS Med. 2018 Nov 27;15(11):e1002701. doi: 10.1371/journal.pmed.1002701. eCollection 2018 Nov.
5
Development and Validation of an Electronic Health Record-Based Machine Learning Model to Estimate Delirium Risk in Newly Hospitalized Patients Without Known Cognitive Impairment.基于电子病历的机器学习模型开发与验证:用于预测无已知认知障碍的新入院患者发生谵妄的风险。
JAMA Netw Open. 2018 Aug 3;1(4):e181018. doi: 10.1001/jamanetworkopen.2018.1018.
6
Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach.急诊科脓毒症患者院内死亡率的预测:一种基于本地大数据驱动的机器学习方法。
Acad Emerg Med. 2016 Mar;23(3):269-78. doi: 10.1111/acem.12876. Epub 2016 Feb 13.
7
Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: A retrospective cohort study.利用机器学习技术提高经皮冠状动脉介入治疗后急性肾损伤风险的预测:一项回顾性队列研究。
PLoS Med. 2018 Nov 27;15(11):e1002703. doi: 10.1371/journal.pmed.1002703. eCollection 2018 Nov.
8
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
9
Machine Learning Model for Risk Prediction of Community-Acquired Acute Kidney Injury Hospitalization From Electronic Health Records: Development and Validation Study.基于电子健康记录的社区获得性急性肾损伤住院风险预测的机器学习模型:开发和验证研究。
J Med Internet Res. 2020 Aug 4;22(8):e16903. doi: 10.2196/16903.
10
Predicting Survival From Large Echocardiography and Electronic Health Record Datasets: Optimization With Machine Learning.从大型超声心动图和电子健康记录数据集预测生存:机器学习优化。
JACC Cardiovasc Imaging. 2019 Apr;12(4):681-689. doi: 10.1016/j.jcmg.2018.04.026. Epub 2018 Jun 13.

引用本文的文献

1
Equitable hospital length of stay prediction for patients with learning disabilities and multiple long-term conditions using machine learning.使用机器学习对学习障碍和多种长期病症患者的住院时间进行公平预测。
Front Digit Health. 2025 Feb 14;7:1538793. doi: 10.3389/fdgth.2025.1538793. eCollection 2025.
2
Advances in Machine Learning Models for Healthcare Applications: A Precise and Patient-Centric Approach.用于医疗保健应用的机器学习模型进展:一种精确且以患者为中心的方法。
Curr Pharm Des. 2025;31(28):2240-2251. doi: 10.2174/0113816128353371250119121315.
3
Differential behaviour of a risk score for emergency hospital admission by demographics in Scotland-A retrospective study.

本文引用的文献

1
Predicting all-cause risk of 30-day hospital readmission using artificial neural networks.使用人工神经网络预测30天内再次入院的全因风险。
PLoS One. 2017 Jul 14;12(7):e0181173. doi: 10.1371/journal.pone.0181173. eCollection 2017.
2
Development and validation of QRISK3 risk prediction algorithms to estimate future risk of cardiovascular disease: prospective cohort study.用于估计心血管疾病未来风险的QRISK3风险预测算法的开发与验证:前瞻性队列研究
BMJ. 2017 May 23;357:j2099. doi: 10.1136/bmj.j2099.
3
Data Resource Profile: Hospital Episode Statistics Admitted Patient Care (HES APC).
苏格兰不同人口统计学特征人群急诊入院风险评分的差异行为——一项回顾性研究
PLOS Digit Health. 2024 Dec 17;3(12):e0000675. doi: 10.1371/journal.pdig.0000675. eCollection 2024 Dec.
4
Development and assessment of a machine learning tool for predicting emergency admission in Scotland.用于预测苏格兰急诊入院情况的机器学习工具的开发与评估
NPJ Digit Med. 2024 Oct 23;7(1):277. doi: 10.1038/s41746-024-01250-1.
5
Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review.使用机器学习方法预测成年人全因躯体住院治疗:系统评价。
PLoS One. 2024 Aug 23;19(8):e0309175. doi: 10.1371/journal.pone.0309175. eCollection 2024.
6
Using machine learning to predict acute myocardial infarction and ischemic heart disease in primary care cardiovascular patients.利用机器学习预测基层医疗心血管患者的急性心肌梗死和缺血性心脏病。
PLoS One. 2024 Jul 18;19(7):e0307099. doi: 10.1371/journal.pone.0307099. eCollection 2024.
7
Signals in the Cells: Multimodal and Contextualized Machine Learning Foundations for Therapeutics.细胞中的信号:治疗学的多模态与情境化机器学习基础
bioRxiv. 2024 Nov 12:2024.06.12.598655. doi: 10.1101/2024.06.12.598655.
8
Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source.基于机器学习的大型索赔数据源中先天性心脏病诊断
JACC Adv. 2023 Dec 25;3(2):100801. doi: 10.1016/j.jacadv.2023.100801. eCollection 2024 Feb.
9
Association Between the Incidence of Hospitalizations for Acute Cardiovascular Events, Weather, and Air Pollution.急性心血管事件住院发生率、天气与空气污染之间的关联
JACC Adv. 2023 May 24;2(4):100334. doi: 10.1016/j.jacadv.2023.100334. eCollection 2023 Jun.
10
Promising algorithms to perilous applications: a systematic review of risk stratification tools for predicting healthcare utilisation.有前途的算法与危险的应用:预测医疗保健利用的风险分层工具的系统评价。
BMJ Health Care Inform. 2024 Jun 19;31(1):e101065. doi: 10.1136/bmjhci-2024-101065.
数据资源简介:医院事件统计入院患者护理(HES APC)
Int J Epidemiol. 2017 Aug 1;46(4):1093-1093i. doi: 10.1093/ije/dyx015.
4
Doctor AI: Predicting Clinical Events via Recurrent Neural Networks.人工智能医生:通过循环神经网络预测临床事件
JMLR Workshop Conf Proc. 2016 Aug;56:301-318. Epub 2016 Dec 10.
5
Dermatologist-level classification of skin cancer with deep neural networks.基于深度神经网络的皮肤癌皮肤科医生级分类。
Nature. 2017 Feb 2;542(7639):115-118. doi: 10.1038/nature21056. Epub 2017 Jan 25.
6
Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View.生物医学研究中机器学习预测模型开发与报告指南:多学科视角
J Med Internet Res. 2016 Dec 16;18(12):e323. doi: 10.2196/jmir.5870.
7
Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.深度学习算法在视网膜眼底照片糖尿病视网膜病变检测中的开发与验证。
JAMA. 2016 Dec 13;316(22):2402-2410. doi: 10.1001/jama.2016.17216.
8
PREDICTIVE MODELING OF HOSPITAL READMISSION RATES USING ELECTRONIC MEDICAL RECORD-WIDE MACHINE LEARNING: A CASE-STUDY USING MOUNT SINAI HEART FAILURE COHORT.使用全电子病历机器学习对医院再入院率进行预测建模:以西奈山心力衰竭队列为例的研究
Pac Symp Biocomput. 2017;22:276-287. doi: 10.1142/9789813207813_0027.
9
External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges.利用电子健康记录或个体患者数据(IPD)荟萃分析的大数据集对临床预测模型进行外部验证:机遇与挑战
BMJ. 2016 Jun 22;353:i3140. doi: 10.1136/bmj.i3140.
10
Data Resource Profile: Clinical Practice Research Datalink (CPRD).数据资源简介:临床实践研究数据链(CPRD)
Int J Epidemiol. 2015 Jun;44(3):827-36. doi: 10.1093/ije/dyv098. Epub 2015 Jun 6.