Luo Gang, Nau Claudia L, Crawford William W, Schatz Michael, Zeiger Robert S, Rozema Emily, Koebnick Corinna
Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States.
Department of Research & Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States.
JMIR Med Inform. 2020 Nov 9;8(11):e22689. doi: 10.2196/22689.
Asthma causes numerous hospital encounters annually, including emergency department visits and hospitalizations. To improve patient outcomes and reduce the number of these encounters, predictive models are widely used to prospectively pinpoint high-risk patients with asthma for preventive care via care management. However, previous models do not have adequate accuracy to achieve this goal well. Adopting the modeling guideline for checking extensive candidate features, we recently constructed a machine learning model on Intermountain Healthcare data to predict asthma-related hospital encounters in patients with asthma. Although this model is more accurate than the previous models, whether our modeling guideline is generalizable to other health care systems remains unknown.
This study aims to assess the generalizability of our modeling guideline to Kaiser Permanente Southern California (KPSC).
The patient cohort included a random sample of 70.00% (397,858/568,369) of patients with asthma who were enrolled in a KPSC health plan for any duration between 2015 and 2018. We produced a machine learning model via a secondary analysis of 987,506 KPSC data instances from 2012 to 2017 and by checking 337 candidate features to project asthma-related hospital encounters in the following 12-month period in patients with asthma.
Our model reached an area under the receiver operating characteristic curve of 0.820. When the cutoff point for binary classification was placed at the top 10.00% (20,474/204,744) of patients with asthma having the largest predicted risk, our model achieved an accuracy of 90.08% (184,435/204,744), a sensitivity of 51.90% (2259/4353), and a specificity of 90.91% (182,176/200,391).
Our modeling guideline exhibited acceptable generalizability to KPSC and resulted in a model that is more accurate than those formerly built by others. After further enhancement, our model could be used to guide asthma care management.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/resprot.5039.
哮喘每年导致大量的就医情况,包括急诊就诊和住院治疗。为了改善患者预后并减少此类就医情况的数量,预测模型被广泛用于前瞻性地找出哮喘高危患者,以便通过护理管理提供预防性护理。然而,以往的模型在实现这一目标方面准确性不足。我们采用用于检查大量候选特征的建模指南,最近基于山间医疗保健系统(Intermountain Healthcare)的数据构建了一个机器学习模型,以预测哮喘患者与哮喘相关的医院就诊情况。尽管该模型比以往的模型更准确,但我们的建模指南是否可推广到其他医疗系统仍不清楚。
本研究旨在评估我们的建模指南对南加州凯撒医疗集团(Kaiser Permanente Southern California,KPSC)的可推广性。
患者队列包括在2015年至2018年期间任何时间段参加KPSC健康计划的哮喘患者的70.00%(397,858/568,369)的随机样本。我们通过对2012年至2017年的987,506个KPSC数据实例进行二次分析,并检查337个候选特征,构建了一个机器学习模型,以预测哮喘患者在接下来12个月内与哮喘相关的医院就诊情况。
我们的模型在受试者工作特征曲线下面积达到0.820。当二元分类的截断点设定为预测风险最高的前10.00%(20,474/204,744)的哮喘患者时,我们的模型准确率为90.08%(184,435/204,744),灵敏度为51.90%(2259/4353),特异度为90.91%(182,176/200,391)。
我们的建模指南对KPSC显示出可接受的可推广性,并产生了一个比其他人以前构建的模型更准确的模型。经过进一步改进后,我们的模型可用于指导哮喘护理管理。
国际注册报告识别码(IRRID):RR2-10.2196/resprot.5039。