Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
Heart and Vascular Center, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
JAMA Netw Open. 2020 Jan 3;3(1):e1918962. doi: 10.1001/jamanetworkopen.2019.18962.
Accurate risk stratification of patients with heart failure (HF) is critical to deploy targeted interventions aimed at improving patients' quality of life and outcomes.
To compare machine learning approaches with traditional logistic regression in predicting key outcomes in patients with HF and evaluate the added value of augmenting claims-based predictive models with electronic medical record (EMR)-derived information.
DESIGN, SETTING, AND PARTICIPANTS: A prognostic study with a 1-year follow-up period was conducted including 9502 Medicare-enrolled patients with HF from 2 health care provider networks in Boston, Massachusetts ("providers" includes physicians, clinicians, other health care professionals, and their institutions that comprise the networks). The study was performed from January 1, 2007, to December 31, 2014; data were analyzed from January 1 to December 31, 2018.
All-cause mortality, HF hospitalization, top cost decile, and home days loss greater than 25% were modeled using logistic regression, least absolute shrinkage and selection operation regression, classification and regression trees, random forests, and gradient-boosted modeling (GBM). All models were trained using data from network 1 and tested in network 2. After selecting the most efficient modeling approach based on discrimination, Brier score, and calibration, area under precision-recall curves (AUPRCs) and net benefit estimates from decision curves were calculated to focus on the differences when using claims-only vs claims + EMR predictors.
A total of 9502 patients with HF with a mean (SD) age of 78 (8) years were included: 6113 from network 1 (training set) and 3389 from network 2 (testing set). Gradient-boosted modeling consistently provided the highest discrimination, lowest Brier scores, and good calibration across all 4 outcomes; however, logistic regression had generally similar performance (C statistics for logistic regression based on claims-only predictors: mortality, 0.724; 95% CI, 0.705-0.744; HF hospitalization, 0.707; 95% CI, 0.676-0.737; high cost, 0.734; 95% CI, 0.703-0.764; and home days loss claims only, 0.781; 95% CI, 0.764-0.798; C statistics for GBM: mortality, 0.727; 95% CI, 0.708-0.747; HF hospitalization, 0.745; 95% CI, 0.718-0.772; high cost, 0.733; 95% CI, 0.703-0.763; and home days loss, 0.790; 95% CI, 0.773-0.807). Higher AUPRCs were obtained for claims + EMR vs claims-only GBMs predicting mortality (0.484 vs 0.423), HF hospitalization (0.413 vs 0.403), and home time loss (0.575 vs 0.521) but not cost (0.249 vs 0.252). The net benefit for claims + EMR vs claims-only GBMs was higher at various threshold probabilities for mortality and home time loss outcomes but similar for the other 2 outcomes.
Machine learning methods offered only limited improvement over traditional logistic regression in predicting key HF outcomes. Inclusion of additional predictors from EMRs to claims-based models appeared to improve prediction for some, but not all, outcomes.
准确的心力衰竭(HF)风险分层对于部署针对改善患者生活质量和预后的靶向干预措施至关重要。
比较机器学习方法与传统逻辑回归在预测 HF 患者关键结局方面的表现,并评估在基于索赔的预测模型中增加电子病历(EMR)衍生信息的附加价值。
设计、地点和参与者:这是一项预后研究,随访期为 1 年,纳入了来自马萨诸塞州波士顿的 2 个医疗保健提供者网络的 9502 名医疗保险登记的 HF 患者(提供者包括医生、临床医生、其他医疗保健专业人员以及构成网络的他们的机构)。该研究于 2007 年 1 月 1 日至 2014 年 12 月 31 日进行;数据分析于 2018 年 1 月 1 日至 12 月 31 日进行。
使用逻辑回归、最小绝对收缩和选择操作回归、分类和回归树、随机森林和梯度提升建模(GBM)对全因死亡率、HF 住院、最高费用十分位数和 home days loss 大于 25%进行建模。所有模型均使用网络 1 中的数据进行训练,并在网络 2 中进行测试。在基于判别、Brier 得分和校准选择最有效的建模方法后,计算精度-召回曲线下面积(AUPRC)和决策曲线的净收益,以重点关注仅使用索赔数据与同时使用索赔和 EMR 预测因子的差异。
共纳入 9502 名 HF 患者,平均(SD)年龄为 78(8)岁:6113 名来自网络 1(训练集),3389 名来自网络 2(测试集)。梯度提升建模在所有 4 个结局中均提供了最高的判别力、最低的 Brier 得分和良好的校准;然而,逻辑回归的性能通常相似(基于仅索赔预测因子的逻辑回归的 C 统计量:死亡率,0.724;95%CI,0.705-0.744;HF 住院率,0.707;95%CI,0.676-0.737;高费用,0.734;95%CI,0.703-0.764;home days loss 仅索赔,0.781;95%CI,0.764-0.798;基于 GBM 的 C 统计量:死亡率,0.727;95%CI,0.708-0.747;HF 住院率,0.745;95%CI,0.718-0.772;高费用,0.733;95%CI,0.703-0.763;home days loss,0.790;95%CI,0.773-0.807)。与仅索赔 GBM 相比,使用索赔和 EMR 预测死亡率(0.484 比 0.423)、HF 住院率(0.413 比 0.403)和 home time loss(0.575 比 0.521)的 GBM 的 AUPRC 更高,但成本(0.249 比 0.252)除外。与仅索赔 GBM 相比,在各种死亡率和 home time loss 结局的阈值概率下,索赔和 EMR 预测因子的净收益更高,但在其他 2 个结局下相似。
机器学习方法在预测 HF 关键结局方面仅提供了对传统逻辑回归的有限改进。将来自 EMR 的附加预测因子纳入基于索赔的模型似乎提高了某些但不是所有结局的预测能力。