Bowers Anne, Drake Chelsea, Makarkin Alexi E, Monzyk Robert, Maity Biswajit, Telle Andrew
Evernorth Health, Inc, St. Louis, MO, United States.
JMIR AI. 2023 Feb 20;2:e42253. doi: 10.2196/42253.
Machine learning (ML) can offer greater precision and sensitivity in predicting Medicare patient end of life and potential need for palliative services compared to provider recommendations alone. However, earlier ML research on older community dwelling Medicare beneficiaries has provided insufficient exploration of key model feature impacts and the role of the social determinants of health.
This study describes the development of a binary classification ML model predicting 1-year mortality among Medicare Advantage plan members aged ≥65 years (N=318,774) and further examines the top features of the predictive model.
A light gradient-boosted trees model configuration was selected based on 5-fold cross-validation. The model was trained with 80% of cases (n=255,020) using randomized feature generation periods, with 20% (n=63,754) reserved as a holdout for validation. The final algorithm used 907 feature inputs extracted primarily from claims and administrative data capturing patient diagnoses, service utilization, demographics, and census tract-based social determinants index measures.
The total sample had an actual mortality prevalence of 3.9% in the 2018 outcome period. The final model correctly predicted 44.2% of patient expirations among the top 1% of highest risk members (AUC=0.84; 95% CI 0.83-0.85) versus 24.0% predicted by the model iteration using only age, gender, and select high-risk utilization features (AUC=0.74; 95% CI 0.73-0.74). The most important algorithm features included patient demographics, diagnoses, pharmacy utilization, mean costs, and certain social determinants of health.
The final ML model better predicts Medicare Advantage member end of life using a variety of routinely collected data and supports earlier patient identification for palliative care.
与仅依据医疗服务提供者的建议相比,机器学习(ML)在预测医疗保险患者的生命终结以及对姑息治疗服务的潜在需求方面,能够提供更高的精度和灵敏度。然而,早期针对居住在社区的老年医疗保险受益人的机器学习研究,对关键模型特征的影响以及健康的社会决定因素的作用,探索尚不充分。
本研究描述了一个二元分类机器学习模型的开发过程,该模型用于预测年龄≥65岁的医疗保险优势计划成员(N = 318,774)的1年死亡率,并进一步研究该预测模型的主要特征。
基于五折交叉验证选择了轻梯度提升树模型配置。使用随机特征生成期,用80%的病例(n = 255,020)对模型进行训练,将20%(n = 63,754)留作验证集。最终算法使用了907个特征输入,这些输入主要从索赔和管理数据中提取,涵盖患者诊断、服务利用情况、人口统计学信息以及基于普查区的社会决定因素指数测量值。
在2018年的观察期内,总样本的实际死亡率为3.9%。最终模型在风险最高的前1%成员中正确预测了44.2%的患者死亡情况(曲线下面积[AUC]=0.84;95%置信区间0.83 - 0.85),而仅使用年龄、性别和选定的高风险利用特征的模型迭代预测的比例为24.0%(AUC = 0.74;95%置信区间0.73 - 0.74)。最重要的算法特征包括患者人口统计学信息、诊断、药房使用情况、平均费用以及某些健康的社会决定因素。
最终的机器学习模型利用各种常规收集的数据,能更好地预测医疗保险优势计划成员的生命终结情况,并支持更早地识别出需要姑息治疗的患者。