Lo-Ciganic Wei-Hsuan, Donohue Julie M, Hulsey Eric G, Barnes Susan, Li Yuan, Kuza Courtney C, Yang Qingnan, Buchanich Jeanine, Huang James L, Mair Christina, Wilson Debbie L, Gellad Walid F
Department of Pharmaceutical Outcomes & Policy, College of Pharmacy, University of Florida, Gainesville, FL, United States of America.
Center for Drug Evaluation and Safety (CoDES), University of Florida, Gainesville, FL, United States of America.
PLoS One. 2021 Mar 18;16(3):e0248360. doi: 10.1371/journal.pone.0248360. eCollection 2021.
Health system data incompletely capture the social risk factors for drug overdose. This study aimed to improve the accuracy of a machine-learning algorithm to predict opioid overdose risk by integrating human services and criminal justice data with health claims data to capture the social determinants of overdose risk. This prognostic study included Medicaid beneficiaries (n = 237,259) in Allegheny County, Pennsylvania enrolled between 2015 and 2018, randomly divided into training, testing, and validation samples. We measured 290 potential predictors (239 derived from Medicaid claims data) in 30-day periods, beginning with the first observed Medicaid enrollment date during the study period. Using a gradient boosting machine, we predicted a composite outcome (i.e., fatal or nonfatal opioid overdose constructed using medical examiner and claims data) in the subsequent month. We compared prediction performance between a Medicaid claims only model to one integrating human services and criminal justice data with Medicaid claims (i.e., integrated model) using several metrics (e.g., C-statistic, number needed to evaluate [NNE] to identify one overdose). Beneficiaries were stratified into risk-score decile subgroups. The samples (training = 79,087, testing = 79,086, validation = 79,086) had similar characteristics (age = 38±18 years, female = 56%, white = 48%, having at least one overdose = 1.7% during study period). Using the validation sample, the integrated model slightly improved on the Medicaid claims only model (C-statistic = 0.885; 95%CI = 0.877-0.892 vs. C-statistic = 0.871; 95%CI = 0.863-0.878), with small corresponding improvements in the NNE and positive predictive value. Nine of the top 30 most important predictors in the integrated model were human services and criminal justice variables. Using the integrated model, approximately 70% of individuals with overdoses were members of the top risk decile (overdose rates in the subsequent month = 47/10,000 beneficiaries). Few individuals in the bottom 9 deciles had overdose episodes (0-12/10,000). Machine-learning algorithms integrating claims and social service and criminal justice data modestly improved opioid overdose prediction among Medicaid beneficiaries for a large U.S. county heavily affected by the opioid crisis.
卫生系统数据未能完全捕捉药物过量的社会风险因素。本研究旨在通过将人类服务和刑事司法数据与健康保险理赔数据相结合,以捕捉药物过量风险的社会决定因素,从而提高机器学习算法预测阿片类药物过量风险的准确性。这项预后研究纳入了宾夕法尼亚州阿勒格尼县2015年至2018年期间登记的医疗补助受益人(n = 237,259),随机分为训练、测试和验证样本。我们从研究期间首次观察到的医疗补助登记日期开始,在30天的时间段内测量了290个潜在预测指标(其中239个来自医疗补助理赔数据)。使用梯度提升机,我们预测了随后一个月的综合结果(即使用法医和理赔数据构建的致命或非致命阿片类药物过量)。我们使用多种指标(如C统计量、识别一例药物过量所需评估的人数[NNE]),比较了仅使用医疗补助理赔数据的模型与将人类服务和刑事司法数据与医疗补助理赔数据相结合的模型(即综合模型)之间的预测性能。受益人被分层为风险评分十分位数亚组。样本(训练组 = 79,087,测试组 = 79,086,验证组 = 79,086)具有相似的特征(年龄 = 38±18岁,女性 = 56%,白人 = 48%,在研究期间至少有一次药物过量 = 1.7%)。使用验证样本,综合模型在仅使用医疗补助理赔数据的模型基础上略有改进(C统计量 = 0.885;95%CI = 0.877 - 0.892,而仅使用医疗补助理赔数据的模型C统计量 = 0.871;95%CI = 0.863 - 0.878),NNE和阳性预测值也有相应的小幅改善。综合模型中最重要的30个预测指标中有9个是人类服务和刑事司法变量。使用综合模型,约70%的药物过量个体属于最高风险十分位数组(随后一个月的药物过量率 = 47/10,000受益人)。最低的9个十分位数组中很少有人出现药物过量事件(0 - 12/10,000)。对于受阿片类药物危机严重影响的美国一个大县的医疗补助受益人,整合理赔数据与社会服务和刑事司法数据的机器学习算法在阿片类药物过量预测方面有适度改善。