Saint John's High School, 378 Main Street, Shrewsbury, MA 01545, United States.
Drug Alcohol Depend. 2020 Mar 1;208:107839. doi: 10.1016/j.drugalcdep.2020.107839. Epub 2020 Jan 15.
Opioid Use Disorder (OUD), defined as a physical or psychological reliance on opioids, is a public health epidemic. Identifying adults likely to develop OUD can help public health officials in planning effective intervention strategies. The aim of this paper is to develop a machine learning approach to predict adults at risk for OUD and to identify interactions between various characteristics that increase this risk.
In this approach, a data set was curated using the responses from the 2016 edition of the National Survey on Drug Use and Health (NSDUH). Using this data set, tree-based classifiers (decision tree and random forest) were trained, while employing downsampling to handle class imbalance. Predictions from the tree-based classifiers were also compared to the results from a logistic regression model. The results from the three classifiers were then interpreted synergistically to highlight individual characteristics and their interplay that pose a risk for OUD.
Random forest predicted adults at risk for OUD with remarkable accuracy, with the average area under the Receiver-Operating-Characteristics curve (AUC) over 0.89, even though the prevalence of OUD was only about 1 %. It showed a slight improvement over logistic regression. Logistic regression identified statistically significant characteristics, while random forest ranked the predictors in order of their contribution to OUD prediction. Early initiation of marijuana (before 18 years) emerged as the dominant predictor. Decision trees revealed that early marijuana initiation especially increased the risk if individuals: (i) were between 18-34 years of age, or (ii) had incomes less than $49,000, or (iii) were of Hispanic and White heritage, or (iv) were on probation, or (v) lived in neighborhoods with easy access to drugs.
Machine learning can accurately predict adults at risk for OUD, and identify interactions among the factors that pronounce this risk. Curbing early initiation of marijuana may be an effective prevention strategy against opioid addiction, especially in high risk groups.
阿片类药物使用障碍(OUD)定义为对阿片类药物的身体或心理依赖,是一种公共卫生流行病。识别可能患有 OUD 的成年人可以帮助公共卫生官员规划有效的干预策略。本文的目的是开发一种机器学习方法来预测有患 OUD 风险的成年人,并确定增加这种风险的各种特征之间的相互作用。
在这种方法中,使用来自 2016 年国家药物使用与健康调查(NSDUH)的回复创建了一个数据集。使用这个数据集,训练了基于树的分类器(决策树和随机森林),同时采用下采样来处理类别不平衡。基于树的分类器的预测结果也与逻辑回归模型的结果进行了比较。然后协同解释这三种分类器的结果,以突出表现出 OUD 风险的个体特征及其相互作用。
随机森林以惊人的准确性预测了有患 OUD 风险的成年人,平均接收器操作特征曲线下的面积(AUC)超过 0.89,即使 OUD 的患病率仅约为 1%。它略优于逻辑回归。逻辑回归识别出具有统计学意义的特征,而随机森林则按对 OUD 预测的贡献对预测器进行排序。大麻的早期使用(18 岁之前)成为主要预测因子。决策树显示,大麻的早期使用尤其会增加以下情况的风险:(i)年龄在 18-34 岁之间,或(ii)收入低于 49,000 美元,或(iii)是西班牙裔和白人血统,或(iv)在缓刑中,或(v)居住在毒品容易获得的社区。
机器学习可以准确预测有患 OUD 风险的成年人,并确定影响这种风险的因素之间的相互作用。遏制大麻的早期使用可能是预防阿片类药物成瘾的有效策略,尤其是在高风险群体中。