Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX, USA.
School of Behavioral and Brain Sciences, University of Texas at Dallas, Richardson, TX, USA.
Drug Alcohol Depend. 2022 Jul 1;236:109476. doi: 10.1016/j.drugalcdep.2022.109476. Epub 2022 Apr 29.
The prevalence of cannabis use disorder (CUD) has been increasing recently and is expected to increase further due to the rising trend of cannabis legalization. To help stem this public health concern, a model is needed that predicts for an adolescent or young adult cannabis user their personalized risk of developing CUD in adulthood. However, there exists no such model that is built using nationally representative longitudinal data.
We use a novel Bayesian learning approach and data from Add Health (n = 8712), a nationally representative longitudinal study, to build logistic regression models using four different regularization priors: lasso, ridge, horseshoe, and t. The models are compared by their prediction performance on unseen data via 5-fold-cross-validation (CV). We assess model discrimination using the area under the curve (AUC) and calibration by comparing the expected (E) and observed (O) number of CUD cases. We also externally validate the final model on independent test data from Add Health (n = 570).
Our final model is based on lasso prior and has seven predictors: biological sex; scores on personality traits of neuroticism, openness, and conscientiousness; and measures of adverse childhood experiences, delinquency, and peer cannabis use. It has good discrimination and calibration performance as reflected by its respective AUC and E/O of 0.69 and 0.95 based on 5-fold CV and 0.71 and 1.10 on validation data.
This externally validated model may help in identifying adolescent or young adult cannabis users at high risk of developing CUD in adulthood.
最近,大麻使用障碍(CUD)的患病率一直在上升,并且由于大麻合法化的趋势上升,预计患病率还会进一步上升。为了帮助遏制这一公共卫生问题,需要建立一个模型,预测青少年或年轻的大麻使用者在成年后患上 CUD 的个人风险。然而,目前还没有使用全国代表性纵向数据构建的此类模型。
我们使用一种新颖的贝叶斯学习方法和来自全国代表性纵向研究“健康促进网络”(Add Health)的数据(n=8712),使用四种不同的正则化先验(lasso、ridge、horseshoe 和 t)构建逻辑回归模型。通过 5 折交叉验证(CV)比较模型在未见数据上的预测性能。我们通过比较曲线下面积(AUC)和预期(E)与观察(O)的 CUD 病例数来评估模型的校准。我们还使用 Add Health 的独立测试数据(n=570)对最终模型进行外部验证。
我们的最终模型基于 lasso 先验,有七个预测因子:生物性别;神经质、开放性和尽责性人格特质的得分;以及童年逆境经历、犯罪行为和同伴大麻使用的衡量指标。它具有良好的区分度和校准性能,反映在其各自的 AUC 和基于 5 折 CV 的 E/O 分别为 0.69 和 0.95,以及验证数据的 0.71 和 1.10。
这个经过外部验证的模型可以帮助识别青少年或年轻的大麻使用者,他们在成年后患上 CUD 的风险较高。