Department of Behavioural Science and Health, University College London, London, UK.
SPECTRUM Consortium, London, UK.
Nicotine Tob Res. 2023 Jun 9;25(7):1330-1339. doi: 10.1093/ntr/ntad051.
Smoking lapses after the quit date often lead to full relapse. To inform the development of real time, tailored lapse prevention support, we used observational data from a popular smoking cessation app to develop supervised machine learning algorithms to distinguish lapse from non-lapse reports.
We used data from app users with ≥20 unprompted data entries, which included information about craving severity, mood, activity, social context, and lapse incidence. A series of group-level supervised machine learning algorithms (eg, Random Forest, XGBoost) were trained and tested. Their ability to classify lapses for out-of-sample (1) observations and (2) individuals were evaluated. Next, a series of individual-level and hybrid algorithms were trained and tested.
Participants (N = 791) provided 37 002 data entries (7.6% lapses). The best-performing group-level algorithm had an area under the receiver operating characteristic curve (AUC) of 0.969 (95% confidence interval [CI] = 0.961 to 0.978). Its ability to classify lapses for out-of-sample individuals ranged from poor to excellent (AUC = 0.482-1.000). Individual-level algorithms could be constructed for 39/791 participants with sufficient data, with a median AUC of 0.938 (range: 0.518-1.000). Hybrid algorithms could be constructed for 184/791 participants and had a median AUC of 0.825 (range: 0.375-1.000).
Using unprompted app data appeared feasible for constructing a high-performing group-level lapse classification algorithm but its performance was variable when applied to unseen individuals. Algorithms trained on each individual's dataset, in addition to hybrid algorithms trained on the group plus a proportion of each individual's data, had improved performance but could only be constructed for a minority of participants.
This study used routinely collected data from a popular smartphone app to train and test a series of supervised machine learning algorithms to distinguish lapse from non-lapse events. Although a high-performing group-level algorithm was developed, it had variable performance when applied to new, unseen individuals. Individual-level and hybrid algorithms had somewhat greater performance but could not be constructed for all participants because of the lack of variability in the outcome measure. Triangulation of results with those from a prompted study design is recommended prior to intervention development, with real-world lapse prediction likely requiring a balance between unprompted and prompted app data.
戒烟日期后吸烟的短暂停顿常常导致完全复发。为了为实时、定制的短暂停顿预防支持提供信息,我们使用了一个流行的戒烟应用程序中的观察数据来开发监督机器学习算法,以区分短暂停顿和非短暂停顿报告。
我们使用了至少有 20 个未经提示的数据条目的应用程序用户的数据,其中包括关于渴望严重程度、情绪、活动、社会背景和短暂停顿发生率的信息。训练和测试了一系列组级监督机器学习算法(例如,随机森林、XGBoost)。评估了它们对抽样外(1)观测和(2)个体的短暂停顿分类能力。接下来,训练和测试了一系列个体水平和混合算法。
参与者(N=791)提供了 37002 个数据条目(7.6%的短暂停顿)。表现最好的组级算法的接收者操作特征曲线下面积(AUC)为 0.969(95%置信区间[CI]为 0.961 至 0.978)。它对抽样外个体的短暂停顿分类能力从差到优不等(AUC=0.482-1.000)。对于有足够数据的 39/791 名参与者,可以构建个体水平的算法,中位数 AUC 为 0.938(范围:0.518-1.000)。对于 184/791 名参与者,可以构建混合算法,中位数 AUC 为 0.825(范围:0.375-1.000)。
使用未经提示的应用程序数据似乎可以为构建高性能的组级短暂停顿分类算法提供可行性,但当应用于新的、未见过的个体时,其性能是可变的。基于个体数据集训练的算法,以及基于组和个体数据的一部分训练的混合算法,具有更好的性能,但由于结果衡量标准的变异性,只能为少数参与者构建。
本研究使用流行智能手机应用程序中的常规收集数据来训练和测试一系列监督机器学习算法,以区分短暂停顿和非短暂停顿事件。虽然开发了一种高性能的组级算法,但当应用于新的、未见过的个体时,其性能是可变的。个体水平和混合算法的性能稍好一些,但由于结果衡量标准的变异性,不能为所有参与者构建。建议在干预措施开发之前,将结果与提示研究设计的结果进行三角剖分,并且现实世界中的短暂停顿预测可能需要在未经提示和提示应用程序数据之间取得平衡。