Institute of Information Systems, Leuphana University Lüneburg, Lüneburg, Germany.
Center for Behavioral Health & Technology, University of Virginia School of Medicine, Charlottesville, VA, United States.
J Med Internet Res. 2020 Oct 28;22(10):e17738. doi: 10.2196/17738.
User dropout is a widespread concern in the delivery and evaluation of digital (ie, web and mobile apps) health interventions. Researchers have yet to fully realize the potential of the large amount of data generated by these technology-based programs. Of particular interest is the ability to predict who will drop out of an intervention. This may be possible through the analysis of user journey data-self-reported as well as system-generated data-produced by the path (or journey) an individual takes to navigate through a digital health intervention.
The purpose of this study is to provide a step-by-step process for the analysis of user journey data and eventually to predict dropout in the context of digital health interventions. The process is applied to data from an internet-based intervention for insomnia as a way to illustrate its use. The completion of the program is contingent upon completing 7 sequential cores, which include an initial tutorial core. Dropout is defined as not completing the seventh core.
Steps of user journey analysis, including data transformation, feature engineering, and statistical model analysis and evaluation, are presented. Dropouts were predicted based on data from 151 participants from a fully automated web-based program (Sleep Healthy Using the Internet) that delivers cognitive behavioral therapy for insomnia. Logistic regression with L1 and L2 regularization, support vector machines, and boosted decision trees were used and evaluated based on their predictive performance. Relevant features from the data are reported that predict user dropout.
Accuracy of predicting dropout (area under the curve [AUC] values) varied depending on the program core and the machine learning technique. After model evaluation, boosted decision trees achieved AUC values ranging between 0.6 and 0.9. Additional handcrafted features, including time to complete certain steps of the intervention, time to get out of bed, and days since the last interaction with the system, contributed to the prediction performance.
The results support the feasibility and potential of analyzing user journey data to predict dropout. Theory-driven handcrafted features increased the prediction performance. The ability to predict dropout at an individual level could be used to enhance decision making for researchers and clinicians as well as inform dynamic intervention regimens.
用户流失是数字(即网络和移动应用程序)健康干预措施交付和评估中的一个普遍问题。研究人员尚未充分发挥这些基于技术的程序所产生大量数据的潜力。特别感兴趣的是预测谁将退出干预的能力。这可能通过分析用户旅程数据(包括自我报告和系统生成的数据)来实现,这些数据是个人在数字健康干预过程中所经历的路径(或旅程)产生的。
本研究旨在提供分析用户旅程数据的逐步过程,并最终预测数字健康干预背景下的用户流失。该过程应用于基于互联网的失眠干预数据,以说明其使用方法。该计划的完成取决于完成 7 个连续核心,其中包括一个初始教程核心。流失被定义为未完成第七个核心。
介绍了用户旅程分析的步骤,包括数据转换、特征工程以及统计模型分析和评估。根据来自完全自动化的基于网络的程序(使用互联网治疗失眠)的 151 名参与者的数据预测流失,该程序提供认知行为疗法治疗失眠。使用逻辑回归(带 L1 和 L2 正则化)、支持向量机和增强决策树,并根据其预测性能进行评估。报告了来自数据的预测用户流失的相关特征。
根据程序核心和机器学习技术,预测流失的准确性(曲线下面积 [AUC] 值)有所不同。在模型评估后,增强决策树的 AUC 值在 0.6 到 0.9 之间。包括完成干预措施某些步骤的时间、起床时间和上次与系统交互的天数在内的其他手工制作特征,有助于提高预测性能。
结果支持分析用户旅程数据以预测流失的可行性和潜力。基于理论的手工制作特征提高了预测性能。能够在个体水平上预测流失,这可以帮助研究人员和临床医生做出决策,并为动态干预方案提供信息。