Baee Sonia, Eberle Jeremy W, Baglione Anna N, Spears Tyler, Lewis Elijah, Wang Hongning, Funk Daniel H, Teachman Bethany, E Barnes Laura
Department of Systems and Information Engineering, University of Virginia, Charlottesville, VA, United States.
Department of Psychology, University of Virginia, Charlottesville, VA, United States.
JMIR Ment Health. 2024 Dec 20;11:e51567. doi: 10.2196/51567.
Digital mental health is a promising paradigm for individualized, patient-driven health care. For example, cognitive bias modification programs that target interpretation biases (cognitive bias modification for interpretation [CBM-I]) can provide practice thinking about ambiguous situations in less threatening ways on the web without requiring a therapist. However, digital mental health interventions, including CBM-I, are often plagued with lack of sustained engagement and high attrition rates. New attrition detection and mitigation strategies are needed to improve these interventions.
This paper aims to identify participants at a high risk of dropout during the early stages of 3 web-based trials of multisession CBM-I and to investigate which self-reported and passively detected feature sets computed from the participants interacting with the intervention and assessments were most informative in making this prediction.
The participants analyzed in this paper were community adults with traits such as anxiety or negative thinking about the future (Study 1: n=252, Study 2: n=326, Study 3: n=699) who had been assigned to CBM-I conditions in 3 efficacy-effectiveness trials on our team's public research website. To identify participants at a high risk of dropout, we created 4 unique feature sets: self-reported baseline user characteristics (eg, demographics), self-reported user context and reactions to the program (eg, state affect), self-reported user clinical functioning (eg, mental health symptoms), and passively detected user behavior on the website (eg, time spent on a web page of CBM-I training exercises, time of day during which the exercises were completed, latency of completing the assessments, and type of device used). Then, we investigated the feature sets as potential predictors of which participants were at high risk of not starting the second training session of a given program using well-known machine learning algorithms.
The extreme gradient boosting algorithm performed the best and identified participants at high risk with macro-F-scores of .832 (Study 1 with 146 features), .770 (Study 2 with 87 features), and .917 (Study 3 with 127 features). Features involving passive detection of user behavior contributed the most to the prediction relative to other features. The mean Gini importance scores for the passive features were as follows: .033 (95% CI .019-.047) in Study 1; .029 (95% CI .023-.035) in Study 2; and .045 (95% CI .039-.051) in Study 3. However, using all features extracted from a given study led to the best predictive performance.
These results suggest that using passive indicators of user behavior, alongside self-reported measures, can improve the accuracy of prediction of participants at a high risk of dropout early during multisession CBM-I programs. Furthermore, our analyses highlight the challenge of generalizability in digital health intervention studies and the need for more personalized attrition prevention strategies.
数字心理健康是一种有前景的个性化、患者驱动型医疗保健模式。例如,针对解释偏差的认知偏差修正程序(解释性认知偏差修正 [CBM-I])可以让患者在网络上以威胁性较小的方式练习思考模糊情境,而无需治疗师在场。然而,包括CBM-I在内的数字心理健康干预措施常常存在缺乏持续参与度和高流失率的问题。因此需要新的流失检测和缓解策略来改进这些干预措施。
本文旨在识别在三项多节次CBM-I网络试验早期阶段有高辍学风险的参与者,并研究从参与者与干预措施及评估的交互中计算得出的哪些自我报告和被动检测到的特征集在进行此预测时最具信息价值。
本文分析的参与者是具有焦虑或对未来消极思考等特征的社区成年人(研究1:n = 252,研究2:n = 326,研究3:n = 699),他们在我们团队的公共研究网站上的三项疗效-效果试验中被分配到CBM-I条件组。为了识别有高辍学风险的参与者,我们创建了4个独特的特征集:自我报告的基线用户特征(如人口统计学特征)、自我报告的用户情境及对程序的反应(如状态情感)、自我报告的用户临床功能(如心理健康症状)以及在网站上被动检测到的用户行为(如在CBM-I训练练习网页上花费的时间、完成练习的时间段、完成评估的延迟时间以及使用的设备类型)。然后,我们使用著名的机器学习算法研究这些特征集作为哪些参与者有不开始给定程序第二次训练课程的高风险的潜在预测指标。
极端梯度提升算法表现最佳,识别出高风险参与者的宏F分数分别为.832(研究1,有146个特征)、.770(研究2,有87个特征)和.917(研究3,有127个特征)。相对于其他特征,涉及被动检测用户行为的特征对预测的贡献最大。被动特征的平均基尼重要性分数如下:研究1中为.033(95%置信区间.019 -.047);研究2中为.029(95%置信区间.023 -.035);研究1中为.045(95%置信区间.039 -.051)。然而,使用从给定研究中提取的所有特征可获得最佳预测性能。
这些结果表明,在多节次CBM-I程序早期,将用户行为的被动指标与自我报告测量相结合,可以提高对有高辍学风险参与者预测的准确性。此外,我们的分析突出了数字健康干预研究中普遍性的挑战以及对更个性化流失预防策略的需求。