Clinical Psychology and Psychotherapy, Department of Psychology, University of Trier, Trier, Germany.
Clinical and Applied Psychology Unit, Department of Psychology, University of Sheffield, Sheffield, UK.
Psychother Res. 2023 Jul;33(6):683-695. doi: 10.1080/10503307.2022.2161432. Epub 2023 Jan 20.
The occurrence of dropout from psychological interventions is associated with poor treatment outcome and high health, societal and economic costs. Recently, machine learning (ML) algorithms have been tested in psychotherapy outcome research. Dropout predictions are usually limited by imbalanced datasets and the size of the sample. This paper aims to improve dropout prediction by comparing ML algorithms, sample sizes and resampling methods. Twenty ML algorithms were examined in twelve subsamples (drawn from a sample of = 49,602) using four resampling methods in comparison to the absence of resampling and to each other. Prediction accuracy was evaluated in an independent holdout dataset using the -Measure. Resampling methods improved the performance of ML algorithms and down-sampling can be recommended, as it was the fastest method and as accurate as the other methods. For the highest mean -Score of .51 a minimum sample size of = 300 was necessary. No specific algorithm or algorithm group can be recommended. Resampling methods could improve the accuracy of predicting dropout in psychological interventions. Down-sampling is recommended as it is the least computationally taxing method. The training sample should contain at least 300 cases.
脱落(dropout)现象的发生与较差的治疗效果以及较高的健康、社会和经济成本相关。最近,机器学习(ML)算法已经在心理治疗结果研究中得到了检验。脱落预测通常受到不平衡数据集和样本量大小的限制。本文旨在通过比较 ML 算法、样本量和重采样方法来改进脱落预测。
使用四种重采样方法和不进行重采样以及彼此之间的比较,在十二个子样本(从 = 49602 个样本中抽取)中检查了二十种 ML 算法。在独立的保留数据集上,使用 -Measure 评估预测准确性。重采样方法提高了 ML 算法的性能,并且可以推荐下采样,因为它是最快的方法,并且与其他方法一样准确。对于.51 的最高平均 -Score,需要至少 = 300 的最小样本量。不能推荐特定的算法或算法组。
重采样方法可以提高心理干预中脱落预测的准确性。推荐使用下采样,因为它是计算负担最小的方法。训练样本应至少包含 300 个案例。