Zantvoort Kirsten, Nacke Barbara, Görlich Dennis, Hornstein Silvan, Jacobi Corinna, Funk Burkhardt
Institute of Information Systems, Leuphana University, Lüneburg, Germany.
Department of Clinical Psychology and Psychotherapy, Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.
NPJ Digit Med. 2024 Dec 18;7(1):361. doi: 10.1038/s41746-024-01360-w.
Artificial intelligence promises to revolutionize mental health care, but small dataset sizes and lack of robust methods raise concerns about result generalizability. To provide insights on minimal necessary data set sizes, we explore domain-specific learning curves for digital intervention dropout predictions based on 3654 users from a single study (ISRCTN13716228, 26/02/2016). Prediction performance is analyzed based on dataset size (N = 100-3654), feature groups (F = 2-129), and algorithm choice (from Naive Bayes to Neural Networks). The results substantiate the concern that small datasets (N ≤ 300) overestimate predictive power. For uninformative feature groups, in-sample prediction performance was negatively correlated with dataset size. Sophisticated models overfitted in small datasets but maximized holdout test results in larger datasets. While N = 500 mitigated overfitting, performance did not converge until N = 750-1500. Consequently, we propose minimum dataset sizes of N = 500-1000. As such, this study offers an empirical reference for researchers designing or interpreting AI studies on Digital Mental Health Intervention data.
人工智能有望彻底改变精神卫生保健,但数据集规模小且缺乏可靠方法引发了对结果可推广性的担忧。为了深入了解所需的最小数据集规模,我们基于一项研究(ISRCTN13716228,2016年2月26日)中的3654名用户,探索了数字干预辍学预测的特定领域学习曲线。根据数据集规模(N = 100 - 3654)、特征组(F = 2 - 129)和算法选择(从朴素贝叶斯到神经网络)分析预测性能。结果证实了对小数据集(N ≤ 300)高估预测能力的担忧。对于无信息的特征组,样本内预测性能与数据集规模呈负相关。复杂模型在小数据集中会过度拟合,但在大数据集中能使留出法测试结果最大化。虽然N = 500减轻了过度拟合,但直到N = 750 - 1500性能才趋于稳定。因此,我们提出最小数据集规模为N = 500 - 1000。据此,本研究为设计或解释关于数字心理健康干预数据的人工智能研究的人员提供了实证参考。