Zhang Jian, Liu Yujun, Zhang Chao, Chen Yilong, Hu Yao, Yang Xiujia, Liu Wentao, Zhang Wei, Liu Di, Song Huan
Mental Health Center, West China Hospital, Sichuan University, Chengdu, China.
West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China.
Digit Health. 2024 Oct 13;10:20552076241287450. doi: 10.1177/20552076241287450. eCollection 2024 Jan-Dec.
To construct applicable models suitable for predicting the risk of suicidal behavior among individuals with depression, particularly on the progression from no history of suicidal behavior to suicide attempts, as well as from suicidal ideation to suicide attempts.
Based on a prospective cohort from the UK Biobank, a total of 55,139 individuals aged 50 and above with depression were enrolled in the study, among whom 29,528 exhibited suicidal behavior. Specifically, they were divided into control (25,611), suicidal ideation (24,361), and suicide attempt (5167) groups. Least absolute shrinkage and selection operator (LASSO) regression was used to identify a subset of important features for distinguishing suicidal ideation and suicide attempts. We used the Gradient Boosting Decision Tree (GBDT) algorithm with stratified 10-fold cross-validation and grid-search to construct the prediction models for suicidal ideation or suicide attempts. To address the dataset imbalance in classifying suicide attempts, we used random under-sampling. The SHapley Additive exPlanations (SHAP) were used to estimate the important variables in the GBDT model.
Significant differences in sociodemographic, economic, lifestyle, and psychological factors were observed across the three groups. Each classifier optimally utilized 8-11 features. Overall, the algorithms predicting suicide attempts demonstrated slightly higher performance than those predicting suicidal ideation. The GBDT classifier achieved the highest accuracy, with AUROC scores of 0.914 for suicide attempts and 0.803 for suicidal ideation. Distinctive predictive factors were identified for each group: while depression's inherent characteristics crucially distinguished the suicidal ideation group from controls, some key predictors, including the age of depression onset and childhood trauma events, were identified for suicide attempts.
We established applicable machine learning-based models for predicting suicidal behavior, particularly suicide attempts, in individuals with depression, and clarified the differences in predictors between suicidal ideation and suicide attempts.
构建适用于预测抑郁症患者自杀行为风险的模型,特别是从无自杀行为史到自杀未遂的进展情况,以及从自杀意念到自杀未遂的进展情况。
基于英国生物银行的前瞻性队列,共有55139名50岁及以上的抑郁症患者纳入研究,其中29528人表现出自杀行为。具体而言,他们被分为对照组(25611人)、自杀意念组(24361人)和自杀未遂组(5167人)。使用最小绝对收缩和选择算子(LASSO)回归来识别区分自杀意念和自杀未遂的重要特征子集。我们使用具有分层10折交叉验证和网格搜索的梯度提升决策树(GBDT)算法来构建自杀意念或自杀未遂的预测模型。为了解决自杀未遂分类中的数据集不平衡问题,我们使用了随机欠采样。使用夏普利加法解释(SHAP)来估计GBDT模型中的重要变量。
三组在社会人口统计学、经济、生活方式和心理因素方面存在显著差异。每个分类器最佳利用8 - 11个特征。总体而言,预测自杀未遂的算法表现略高于预测自杀意念的算法。GBDT分类器准确率最高,自杀未遂的AUROC评分为0.914,自杀意念的AUROC评分为0.803。为每组确定了独特的预测因素:虽然抑郁症的固有特征是区分自杀意念组与对照组的关键,但也确定了一些自杀未遂的关键预测因素,包括抑郁症发病年龄和童年创伤事件。
我们建立了基于机器学习的适用于预测抑郁症患者自杀行为,特别是自杀未遂的模型,并阐明了自杀意念和自杀未遂预测因素之间的差异。