Al-Srehan Hussein, Ayasrah Mohammad Nayef, Al-Rousan Ayoub Hamdan, Khasawneh Mohamad Ahmad Saleem, Gharaibeh Mahmoud
College of Education, Humanities and Social Sciences, Al Ain University, Abu Dhabi, United Arab Emirates.
Special Education, Al Balqa Applied University, Irbid University College, Department of Educational Sciences, Irbid, Jordan.
Clin Psychol Psychother. 2025 May-Jun;32(3):e70082. doi: 10.1002/cpp.70082.
This study aimed to predict suicidal ideation among youth with autism spectrum disorder (ASD) by applying machine learning techniques. A cross-sectional sample of 368 ASD-diagnosed young people (aged 18-24 years) was recruited, and 34 candidate predictors-including sociodemographic characteristics, psychiatric symptoms (e.g., anxiety problems and depressive symptoms), behavioural measures (e.g., bullying victimization and insomnia severity) and adverse childhood experiences-were assessed using standardized instruments and parent-report checklists. After listwise deletion of missing data, recursive feature elimination (RFE) with a random forest wrapper was performed to identify the five most influential predictors. Four classification algorithms (logistic regression, random forest, eXtreme Gradient Boosting [XGBoost] and support vector machine [SVM]) were then trained on a 70/30 stratified split and evaluated on the hold-out test set using area under the curve (AUC), sensitivity, specificity, positive predictive value, negative predictive value and accuracy. RFE identified anxiety problems, insomnia, bullying victimization, age and depression (PHQ-9) as the top predictors. Logistic regression achieved an AUC of 0.943 (sensitivity = 0.773, specificity = 0.957 and accuracy = 0.922), random forest an AUC of 0.948 (sensitivity = 0.727, specificity = 0.989 and accuracy = 0.939), XGBoost an AUC of 0.930 (sensitivity = 0.772, specificity = 0.989 and accuracy = 0.947) and SVM an AUC of 0.942 (sensitivity = 0.772, specificity = 0.978 and accuracy = 0.939). Across models, anxiety and insomnia emerged as the two most important risk factors, and XGBoost demonstrated the best overall balance of performance metrics, yielding the highest accuracy. Gradient-boosted tree models were thus shown to effectively integrate multidimensional data to predict suicidality in autistic youth, highlighting anxiety and sleep disturbances as critical targets for personalized risk assessment and prevention efforts.
本研究旨在通过应用机器学习技术预测自闭症谱系障碍(ASD)青年的自杀意念。招募了368名被诊断为ASD的年轻人(年龄在18 - 24岁之间)作为横断面样本,使用标准化工具和家长报告清单评估了34个候选预测因素,包括社会人口学特征、精神症状(如焦虑问题和抑郁症状)、行为指标(如受欺凌情况和失眠严重程度)以及童年不良经历。在对缺失数据进行列删除后,使用随机森林包装器进行递归特征消除(RFE),以确定五个最具影响力的预测因素。然后,在70/30分层分割上训练四种分类算法(逻辑回归、随机森林、极端梯度提升[XGBoost]和支持向量机[SVM]),并使用曲线下面积(AUC)、敏感性、特异性、阳性预测值、阴性预测值和准确性在留出测试集上进行评估。RFE确定焦虑问题、失眠、受欺凌情况、年龄和抑郁(PHQ - 9)为首要预测因素。逻辑回归的AUC为0.943(敏感性 = 0.773,特异性 = 0.957,准确性 = 0.922),随机森林的AUC为0.948(敏感性 = 0.727,特异性 = 0.989,准确性 = 0.939),XGBoost的AUC为0.930(敏感性 = 0.772,特异性 = 0.989,准确性 = 0.947),SVM的AUC为0.942(敏感性 = 0.772,特异性 = 0.978,准确性 = 0.939)。在所有模型中,焦虑和失眠是两个最重要的风险因素,XGBoost在性能指标方面表现出最佳的整体平衡,准确性最高。因此,梯度提升树模型被证明能有效整合多维数据以预测自闭症青年的自杀倾向,突出焦虑和睡眠障碍是个性化风险评估和预防工作的关键目标。