Gu Yumeng, Xue Juanjuan, Xia Xiaoshuang, Guo Xiaokun, Wang Zhongyan, Wu Kun, Yue Wei, Chen Nian, Wang Lin, Li Xin
Department of Neurology, Second Hospital of Tianjin Medical University, Tianjin, 300211, China.
Department of Health and Medical &Geriatrics, Second Hospital of Tianjin Medical University, Tianjin, 300211, China.
J Psychiatr Res. 2025 Jul;187:123-133. doi: 10.1016/j.jpsychires.2025.05.015. Epub 2025 May 6.
Post-stroke depression (PSD) is a common psychiatric complication following stroke, with low clinical detection rates and delayed diagnosis. Most existing PSD prediction models suffer from incomplete data inclusion, which limits their clinical predictive value. This study aims to integrate multimodal data, including clinical characteristics, biomarkers, and neuroimaging variables, to validate the potential of machine learning models in efficiently identifying high-risk PSD patients.
This study is based on a multicenter clinical follow-up cohort of patients with acute ischemic stroke (AIS) in China, conducted from December 2020 to September 2023. Predictive factors included demographic characteristics, clinical features, and previously identified neuroimaging variables associated with PSD. The primary outcome was the occurrence of PSD within 3-6 months after stroke. The dataset was divided into a training set and a test set at a 3:1 ratio, with further validation performed using an external dataset. Four machine learning models-Adaptive Boosting, Gradient Boosting Decision Tree (GBDT), Quadratic Discriminant Analysis, and Multilayer Perceptron Classifier-were implemented using Python. Their predictive performance was compared based on accuracy metrics.
A total of 4298 AIS patients (mean age: 68.33 ± 8.82 years, 46.4 % male) were included, among whom 1483 developed PSD. In the test dataset, the GBDT model achieved an area under the curve (AUC) of 0.8626, accuracy of 0.7833, sensitivity of 0.8085, specificity of 0.5296, and an F1-score of 0.6396, outperforming other models. In the external validation set, the GBDT model also demonstrated superior performance, with an AUC of 0.8185, accuracy of 0.8636, sensitivity of 0.8846, specificity of 0.5285, and an F1-score of 0.6689. The most important predictors of PSD included National Institutes of Health Stroke Scale (NIHSS) at discharge, left-sided lesions, lacunar infarcts (LIs), homocysteine (HCY) levels, and systolic blood pressure (SBP).
The machine learning model performs well in predicting PSD. Clinicians should focus on stroke patients with high NIHSS scores, left-sided lesions, LIs, elevated HCY level, and high SBP to develop personalized and precise management and treatment strategies for high-risk PSD patients, aiming to prevent or delay PSD onset.
中风后抑郁症(PSD)是中风后常见的精神并发症,临床检出率低且诊断延迟。大多数现有的PSD预测模型存在数据纳入不完整的问题,这限制了它们的临床预测价值。本研究旨在整合多模态数据,包括临床特征、生物标志物和神经影像学变量,以验证机器学习模型在有效识别高危PSD患者方面的潜力。
本研究基于2020年12月至2023年9月在中国进行的急性缺血性中风(AIS)患者多中心临床随访队列。预测因素包括人口统计学特征、临床特征以及先前确定的与PSD相关的神经影像学变量。主要结局是中风后3至6个月内PSD的发生情况。数据集按3:1的比例分为训练集和测试集,并使用外部数据集进行进一步验证。使用Python实现了四种机器学习模型——自适应增强、梯度提升决策树(GBDT)、二次判别分析和多层感知器分类器。根据准确性指标比较它们的预测性能。
共纳入4298例AIS患者(平均年龄:68.33±8.82岁,男性占46.4%),其中1483例发生PSD。在测试数据集中,GBDT模型的曲线下面积(AUC)为0.8626,准确率为0.7833,灵敏度为0.8085,特异性为0.5296,F1分数为0.6396,优于其他模型。在外部验证集中,GBDT模型也表现出卓越的性能,AUC为0.8185,准确率为0.8636,灵敏度为0.8846,特异性为0.5285,F1分数为0.6689。PSD最重要的预测因素包括出院时的美国国立卫生研究院卒中量表(NIHSS)评分、左侧病变、腔隙性梗死(LIs)、同型半胱氨酸(HCY)水平和收缩压(SBP)。
机器学习模型在预测PSD方面表现良好。临床医生应关注NIHSS评分高、左侧病变、LIs、HCY水平升高和SBP高的中风患者,为高危PSD患者制定个性化、精准的管理和治疗策略,以预防或延迟PSD的发生。