Xu Ruo-Fei, Liu Zhen-Jing, Ouyang Shunan, Dong Qin, Yan Wen-Jing, Xu Dong-Wu
School of Mental Health, Wenzhou Medical University, Wenzhou, China.
Zhejiang Provincial Clinical Research Centre for Mental Health, Affiliated Kangning Hospital, Wenzhou Medical University, Wenzhou, 325000, China.
BMC Psychiatry. 2025 Mar 26;25(1):286. doi: 10.1186/s12888-025-06693-8.
To develop a stratified screening tool through machine learning approaches for the Center for Epidemiologic Studies Depression Scale (CES-D-20) while maintaining diagnostic accuracy, addressing the efficiency limitations in large-scale applications.
Data were derived from the Chinese Psychological Health Guard Project (primary sample: n = 179,877; age 9-18) and China Labor-force Dynamics Survey (validation samples across age spans). We employed a two-stage machine learning approach: first applying Recursive Feature Elimination with multiple linear regression to identify core predictive items for total depression scores, followed by logistic regression for optimizing depression classification (CES-D ≥ 16). Model performance was systematically evaluated through discrimination (ROC analysis), calibration (Brier score), and clinical utility analyses (decision curve analysis), with additional validation using random forest and support vector machine algorithms across independent samples.
The resulting stratified screening system consists of an initial four-item rapid screening layer (encompassing emotional, cognitive, and interpersonal dimensions) for detecting probable depression (AUC = 0.982, sensitivity = 0.945, specificity = 0.926), followed by an enhanced assessment layer with five additional items. Together, these nine items enable accurate prediction of the full CES-D-20 total score (R = 0.957). This stratified approach demonstrated robust generalizability across age groups (R > 0.94, accuracy > 0.91) and time points. Calibration analyses and decision curve analyses confirmed optimal clinical utility, particularly in the critical risk threshold range (0.3-0.6).
This study contributes to the refinement of CES-D by developing a machine learning-derived stratified screening version, offering an efficient and reliable approach that optimizes assessment burden while maintaining excellent psychometric properties. The stratified design makes it particularly valuable for large-scale mental health screening programs, enabling efficient risk stratification and targeted assessment allocation.
通过机器学习方法开发一种用于流行病学研究中心抑郁量表(CES-D-20)的分层筛查工具,同时保持诊断准确性,解决大规模应用中的效率限制问题。
数据来源于中国心理健康卫士项目(初始样本:n = 179877;年龄9 - 18岁)和中国劳动力动态调查(不同年龄跨度的验证样本)。我们采用两阶段机器学习方法:首先应用带有多重线性回归的递归特征消除来识别总抑郁得分的核心预测项目,然后进行逻辑回归以优化抑郁分类(CES-D≥16)。通过鉴别(ROC分析)、校准(Brier评分)和临床效用分析(决策曲线分析)系统地评估模型性能,并使用随机森林和支持向量机算法在独立样本上进行额外验证。
最终的分层筛查系统包括一个初始的四项快速筛查层(涵盖情绪、认知和人际维度),用于检测可能的抑郁(AUC = 0.982,灵敏度 = 0.945,特异度 = 0.926),随后是一个包含另外五个项目的强化评估层。这九个项目共同能够准确预测完整的CES-D-20总分(R = 0.957)。这种分层方法在不同年龄组(R > 0.94,准确率 > 0.91)和时间点上表现出强大的通用性。校准分析和决策曲线分析证实了其最佳临床效用,特别是在关键风险阈值范围(0.3 - 0.6)。
本研究通过开发一种基于机器学习的分层筛查版本,为CES-D的优化做出了贡献,提供了一种高效可靠的方法,在保持良好心理测量特性的同时优化了评估负担。这种分层设计使其对于大规模心理健康筛查项目特别有价值,能够实现高效的风险分层和有针对性的评估分配。